Skip to content

Navigation breadcrumbs

  1. Home
  2. Health data
  3. Data services
  4. Datasets

Datasets

DATA-CAN is making high quality datasets available to cancer researchers via the Health Data Research Innovation Gateway.

Click on the links below to find out more about the DATA-CAN datasets available via the Innovation Gateway.

Children’s Kidney Cancers – Great Ormond Street Hospital

GOSH is the custodian of the latest study dataset – IMPORT. All of the trial/study datasets include: full patient and tumour demographics, associated congenital abnormalities, tumour stage, treatment received and follow up for relapse and death. Find this dataset on the Innovation Gateway.

Comprehensive Patient Records for Cancer Outcomes

The Comprehensive Patient Records research dataset relates to the medical history of cancer patients prior to cancer, their diagnosis and treatment, long-term outcomes, and medical history of matched non-cancer patients that form a comparator cohort. Find this dataset on the Innovation Gateway.

Genomics England 100k Bioinformatics Data

Contains tables with data related to genomic data and the outputs from the GEL interpretation pipeline data for participants from both cancer and rare disease programmes. These tables do not directly include primary + secondary sources of clinical data. Find this dataset on the Innovation Gateway.

Genomics England 100k Cancer and Common

Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project. Find this dataset on the Innovation Gateway.

Genomics England 100k NHS Digital Linked Data

NHS national datasets collect information from care records, systems and organisations on specific areas of health and care. Find this dataset on the Innovation Gateway.

Genomics England 100k Public Health England Linked Data

This dataset brings together data from more than 500 local and regional datasets to build a picture of an individual’s treatment from diagnosis. Available for patients diagnosed with Cancer (ICD10 C00-97, D00-48) from 1 January 1995 -31 December 2017. Find this dataset on the Innovation Gateway.

Genomics England 100k Rare Disease and Common

Rare Disease (RD) data are presented at the level of RD families, RD pedigrees, and participants. Participants are consenting individuals who have had their genome sequenced. Pedigree members are extended members of the proband’s family. Find this dataset on the Innovation Gateway.

Genomics England 100k Quick View

Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project. Find this dataset on the Innovation Gateway.

Leeds-IQVIA Collaboration

This uses the data within PPM+, the EHR for the Leeds Cancer Centre and Leeds Teaching Hospitals. It includes all patients diagnosed with cancer since 1990, all chemotherapy since 1993, all radiotherapy since 1994. It integrates all sources of EHR data. Find this dataset on the Innovation Gateway.

National Cancer Trusted Research Environment

NHS Digital’s Trusted Research (TRE) service for England provides approved researchers with access to essential linked, de-identified health data to quickly answer COVID-19 related research questions. The TRE service provides researchers support on data access requests (DARS). Find this dataset on the Innovation Gateway.

Population Health Management and Cancer Outcomes in the West Yorkshire and Harrogate Cancer Alliance

The purpose of this data set is to enable the sharing of de-identified patient data, for the Population Health Management and Cancer Outcomes in the West Yorkshire and Harrogate Cancer Alliance (WY&H CA). Find this dataset on the Innovation Gateway.

Real Time Data Network (RTDN)

Near-real time aggregated cancer activity data from 8 major sites across the UK, for the purpose of testing the effect of the COVID-19 pandemic on cancer diagnostic and cancer treatment pathways. Find this dataset on the Innovation Gateway.

South West Primary Care Dataset

Through the collaboration with Exeter the South West Cancer Alliance has submitted primary care datasets. Find this dataset on the Innovation Gateway.

Yorkshire Specialist Register of Cancer in Children and Young People

A regional asset for Yorkshire and Humber Data, a population-based database of all children and young people (0–29 yrs) diagnosed with cancer residing in the Yorkshire and Humber region in England (10,000 tumour registrations in children aged 0-14 years. Find this dataset on the Innovation Gateway.

Visit the Innovation Gateway to find out more.

Curated Electronic Health Record (EHR) Network

In addition to the datasets on the Innovation Gateway, DATA-CAN has unique access to curate patient-level data at source within the electronic health records (EHR) of several large NHS Cancer Centres across the UK. These include Leeds Teaching Hospitals NHS Trust, University Hospitals Birmingham NHS Foundation Trust and NHS Lothian. These data are checked and completed within the clinical record, and then extracted and combined to create bespoke datasets according to the requirements of project sponsors. The resultant datasets are analysed to generate real-world evidence, typically relating to patient characteristics, treatment regimens and outcomes. Only de-identified results are made available to project sponsors. Practising oncologists provide expert input at all stages of the studies, from scoping and protocol design to final reporting. All commercially-funded projects are subject to approval by members of the DATA-CAN PPIE Group.

Get in touch to find out more here.