Background Information

This page is designed to provide background information for bioinformaticians / informaticians / developers as to the data in the CIP-API, how it is structured and how it can be queried.


Decision Support Systems

Decision Support Services (DSSs) provide Decision Support Systems (Web GUIs) that enable Clinical Scientists within the NHS to review genomes from cases processed through the Genomics England Interpretation Pipeline.

Currently, GeL is supported by these DSSs, which provide the following tools:

Rare Disease

  1. Fabric (formerly known as OMICIA) - Opal
  2. Congenica - Sapientia
  3. Illumina (TBD)

Cancer

  1. Illumina - BaseSpace Variant Interpreter (BSVI or VI) - Variant Interpreter

The flow of data from Genomics England to DSSs is managed by the CIP-API.


GeL Data Flow

The following details the high level flow of results from GeL to DSS and to GMC/GLH.

All of the data for each stage described below is obtainable from the API via get requests.

See usage for more info and examples of queries to obtain data for each stage described below.

The table below explains the CIP-API statuses for a case:

CIP-API status Description
waiting_payload Interpretation Pipeline has completed
interpretation_generated Interpretation Request has been created, ready to send to DSS
snp_check_pending (GMS-Rare Disease) The referral test's samples do not have a complete set of all sample matching service results
snp_check_pass (GMS-Rare Disease) The referral test's samples have a complete set of all samples passing the sample matching service comparison
snp_check_fail (GMS-Rare Disease) At least one sample from a referral test has failed the sample matching service comparison
files_copied VCF and tiering files have been sent to the DSS
dispatched Interpretation Request has been sent to DSS
transfer_ready DSS has completed their analysis of the case
transfer_complete DSS has transferred their interpreted genome to the API
gel_qc_failed The case has failed GeL QC checks and will be sent back to the DSS
rejected_wrong_format Interpreted Genome has failed API validation
gel_qc_passed The case has passed GeL QC checks
sent_to_gmcs The case has been sent to the GMCs for them to generate the report in the DSS portal in Inuvika/web interface
report_generated A report/summary of findings has been generated for this case by the GMC. More than one report can be generated
report_sent The latest report/summary of findings' outcomes exit questionnaire has been submitted by the user
blocked interpretation of the case has been blocked

Case Data Flow

Note

Currently, the default interpretation services for Rare Disease cases are the Genomics England Tiering and Exomiser. The Decision Support Systems (DSSs) can provide an interpretation service as well. With DSS "Bronze Service", no Interpreted Genome for the case is created. With DSS "Silver Service", an Interpreted Genome for the case is created. Any participating interpretation service, including CIP provided interpretation services, can provide an Interpreted Genome in a modular fashion, which can then be presented visually in a DSS front end.

  1. Genomics England sends an Interpretation Request to an Interpretation Service
    • The payload of a Genomics England Interpretation Request contains a Pedigree and the location of the associated BAM and VCF files

  2. Interpretation Service processes the data and returns to Genomics England an Interpreted Genome
    • The payload of an Interpreted Genome contains any variants highlighted as candidates by an interpretation service (e.g. Genomics England Tiering, Exomiser)
    • For cancer cancer cases a summary of findings is automatically created with the Genomics England Cancer Tiering information

  3. Genomics England sends the case to a Decision Support System
    • Interpretation Data is sent to a Decision Support System, once loaded the case is ready to be reviewed by GMCs/GLHs

  4. Case is released to GMC/GLH in the Decision Support System
    • For cancer cases GMC/GLHs can see any supplementary findings from the Genomics England Cancer Interpretation pipeline given (e.g. mutation burden) displayed in the Genomics England Interpretation Portal.
    • GMC/GLH is alerted with an email alert from the Genomics England that this case is ready to be reviewed

  5. GMC/GLH user browses variants in the Decision Support System, interprets the case, reviews the case and submits case back to GeL in the form of a Summary of Findings.
    • Primary findings are automatically exported from the DSS in the form of a GeL "Summary of Findings"
    • Genomics England make the data in the Summary of Findings available to GMCs/GLHs via the Genomics England Interpretation Portal

  6. GMC/GLH user completes Reporting Outcomes Exit Questionnaire in GeL Interpretation Portal
    • Variant exit questionnaire data is stored for each Summary of Findings generated from the DSS
    • Case is then set to status "Archived" in the Interpretation Portal once exit questionnaire for latest Summary of Findings is completed

Note

Cases can be reviewed and Summary of Findings generated via the Genomics England Interpretation Browser accessible via the Interpretation Portal (URL only accessible on HSCN)


GeLReportModels

Each .json payload (Interpretation Request, Interpreted Genome, Clinical Report) communicated to or from the CIP-API must agree with the specification described in GeLReportModels.

Github repository here: https://github.com/genomicsengland/GelReportModels/

Available on PyPi here: https://pypi.org/project/GelReportModels/

GeLReportModels is built using apache avro.

GelReportModels Versions used in the CIP-API and Version Control

The GelReportsModels project defines the data models used by Genomics England. The project is organised in packages. Each package is versioned and can be developed independenty of the others, so they can evolve separately. As the project packages are interdependent a global version of GelReportModels is maintained to keep all of the packages in sync.

The CIP-API uses version 7.2 of GelReportModels from which the following packages are used in api communication with end users and Decision Support Systems:

  • org.gel.models.report.avro - Version v6.0.1
  • org.gel.models.participant.avro - Version v1.1.2

The CIP-API uses version 7.3 of GelReportModels only for the Genomic Medicine Service Referral model, which is contained in the following package. The Referral model can be queried independently from the CIP-API using an additional parameter ?extra-params=show_referral on the api/2/interpretation-request/{id}/{version} endpoint, or it can be queried directly from the referral list endpoint api/2/referral or the specific referral endpoint api/2/referral/{referral_id}

  • org.gel.models.participant.avro - Version v1.2.0

If you want a practical example on how to use GelReportModels from querying the CIP-API, please go to Building objects from GelReportModels

The following schemas are associated to the following GeLReportModel versions:

Model key in api/2/interpretation-request response GelReportModels
Package Name
GelReportModels
Package Version
GelReportModels
Project Version
InterpretationRequestRD ["interpretation_request_data"]["json_request"] org.gel.models.reports.avro 6.0.1 7.2
InterpretationRequestCancer ["interpretation_request_data"]["json_request"] org.gel.models.reports.avro 6.0.1 7.2
ClinicalReport (Summary of Findings) ["clinical_report"][i]["clinical_report_data"] org.gel.models.reports.avro 6.0.1 7.2
InterpretedGenome ["interpreted_genome"][i]["interpreted_genome_data"] org.gel.models.reports.avro 6.0.1 7.2
ExitQuestionnaire ["clinical_report"][i]["clinical_report_data"]["exit_questionnaire"]["exit_questionnaire_data"] org.gel.models.reports.avro 6.0.1 7.2
Referral ["referral"]["referral_data"] org.gel.models.participant.avro 1.2.0 7.3

Note

The CIP-API currently does not use version 7.3 of GelReportModels for all packages as it contains some non backwards compatible changes that downstream clients such as the Decisions Support Systems have yet to integrate with.

Note

Since models in GelReportModels project version 3.1 are no longer supported, the CIP-API webservices extra parameter of reports_v6=true is no longer needed. Using an older version of the API (e.g. https://{cipapi-host}/api/1) does not support GMS cases as they are only found on API version 2 (e.g. https://{cipapi-host}/api/2). Genomics England team will support you with any question you may have in the process of changing to the supported version.

GelReportModels Packaging Structure Example

    {
      "version": "7.2",                                  ##This is the version of the entire GelReportModels project
      "packages": [
               {
          "package": "org.gel.models.participant.avro",  ##This is the model used for the participant package
          "python_package": "participant",
          "version": "1.1.2",                            ##This is the model version
          "dependencies": []
        },
        {
          "package": "org.gel.models.report.avro",       ##This is the model used for the report package
          "python_package": "reports",
          "version": "6.0.1",                            ##This is the model version
          "dependencies": [
            "org.gel.models.participant.avro"
          ]
    }
    {
      "version": "7.3",                                  ##This is the version of the entire GelReportModels project
      "packages": [
               {
          "package": "org.gel.models.participant.avro",  ##This is the model used for the participant package specifically for Referrals
          "python_package": "participant",
          "version": "1.2.0",                            ##This is the model version
          "dependencies": []
    }


Interpretation Request

Rare Disease

The Rare Disease Interpretation Request represents data associated with a family ready for interpretation by an interpretation service or ingestion into a DSS.

Each interpretation request has an InterpretationRequestId and an InterpretationRequestVersion. If a family needs to be re-interpreted and re-sent to a DSS following new information a new version of that request will be generated.

e.g. 123-1 (InterpretationRequestID=123, InterpretationRequestVersion=1) would be incremented to 123-2 (InterpretationRequestID=123, InterpretationRequestVersion=2) if/when a second analysis is performed by Genomics England on this case.

A Rare Disease Interpretation Request contains:

  1. Pedigree:
    • A list of rare disease participants/patients (both sequenced AND non-sequenced) for that family. Each participant/patient has associated HPO terms, relationships, affection status etc.
    • Gene panels applied for that family and associated Panel coverage metrics
    • Whether the family is suspected of having a non-penetrant condition

  2. List of associated files:
    • BAMs
    • VCFs
    • BigWigs
    • Expansion Hunter STR Plots

  3. Gene Panels Coverage:
    • Coverage of each gene in each assigned gene panel per sequenced family member

Cancer

The Cancer Interpretation request represents data associated with a participant which is ready for interpretation by a DSS.

Each interpretation request has a reportRequestId and a reportVersion. If a case needs to be re-sent to a DSS following new information a new version of that request will be generated.

e.g. 123-1 (reportRequest_id=123, reportVersion=1) would be incremented to 123-2 (reportRequest_id=123, reportVersion=2) if/when a second interpretation is performed by GeL on this case.

A Cancer Interpretation Request contains:

  1. Cancer Participant:
    • Contains demographic and phenotypic data on the participant
    • Includes any tumour and germline sample information, and if they are matched (analysed together)

  2. List of associated file paths:
    • BAMs
    • VCFs
    • BigWigs


Interpreted Genome

An Interpretation Service (e.g. Exomiser, a DSS Interpretation Service) receives an Interpretation Request from the CIP-API, and returns its own candidate variants to the CIP-API in the form of an Interpreted Genome.

Interpretation Services might send multiple interpreted genomes for an Interpretation Request version (e.g. if the Interpretation Service re-processes a case), although this is not the usual behaviour. It's expected the most recent interpreted genome per Interpretation Service appended should be reviewed by the end users.

An Interpreted Genome contains a list of candidate variants, for each one of these, information about how the variant is classified can be found in the form of a report event, the full specification of a report event can be found in the GelModels Documentation, here are described the 4 most relevant fields:

  1. Tier

    • Genomics England classification for germline (rare disease and cancer) Variants
    • Description of the Genomics England tiering process for Rare Disease and cancer cases can be found in the Genomics England Guides which can be downloaded from here.
    • The list of GeL Tiered Variants contain an array of variants and their associated tiering report events
  2. Domain

  3. vendorSpecificScores
    • Other scores that the interpretation provider may add, for example phenotypically informed or family informed scores (E.g phevor rank, VAAST score, etc.)
  4. reportEvent.comments
    • Comments made by the interpretation provider on the particular report event. This is a free text filled by the interpreted genome, usually a software, it should never be populated with sensitive data.
  5. variantAttributes.comments
    • Comments added at variant level by an interpretation service. This is a free text filled by the interpreted genome, usually a software, it should never be populated with sensitive data.

Note

Genetic variants are associated with one or more report events depending on the assigned gene panels, mode of inheritance, penetrance etc.

Attention

Currently, the Cancer Data Flow does not include any additional Interpreted Genome other than the default Genomics England Tiering service.


Summary of Findings

Note

The Summary of Findings was previously known as the Clinical Report. The term has changed but the model, schema, and json payload are identical to the Clinical Report in GeLReportModels.

Note

It is possible for Summary of Findings to be created in the Interpretation Browser of the Interpretation Portal

A GMC/GLH user (e.g. Clinical Scientist) will highlight variants in the DSS front end to mark them as Primary Findings. Once a case is closed in the DSS front end, the DSS will send a Summary of Findings (Clinical Report) JSON to the CIP-API. Once the Summary of Findings is successfully ingested in the CIP-API, the case status is updated to report_generated.

As in the GeL Interpretation Request, each variant in the Summary of Findings is associated with one or more ReportEvents depending on the panels applied, mode of inheritance, penetrance, etc.

Comments or classifications made by GMC/GLH users on variants within the DSS will feed through into the GeL Summary of Findings HTML which can be downloaded in the GeL Interpretation Portal.

It is possible for GMC/GLH users to create multiple versions of the Summary of Findings in the DSS system e.g. one Summary of Findings might be made to facilitate multi disciplinary team discussion before a final Summary of Findings is generated. All Summary of Findings versions are accessible from the API and in the GeL Interpretation Portal in HTML format.

A Summary of Findings contains:

  1. Variants - Variants reported by GMC/GLH user in the DSS as Primary Findings in the following formats:

  2. genomicInterpretation - Summary of the interpretation, this should reflect the positive conclusions of this interpretation of the case

  3. reportEvent.comments - Comments added at reportEvent level by an end user in either a DSS or the MDT tool. This is a free text field. It should never be populated with sensitive data.
  4. variantAttributes.comments - Comments added at variant level by an end user in either a DSS or the MDT tool. This is a free text field. It should never be populated with sensitive data.

Note

Do note that GeL produces preliminary findings reports for the Cancer interpretation pipeline.


Exit Questionnaire

The Exit Questionnaire (Rare Disease and Cancer) is directly associated to a particular Summary of Findings. The Exit Questionnaire has family and variant level questions for Rare Disease cases, and case and variant level questions for Cancer cases.
It contains questions regarding if variants have been technically validated, if they have any actionability, and (in Rare Disease) if they explain any phenotype(s) of the proband.

This information can be extracted from the CIP-API programmatically by GMC/GLH users as the Exit Questionnaire is a data element within the Summary of Findings.


Referral (NHS Genomics Medicine Service)

The NHS commissioned Genomics Medicine Service (GMS) has additional clinical and referral metadata for cancer and rare disease cases. The concept of a GMS referral is that it is a contained set of clinical data for a pedigree/patient and a set of clinical genetics tests (both wet lab and genomics).

A referral can have many tests and samples that are unique or shared between tests. A patient can also have more than one referral. For each "referral test", there will be an interpreting organisation that will provide a Summary of Findings to the requesting organisation.

Note

Currently, the CIP-API only displays Whole Genome Sequencing results for the NHS Genomics Medicine Service. Therefore, GMS referrals will have only one "referral test", which is the WGS Test. In the future, Exomes and large panels may become additional tests.

The DSSs and Interpretation Services will not receive referrals but instead will continue to use the Interpretation Request as the primary means of API communications. However, the referral data is presented to the user in the Genomics England Interpretation Portal and can be retreived from the CIP-API from the webservices (see GLH/GMC Tools Usage)