0% found this document useful (0 votes)
17 views1 page

Curation by Design v11SM

The document discusses curating genomic data by design from creation to interpretation. It proposes sharing curation responsibility over the whole community through creating an independent DATA cooperative to handle exponential growth and complexity of curating individual citizen data on their behalf in a trusted and transparent manner between institutes.

Uploaded by

Peter Walgemoed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views1 page

Curation by Design v11SM

The document discusses curating genomic data by design from creation to interpretation. It proposes sharing curation responsibility over the whole community through creating an independent DATA cooperative to handle exponential growth and complexity of curating individual citizen data on their behalf in a trusted and transparent manner between institutes.

Uploaded by

Peter Walgemoed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Genomic Data Curation by Design

1 2
Peter Walgemoed, Bert Eussen
1 2
Carelliance Group BV Eindhoven NL, Clinical Genetics ErasmusMC Rotterdam NL
Sharing genomic data globally for all stakeholders from creation to interpretation is
a major challenge. The figure on the right shows this challenge and different perspective
for each stakeholder. Solutions are being developed at the institutional level.
Catalogues like Decipher and ClinVar are the repository end products of international
knowledge communities. Besides these reference datasets, clinical (HPO), imaging,
behaviour/lifestyle and diagnostic(SNOMED, LOINC) datasets are also used.
To support curation, we have developed a concept where data is tagged from the
moment of creation. and can be shared globally. Curation starts with raw data in a lab
or with the clinical work-up. The processed data is linked to a person as a research and/or
clinical subject at the institutional level. For each step of the process, standardization of
the format is required. The local institute is responsible for the master copy, as well as for
access to the privacy and sample IDs. The question is whether each institute is able to
take responsibility for curation? We propose sharing the responsibility over the whole
community, and creating a new, independent DATA co-operative.

End user client/citizen value Applications IT-infrastructure DATA


Data value creation

DATA: Co-operation between Institutes


Data Service Curation
Managed Cloud Service
Institutional service
Citizens Clients
Array service
HPO LAB service
NGS service EU Diagnostic
LAB service LAB service
LAB service
Privacy Keys
CN Analysis
Software P2G Software Software Software
Variant Analysis

FAIR (master) IT-infra DATA IT-infra DATA


IT-infra DATA Center Any Trusted
Data Asset Center Center

Independent Master Data Copies


Trusted Document System: TrustDocA

The lab is a data collection point but it is driven by its clients (researchers and clinicians). These clients have the responsibility to manage the privacy for their clients (citizens).
Therefore data curation is on behalf of the citizen. All procedures and lab services are documented in a trusted, authoring document system (TrustDocA).

Institute Services

Governance Sample
model Fraction Technical
Assay
Analysis
analysis I Interpretation
Medical analysis
Informed
Request
consent
Clinical
exame HPO Phenotype
Privacy 2
data Genotype

Data curation starts with a medical request for a citizen, and includes informed consent. A clear and uniform governance model should form the basis of this registration.
All data derived from the original request should be linked within this model. Diagnostic, research or shared processes are represented by the red and blue lines.
The governance metadata should be present in all the production files (assets) and should included in the mastercopies of the generated assets.

It will be challenging for citizens to curate their own data. It is likely to grow exponentially and it will become very complex to handle phenotypic,
laboratory, treatment, municipal and personal health and lifestyle data. Therefore a trusted and transparent co-operation between institutes is
required to curate the data on the citizen’s behalf. DATA co-operative not only includes storage and preservation but also creates value by using
the data as much as possible. Data repositories like Decipher, ClinVar and local aplications are service providers for
combined datasets and act on behalf of clinical and research clients in the DATA co-operative.
Transparent data collection systems are essential for consortia wanting to share data on behalf of their clients/
citizens as part of a FAIR data policy. Governance should be by design and citizen informed consent implies
that a data copy is curated by the DATA co-operative and should be available for future generations.

Carelliance Group BV Eindhoven & Clinical Genetics ErasmusMC Email: [email protected] [email protected] 2016

You might also like