Data Services for
Geochemical Data
Kerstin Lehnert
Lamont-Doherty Earth Observatory, Columbia University
Director IEDA/EarthChem/SESAR
President IGSN e.V.
1
Topics
● Geochemical Data Services: EarthChem, Astromat, & SESAR
○ Purpose, services, usage
○ Data structure, metadata content, data exchange
○ Community input and adoption
● OneGeochemistry
○ Why we need it
○ How we can do it
AGN Mini-Workshop, March 25, 2020 2
Overview
● EarthChem: NSF funded
AGN Mini-Workshop, March 25, 2020 3
EarthChem Evolution
● 1996 - 2010: Multiple NSF awards to develop and operate the PetDB database
● 2004 – 2009: NSF awards to develop and operate SESAR & the IGSN
● 2003: EarthChem formed as an alliance of PetDB, NAVDAT, & GEOROC (R. Carlson, A.
Hofmann, D. Walker, K. Lehnert)
● 2006 – 2011: NSF award to develop EarthChem as a global one-stop-shop for geochemical
data (EarthChem Portal) and infrastructure for community contributions
● 2007 – 2009: Editors Roundtable to develop & promote Best Practices for geochemical data
● 2010 – 2020: EarthChem & SESAR operated & maintained as part of the IEDA Cooperative
Agreement with NSF
● 2010: Release of the EarthChem Library (data repository)
● 2016: Inclusion of experimental petrology data systems LEPR and TraceDs
AGN Mini-Workshop, March 25, 2020 4
Geochemical Data Services
Data Publication
& Preservation
Data Mining &
Analysis
Global Data
Network
Repository
(ECL, Geochron,
AstroRep)
Synthesis
(PetDB, AstroDB,
LEPR/TraceDs)
Portal
(EarthChem
Portal)
Best
Practices
Data
Standards
 System development, administration, & operation
 Review & curation of user submitted data
 Aggregation & ingestion of published data
 Community engagement (workshops, newsletter,
etc.)
 Participation in & leadership of national &
international initiatives for research data
management
5
EarthChem Architecture & Data Flow
6
EarthChem
Synthesis
Data, citation,
sample metadata
method metadata
EarthChem
Portal
Data, citation,
sample metadata
method metadata
NAVDAT
GEOROC
USGS
Geochron
Sample metadata
& ages
Analytical
metadata
Analytical
metadata
EarthChem
Library
Title, authors,
keywords,
license, etc.
Data file &
metadata
Data file &
metadata
Lab instruments,
manual upload
User submission
Data ingestion by
curators
XML data
transfer
EarthChem Library
Astromat Repo
● Scope: Publication & archiving of geochemical, petrological,
mineralogical, and geochronological data (user contributed)
● Purpose: Support researchers in making their data FAIR
● Content: User submitted datasets (new data, compilations)
● Services: Data curation following international best practices
○ Guidelines & templates for users to document data provenance & quality
○ Web-based data submission & data search/access
○ Dataset review by dedicated curators
○ DOI registration (DataCite)
○ Long-term archiving in Amazon Glacier
○ Links data to publications, samples, funding awards, authors; interoperability
with data portals (e.g. DataONE), schema.org
● Structure: File-based system with metadata catalog
Repository
AGN Mini-Workshop, March 25, 2020 7
EarthChem’s Best Practices for Geochemical Data
● Following recommendations by the Editors Roundtable, developed between 2007 and
2009 and signed by publishers and data facilities (the precursor to COPDESS).
● Fairly high-level guideline for data reporting, including provenance metadata of
analytical measurements, sample identification, and sample metadata.
● EarthChem data submission templates are designed following this guideline.
● Not comprehensive.
● Not a technical standard.
AGN Mini-Workshop, March 25, 2020 8
Goldstein, Steven L.; Hofmann, Albrecht W.; Lehnert, Kerstin A. (2014): Requirements for the Publication of Geochemical Data.
Interdisciplinary Earth Data Alliance (IEDA). http://dx.doi.org/10.1594/IEDA/100426.
AGN Mini-Workshop, March 25, 2020 9
EarthChem Library Usage
● Accepting wide range of data types
○ Observational data, mainly sample-based
■ Chemical & physical properties of different materials
(rock, soil, fluid, gas, mineral, etc.)
■ Sample descriptions
○ Data compilations
○ (Method descriptions)
● Not ready for models & software
Implementation of FAIR Data policy at AGU/WileyAGN Mini-Workshop, March 25, 2020 10
Data Services for Geochemical Data
Data Services for Geochemical Data
Recent ECL Datasets
● Major, trace element and Sr-Nd-Pb isotope data for lavas from Holes 559 and 561 of DSDP Leg 82
● Magmatism at the Southern End of the East African Rift System: Origin and Role During Early Stage Melting
● Susquehanna Shale Hills Critical Zone Observatory - Cole Farm groundwater surface water chemistry 2018
● Major and minor element data and mineralogical composition of products from kinetic dolomitization laboratory experiments
● Major and trace element geochemistry of igneous and sedimentary rocks from Lingshan Island, China
● Humic acids and soil organic matter in three forest types in Taiwan
● U-Pb detrital zircon geochronology of the Bagua Basin, Northern Peru
● The emissions of CO2 and other volatiles from the world’s subaerial volcanoes
● Open apatite Sr isotopic system in low-temperature hydrous regimes (Baogutu porphyry Cu deposit, Western Junggar, China)
● Geochemical analyses of ice cores to investigate the hydrology and biogeochemistry of Lake Eggers, McMurdo Sound, Antarctica
● Incubation experiment to investigate the effect of phosphorus slag on the physical-chemical properties of cadmium contaminated
soil
● Hafnium isotopes in Eastern North American tholeiites of the Central Atlantic Magmatic Province
● Elemental concentration of carbon and nitrogen in peat from Molokai, Hawaii
● Volatile and major elements in mid-ocean ridge basalt glasses from the Gulf of Aden
● Susquehanna Shale Hills Critical Zone Observatory - Cole Farm Geochemistry
● Pliocene Palaeoclimate off Southeastern Africa: Insights from IODP Expedition 361
AGN Mini-Workshop, March 25, 2020 13
PetDB (ECS)
LEPR, TraceDs
AstroDB
● Scope: Synthesis of geochemical and petrological data
● Purpose: Advanced data access & mining (access to individual
measurements)
● Content: Published data, processed (harmonized) and ingested
by data curators
○ Analytical and experimental observations; compositional (geochemistry,
mineralogy)
○ Thematic collections
● Services: Interactive search interface for data exploration &
retrieval at the granularity of individual samples/experiments
& measurements
● Structure: Relational database
● Special Features: Highly successful data infrastructure for
geochemistry since late 1990’s (860 citations in the literature)
AGN Mini-Workshop, March 25, 2020 14
Synthesis
Synthesis Databases: Data Model (ODM2)
Feature of
Interest
Sampling
Feature
Action
Result
Method
Metadata
Annotations
Annotations
Annotations
Annotations
Annotations
AGN Mini-Workshop, March 25, 2020 15
e.g. geological unit, volcano, lava flow
e.g. specimen, core (can have hierarchy)
e.g. sample preparation, analysis
e.g. measured value
Horsburgh, J. S. et al., “Observations Data Model 2: A community information model for spatially discrete Earth
observations”, Environmental Modelling & Software, volume 79, 55–74, 2016.
https://doi.org/10.1016/j.envsoft.2016.01.010
EarthChem Synthesis: Data Search & Retrieval
● Give me all 87Sr/86Sr and O isotopes for
granites from location X and Y
● Give me all REE data for basaltic
samples with Nb < 2ppm
● Give me all data for samples with ages
between 2 and 2.5 My
● Give me all olivine compositions for
samples with Sr & Nd isotopes available
AGN Mini-Workshop, March 25, 2020 16
Example Queries
EarthChem Portal
• 22,074 publications
• 1,054,738 samples
• 30,059,995 analytical values
Access to data in a global federation of
geochemical databases
• PetDB
• SedDB
• GEOROC (Germany)
• USGS
• MetPetDB
• GANSEKI (Japan)
 Machine-readable interfaces
 Interoperability with modeling tools
 EarthChemXML for data integration
AGN Mini-Workshop, March 25, 2020 17
AGN Mini-Workshop, March 25, 2020 18
Value & Impact of Data Synthesis
19
● Minimize time for data wrangling & reuse
● Allow quick exploration & testing of research
hypotheses
● Allow real-world data use in the classroom
● Facilitate and inspire new research paradigms
○ Statistical Geochemistry
○ Machine Learning
○ Neural Network Analysis
AGN Mini-Workshop, March 25, 2020
Ueki, K., Hino, H., & Kuwatani, T. (2018). Geochemical
discrimination and characteristics of magmatic tectonic settings: A
machine-learning-based approach.
Geochemistry, Geophysics,Geosystems,19, 1327–1347.
https://doi.org/10.1029/2017GC007401
EarthChem-in-a-Box
● Support community contributions to EarthChem Synthesis
● Support local data management, e.g. for projects or in geochemistry labs
AGN Mini-Workshop, March 25, 2020 20
Web applications
(Admin, Search, Page
Database
(ODM2 compliant)
APIs
Joint Metadata Index
Web applications
(Admin, Search, Page
Database
(ODM2 compliant)
APIs
Web applications
(Admin, Search, Page
Database
(ODM2 compliant)
APIs
Lab 1 Lab 2 Lab 3
Challenges for Data Synthesis & Federation
21
● Lack of internationally endorsed data exchange standard for analytical data
○ Difficulty to automate data ingestion.
○ Hinders growth of federation and accessible data collections.
● Architectures becoming unfit for growing volume of data and increasing demand for
access to most or all of the data for data analytics.
● Lack of technical infrastructure or resources for a federation of partner systems.
○ Unable to deliver data in encoded form.
○ Unable to modify systems to align with new architecture or data exchange standards (incl. vocabularies).
● Lack of formal agreements or governance of the federation.
AGN Mini-Workshop, March 25, 2020
OneGeochemistry
“OneGeochemistry seeks to create a global geochemical data network that facilitates and
promotes discovery and access of geochemical data through coordination and
collaboration among international geochemical data providers.”
AGN
Mini-
Works
hop,
March
22
From: Lehnert K, Wyborn L, Bennett V C, Hezel D, McInnes B I A , Plank T,
Rubin K (2019): OneGeochemistry: Towards an Interoperable Global Network
of FAIR Geochemical Data. Abstract submitted to CODATA 2019
The Model:
OneGeochemistry Goals
● Develop internationally endorsed best practices for FAIR geochemical data.
○ Define requirements for data documentation (method, samples, data quality, etc.)
● Develop and implement interoperability standards for geochemical data to enable
machine-to-machine exchange and integration of geochemical data.
○ Align with modern technology, e.g. semantic web standards.
○ Use, where possible, internationally endorsed vocabularies.
● Develop a governance model for the organization to ensure participation and trust.
● Develop a business model to ensure long-term sustainability.
“We must, indeed, all hang together or, most
assuredly, we shall all hang separately”.
Benjamin FranklinAGN Mini-Workshop, March 25, 2020 23
Geochemical Data Standards & Best Practices
● What are the requirements for interoperability and reusability of lab analytical data?
○ We need to define Best Practices for researchers reporting the data.
○ We need to define data structures, vocabularies, and protocols for data managers to allow global integration and
exchange of data.
● What standards and best practices exist that can be adopted or adapted?
● Which modern protocols do we need to align with?
● What are the best mechanisms to develop, adopt, and govern such data standards?
● How do we advance the necessary culture change in geochemistry toward open data
sharing?
AGN Mini-Workshop, March 25, 2020 24
Geochemistry Is Lagging Behind
AGN
Mini-
Works
hop,
March
25

More Related Content

PDF
INSPIRE Data Specifications on Geology and Mineral Resources
PDF
SPATIAL_DATA_INTEGRITY_SIS_A1
PDF
Manuscript_IPA15-E-143
PPTX
HPC Use for Earthquake Research
PPTX
Research Data Infrastructure for Geochemistry (DFG Roundtable)
PPTX
Astromat Update on Developments 2021-01-29
PPTX
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
PPTX
IEDA Overview & Updates, March 2014
INSPIRE Data Specifications on Geology and Mineral Resources
SPATIAL_DATA_INTEGRITY_SIS_A1
Manuscript_IPA15-E-143
HPC Use for Earthquake Research
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Astromat Update on Developments 2021-01-29
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
IEDA Overview & Updates, March 2014

Similar to Data Services for Geochemical Data (20)

PPTX
Geospatial Data and Key Characteristics of Geospatial Data Analysis and Science
PDF
Marisol Bonnet_Resume_April
PDF
Discovering new functional materials for clean energy and beyond using high-t...
PPT
Cambridge University Geospatial Metadata Workshop 20110524
PDF
PDFsam_merge.pdf
PPT
Northumbria University Geospatial Metadata Workshop 20110505
PDF
The Materials Project: A Community Data Resource for Accelerating New Materia...
PPT
Oxford University Geospatial Metadata Workshop 20110415
PPTX
EarthCube EISWG Spring Meeting Presentation - 4.28.2014
PPT
Leeds University Geospatial Metadata Workshop 20110617
PPT
Geospatial Metadata Workshop
PPT
Geospatial Metadata Workshop
PPT
Geospatial Metadata Workshop
PPTX
EarthCube Activities at DOE by Dan King, DOE Geothermal Technologies Office F...
PDF
Conducting and Enabling Data-Driven Research Through the Materials Project
PDF
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
PDF
1803-FrenchCWRU-GLEI-Houston.pdf presentation
PDF
Computational Materials Design and Data Dissemination through the Materials P...
PPTX
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
PDF
Discovering and Exploring New Materials through the Materials Project
Geospatial Data and Key Characteristics of Geospatial Data Analysis and Science
Marisol Bonnet_Resume_April
Discovering new functional materials for clean energy and beyond using high-t...
Cambridge University Geospatial Metadata Workshop 20110524
PDFsam_merge.pdf
Northumbria University Geospatial Metadata Workshop 20110505
The Materials Project: A Community Data Resource for Accelerating New Materia...
Oxford University Geospatial Metadata Workshop 20110415
EarthCube EISWG Spring Meeting Presentation - 4.28.2014
Leeds University Geospatial Metadata Workshop 20110617
Geospatial Metadata Workshop
Geospatial Metadata Workshop
Geospatial Metadata Workshop
EarthCube Activities at DOE by Dan King, DOE Geothermal Technologies Office F...
Conducting and Enabling Data-Driven Research Through the Materials Project
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
1803-FrenchCWRU-GLEI-Houston.pdf presentation
Computational Materials Design and Data Dissemination through the Materials P...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
Discovering and Exploring New Materials through the Materials Project
Ad

More from Kerstin Lehnert (16)

PPTX
Lehnert_EGU201_SampleMetadataStandards
PPTX
Goldschmidt2019 Samples Workshop
PPTX
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
PPTX
EGU 2018 Ian McHarg Lecture
PPT
EarthCubeArchitectureWS_June2015
PPTX
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
PPTX
Making Small Data BIG (UT Austin, March 2016)
PPTX
IGSN: The International Geo Sample Number (DFG Roundtable)
PPTX
Data Standards & Best Practices for the Stratigraphic Record
PPTX
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
PPTX
The Internet of Samples: IGSN in Action
PPTX
Digital Representation of Physical Samples in Scientific Publications
PPTX
Lehnert: Making Small Data Big, IACS, April2015
PPTX
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
PPTX
iSamples Research Coordination Network (C4P Webinar)
PPTX
IEDA Data Publication Workshop @AGU
Lehnert_EGU201_SampleMetadataStandards
Goldschmidt2019 Samples Workshop
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
EGU 2018 Ian McHarg Lecture
EarthCubeArchitectureWS_June2015
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Making Small Data BIG (UT Austin, March 2016)
IGSN: The International Geo Sample Number (DFG Roundtable)
Data Standards & Best Practices for the Stratigraphic Record
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
The Internet of Samples: IGSN in Action
Digital Representation of Physical Samples in Scientific Publications
Lehnert: Making Small Data Big, IACS, April2015
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
iSamples Research Coordination Network (C4P Webinar)
IEDA Data Publication Workshop @AGU
Ad

Recently uploaded (20)

PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PPTX
chrmotography.pptx food anaylysis techni
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPT
statistics analysis - topic 3 - describing data visually
PPTX
Machine Learning and working of machine Learning
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
PPTX
MBA JAPAN: 2025 the University of Waseda
PPTX
ai agent creaction with langgraph_presentation_
PPTX
AI AND ML PROPOSAL PRESENTATION MUST.pptx
PDF
Best Data Science Professional Certificates in the USA | IABAC
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
CYBER SECURITY the Next Warefare Tactics
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
chrmotography.pptx food anaylysis techni
A biomechanical Functional analysis of the masitary muscles in man
The Data Security Envisioning Workshop provides a summary of an organization...
Navigating the Thai Supplements Landscape.pdf
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
statistics analysis - topic 3 - describing data visually
Machine Learning and working of machine Learning
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
MBA JAPAN: 2025 the University of Waseda
ai agent creaction with langgraph_presentation_
AI AND ML PROPOSAL PRESENTATION MUST.pptx
Best Data Science Professional Certificates in the USA | IABAC
Session 11 - Data Visualization Storytelling (2).pdf
1 hour to get there before the game is done so you don’t need a car seat for ...

Data Services for Geochemical Data

  • 1. Data Services for Geochemical Data Kerstin Lehnert Lamont-Doherty Earth Observatory, Columbia University Director IEDA/EarthChem/SESAR President IGSN e.V. 1
  • 2. Topics ● Geochemical Data Services: EarthChem, Astromat, & SESAR ○ Purpose, services, usage ○ Data structure, metadata content, data exchange ○ Community input and adoption ● OneGeochemistry ○ Why we need it ○ How we can do it AGN Mini-Workshop, March 25, 2020 2
  • 3. Overview ● EarthChem: NSF funded AGN Mini-Workshop, March 25, 2020 3
  • 4. EarthChem Evolution ● 1996 - 2010: Multiple NSF awards to develop and operate the PetDB database ● 2004 – 2009: NSF awards to develop and operate SESAR & the IGSN ● 2003: EarthChem formed as an alliance of PetDB, NAVDAT, & GEOROC (R. Carlson, A. Hofmann, D. Walker, K. Lehnert) ● 2006 – 2011: NSF award to develop EarthChem as a global one-stop-shop for geochemical data (EarthChem Portal) and infrastructure for community contributions ● 2007 – 2009: Editors Roundtable to develop & promote Best Practices for geochemical data ● 2010 – 2020: EarthChem & SESAR operated & maintained as part of the IEDA Cooperative Agreement with NSF ● 2010: Release of the EarthChem Library (data repository) ● 2016: Inclusion of experimental petrology data systems LEPR and TraceDs AGN Mini-Workshop, March 25, 2020 4
  • 5. Geochemical Data Services Data Publication & Preservation Data Mining & Analysis Global Data Network Repository (ECL, Geochron, AstroRep) Synthesis (PetDB, AstroDB, LEPR/TraceDs) Portal (EarthChem Portal) Best Practices Data Standards  System development, administration, & operation  Review & curation of user submitted data  Aggregation & ingestion of published data  Community engagement (workshops, newsletter, etc.)  Participation in & leadership of national & international initiatives for research data management 5
  • 6. EarthChem Architecture & Data Flow 6 EarthChem Synthesis Data, citation, sample metadata method metadata EarthChem Portal Data, citation, sample metadata method metadata NAVDAT GEOROC USGS Geochron Sample metadata & ages Analytical metadata Analytical metadata EarthChem Library Title, authors, keywords, license, etc. Data file & metadata Data file & metadata Lab instruments, manual upload User submission Data ingestion by curators XML data transfer
  • 7. EarthChem Library Astromat Repo ● Scope: Publication & archiving of geochemical, petrological, mineralogical, and geochronological data (user contributed) ● Purpose: Support researchers in making their data FAIR ● Content: User submitted datasets (new data, compilations) ● Services: Data curation following international best practices ○ Guidelines & templates for users to document data provenance & quality ○ Web-based data submission & data search/access ○ Dataset review by dedicated curators ○ DOI registration (DataCite) ○ Long-term archiving in Amazon Glacier ○ Links data to publications, samples, funding awards, authors; interoperability with data portals (e.g. DataONE), schema.org ● Structure: File-based system with metadata catalog Repository AGN Mini-Workshop, March 25, 2020 7
  • 8. EarthChem’s Best Practices for Geochemical Data ● Following recommendations by the Editors Roundtable, developed between 2007 and 2009 and signed by publishers and data facilities (the precursor to COPDESS). ● Fairly high-level guideline for data reporting, including provenance metadata of analytical measurements, sample identification, and sample metadata. ● EarthChem data submission templates are designed following this guideline. ● Not comprehensive. ● Not a technical standard. AGN Mini-Workshop, March 25, 2020 8 Goldstein, Steven L.; Hofmann, Albrecht W.; Lehnert, Kerstin A. (2014): Requirements for the Publication of Geochemical Data. Interdisciplinary Earth Data Alliance (IEDA). http://dx.doi.org/10.1594/IEDA/100426.
  • 10. EarthChem Library Usage ● Accepting wide range of data types ○ Observational data, mainly sample-based ■ Chemical & physical properties of different materials (rock, soil, fluid, gas, mineral, etc.) ■ Sample descriptions ○ Data compilations ○ (Method descriptions) ● Not ready for models & software Implementation of FAIR Data policy at AGU/WileyAGN Mini-Workshop, March 25, 2020 10
  • 13. Recent ECL Datasets ● Major, trace element and Sr-Nd-Pb isotope data for lavas from Holes 559 and 561 of DSDP Leg 82 ● Magmatism at the Southern End of the East African Rift System: Origin and Role During Early Stage Melting ● Susquehanna Shale Hills Critical Zone Observatory - Cole Farm groundwater surface water chemistry 2018 ● Major and minor element data and mineralogical composition of products from kinetic dolomitization laboratory experiments ● Major and trace element geochemistry of igneous and sedimentary rocks from Lingshan Island, China ● Humic acids and soil organic matter in three forest types in Taiwan ● U-Pb detrital zircon geochronology of the Bagua Basin, Northern Peru ● The emissions of CO2 and other volatiles from the world’s subaerial volcanoes ● Open apatite Sr isotopic system in low-temperature hydrous regimes (Baogutu porphyry Cu deposit, Western Junggar, China) ● Geochemical analyses of ice cores to investigate the hydrology and biogeochemistry of Lake Eggers, McMurdo Sound, Antarctica ● Incubation experiment to investigate the effect of phosphorus slag on the physical-chemical properties of cadmium contaminated soil ● Hafnium isotopes in Eastern North American tholeiites of the Central Atlantic Magmatic Province ● Elemental concentration of carbon and nitrogen in peat from Molokai, Hawaii ● Volatile and major elements in mid-ocean ridge basalt glasses from the Gulf of Aden ● Susquehanna Shale Hills Critical Zone Observatory - Cole Farm Geochemistry ● Pliocene Palaeoclimate off Southeastern Africa: Insights from IODP Expedition 361 AGN Mini-Workshop, March 25, 2020 13
  • 14. PetDB (ECS) LEPR, TraceDs AstroDB ● Scope: Synthesis of geochemical and petrological data ● Purpose: Advanced data access & mining (access to individual measurements) ● Content: Published data, processed (harmonized) and ingested by data curators ○ Analytical and experimental observations; compositional (geochemistry, mineralogy) ○ Thematic collections ● Services: Interactive search interface for data exploration & retrieval at the granularity of individual samples/experiments & measurements ● Structure: Relational database ● Special Features: Highly successful data infrastructure for geochemistry since late 1990’s (860 citations in the literature) AGN Mini-Workshop, March 25, 2020 14 Synthesis
  • 15. Synthesis Databases: Data Model (ODM2) Feature of Interest Sampling Feature Action Result Method Metadata Annotations Annotations Annotations Annotations Annotations AGN Mini-Workshop, March 25, 2020 15 e.g. geological unit, volcano, lava flow e.g. specimen, core (can have hierarchy) e.g. sample preparation, analysis e.g. measured value Horsburgh, J. S. et al., “Observations Data Model 2: A community information model for spatially discrete Earth observations”, Environmental Modelling & Software, volume 79, 55–74, 2016. https://doi.org/10.1016/j.envsoft.2016.01.010
  • 16. EarthChem Synthesis: Data Search & Retrieval ● Give me all 87Sr/86Sr and O isotopes for granites from location X and Y ● Give me all REE data for basaltic samples with Nb < 2ppm ● Give me all data for samples with ages between 2 and 2.5 My ● Give me all olivine compositions for samples with Sr & Nd isotopes available AGN Mini-Workshop, March 25, 2020 16 Example Queries
  • 17. EarthChem Portal • 22,074 publications • 1,054,738 samples • 30,059,995 analytical values Access to data in a global federation of geochemical databases • PetDB • SedDB • GEOROC (Germany) • USGS • MetPetDB • GANSEKI (Japan)  Machine-readable interfaces  Interoperability with modeling tools  EarthChemXML for data integration AGN Mini-Workshop, March 25, 2020 17
  • 19. Value & Impact of Data Synthesis 19 ● Minimize time for data wrangling & reuse ● Allow quick exploration & testing of research hypotheses ● Allow real-world data use in the classroom ● Facilitate and inspire new research paradigms ○ Statistical Geochemistry ○ Machine Learning ○ Neural Network Analysis AGN Mini-Workshop, March 25, 2020 Ueki, K., Hino, H., & Kuwatani, T. (2018). Geochemical discrimination and characteristics of magmatic tectonic settings: A machine-learning-based approach. Geochemistry, Geophysics,Geosystems,19, 1327–1347. https://doi.org/10.1029/2017GC007401
  • 20. EarthChem-in-a-Box ● Support community contributions to EarthChem Synthesis ● Support local data management, e.g. for projects or in geochemistry labs AGN Mini-Workshop, March 25, 2020 20 Web applications (Admin, Search, Page Database (ODM2 compliant) APIs Joint Metadata Index Web applications (Admin, Search, Page Database (ODM2 compliant) APIs Web applications (Admin, Search, Page Database (ODM2 compliant) APIs Lab 1 Lab 2 Lab 3
  • 21. Challenges for Data Synthesis & Federation 21 ● Lack of internationally endorsed data exchange standard for analytical data ○ Difficulty to automate data ingestion. ○ Hinders growth of federation and accessible data collections. ● Architectures becoming unfit for growing volume of data and increasing demand for access to most or all of the data for data analytics. ● Lack of technical infrastructure or resources for a federation of partner systems. ○ Unable to deliver data in encoded form. ○ Unable to modify systems to align with new architecture or data exchange standards (incl. vocabularies). ● Lack of formal agreements or governance of the federation. AGN Mini-Workshop, March 25, 2020
  • 22. OneGeochemistry “OneGeochemistry seeks to create a global geochemical data network that facilitates and promotes discovery and access of geochemical data through coordination and collaboration among international geochemical data providers.” AGN Mini- Works hop, March 22 From: Lehnert K, Wyborn L, Bennett V C, Hezel D, McInnes B I A , Plank T, Rubin K (2019): OneGeochemistry: Towards an Interoperable Global Network of FAIR Geochemical Data. Abstract submitted to CODATA 2019 The Model:
  • 23. OneGeochemistry Goals ● Develop internationally endorsed best practices for FAIR geochemical data. ○ Define requirements for data documentation (method, samples, data quality, etc.) ● Develop and implement interoperability standards for geochemical data to enable machine-to-machine exchange and integration of geochemical data. ○ Align with modern technology, e.g. semantic web standards. ○ Use, where possible, internationally endorsed vocabularies. ● Develop a governance model for the organization to ensure participation and trust. ● Develop a business model to ensure long-term sustainability. “We must, indeed, all hang together or, most assuredly, we shall all hang separately”. Benjamin FranklinAGN Mini-Workshop, March 25, 2020 23
  • 24. Geochemical Data Standards & Best Practices ● What are the requirements for interoperability and reusability of lab analytical data? ○ We need to define Best Practices for researchers reporting the data. ○ We need to define data structures, vocabularies, and protocols for data managers to allow global integration and exchange of data. ● What standards and best practices exist that can be adopted or adapted? ● Which modern protocols do we need to align with? ● What are the best mechanisms to develop, adopt, and govern such data standards? ● How do we advance the necessary culture change in geochemistry toward open data sharing? AGN Mini-Workshop, March 25, 2020 24
  • 25. Geochemistry Is Lagging Behind AGN Mini- Works hop, March 25