SlideShare a Scribd company logo
Materials Project computation and
database infrastructure
Anubhav Jain
Energy Technologies Area
Lawrence Berkeley National Laboratory
Berkeley, CA
Presentation given to Delaware Energy Institute, 2018
Slides (already) posted to https://hackingmaterials.lbl.gov
Outline
2
① Introduction to the Materials Project
② Materials Project computation infrastructure
③ Database considerations
The Materials Project database
• Online resource of density
functional theory simulation data
for ~85,000 inorganic materials
• Includes band structures, elastic
tensors, piezoelectric tensors,
battery properties and more
• >60,000 registered users
• Free
• www.materialsproject.org
3
Jain et al. Commentary: The Materials Project: A
materials genome approach to accelerating
materials innovation. APL Mater. 1, 11002 (2013).
4
Many data sets are available!
M. De Jong et
al. Sci. Data,
2015, 2,
150009.
]
M. De Jong et
al. Sci. Data,
2015, 2,
150009.
5
As well as “apps” for exploring the data
Outline
6
① Introduction to the Materials Project
② Materials Project computation infrastructure
③ Database considerations
A “black-box” view of performing a calculation
7
“something”
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Unfortunately, the inside of the “black box”
is usually tedious and “low-level”
8
lots of tedious,
low-level work…
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Input file flags
SLURM format
how to fix ZPOTRF?
q set up the structure coordinates
q write input files, double-check all
the flags
q copy to supercomputer
q submit job to queue
q deal with supercomputer
headaches
q monitor job
q fix error jobs, resubmit to queue,
wait again
q repeat process for subsequent
calculations in workflow
q parse output files to obtain results
q copy and organize results, e.g., into
Excel
What would be a better way?
9
“something”
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
What would be a better way?
10
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Workflows to run
q band structure
q surface energies
ü elastic tensor
q Raman spectrum
q QH thermal expansion
Ideally the method should scale to millions of calculations
11
Results!
researcher
Start with all binary
oxides, replace O->S,
run several different
properties
Workflows to run
ü band structure
ü surface energies
ü elastic tensor
q Raman spectrum
q QH thermal expansion
q spin-orbit coupling
Atomate tries make it easy, automatic, and flexible to
generate data with existing simulation packages
12
Results!
researcher
Run many different
properties of many
different materials!
Atomate contains a library of simulation procedures
13
VASP-based
• band structure
• spin-orbit coupling
• hybrid functional
calcs
• elastic tensor
• piezoelectric tensor
• Raman spectra
• NEB
• GIBBS method
• QH thermal
expansion
• AIMD
• ferroelectric
• surface adsorption
• work functions
Other
• BoltzTraP
• FEFF method
• LAMMPS MD
Mathew, K. et al Atomate: A high-level interface to generate, execute, and analyze
computational materials science workflows, Comput. Mater. Sci. 139 (2017) 140–152.
Each simulation procedure translates high-level instructions
into a series of low-level tasks
14
quickly and automatically translate PI-style (minimal)
specifications into well-defined FireWorks workflows
What is the
GGA-PBE elastic
tensor of GaAs?
M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, et al.,
Charting the complete elastic properties of inorganic crystalline compounds,
Sci. Data. 2 (2015).
Atomate thus encodes and standardizes knowledge about
running various kinds of simulations from domain experts
15
K. Mathew J. Montoya S. Dwaraknath A. Faghaninia
All past and present knowledge, from everyone in the group,
everyone previously in the group, and our collaborators,
about how to run calculations
M. Aykol
S.P. Ong
B. Bocklund T. Smidt
H. Tang I.H. Chu M. Horton J. Dagdalen B. Wood
Z.K. Liu J. Neaton K. Persson A. Jain
+
16
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
17
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
• Pymatgen can retrieve crystal
structures from the Materials
Project database (MPRester class)
• It can also manipulate crystal
structures
– substitutions
– supercell creation
– order-disorder (shown at right)
– interstitial finding
– surface / slab generation
• A visual interface to many of the
tools are in Materials Project’s
“Crystal Toolkit” app
18
Crystal structure generation via pymatgen
Example: Order-disorder
resolve partial or mixed
occupancies into a fully
ordered crystal structure
(e.g., mixed oxide-fluoride site
into separate oxygen/fluorine)
19
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
20
Atomate’s main goal – convert structures to workflows
Workflows consist of a series of jobs (“FireWorks”), each
with multiple tasks. Atomate jobs typically (i) run a
calculation and (ii) store the results in a database
21
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
FireWorks allows you to write your workflow once and
execute (almost) anywhere
22
• Execute workflows
locally or at a
supercomputing
center
• Queue systems
supported
– PBS
– SGE
– SLURM
– IBM LoadLeveler
– NEWT (a REST-based
API at NERSC)
– Cobalt (Argonne LCF)
Dashboard with status of all jobs
23
• Job provenance and automatic metadata storage
• Detect and rerun failures
• “Dynamic” workflows that change behavior based on
results
• Customize job priorities
• Much more…
24
Other features
25
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
Atomate – builders
framework
26
“Builders” start with base
collections in a database and
create higher-level collections
that summarize information or
add metadata
27
The atomate database makes it easy to perform various
analyses with pymatgen
atomate output
database(s)
phase
diagrams
Pourbaix
diagrams
diffusivity via MDband structure analysis
28
Many research groups have run tens of thousands of
materials science workflows with atomate
also used by:
• Persson research group, UC Berkeley
• Ong research group, UC San Diego
• Neaton research group, UC Berkeley
• Liu research group, Penn State
• Groups not developing on atomate!
• e.g., see “Thermal expansion of quaternary nitride coatings” by
Tasnadi et al.
atomate now powers the Materials
Project and will be used to run
hundreds of thousands of
simulations in the next year
(www.materialsproject.org)
Outline
29
① Introduction to the Materials Project
② Materials Project computation infrastructure
③ Database considerations
30
About a decade ago, we were using a SQL infrastructure
Main problems we ran into:
• Too static – every time we wanted
to store a new kind of data, the DB
master needed to “design and
update” the database schema
• Too difficult for newcomers –
constructing queries (joins, etc.).
We actually designed a system to
help people make queries, which is
common
31
Since then, we have switched to MongoDB –
a “noSQL” database
Major advantages
• Very dynamic – easy to add
new data types without
interfering with old data
types or redesigning
everything. No central
“database master” needed
• Easy for newcomers – easy
syntax, no complex “joins”,
easy to visualize results
• Easy object-relational
mapping – built our
pymatgen code so that any
objects (e.g., band
structures, crystal
structures, etc.) could be
exported to a database or
imported from a database
easily
32
How we store computed data
Data is stored in “collections”. Each collection is a set of documents that can be queried.
Each document
consists of nested key-
value pairs
(“dictionaries”) or
arrays.
e.g. one can search for:
{“tags”: “phosphides”}
to retrieve all
documents tagged
with “phosphide”
33
Each collection has a set of standard keys
Data is stored in “collections”. Each collection is a set of documents that can be queried.
materials collection – each
document represents a
material, with keys like
“formula” and “band_gap”
tasks collection – each
document represents a
DFT calculation, with keys
like “dir_name” and
“input.parameters”
workflows collection – each
document represents a
calculation workflow, with
keys like “nodes” and
“links”
Typically, each document within a collection will be of a uniform
format, but this not a hard requirement in MongoDB.
1. As described previously: for each data type (a
“material”, “task”, “workflow”, etc.) decide on a
set of fields that describe each instance of that
data type. In MongoDB, these fields can easily
be changed or added to later if needed.
2. Try to create a single collection and document
format that can handle any kind of materials
data!
– example 1: “PIF” file format from Citrine[1]
– example 2: MPContribs from Materials Project[2]
34
Two approaches to store data in MongoDB
[1] J. O’Mara, B. Meredig, K. Michel, Materials Data
Infrastructure : A Case Study of the Citrination Platform to
Examine Data Import , Storage , and Access, Jom. (2016).
[2] P. Huck, D. Gunter, S. Cholia, D. Winston, A.T. N’Diaye, K. Persson, User
applications driven by the community contribution framework MPContribs
in the Materials Project, Concurr. Comput. Pract. Exp. 22 (2015)
35
MPContribs and MPFile – storing / querying any kind of
materials data into Materials Project
36
MPContribs portal (currently in beta testing,
access provided by request)
Funding: DOE-BES Materials Science Division, Computing: NERSC
37
Who to talk to next!
The current “Guardians of the MP infrastructure”
Slides (already) posted to https://hackingmaterials.lbl.gov

More Related Content

Similar to Materials Project computation and database infrastructure (20)

Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
Anubhav Jain
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
Anubhav Jain
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
Anubhav Jain
 
Atomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discoveryAtomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discovery
Anubhav Jain
 
Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...
Anubhav Jain
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
Anubhav Jain
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
Anubhav Jain
 
Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...
Anubhav Jain
 
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
PyData
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
University of California, San Diego
 
MongoDB San Francisco 2013: MongoDB for Collaborative Science presented by D...
MongoDB San Francisco 2013:  MongoDB for Collaborative Science presented by D...MongoDB San Francisco 2013:  MongoDB for Collaborative Science presented by D...
MongoDB San Francisco 2013: MongoDB for Collaborative Science presented by D...
MongoDB
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...
Anubhav Jain
 
Using MongoDB for Materials Discovery
Using MongoDB for Materials DiscoveryUsing MongoDB for Materials Discovery
Using MongoDB for Materials Discovery
Dan Gunter
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Anubhav Jain
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
aimsnist
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials design
University of California, San Diego
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
Anubhav Jain
 
The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...
University of California, San Diego
 
ICME Workshop Jul 2014 - The Materials Project
ICME Workshop Jul 2014 - The Materials ProjectICME Workshop Jul 2014 - The Materials Project
ICME Workshop Jul 2014 - The Materials Project
University of California, San Diego
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
Anubhav Jain
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
Anubhav Jain
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
Anubhav Jain
 
Atomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discoveryAtomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discovery
Anubhav Jain
 
Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...
Anubhav Jain
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
Anubhav Jain
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
Anubhav Jain
 
Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...
Anubhav Jain
 
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
How Web APIs and Data Centric Tools Power the Materials Project (PyData SV 2013)
PyData
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
University of California, San Diego
 
MongoDB San Francisco 2013: MongoDB for Collaborative Science presented by D...
MongoDB San Francisco 2013:  MongoDB for Collaborative Science presented by D...MongoDB San Francisco 2013:  MongoDB for Collaborative Science presented by D...
MongoDB San Francisco 2013: MongoDB for Collaborative Science presented by D...
MongoDB
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...
Anubhav Jain
 
Using MongoDB for Materials Discovery
Using MongoDB for Materials DiscoveryUsing MongoDB for Materials Discovery
Using MongoDB for Materials Discovery
Dan Gunter
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Anubhav Jain
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
aimsnist
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials design
University of California, San Diego
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
Anubhav Jain
 
The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...
University of California, San Diego
 

More from Anubhav Jain (20)

A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
A Career at a U.S. National Lab: Perspective from a Mid-Career ScientistA Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
Anubhav Jain
 
Research opportunities in materials design using AI/ML
Research opportunities in materials design using AI/MLResearch opportunities in materials design using AI/ML
Research opportunities in materials design using AI/ML
Anubhav Jain
 
Accelerating materials discovery with big data and machine learning
Accelerating materials discovery with big data and machine learningAccelerating materials discovery with big data and machine learning
Accelerating materials discovery with big data and machine learning
Anubhav Jain
 
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Anubhav Jain
 
Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...
Anubhav Jain
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
Anubhav Jain
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesis
Anubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
Anubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
Anubhav Jain
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...
Anubhav Jain
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...
Anubhav Jain
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Anubhav Jain
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst Design
Anubhav Jain
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...
Anubhav Jain
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...
Anubhav Jain
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine Learning
Anubhav Jain
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
Anubhav Jain
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials Project
Anubhav Jain
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...
Anubhav Jain
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
Anubhav Jain
 
A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
A Career at a U.S. National Lab: Perspective from a Mid-Career ScientistA Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
Anubhav Jain
 
Research opportunities in materials design using AI/ML
Research opportunities in materials design using AI/MLResearch opportunities in materials design using AI/ML
Research opportunities in materials design using AI/ML
Anubhav Jain
 
Accelerating materials discovery with big data and machine learning
Accelerating materials discovery with big data and machine learningAccelerating materials discovery with big data and machine learning
Accelerating materials discovery with big data and machine learning
Anubhav Jain
 
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Anubhav Jain
 
Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...
Anubhav Jain
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
Anubhav Jain
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesis
Anubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
Anubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
Anubhav Jain
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...
Anubhav Jain
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...
Anubhav Jain
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Anubhav Jain
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst Design
Anubhav Jain
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...
Anubhav Jain
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...
Anubhav Jain
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine Learning
Anubhav Jain
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
Anubhav Jain
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials Project
Anubhav Jain
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...
Anubhav Jain
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
Anubhav Jain
 
Ad

Recently uploaded (20)

Medical Instrumentation -I Biological Signals .pptx
Medical Instrumentation -I Biological Signals .pptxMedical Instrumentation -I Biological Signals .pptx
Medical Instrumentation -I Biological Signals .pptx
drmaneharshalid
 
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptxLINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
constantinoag4
 
Chemistry Quick Notes By MdcatAcademy.com ..pdf
Chemistry Quick Notes By MdcatAcademy.com ..pdfChemistry Quick Notes By MdcatAcademy.com ..pdf
Chemistry Quick Notes By MdcatAcademy.com ..pdf
salimullahk05
 
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Archana Verma
 
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
HannoPoeschl
 
Basic immune response against viruses.pptx
Basic immune response against viruses.pptxBasic immune response against viruses.pptx
Basic immune response against viruses.pptx
nehadeshmukh4702
 
Compound Microscope with working principle
Compound Microscope with working principleCompound Microscope with working principle
Compound Microscope with working principle
RahulRajai
 
Glymphatic system dysfunction and neurodegeneration
Glymphatic system dysfunction and neurodegenerationGlymphatic system dysfunction and neurodegeneration
Glymphatic system dysfunction and neurodegeneration
KanakChaudhary10
 
A review on simple heterocyclics involved in chemical ,biochemical and metabo...
A review on simple heterocyclics involved in chemical ,biochemical and metabo...A review on simple heterocyclics involved in chemical ,biochemical and metabo...
A review on simple heterocyclics involved in chemical ,biochemical and metabo...
DrAparnaYeddala
 
Cell_Presentatn_Class8_Vaishnavi.pptx uuu
Cell_Presentatn_Class8_Vaishnavi.pptx uuuCell_Presentatn_Class8_Vaishnavi.pptx uuu
Cell_Presentatn_Class8_Vaishnavi.pptx uuu
dhabaleyash11
 
Decipher the Magic of Quantum Entanglement.pdf
Decipher the Magic of Quantum Entanglement.pdfDecipher the Magic of Quantum Entanglement.pdf
Decipher the Magic of Quantum Entanglement.pdf
SaikatBasu37
 
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics DataMulti-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
sehilyi
 
Revision of the Proteaceae Macrofossil Record from Patagonia, Argentina
Revision of the Proteaceae Macrofossil Record from Patagonia, ArgentinaRevision of the Proteaceae Macrofossil Record from Patagonia, Argentina
Revision of the Proteaceae Macrofossil Record from Patagonia, Argentina
CynthiaGonzlez48
 
Grammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics DataGrammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics Data
sehilyi
 
Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
 Niosomes- Non ionic surfactant vesicle ( Karina Changrani) Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
3012KarinaChangrani
 
Insights to Narcotic Drugs by Urmila Nirmal
Insights to Narcotic Drugs by Urmila NirmalInsights to Narcotic Drugs by Urmila Nirmal
Insights to Narcotic Drugs by Urmila Nirmal
urvi1504nirmal
 
Cytoskeleton__with_anno_1683089530723.pdf
Cytoskeleton__with_anno_1683089530723.pdfCytoskeleton__with_anno_1683089530723.pdf
Cytoskeleton__with_anno_1683089530723.pdf
raorajveer1612
 
Thermodynamic concepts of zinc availability in soil and recent advances.pptx
Thermodynamic concepts of zinc availability in soil and recent advances.pptxThermodynamic concepts of zinc availability in soil and recent advances.pptx
Thermodynamic concepts of zinc availability in soil and recent advances.pptx
Archana Verma
 
International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)
journalijps98
 
Next Generation Sequencing.pptx important
Next Generation Sequencing.pptx importantNext Generation Sequencing.pptx important
Next Generation Sequencing.pptx important
Muqaddasjamil5
 
Medical Instrumentation -I Biological Signals .pptx
Medical Instrumentation -I Biological Signals .pptxMedical Instrumentation -I Biological Signals .pptx
Medical Instrumentation -I Biological Signals .pptx
drmaneharshalid
 
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptxLINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
LINGUISTICS_UNIT_ONE_-_LGE_DEFS_AND_ITS_NATURE_LECTURE_23.pptx
constantinoag4
 
Chemistry Quick Notes By MdcatAcademy.com ..pdf
Chemistry Quick Notes By MdcatAcademy.com ..pdfChemistry Quick Notes By MdcatAcademy.com ..pdf
Chemistry Quick Notes By MdcatAcademy.com ..pdf
salimullahk05
 
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...
Archana Verma
 
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
_OceanofPDF.com_Qualitative_Research_Analyzing_Life_-_Johnny_Saldana.pdf
HannoPoeschl
 
Basic immune response against viruses.pptx
Basic immune response against viruses.pptxBasic immune response against viruses.pptx
Basic immune response against viruses.pptx
nehadeshmukh4702
 
Compound Microscope with working principle
Compound Microscope with working principleCompound Microscope with working principle
Compound Microscope with working principle
RahulRajai
 
Glymphatic system dysfunction and neurodegeneration
Glymphatic system dysfunction and neurodegenerationGlymphatic system dysfunction and neurodegeneration
Glymphatic system dysfunction and neurodegeneration
KanakChaudhary10
 
A review on simple heterocyclics involved in chemical ,biochemical and metabo...
A review on simple heterocyclics involved in chemical ,biochemical and metabo...A review on simple heterocyclics involved in chemical ,biochemical and metabo...
A review on simple heterocyclics involved in chemical ,biochemical and metabo...
DrAparnaYeddala
 
Cell_Presentatn_Class8_Vaishnavi.pptx uuu
Cell_Presentatn_Class8_Vaishnavi.pptx uuuCell_Presentatn_Class8_Vaishnavi.pptx uuu
Cell_Presentatn_Class8_Vaishnavi.pptx uuu
dhabaleyash11
 
Decipher the Magic of Quantum Entanglement.pdf
Decipher the Magic of Quantum Entanglement.pdfDecipher the Magic of Quantum Entanglement.pdf
Decipher the Magic of Quantum Entanglement.pdf
SaikatBasu37
 
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics DataMulti-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
sehilyi
 
Revision of the Proteaceae Macrofossil Record from Patagonia, Argentina
Revision of the Proteaceae Macrofossil Record from Patagonia, ArgentinaRevision of the Proteaceae Macrofossil Record from Patagonia, Argentina
Revision of the Proteaceae Macrofossil Record from Patagonia, Argentina
CynthiaGonzlez48
 
Grammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics DataGrammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics Data
sehilyi
 
Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
 Niosomes- Non ionic surfactant vesicle ( Karina Changrani) Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
Niosomes- Non ionic surfactant vesicle ( Karina Changrani)
3012KarinaChangrani
 
Insights to Narcotic Drugs by Urmila Nirmal
Insights to Narcotic Drugs by Urmila NirmalInsights to Narcotic Drugs by Urmila Nirmal
Insights to Narcotic Drugs by Urmila Nirmal
urvi1504nirmal
 
Cytoskeleton__with_anno_1683089530723.pdf
Cytoskeleton__with_anno_1683089530723.pdfCytoskeleton__with_anno_1683089530723.pdf
Cytoskeleton__with_anno_1683089530723.pdf
raorajveer1612
 
Thermodynamic concepts of zinc availability in soil and recent advances.pptx
Thermodynamic concepts of zinc availability in soil and recent advances.pptxThermodynamic concepts of zinc availability in soil and recent advances.pptx
Thermodynamic concepts of zinc availability in soil and recent advances.pptx
Archana Verma
 
International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)
journalijps98
 
Next Generation Sequencing.pptx important
Next Generation Sequencing.pptx importantNext Generation Sequencing.pptx important
Next Generation Sequencing.pptx important
Muqaddasjamil5
 
Ad

Materials Project computation and database infrastructure

  • 1. Materials Project computation and database infrastructure Anubhav Jain Energy Technologies Area Lawrence Berkeley National Laboratory Berkeley, CA Presentation given to Delaware Energy Institute, 2018 Slides (already) posted to https://hackingmaterials.lbl.gov
  • 2. Outline 2 ① Introduction to the Materials Project ② Materials Project computation infrastructure ③ Database considerations
  • 3. The Materials Project database • Online resource of density functional theory simulation data for ~85,000 inorganic materials • Includes band structures, elastic tensors, piezoelectric tensors, battery properties and more • >60,000 registered users • Free • www.materialsproject.org 3 Jain et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 1, 11002 (2013).
  • 4. 4 Many data sets are available! M. De Jong et al. Sci. Data, 2015, 2, 150009. ] M. De Jong et al. Sci. Data, 2015, 2, 150009.
  • 5. 5 As well as “apps” for exploring the data
  • 6. Outline 6 ① Introduction to the Materials Project ② Materials Project computation infrastructure ③ Database considerations
  • 7. A “black-box” view of performing a calculation 7 “something” Results! researcher What is the GGA-PBE elastic tensor of GaAs?
  • 8. Unfortunately, the inside of the “black box” is usually tedious and “low-level” 8 lots of tedious, low-level work… Results! researcher What is the GGA-PBE elastic tensor of GaAs? Input file flags SLURM format how to fix ZPOTRF? q set up the structure coordinates q write input files, double-check all the flags q copy to supercomputer q submit job to queue q deal with supercomputer headaches q monitor job q fix error jobs, resubmit to queue, wait again q repeat process for subsequent calculations in workflow q parse output files to obtain results q copy and organize results, e.g., into Excel
  • 9. What would be a better way? 9 “something” Results! researcher What is the GGA-PBE elastic tensor of GaAs?
  • 10. What would be a better way? 10 Results! researcher What is the GGA-PBE elastic tensor of GaAs? Workflows to run q band structure q surface energies ü elastic tensor q Raman spectrum q QH thermal expansion
  • 11. Ideally the method should scale to millions of calculations 11 Results! researcher Start with all binary oxides, replace O->S, run several different properties Workflows to run ü band structure ü surface energies ü elastic tensor q Raman spectrum q QH thermal expansion q spin-orbit coupling
  • 12. Atomate tries make it easy, automatic, and flexible to generate data with existing simulation packages 12 Results! researcher Run many different properties of many different materials!
  • 13. Atomate contains a library of simulation procedures 13 VASP-based • band structure • spin-orbit coupling • hybrid functional calcs • elastic tensor • piezoelectric tensor • Raman spectra • NEB • GIBBS method • QH thermal expansion • AIMD • ferroelectric • surface adsorption • work functions Other • BoltzTraP • FEFF method • LAMMPS MD Mathew, K. et al Atomate: A high-level interface to generate, execute, and analyze computational materials science workflows, Comput. Mater. Sci. 139 (2017) 140–152.
  • 14. Each simulation procedure translates high-level instructions into a series of low-level tasks 14 quickly and automatically translate PI-style (minimal) specifications into well-defined FireWorks workflows What is the GGA-PBE elastic tensor of GaAs? M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, et al., Charting the complete elastic properties of inorganic crystalline compounds, Sci. Data. 2 (2015).
  • 15. Atomate thus encodes and standardizes knowledge about running various kinds of simulations from domain experts 15 K. Mathew J. Montoya S. Dwaraknath A. Faghaninia All past and present knowledge, from everyone in the group, everyone previously in the group, and our collaborators, about how to run calculations M. Aykol S.P. Ong B. Bocklund T. Smidt H. Tang I.H. Chu M. Horton J. Dagdalen B. Wood Z.K. Liu J. Neaton K. Persson A. Jain +
  • 16. 16 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 17. 17 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 18. • Pymatgen can retrieve crystal structures from the Materials Project database (MPRester class) • It can also manipulate crystal structures – substitutions – supercell creation – order-disorder (shown at right) – interstitial finding – surface / slab generation • A visual interface to many of the tools are in Materials Project’s “Crystal Toolkit” app 18 Crystal structure generation via pymatgen Example: Order-disorder resolve partial or mixed occupancies into a fully ordered crystal structure (e.g., mixed oxide-fluoride site into separate oxygen/fluorine)
  • 19. 19 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 20. 20 Atomate’s main goal – convert structures to workflows Workflows consist of a series of jobs (“FireWorks”), each with multiple tasks. Atomate jobs typically (i) run a calculation and (ii) store the results in a database
  • 21. 21 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 22. FireWorks allows you to write your workflow once and execute (almost) anywhere 22 • Execute workflows locally or at a supercomputing center • Queue systems supported – PBS – SGE – SLURM – IBM LoadLeveler – NEWT (a REST-based API at NERSC) – Cobalt (Argonne LCF)
  • 23. Dashboard with status of all jobs 23
  • 24. • Job provenance and automatic metadata storage • Detect and rerun failures • “Dynamic” workflows that change behavior based on results • Customize job priorities • Much more… 24 Other features
  • 25. 25 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 26. Atomate – builders framework 26 “Builders” start with base collections in a database and create higher-level collections that summarize information or add metadata
  • 27. 27 The atomate database makes it easy to perform various analyses with pymatgen atomate output database(s) phase diagrams Pourbaix diagrams diffusivity via MDband structure analysis
  • 28. 28 Many research groups have run tens of thousands of materials science workflows with atomate also used by: • Persson research group, UC Berkeley • Ong research group, UC San Diego • Neaton research group, UC Berkeley • Liu research group, Penn State • Groups not developing on atomate! • e.g., see “Thermal expansion of quaternary nitride coatings” by Tasnadi et al. atomate now powers the Materials Project and will be used to run hundreds of thousands of simulations in the next year (www.materialsproject.org)
  • 29. Outline 29 ① Introduction to the Materials Project ② Materials Project computation infrastructure ③ Database considerations
  • 30. 30 About a decade ago, we were using a SQL infrastructure Main problems we ran into: • Too static – every time we wanted to store a new kind of data, the DB master needed to “design and update” the database schema • Too difficult for newcomers – constructing queries (joins, etc.). We actually designed a system to help people make queries, which is common
  • 31. 31 Since then, we have switched to MongoDB – a “noSQL” database Major advantages • Very dynamic – easy to add new data types without interfering with old data types or redesigning everything. No central “database master” needed • Easy for newcomers – easy syntax, no complex “joins”, easy to visualize results • Easy object-relational mapping – built our pymatgen code so that any objects (e.g., band structures, crystal structures, etc.) could be exported to a database or imported from a database easily
  • 32. 32 How we store computed data Data is stored in “collections”. Each collection is a set of documents that can be queried. Each document consists of nested key- value pairs (“dictionaries”) or arrays. e.g. one can search for: {“tags”: “phosphides”} to retrieve all documents tagged with “phosphide”
  • 33. 33 Each collection has a set of standard keys Data is stored in “collections”. Each collection is a set of documents that can be queried. materials collection – each document represents a material, with keys like “formula” and “band_gap” tasks collection – each document represents a DFT calculation, with keys like “dir_name” and “input.parameters” workflows collection – each document represents a calculation workflow, with keys like “nodes” and “links” Typically, each document within a collection will be of a uniform format, but this not a hard requirement in MongoDB.
  • 34. 1. As described previously: for each data type (a “material”, “task”, “workflow”, etc.) decide on a set of fields that describe each instance of that data type. In MongoDB, these fields can easily be changed or added to later if needed. 2. Try to create a single collection and document format that can handle any kind of materials data! – example 1: “PIF” file format from Citrine[1] – example 2: MPContribs from Materials Project[2] 34 Two approaches to store data in MongoDB [1] J. O’Mara, B. Meredig, K. Michel, Materials Data Infrastructure : A Case Study of the Citrination Platform to Examine Data Import , Storage , and Access, Jom. (2016). [2] P. Huck, D. Gunter, S. Cholia, D. Winston, A.T. N’Diaye, K. Persson, User applications driven by the community contribution framework MPContribs in the Materials Project, Concurr. Comput. Pract. Exp. 22 (2015)
  • 35. 35 MPContribs and MPFile – storing / querying any kind of materials data into Materials Project
  • 36. 36 MPContribs portal (currently in beta testing, access provided by request)
  • 37. Funding: DOE-BES Materials Science Division, Computing: NERSC 37 Who to talk to next! The current “Guardians of the MP infrastructure” Slides (already) posted to https://hackingmaterials.lbl.gov