Discovering advanced materials for energy
applications by mining the scientific literature
Anubhav Jain
Energy Technologies Area
Lawrence Berkeley National Laboratory
Berkeley, CA
AFRL meeting, Jan 2020
Slides (already) posted to hackingmaterials.lbl.gov
• Often, materials are known for several decades
before their functional applications are known
– MgB2 sitting on lab shelves for 50 years before its
identification as a superconductor in 2001
– LiFePO4 known since 1938, only identified as a Li-ion
battery cathode in 1997
• Even after discovery, optimization and
commercialization still take decades
• To get a sense for why this is so hard, let’s look at
the problem in more detail …
2
Typically, both new materials discovery and optimization
take decades
What constrains traditional approaches to materials design?
3
“[The Chevrel] discovery resulted from a lot of
unsuccessful experiments of Mg ions insertion
into well-known hosts for Li+ ions insertion, as
well as from the thorough literature analysis
concerning the possibility of divalent ions
intercalation into inorganic materials.”
-Aurbach group, on discovery of Chevrel cathode
for multivalent (e.g., Mg2+) batteries
Levi, Levi, Chasid, Aurbach
J. Electroceramics (2009)
4
Researchers are starting to fundamentally re-think how we
invent the materials that make up our devices
Next-
generation
materials
design
Computer-
aided
materials
design
Natural
language
processing
“Self-driving
laboratories”
Outline
5
① Natural language processing - where are
we right now?
② What’s next for the NLP work?
6
Can ML help us work through our backlog of information we
need to assimilate from text sources?
papers to read “someday”
NLP algorithms
• It is difficult to look up all information any given material
due to the many different ways chemical compositions
are written
– a search for “TiNiSn” will give different results than “NiTiSn”
– a search for “GaSb” won’t match text that reads “Ga0.5Sb0.5”
– a search for “SnBi4Te7” won’t match text that reads “we studied
SnBi4X7 (X=S, Se, Te)”.
– a search for “AgCrSe2”, if it doesn’t have any hits, won’t suggest
“CuCrSe2” as a similar result
• It is difficult to compile summaries, e.g.:
– A list of all materials studied for an application
– A list of all synthesis methods for a material
7
Traditional search doesn’t answer the questions we want
What is matscholar?
• Matscholar is an attempt to organize the world’s
information on materials science, connecting
together topics of study, synthesis and
characterization methods, and specific materials
compositions
• It is also an effort to use state-of-the-art natural
language processing to make collective use of
the information in millions of articles
One of our main projects concerns named entity
recognition, or automatically labeling text
9
1
0
> 4 million
Papers Collected
31 million
Properties
19 million
Materials Mentions
8.8 million
Characterization Methods
7.5 million
Applications
5 million
Synthesis Methods
•Data Collection: Over 4 million full papers*
collected from more than 2100 journals.
* Entities only extracted from abstracts deemed relevant to inorganic materials
science (~2M) so far.
11
Now we can search!
Live on www.matscholar.com
12
Another example …
13
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
Extracted 4 million
abstracts of relevant
scientific articles using
various APIs from
journal publishers
Some are more difficult
than others to obtain.
Abstract collection
continues …
14
Step 1 – data collection
15
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
• First split the text into sentences
– Seems simple, but remember edge cases like ”et al.” or
“etc.” does not necessarily signify end of sentence despite
the period
• Then split the sentences into words
– Tricky things are detecting and normalizing chemical
formulas, selective lowercasing (“Battery” vs “battery” or
“BaS” vs “BAs”), homogenizing numbers, etc.
• Done largely with the ChemDataExtractor* with
some custom improvements
– We may move to a fully custom tokenizer soon
16
Step 2 - tokenization
*http://chemdataextractor.org
17
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
• Part A is marking abstracts
as relevant / non-relevant
to inorganic materials
science
• Part B is tediously labeling
~600 abstracts
– Largely done by one person
– Spot-check of 25 abstracts
by a second person gave
87.4% agreement
18
Step 3 – hand label abstracts
19
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
• We use the word2vec
algorithm (Google) to turn
each unique word in our
corpus into a 200-
dimensional vector
• These vectors encode the
meaning of each word
meaning based on trying to
predict context words
around the target
20
Step 4a: the word2vec algorithm is used to “featurize” words
Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017
• We use the word2vec
algorithm (Google) to turn
each unique word in our
corpus into a 200-
dimensional vector
• These vectors encode the
meaning of each word
meaning based on trying to
predict context words
around the target
21
Step 4a: the word2vec algorithm is used to “featurize” words
Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017
“You shall know a word by
the company it keeps”
- John Rupert Firth (1957)
• The classic example is:
– “king” - “man” + “woman” = ? → “queen”
22
Word embeddings trained on ”normal” text learns
relationships between words
23
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
• If you read this sentence:
“The band gap of ___ is 4.5 eV”
It is clear that the blank should be filled in with a
material word (not a synthesis method, characterization
method, etc.)
How do we get a neural network to take into account
context (as well as properties of the word itself)?
24
Step 4b: How do we train a model to recognize context?
25
Step 4b.An LSTM neural net classifies words by reading
word sequences
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
26
Ok so how does this work? High-level view
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
27
Step 5. Sit back and let the model label things for you!Named Entity Recognition
X
• Custom machine learning models to
extract the most valuable materials-related
information.
• Utilizes a long short-term memory (LSTM)
network trained on ~1000 hand-annotated
abstracts.
• f1 scores of ~0.9. f1 score for inorganic
materials extraction is >0.9.
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
28
Live online …
29
Could these techniques also be used to predict which
materials we might want to screen for an application?
papers to read “someday”
NLP algorithms
• The classic example is:
– “king” - “man” + “woman” = ? → “queen”
30
Remember that word embeddings seem to learn
relationships in text
31
For scientific text, it learns scientific concepts as well
crystal structures of the elements
Tshitoyan, V. et al. Unsupervised word embeddings capture latent
knowledge from materials science literature. Nature 571, 95–98 (2019).
32
There seems to be materials knowledge encoded in the
word vectors
Tshitoyan, V. et al. Unsupervised word embeddings capture latent
knowledge from materials science literature. Nature 571, 95–98 (2019).
33
Note that more data is not always better!
We want relevance
Tshitoyan, V. et al. Unsupervised word embeddings capture latent
knowledge from materials science literature. Nature 571, 95–98 (2019).
34
Word embeddings also have the periodic table encoded in it
with no prior knowledge
“word embedding”
periodic table
• Dot product of a composition word with
the word “thermoelectric” essentially
predicts how likely that word is to appear
in an abstract with the word
thermoelectric
• Compositions with high dot products are
typically known thermoelectrics
• Sometimes, compositions have a high dot
product with “thermoelectric” but have
never been studied as a thermoelectric
• These compositions usually have high
computed power factors!
(DFT+BoltzTraP)
35
Making predictions: dot products measure likelihood for
words to co-occur
Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from
materials science literature. Nature 571, 95–98 (2019).
36
Try ”going back in time” and ranking materials, and follow
what happens in later years
Tshitoyan, V. et al.
Unsupervised word
embeddings capture latent
knowledge from materials
science literature. Nature
571, 95–98 (2019).
– For every year since
2001, see which
compounds we would
have predicted using
only literature data until
that point in time
– Make predictions of
what materials are the
most promising
thermoelectrics for
data until that year
– See if those materials
were actually studied as
thermoelectrics in
subsequent years 37
A more comprehensive “back in time” test
Tshitoyan, V. et al. Unsupervised word embeddings capture
latent knowledge from materials science literature. Nature
571, 95–98 (2019).
• Thus far, 2 of our top 20 predictions made in
~August 2018 have already been reported in the
literature for the first time as thermoelectrics
– Li3Sb was the subject of a computational study
(predicted zT=2.42) in Oct 2018
– SnTe2 was experimentally found to be a moderately
good thermoelectric (expt zT=0.71) in Dec 2018
• We are working with an experimentalist on one
of the predictions (but ”spare time” project)
38
How about “forward” predictions?
[1] Yang et al. "Low lattice thermal conductivity and
excellent thermoelectric behavior in Li3Sb and Li3Bi."
Journal of Physics: Condensed Matter 30.42 (2018):
425401
[2] Wang et al. "Ultralow lattice thermal conductivity and
electronic properties of monolayer 1T phase semimetal
SiTe2 and SnTe2." Physica E: Low-dimensional Systems and
Nanostructures 108 (2019): 53-59
39
How is this working?
“Context
words” link
together
information
from different
sources
Outline
40
① Natural language processing - where are
we right now?
② What’s next for the NLP work?
• Currently, we only have word vectors for
compositions that explicitly appear in abstracts
• We can rank known materials for an application,
but for materials with zero or little mention in the
scientific literature, we are stuck!
• How do we get word embeddings for
compositions that do not exist in the text?
41
Making predictions for entirely new compositions
42
“Hidden representation learning”
43
Initial results – predicting experimental band gap from
composition (~3000 data points)
44
Going beyond entity recognition towards relationship
extraction
45
Current approach is not good enough
• E.g., automatically generate databases from the
literature
– Materials and their numerical band gaps (or thermal
conductivities, or bulk modulus, or superconducting
temperature, etc.)
– If materials can be made n-type, p-type, or both
– Which synthesis techniques led to various sample
descriptors
• Will likely require more powerful techniques, e.g.,
attention-based algorithms (BERT, Google XLNet …)
– To be investigated …
46
Once the accuracy improves, we can start to make much
more powerful searches
47
D2S2 - data driven synthesis science (just starting)
Can we combine natural language processing with theory
and experiments to control synthesis?
Title auto-generated from abstract Published Title
Dynamics of molecular hydrogen
confined in narrow nanopores
Restricted dynamics of molecular
hydrogen confined in activated carbon
nanopores
Microfluidic Generation of
Polydisperse Solid Foams
Generation of Solid Foams with
Controlled Polydispersity Using
Microfluidics
Minimum variance unbiased estimator
of product performance
Assessing the lifetime performance
index of gamma lifetime products in
the manufacturing industry
Angle resolved ultraviolet
photoemission study of fluorescein
films on Ag 110
The growth of thin fluorescein films on
Ag 110”
48
... and also some fun things, like automatic title generation
49
Acknowledgements
Slides (already) posted to hackingmaterials.lbl.gov
• High-throughput DFT
– Gerbrand Ceder and “BURP” team
– Funding: Bosch / Umicore
• Natural language processing
– Gerbrand Ceder, Kristin Persson, and “Matscholar” team
– Funding: Toyota Research Institutes
• Overall work funded by US Department of Energy
50
The Matscholar team
Kristin PerssonAnubhav JainGerbrand Ceder
John
Dagdelen
Leigh
Weston
Vahe
Tshitoyan
Amalie
Trewartha
Alex
Dunn
Viktoriia
Baibakova
Funding from
(now at Google) (now at Medium)

Discovering advanced materials for energy applications by mining the scientific literature

  • 1.
    Discovering advanced materialsfor energy applications by mining the scientific literature Anubhav Jain Energy Technologies Area Lawrence Berkeley National Laboratory Berkeley, CA AFRL meeting, Jan 2020 Slides (already) posted to hackingmaterials.lbl.gov
  • 2.
    • Often, materialsare known for several decades before their functional applications are known – MgB2 sitting on lab shelves for 50 years before its identification as a superconductor in 2001 – LiFePO4 known since 1938, only identified as a Li-ion battery cathode in 1997 • Even after discovery, optimization and commercialization still take decades • To get a sense for why this is so hard, let’s look at the problem in more detail … 2 Typically, both new materials discovery and optimization take decades
  • 3.
    What constrains traditionalapproaches to materials design? 3 “[The Chevrel] discovery resulted from a lot of unsuccessful experiments of Mg ions insertion into well-known hosts for Li+ ions insertion, as well as from the thorough literature analysis concerning the possibility of divalent ions intercalation into inorganic materials.” -Aurbach group, on discovery of Chevrel cathode for multivalent (e.g., Mg2+) batteries Levi, Levi, Chasid, Aurbach J. Electroceramics (2009)
  • 4.
    4 Researchers are startingto fundamentally re-think how we invent the materials that make up our devices Next- generation materials design Computer- aided materials design Natural language processing “Self-driving laboratories”
  • 5.
    Outline 5 ① Natural languageprocessing - where are we right now? ② What’s next for the NLP work?
  • 6.
    6 Can ML helpus work through our backlog of information we need to assimilate from text sources? papers to read “someday” NLP algorithms
  • 7.
    • It isdifficult to look up all information any given material due to the many different ways chemical compositions are written – a search for “TiNiSn” will give different results than “NiTiSn” – a search for “GaSb” won’t match text that reads “Ga0.5Sb0.5” – a search for “SnBi4Te7” won’t match text that reads “we studied SnBi4X7 (X=S, Se, Te)”. – a search for “AgCrSe2”, if it doesn’t have any hits, won’t suggest “CuCrSe2” as a similar result • It is difficult to compile summaries, e.g.: – A list of all materials studied for an application – A list of all synthesis methods for a material 7 Traditional search doesn’t answer the questions we want
  • 8.
    What is matscholar? •Matscholar is an attempt to organize the world’s information on materials science, connecting together topics of study, synthesis and characterization methods, and specific materials compositions • It is also an effort to use state-of-the-art natural language processing to make collective use of the information in millions of articles
  • 9.
    One of ourmain projects concerns named entity recognition, or automatically labeling text 9
  • 10.
    1 0 > 4 million PapersCollected 31 million Properties 19 million Materials Mentions 8.8 million Characterization Methods 7.5 million Applications 5 million Synthesis Methods •Data Collection: Over 4 million full papers* collected from more than 2100 journals. * Entities only extracted from abstracts deemed relevant to inorganic materials science (~2M) so far.
  • 11.
    11 Now we cansearch! Live on www.matscholar.com
  • 12.
  • 13.
    13 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 14.
    Extracted 4 million abstractsof relevant scientific articles using various APIs from journal publishers Some are more difficult than others to obtain. Abstract collection continues … 14 Step 1 – data collection
  • 15.
    15 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 16.
    • First splitthe text into sentences – Seems simple, but remember edge cases like ”et al.” or “etc.” does not necessarily signify end of sentence despite the period • Then split the sentences into words – Tricky things are detecting and normalizing chemical formulas, selective lowercasing (“Battery” vs “battery” or “BaS” vs “BAs”), homogenizing numbers, etc. • Done largely with the ChemDataExtractor* with some custom improvements – We may move to a fully custom tokenizer soon 16 Step 2 - tokenization *http://chemdataextractor.org
  • 17.
    17 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 18.
    • Part Ais marking abstracts as relevant / non-relevant to inorganic materials science • Part B is tediously labeling ~600 abstracts – Largely done by one person – Spot-check of 25 abstracts by a second person gave 87.4% agreement 18 Step 3 – hand label abstracts
  • 19.
    19 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 20.
    • We usethe word2vec algorithm (Google) to turn each unique word in our corpus into a 200- dimensional vector • These vectors encode the meaning of each word meaning based on trying to predict context words around the target 20 Step 4a: the word2vec algorithm is used to “featurize” words Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017
  • 21.
    • We usethe word2vec algorithm (Google) to turn each unique word in our corpus into a 200- dimensional vector • These vectors encode the meaning of each word meaning based on trying to predict context words around the target 21 Step 4a: the word2vec algorithm is used to “featurize” words Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017 “You shall know a word by the company it keeps” - John Rupert Firth (1957)
  • 22.
    • The classicexample is: – “king” - “man” + “woman” = ? → “queen” 22 Word embeddings trained on ”normal” text learns relationships between words
  • 23.
    23 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 24.
    • If youread this sentence: “The band gap of ___ is 4.5 eV” It is clear that the blank should be filled in with a material word (not a synthesis method, characterization method, etc.) How do we get a neural network to take into account context (as well as properties of the word itself)? 24 Step 4b: How do we train a model to recognize context?
  • 25.
    25 Step 4b.An LSTMneural net classifies words by reading word sequences Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 26.
    26 Ok so howdoes this work? High-level view Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 27.
    27 Step 5. Sitback and let the model label things for you!Named Entity Recognition X • Custom machine learning models to extract the most valuable materials-related information. • Utilizes a long short-term memory (LSTM) network trained on ~1000 hand-annotated abstracts. • f1 scores of ~0.9. f1 score for inorganic materials extraction is >0.9. Weston, L. et al Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J. Chem. Inf. Model. (2019).
  • 28.
  • 29.
    29 Could these techniquesalso be used to predict which materials we might want to screen for an application? papers to read “someday” NLP algorithms
  • 30.
    • The classicexample is: – “king” - “man” + “woman” = ? → “queen” 30 Remember that word embeddings seem to learn relationships in text
  • 31.
    31 For scientific text,it learns scientific concepts as well crystal structures of the elements Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 32.
    32 There seems tobe materials knowledge encoded in the word vectors Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 33.
    33 Note that moredata is not always better! We want relevance Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 34.
    34 Word embeddings alsohave the periodic table encoded in it with no prior knowledge “word embedding” periodic table
  • 35.
    • Dot productof a composition word with the word “thermoelectric” essentially predicts how likely that word is to appear in an abstract with the word thermoelectric • Compositions with high dot products are typically known thermoelectrics • Sometimes, compositions have a high dot product with “thermoelectric” but have never been studied as a thermoelectric • These compositions usually have high computed power factors! (DFT+BoltzTraP) 35 Making predictions: dot products measure likelihood for words to co-occur Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 36.
    36 Try ”going backin time” and ranking materials, and follow what happens in later years Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 37.
    – For everyyear since 2001, see which compounds we would have predicted using only literature data until that point in time – Make predictions of what materials are the most promising thermoelectrics for data until that year – See if those materials were actually studied as thermoelectrics in subsequent years 37 A more comprehensive “back in time” test Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
  • 38.
    • Thus far,2 of our top 20 predictions made in ~August 2018 have already been reported in the literature for the first time as thermoelectrics – Li3Sb was the subject of a computational study (predicted zT=2.42) in Oct 2018 – SnTe2 was experimentally found to be a moderately good thermoelectric (expt zT=0.71) in Dec 2018 • We are working with an experimentalist on one of the predictions (but ”spare time” project) 38 How about “forward” predictions? [1] Yang et al. "Low lattice thermal conductivity and excellent thermoelectric behavior in Li3Sb and Li3Bi." Journal of Physics: Condensed Matter 30.42 (2018): 425401 [2] Wang et al. "Ultralow lattice thermal conductivity and electronic properties of monolayer 1T phase semimetal SiTe2 and SnTe2." Physica E: Low-dimensional Systems and Nanostructures 108 (2019): 53-59
  • 39.
    39 How is thisworking? “Context words” link together information from different sources
  • 40.
    Outline 40 ① Natural languageprocessing - where are we right now? ② What’s next for the NLP work?
  • 41.
    • Currently, weonly have word vectors for compositions that explicitly appear in abstracts • We can rank known materials for an application, but for materials with zero or little mention in the scientific literature, we are stuck! • How do we get word embeddings for compositions that do not exist in the text? 41 Making predictions for entirely new compositions
  • 42.
  • 43.
    43 Initial results –predicting experimental band gap from composition (~3000 data points)
  • 44.
    44 Going beyond entityrecognition towards relationship extraction
  • 45.
    45 Current approach isnot good enough
  • 46.
    • E.g., automaticallygenerate databases from the literature – Materials and their numerical band gaps (or thermal conductivities, or bulk modulus, or superconducting temperature, etc.) – If materials can be made n-type, p-type, or both – Which synthesis techniques led to various sample descriptors • Will likely require more powerful techniques, e.g., attention-based algorithms (BERT, Google XLNet …) – To be investigated … 46 Once the accuracy improves, we can start to make much more powerful searches
  • 47.
    47 D2S2 - datadriven synthesis science (just starting) Can we combine natural language processing with theory and experiments to control synthesis?
  • 48.
    Title auto-generated fromabstract Published Title Dynamics of molecular hydrogen confined in narrow nanopores Restricted dynamics of molecular hydrogen confined in activated carbon nanopores Microfluidic Generation of Polydisperse Solid Foams Generation of Solid Foams with Controlled Polydispersity Using Microfluidics Minimum variance unbiased estimator of product performance Assessing the lifetime performance index of gamma lifetime products in the manufacturing industry Angle resolved ultraviolet photoemission study of fluorescein films on Ag 110 The growth of thin fluorescein films on Ag 110” 48 ... and also some fun things, like automatic title generation
  • 49.
    49 Acknowledgements Slides (already) postedto hackingmaterials.lbl.gov • High-throughput DFT – Gerbrand Ceder and “BURP” team – Funding: Bosch / Umicore • Natural language processing – Gerbrand Ceder, Kristin Persson, and “Matscholar” team – Funding: Toyota Research Institutes • Overall work funded by US Department of Energy
  • 50.
    50 The Matscholar team KristinPerssonAnubhav JainGerbrand Ceder John Dagdelen Leigh Weston Vahe Tshitoyan Amalie Trewartha Alex Dunn Viktoriia Baibakova Funding from (now at Google) (now at Medium)