GPT For Python-Coding (2025)
GPT For Python-Coding (2025)
Volume 1198
Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
Editorial Board
Marco Dorigo , Université Libre de Bruxelles, Bruxelles, Belgium
Andries Engelbrecht, University of Stellenbosch, Stellenbosch, South Africa
Vladik Kreinovich, University of Texas at El Paso, El Paso, TX, USA
Francesco Carlo Morabito, Mediterranea University of Reggio Calabria, Reggio
Calabria, Italy
Roman Slowinski, Poznan University of Technology, Poznan, Poland
Yingxu Wang, Schulich School of Engineering, Calgary, AB, Canada
Yaochu Jin, Westlake University, Hangzhou, China
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design methods
of computational intelligence, as embedded in the fields of engineering, computer
science, physics and life sciences, as well as the methodologies behind them. The
series contains monographs, lecture notes and edited volumes in computational
intelligence spanning the areas of neural networks, connectionist systems, genetic
algorithms, evolutionary computation, artificial intelligence, cellular automata, self-
organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems.
Of particular value to both the contributors and the readership are the short publica-
tion timeframe and the world-wide distribution, which enable both wide and rapid
dissemination of research output.
Indexed by SCOPUS, DBLP, WTI AG (Switzerland), zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
Bernhard Eidel
Editor
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2025
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
In an era where the boundary between artificial intelligence and human capability
continues to blur, the advent of advanced models like GPT-4 stands as a testament to
the ever-expanding possibilities of technology. This book is dedicated to exploring
the fascinating intersection of artificial intelligence, specifically the capabilities of
GPT-4, with computational materials science and computational mechanics—fields
that are pivotal in shaping the future of engineering. The focus here is not only to
unveil the potential of GPT-4 in generating sophisticated Python coding assignments
but also to deepen the understanding and application of computational techniques in
modern engineering problems.
Unleashing the Potential of GPT-4 in Computational Sciences. The first objec-
tive of this book is to present a curated selection of intriguing problems from computa-
tional materials science and computational mechanics. These problems are carefully
chosen for their relevance to current research and industrial applications and their
suitability for showcasing the advanced capabilities of GPT-4 in code generation.
From predicting material behavior under various conditions to simulating complex
mechanical interactions, the problems serve as a canvas on which GPT-4 paints its
solutions, demonstrating not just accuracy but creativity in problem-solving.
Engineering of Prompts: The Art and Science Behind Effective Questioning.
At the heart of effectively utilizing GPT-4 lies the ‘engineering of prompts’—a
structured and methodological approach to formulating queries that elicit the most
coherent and comprehensive answers from the model. This book dedicates substantial
focus to this art, guiding readers through the nuances of crafting prompts that are
clear, precise, and tailored to extract specific outcomes. By mastering this skill,
engineers and researchers can leverage AI tools like GPT-4 to their fullest potential,
turning vague ideas into precise algorithmic actions and reliable solutions.
Rigorous Code Verification: Ensuring Reliability in AI-Generated Outputs.
Trust in AI-generated code demands rigorous verification. This book addresses the
critical need for robust testing frameworks to validate the code produced by GPT-
4. Through detailed walkthroughs of testing strategies and verification protocols,
readers will learn how to ensure that the solutions provided by AI not only work
v
vi Preface
in theory but also perform flawlessly in real-world applications. This part not only
enhances the reliability of AI as a tool but also instills confidence in its users.
Why This Book Is Indispensable. For students, graduate engineers, and seasoned
researchers, this book serves as a vital resource that illustrates the practical integration
of AI in their work. It demystifies the process of interfacing with one of the most
sophisticated AI models to date, making cutting-edge technology accessible and
applicable. For educators, this book provides a blueprint for teaching computational
concepts using AI, enriching the curriculum, and preparing students for a future
where AI is a ubiquitous part of problem-solving.
Inspiring the Next Generation. By bridging traditional engineering problems
with modern AI technology, this book not only enhances the current educational and
professional landscape but also serves to inspire the next generation of engineers
and developers. It pushes the boundaries of what can be achieved when human
ingenuity combines with artificial intelligence, setting a new standard for innovation
and excellence in engineering.
This book is more than just a textbook or a reference manual; it is a gateway to the
future of engineering, a tool that empowers its readers to build on the cutting edge of
technology. It encourages them to explore, experiment, and excel in their respective
fields, equipped with the knowledge and tools to use AI not just as an aid, but as a
transformative force in computational science.
This book project originated from a module I introduced into the Master’s degree
program in Computational Materials Science (CMS) at TU Bergakademie Freiberg
(TUBAF) during the summer semester of 2023, followed by a continuation in the
summer semester of 2024. The co-authors of this volume are students of the CMS
program at TUBAF who participated in the course and developed their individual
projects as part of its requirements.
vii
viii Contents
Conclusions
Learned Lessons-Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Bernhard Eidel, Rahul Narkhede, and Aagashram Neelakandan
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Contributors
ix
Acronyms
AI Artificial Intelligence
ANN Artificial Neural Networks
BC Boundary Condition
BCC Body Centered Cubic
BCE Binary Cross Entropy
BGK Bhatnagar-Gross-Krook
BVP Boundary Value Problem
CA Cellular Automata
CFD Computational Fluid Dynamics
CFL Courant-Friedrichs-Lewy condition
CLAHE Contrast Limited Adaptive Histogram Equalization
CNN Convolutional Neural Network
CPU Central Processing Unit
CS Coordinate System
D2Q9 Two-Dimensional, Nine-Directional
DFT Density Functional Theory
DL Deep Learning
erf Error function
FCC Face Centered Cubic
FD Finite Differences
FDM Finite Difference Method
FDTD Finite Difference Time Domain
FEM Finite Element Method
FSL Fick’s Second Law
FTCS Forward Time Centered Space
FVM Finite Volume Method
GIF Graphics Interchange Format
GPT Generative Pre-trained Transformer
GPU Graphics Processing Unit
HCP Hexagonal Close Packed
HEA High Entropy Alloy
xi
xii Acronyms
IC Initial Condition
IDE Integrated Development Environment
IoU Intersection over Union
IVP Initial Value Problem
LBM Lattice Boltzmann Method
LLM Large Language Model
MD Molecular Dynamics
MLP Multi Layer Perceptron
MSE Mean Square Error
NLP Natural Language Processing
NN Neural Network
OpenCV Open Source Computer Vision Library
PDE Partial Differential Equation
PF Phase Field
PML Perfectly Matched Layer
RK Reaction Kinetics
TE Transverse Electric
TL Transfer Learning
TM Transverse Magnetic
ZSL Zero-Shot Learning
Generation of Atomic Scale Single
Crystals
Abstract This chapter investigates the coding ability of GPT-4 in the generation of
pristine single crystals (Face Centered Cubic (FCC), Body-Centered Cubic (BCC),
and Hexagonal Close Packed (HCP)) structures on the atomic scale in arbitrary
orientation with respect to a Euclidean reference frame. Beyond, the code enables
the insertion of edge and screw dislocations into FCC pristine crystals. The generated
crystals in atomic resolution can be used in Molecular Dynamics (MD) or Molecular
Statics (MS) simulations. The resultant code is checked for correctness and analyzed
to gain deeper insights into GPT-4’s behavior. Furthermore, the ability of GPT-4
to incorporate human feedback is also examined by providing corrections to the
generated code during the evaluation process.
1 Introduction
A solid is considered a crystal when its atoms are arranged with precise and consistent
periodicity in their positions. This organization is made possible through something
called a lattice, which can be divided into two types: Bravais and non-Bravais.
In a Bravais lattice, all the points in the lattice are the same, meaning the atoms
are arranged uniformly throughout the crystal. On the other hand, a non-Bravais
lattice has different points in the lattice, indicating that the atoms are not the same
everywhere. So, the non-Bravais lattice is referred to lattice with a basis.
Figure 1a (a) gives an understanding of the overall structure of a crystal, we can
think of it as the combination of two things: the lattice (the organized arrangement
of points) and the basis (the specific arrangement of atoms at those points). So, in
simpler terms:
.Crystal Structure = Lattice + Basis
This equation helps us grasp the fundamental elements that contribute to the way
atoms are organized in a crystal.
Figures 1c–e depict the prevalent arrangement of unit cells observed in a variety
of metals. For cubic crystal systems, such as FCC or BCC, the lattice parameter is
typically denoted by ‘.a’ and represents the distance between adjacent lattice points
along each edge of the cubic unit cell. In other crystal systems, such as hexagonal,
there may be multiple lattice parameters that define the dimensions along different
crystallographic axes. In Figs. 1c, d, the parameter ‘.a’, and in Fig. 1e, both parameters
‘.a’ and ‘.c’, are referred to as lattice parameters or lattice constants.
Metallic specimens are composed of multiple unit cells repeated in a pattern.
However, the alignment between the overall orientation of the specimen and the
individual crystals it comprises may not necessarily coincide. An effective approach
to describe the crystal’s orientation relative to the specimen involves defining two
distinct coordinate systems (CS): a crystal CS and a global CS as shown in Fig. 1b.
The global (specimen) CS, often used in experiments, describes the orientation of a
sample based on its geometry, while the crystal CS in crystallography aligns with the
crystal lattice. Rotation matrices . R facilitate relating these systems by transforming
vectors from one system to another. By applying . R to vectors in the global CS, they
can be converted to the crystal CS.
Rotations of vectors from global to local are carried out according to (1) and vice
versa according to (2)
. x = Rx , (1)
T
.x = R x , (2)
where . x and . x are vectors in the global and crystal CS, respectively.
However, pristine crystals, characterized by highly ordered repeating atomic pat-
terns, do not undergo plastic deformation and are thus relatively uninteresting. Con-
versely, real-world materials are predominantly governed by defects, which can
be categorized into various dimensions: zero-dimensional (point) defects (such as
vacancies, interstitials, and substitutions), one-dimensional (line) defects (such as
dislocations), two-dimensional (planar) defects (such as grain boundaries, interfaces,
and free surfaces), and three-dimensional (volume) defects (such as pores and voids).
In this section, our focus is directed solely towards the examination of defects
stemming from dislocations. Dislocations represent linear or one-dimensional irreg-
ularities within a crystal lattice, where the alignment of atoms deviates from the
normal arrangement. Dislocations are broadly categorized into two types: edge dis-
locations and screw dislocations. As illustrated in Fig. 1f, an edge dislocation is
characterized by the insertion of an extra portion of a plane of atoms, the edge of
which terminates within the crystal. This edge is called a dislocation line abbrevi-
ated by .ξ ; for an edge dislocation the Burgers vector . b is perpendicular to .ξ . Within
the region around the dislocation line there is some localized lattice distortion. The
atoms above the dislocation line in Fig. 1f undergo compression, and those below
undergo tension.
Generation of Atomic Scale Single Crystal 5
(c) FCC crystal structure (d) BCC crystal structure (e) HCP crystal structure
Screw dislocaƟon
Fig. 1 Definition by sketches of (a) crystal structure, (b) of coordinate systems, (c)–(e) of unit
cells representing fundamental crystal structures, and representations of (f) edge and (g) screw
dislocations within a crystal lattice
6 P. Periyasamy and B. Eidel
according to [3]
b y
.u z = arctan , (3)
2π x
y
b xy
ux = arctan + ,
2π x 2(1 − ν)(x 2 + y 2 )
. (4)
b (1 − 2ν) ln(x 2 + y 2 ) x 2 − y2
uy = − + .
2π 4(1 − ν) 4(1 − ν)(x 2 + y 2 )
Generation of Atomic Scale Single Crystal 7
Here, .ν is the Poisson’s ratio of the material. If the position of .ξ is parallel to the
X - or .Y -axis, the positions of atoms in (3) and (4) have to be adjusted consistently,
.
cf. [3].
In the subsequent sections of this chapter, the prompts generated will assess the
Python coding proficiency of GPT-4 in creating single crystals, specifically focusing
on the FCC, BCC, and HCP structures in any arbitrary orientations. Additionally,
starting from the pristine or defect-free FCC crystal, we will introduce edge and screw
dislocations. The visualization of the generated results will be facilitated through
OVITO [8], a renowned scientific software extensively employed for visualizing and
analyzing molecular and material simulation data in the domain of computational
materials science and engineering.
2 Prompt
Users have the flexibility to input a wide range of questions to obtain answers to their
queries. However, the key lies in developing the skill to pose questions in a manner
that an Artificial Intelligence (AI) engine can comprehend, leading to the production
of high-quality and reliable results. An example of such a well-constructed prompt,
achieved through prompt engineering, is presented in Fig. 4.
The prompt offers a comprehensive roadmap of the Python program’s objectives,
detailing specific instructions for generating crystal structures for FCC, BCC, and
HCP systems, with an emphasis on accurately determining and saving atom coor-
dinates. Clear function requirements are outlined such as the function name and
user inputs. Primitive and basis vectors for each crystal type are explicitly defined
to ensure precise calculations for atom positions. The desired output file name is
specified, along with the file format identified as XYZ, compatible with visualiza-
tion tools like OVITO. User interface aspects, including prompts for crystal type,
lattice constants, box length with periodicity in each direction, and orientation, are
also described. Additionally, suggestions for error-handling measures to validate
user inputs, especially for lattice constants and crystal types, are provided. To ensure
reproducibility, two key factors are emphasized: clear prompts defining crucial loop
adjustments in for loops and a reminder that primitive and basis vectors must be
rotated before generating atom positions. The program intentionally excludes from
specifying the function for calculating periodicity length. Furthermore, an additional
prompt Fig. 3 has been provided to calculate it separately. This decision is made to
mitigate the inherent randomness in GPT-4’s behavior, ensuring that the code remains
reproducible, at least after a few iterations. These topics will be discussed further in
detail.
8 P. Periyasamy and B. Eidel
The user prompt displayed in Fig. 3 showcases the input prompt provided for the
periodicity calculation. Additionally, the Python code generated by GPT-4 and the
corresponding user inputs and the code output are detailed in code Listing 1.1 and
Tables 1 and 2 respectively.
23 p e r i o d i c i t y _ l e n g t h _ x y = ( np . array (
→ n u m b e r _ o f _ u n i t _ c e l l s [:2]) * a * n o r m _ p r i m i t i v e [:2]) .
→ a s t y p e ( int )
24 p e r i o d i c i t y _ l e n g t h _ z = ( n u m b e r _ o f _ u n i t _ c e l l s [2] * c
→ * n o r m _ p r i m i t i v e [ 2 ] ) . a s t y p e ( int )
25 p e r i o d i c i t y _ l e n g t h = np . c o n c a t e n a t e ([
→ p e r i o d i c i t y _ l e n g t h _ x y , [ p e r i o d i c i t y _ l e n g t h _ z ]])
26
27 return periodicity_length
28 def g e t _ u s e r _ i n p u t s () :
29 wh il e True :
30 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC ,
→ HCP ) : " ) . upper ()
31 if c r y s t a l _ t y p e in [ ’ FCC ’ , ’ BCC ’ , ’ HCP ’ ]:
32 b reak
33 pr in t ( " I n v a l i d c r y s t a l t y p e . P l e a s e c h o o s e f r o m ’ FCC
→ ’, ’ BCC ’, or ’ HCP ’. " )
34
35 a = fl oa t ( input ( " E n t e r l a t t i c e c o n s t a n t ’ a ’: " ) )
36
37 if c r y s t a l _ t y p e == ’ HCP ’:
38 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e c o n s t a n t ’
→ c ’ ( must be g r e a t e r t h an ’a ’) : " ) )
39 else :
40 c = None # Not r e q u i r e d for FCC and BCC
41
42 n u m b e r _ o f _ u n i t _ c e l l s = input ( " E n t e r the n u m b e r of u n i t
→ ce ll s in each d i r e c t i o n for the s i m u l a t i o n box ( x , y , z )
→ : " )
43 n u m b e r _ o f _ u n i t _ c e l l s = tuple ( map ( int ,
→ n u m b e r _ o f _ u n i t _ c e l l s . split ( ’ , ’ ) ) )
44
45 return crystal_type , a , c , number_of_unit_cells
46
47 if _ _ n a m e _ _ == " _ _ m a i n _ _ " :
48 try :
49 crystal_type , a , c , number_of_unit_cells =
→ g e t _ u s e r _ i n p u t s ()
50 periodicity_length = generate_crystal_structure (
→ c r y s t a l _ t y p e , a , c , n u m b e r _ o f _ u n i t _ c e l l s )
51 pr in t ( " P e r i o d i c i t y l e n g t h s ( x , y , z ) : " ,
→ p e r i o d i c i t y _ l e n g t h )
52 e x c e p t V a l u e E r r o r as e :
53 pr in t ( " Error : " , e )
Code Listing 1.1 Code generated by GPT-4 for periodicity calculation
Following the execution of code for periodicity calculation, the user input displayed
in Fig. 4 is provided to GPT-4 to generate crystal structures with arbitrary orientations.
Generation of Atomic Scale Single Crystal 11
Fig. 4 (continued)
In code Listing 1.2, the Python code exhibits the conclusive snippet crafted by GPT-
4, accompanied by the corresponding user inputs outlined in Table 3. The outcomes
of the code execution are visually depicted in Fig. 5.
1 # !/ usr / bin / env p y t h o n
2 # c o d i n g : utf -8
3
4 # In [1 8] :
5
6
7 i m p o r t n um py as np
8
9 # R o t a t i o n M a t r i x C a l c u l a t i o n from M i l l e r I n d i c e s
10 def r o t a t i o n _ m a t r i x _ f r o m _ m i l l e r _ i n d i c e s ( hkl_x , hkl_y , hkl_z )
→ :
11 # Normalize input vectors
12 x = np . a r r a y ( h k l _ x ) / np . l i n a l g . norm ( h k l _ x )
13 y = np . a r r a y ( h k l _ y ) / np . l i n a l g . norm ( h k l _ y )
14 z = np . a r r a y ( h k l _ z ) / np . l i n a l g . norm ( h k l _ z )
15
16 # E n s u r e o r t h o g o n a l i t y and c o m p u t e the r o t a t i o n m a t r i x
17 if np . dot (x , y ) == 0 and np . dot (x , z ) == 0 and np . dot (y ,
→ z ) == 0:
Generation of Atomic Scale Single Crystal 13
18 r e t u r n np . array ([ x , y , z ]) . T
19 else :
20 ra is e V a l u e E r r o r ( " I n p u t M i l l e r i n d i c e s are not
→ o r t h o g o n a l . " )
21
22 # G e n e r a t e the c r y s t a l s t r u c t u r e
23 def g e n e r a t e _ c r y s t a l _ s t r u c t u r e ( c r y s t a l _ t y p e , a , c ,
→ p e r i o d i c i t y _ l e n g t h , o r i e n t a t i o n ) :
24 r o t _ m a t r i x = r o t a t i o n _ m a t r i x _ f r o m _ m i l l e r _ i n d i c e s (*
→ o r i e n t a t i o n )
25 p o s i t i o n s = []
26
27 if c r y s t a l _ t y p e == ’ FCC ’:
28 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , 0.5* a , 0] ,
→ [0.5* a , 0 , 0.5* a ] , [0 , 0.5* a , 0.5* a ]])
29 b a s i s _ v e c t o r s = np . array ([[0 , 0 , 0]])
30 elif c r y s t a l _ t y p e == ’ BCC ’ :
31 p r i m i t i v e _ v e c t o r s = np . array ([[ -0.5* a , 0.5* a , 0.5* a
→ ] , [0.5* a , -0.5* a , 0.5* a ] , [0.5* a , 0.5* a , -0.5* a ]])
32 b a s i s _ v e c t o r s = np . array ([[0 , 0 , 0]])
33 elif c r y s t a l _ t y p e == ’ HCP ’ :
34 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , - 0 . 5 * ( 3 * * ( 1 / 2 )
→ ) *a , 0] , [0.5* a , 0 . 5 * ( 3 * * ( 1 / 2 ) ) * a , 0] , [0 , 0 , c ]])
35 b a s i s _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , 0 . 5 * ( 3 * * ( - 1 / 2 ) ) *a ,
→ 0 .2 5* c ] , [0.5* a , -0.5*(3**( -1/2) ) *a , 0.75* c ]])
36 else :
37 ra is e V a l u e E r r o r ( " I n v a l i d c r y s t a l type . " )
38
39 # A pp ly r o t a t i o n to p r i m i t i v e and b a s i s v e c t o r s
40 p r i m i t i v e _ v e c t o r s = np . dot ( p r i m i t i v e _ v e c t o r s , r o t _ m a t r i x
→ )
41 b a s i s _ v e c t o r s = np . dot ( b a s i s _ v e c t o r s , r o t _ m a t r i x )
42
43 # G e n e r a t e atom p o s i t i o n s
44 for i in range ( - p e r i o d i c i t y _ l e n g t h [0] ,
→ p e r i o d i c i t y _ l e n g t h [0] + 1) :
45 for j in range ( - p e r i o d i c i t y _ l e n g t h [1] ,
→ p e r i o d i c i t y _ l e n g t h [1] + 1) :
46 for k in range ( - p e r i o d i c i t y _ l e n g t h [2] ,
→ p e r i o d i c i t y _ l e n g t h [2] + 1) :
47 c e l l _ o r i g i n = i * p r i m i t i v e _ v e c t o r s [0] + j *
→ p r i m i t i v e _ v e c t o r s [1] + k * p r i m i t i v e _ v e c t o r s [2]
48 for basis in b a s i s _ v e c t o r s :
49 pos = c e l l _ o r i g i n + basis
50 if all (0 <= pos [ n ] <= (
→ p e r i o d i c i t y _ l e n g t h [ n ] + 1) for n in range (3) ) :
51 p o s i t i o n s . a p p e n d ( pos )
52
53 return positions
54
55 # Save p o s i t i o n s to XYZ file
56 def s a v e _ t o _ x y z _ f i l e ( a t o m _ p o s i t i o n s ) :
57 with open ( " C r y s t a l _ s t r u c t u r e . xyz " , " w " ) as file :
58 file . write ( f " { len ( a t o m _ p o s i t i o n s ) }\ n \ n " )
59 for pos in a t o m _ p o s i t i o n s :
14 P. Periyasamy and B. Eidel
60 file . write ( f " Atom { pos [0]} { pos [1]} { pos [2]}\ n " )
61
62 # User I n t e r f a c e
63 def main () :
64 crystal_type , a , c , periodicity_length , orientation =
→ g e t _ u s e r _ i n p u t ()
65 positions = generate_crystal_structure ( crystal_type , a ,
→ c , p e r i o d i c i t y _ l e n g t h , o r i e n t a t i o n )
66 save_to_xyz_file ( positions )
67 pr in t ( " C r y s t a l s t r u c t u r e g e n e r a t e d and saved to
→ C r y s t a l _ s t r u c t u r e . xyz . " )
68
69 def g e t _ u s e r _ i n p u t () :
70 # C o l l e c t and v a l i d a t e c r y s t a l type
71 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC , HCP )
→ : " ) . st rip () . upper ()
72 wh il e c r y s t a l _ t y p e not in [ ’ FCC ’ , ’ BCC ’ , ’ HCP ’ ]:
73 pr in t ( " I n v a l i d c r y s t a l t y p e . P l e a s e e n t e r FCC , BCC ,
→ or HCP . " )
74 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC ,
→ HCP ) : " ) . strip () . upper ()
75
76 # C o l l e c t and v a l i d a t e l a t t i c e c o n s t a n t ’ a ’
77 a = fl oa t ( input ( " E n t e r l a t t i c e c o n s t a n t ’ a ’: " ) )
78
79 # I n i t i a l i z e ’c ’ for HCP and v a l i d a t e if n e e d e d
80 c = 0
81 if c r y s t a l _ t y p e == ’ HCP ’:
82 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e c o n s t a n t ’
→ c ’ ( must be > a ) : " ) )
83 wh il e c <= a :
84 p rint ( " I n v a l i d ’ c ’ v a l u e . It must be g r e a t e r
→ than ’a ’. " )
85 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e
→ c o n s t a n t ’c ’: " ) )
86
87 # C o l l e c t and v a l i d a t e p e r i o d i c i t y l e n g t h (x , y , z )
88 p e r i o d i c i t y _ l e n g t h _ s t r = input ( " Enter p e r i o d i c i t y l e n g t h
→ ( x , y , z ) : " )
89 x , y , z = map ( int , p e r i o d i c i t y _ l e n g t h _ s t r . split ( ’ , ’ ) )
90
91 # C o l l e c t and v a l i d a t e o r i e n t a t i o n
92 pr in t ( " En ter o r i e n t a t i o n as M i l l e r i n d i c e s ( h , k , l ) for x
→ , y , z d i r e c t i o n s : " )
93 hk l_ x = t uple ( map ( int , input ( " O r i e n t a t i o n for x
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
94 hk l_ y = t uple ( map ( int , input ( " O r i e n t a t i o n for y
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
95 hk l_ z = t uple ( map ( int , input ( " O r i e n t a t i o n for z
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
96
97 # E n s u r e o r i e n t a t i o n v e c t o r s are o r t h o g o n a l
98 # Note : This is a s i m p l i f i e d check and may need
→ r e f i n e m e n t based on s p e c i f i c r e q u i r e m e n t s
Generation of Atomic Scale Single Crystal 15
The outcomes of generating single crystals of FCC, BCC, and HCP in native ori-
entations are illustrated in Fig. 5a–c as induced by corresponding input parameters
listed in Table 3 (Table 4).
Table 3 User inputs and output for crystal structure generation in native orientation
Input requested parameters: FCC BCC HCP
Crystal type (FCC, BCC, HCP): FCC BCC HCP
Lattice constant ‘a’: 3.51 3.3 3.21
HCP lattice constant ‘c’ (with c.>a): – – 5.21
Periodicity length (x, y, z): 43, 43, 43 38, 38, 38 51, 51, 135
Orientation as Miller indices (h, k, l)
For x direction: 1, 0, 0 1, 0, 0 1, 0, 0
For y direction: 0, 1, 0 0, 1, 0 0, 1, 0
For z direction: 0, 0, 1 0, 0, 1 0, 0, 1
Output: Crystal structure generated and saved to Crystal_structure.xyz
16 P. Periyasamy and B. Eidel
Table 4 User inputs for structure generation in orientation other than native
Requested input FCC BCC HCP
parameters:
Crystal type (FCC, FCC BCC HCP
BCC, HCP):
Lattice constant ‘a’: 3.51 3.3 3.21
HCP lattice constant – – 5.21
‘c’ (with c.>a):
Periodicity length (x, 43, 43, 43 38, 38, 38 51, 51, 135
y, z):
Orientation as Miller indices (h, k, l)
For x direction: 1, .−1, 2 1, .−1, 2 1, .−1, 2
For y direction: .−1, 1, 1 .−1, 1, 1 .−1, 1, 1
For z direction: 1, 1, 0 1, 1, 0 1, 1, 0
Output: Crystal structure generated and saved to Crystal_structure.xyz
(a) FCC crystal structure (b) BCC crystal structure (c) HCP crystal structure
Fig. 5 Generation of single crystals in native orientation (edges align with .< 100 > directions)
and visualization using OVITO
(a) FCC crystal structure (b) BCC crystal structure (c) HCP crystal structure
Fig. 6 Generation of single crystals in orientation other than native and visualization using OVITO
Generation of Atomic Scale Single Crystal 17
Following the crystal generation process, the user prompt depicted in Fig. 7 was
utilized once more to introduce screw and edge dislocations into the pristine FCC
samples.
Fig. 7 (continued)
Generation of Atomic Scale Single Crystal 19
In code listing 1.3, the Python code exhibits the conclusive snippet crafted by GPT-
4, accompanied by the corresponding user inputs outlined in Tables 5 and 6. The
outcomes of the code execution are visually depicted in Figs. 8 and 9.
1 i m p o r t nu mp y as np
2
3 def c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( x1 , x2 , b , nu ) :
4 t h e t a = np . a r c t a n 2 ( x2 , x1 )
5 r2 = x1 **2 + x2 **2 # S q u a r e of the r a d i a l d i s t a n c e r
6 u1 = ( b / (2 * np . pi ) ) * ( theta + ( x1 * x2 ) / (2 * (1 -
→ nu ) * r2 ) )
7 u2 = ( - b / (2 * np . pi ) ) * ((1 - 2 * nu ) * np . log ( r2 ) /
→ (4 * (1 - nu ) ) + ( x1 **2 - x2 **2) / (4 * (1 - nu ) * r2 )
→ )
8 r e t u r n u1 , u2
9
10 def c a l c u l a t e _ s c r e w _ d i s p l a c e m e n t ( x1 , x2 , b ) :
11 t h e t a = np . a r c t a n 2 ( x2 , x1 )
12 u3 = b * theta / (2 * np . pi )
13 r e t u r n u3
14
15 def a p p l y _ d i s p l a c e m e n t ( a t o m _ p o s i t i o n s , d i s l o c a t i o n _ p o s i t i o n ,
→ b , nu , direction , d i s l o c a t i o n _ t y p e ) :
16 m o d i f i e d _ a t o m s = []
17 for atom in a t o m _ p o s i t i o n s :
18 element , x , y , z = atom
19 if d i r e c t i o n == ’z ’ :
20 dx , dy = x - d i s l o c a t i o n _ p o s i t i o n [0] , y -
→ d i s l o c a t i o n _ p o s i t i o n [1]
21 elif d i r e c t i o n == ’x ’ :
22 dy , dz = y - d i s l o c a t i o n _ p o s i t i o n [0] , z -
→ d i s l o c a t i o n _ p o s i t i o n [1]
23 elif d i r e c t i o n == ’y ’ :
24 dx , dz = x - d i s l o c a t i o n _ p o s i t i o n [0] , z -
→ d i s l o c a t i o n _ p o s i t i o n [1]
25
26 if d i s l o c a t i o n _ t y p e == ’ edge ’ :
27 if d i r e c t i o n == ’z ’:
28 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dx , dy ,
→ b , nu )
29 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u1 , y +
→ u2 , z ])
30 elif d i r e c t i o n == ’x ’ :
31 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dy , dz ,
→ b , nu )
32 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y + u1 , z
→ + u2 ])
33 elif d i r e c t i o n == ’y ’ :
34 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dx , dz ,
→ b , nu )
35 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u1 , y , z
→ + u2 ])
36 elif d i s l o c a t i o n _ t y p e == ’ screw ’ :
20 P. Periyasamy and B. Eidel
37 u3 = c a l c u l a t e _ s c r e w _ d i s p l a c e m e n t ( dx , dy , b )
38 if d i r e c t i o n == ’z ’:
39 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y , z + u3
→ ])
40 elif d i r e c t i o n == ’x ’ :
41 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u3 , y , z
→ ])
42 elif d i r e c t i o n == ’y ’ :
43 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y + u3 , z
→ ])
44 return modified_atoms
45
46 def i n s e r t _ d i s l o c a t i o n ( i n p u t _ f i l e _ n a m e , o u t p u t _ f i l e _ n a m e ,
→ l a t t i c e _ c o n s t a n t , d i s l o c a t i o n _ p o s i t i o n , direction ,
→ d i s l o c a t i o n _ t y p e , nu ) :
47 with open ( i n p u t _ f i l e _ n a m e , ’r ’ ) as file :
48 li ne s = file . r e a d l i n e s ()
49
50 atoms = [[ line . split () [0] , float ( line . split () [1]) , float
→ ( line . split () [2]) , float ( line . split () [3]) ] for line in
→ l in es [ 2:]]
51
52 b = np . sqrt (2) * l a t t i c e _ c o n s t a n t / 2 # Burgers vector
→ for FCC
53
54 m o d i f i e d _ a t o m s = a p p l y _ d i s p l a c e m e n t ( atoms ,
→ d i s l o c a t i o n _ p o s i t i o n , b , nu , direction ,
→ d i s l o c a t i o n _ t y p e )
55
56 with open ( o u t p u t _ f i l e _ n a m e , ’w ’) as file :
57 file . write ( f " { len ( m o d i f i e d _ a t o m s ) }\ n " )
58 file . write ( f " Atoms with { d i s l o c a t i o n _ t y p e }
→ d i s l o c a t i o n \ n " )
59 for atom in m o d i f i e d _ a t o m s :
60 file . write ( " " . join ( map ( str , atom ) ) + " \ n " )
61
62 # User input h a n d l i n g
63 i n p u t _ f i l e _ n a m e = input ( " Enter the name of the input . xyz
→ file : " )
64 o u t p u t _ f i l e _ n a m e = input ( " E n t e r the name of the o u t p u t . xyz
→ file : " )
65 l a t t i c e _ c o n s t a n t = float ( input ( " Enter the l a t t i c e c o n s t a n t :
→ " ) )
66 nu = fl oa t ( in put ( " E n t e r the P o i s s o n ’s ratio : " ) )
67 d i r e c t i o n = input ( " Enter the d i r e c t i o n of the d i s l o c a t i o n
→ line (x , y , or z ) : " ) . lower ()
68 d i s l o c a t i o n _ t y p e = input ( " Enter the type of d i s l o c a t i o n (
→ screw or edge ) : " ) . lower ()
69
70 # R e q u e s t a d d i t i o n a l c o o r d i n a t e s based on the d i r e c t i o n of
→ the d i s l o c a t i o n line
71 if d i r e c t i o n == ’x ’ :
72 d i s l o c a t i o n _ y = float ( input ( " Enter the y - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
Generation of Atomic Scale Single Crystal 21
The outcome of the code for inserting screw dislocation and edge dislocation into
the crystal is illustrated in Fig. 8. Note that for screw dislocations, the line direction
.ξ aligns with the Burgers vector.
Fig. 8 Screw dislocations in FCC crystals for different orientation. The green arrow indicates the
direction of the Burgers vector, the red arrow represents the dislocation line direction. Rendering
by using OVITO
The outcome of the code for inserting an edge dislocation or a screw dislocation
into the crystal is illustrated in Fig. 9. Note that for edge dislocations the direction .ξ
of the dislocation line is perpendicular to the Burgers vector.
Generation of Atomic Scale Single Crystal 23
Fig. 9 Edge dislocations in FCC crystals for different orientations. The green arrow indicates
the direction of the Burgers vector, while the blue arrows represent the dislocation line direction.
Rendering by using OVITO
In the initial formulation of the prompt, the problem objective was clearly articulated,
as depicted in Fig. 10. GPT-4 demonstrated its proficiency by generating representa-
tions of all three crystal structures (FCC, BCC, and HCP) based on this initial prompt.
However, a notable observation surfaced during the simulation box depiction, where
it became apparent that the atoms did not entirely occupy the designated space, as
illustrated in Fig. 11. This discrepancy persisted despite the explicit specification
of this condition in the user prompt. The incongruity raises a significant concern
regarding the accurate adherence to the specified conditions during the generation
process.
To address this issue in the subsequent iteration, a more generic prompt was
presented to the language model, as depicted in Fig. 12.
Fig. 13 Consequence of a generic prompt: failure to detect all valid crystal structures
In subsequent iterations, efforts were made to curb the model’s random behavior by
providing more precise information. To retrieve the equations for primitive and basis
vectors from the initial prompt, GPT-4 was specifically instructed to use the primitive
vectors from the initial input. However, due to its limited short-term memory, the
model encountered difficulties in recalling all the required information from the initial
prompt shown in Fig. 10. Consequently, primitive and basis vectors were reintroduced
26 P. Periyasamy and B. Eidel
In the subsequent iteration, the model successfully generated an exact code capable
of producing all three crystal structures in arbitrary orientations. Acknowledging
the significant influence of prompt engineering on the performance of LLMs, GPT-4
was specifically tasked with generating a prompt that would enable the production of
precise code consistently. Despite incorporating a comprehensive prompt as depicted
in Fig. 14 that encapsulated all the necessary information for the precise regeneration
of the code, the model encountered challenges in reproducing identical results.
Due to the constrained field knowledge and limited thinking capability, the model
encountered difficulties in distinguishing between the concepts of the number of
unit cells and the calculation of box length considering periodicity. Specifically, it
struggled with prompts related to periodicity calculation, particularly with the HCP
structure generation, as HCP possesses a distinct lattice constant in the .z-direction.
Despite numerous attempts to generate prompts, the model remained inconsistent
in producing accurate code. This inconsistency can be attributed to the inherent
limitations of the model’s understanding and its tendency to overlook specific aspects
of the input prompts.
In addition, the model’s performance was hindered by the abundance of inputs
provided, which may have overwhelmed its attention and led to confusion. When
presented with an excessive amount of information, the model may struggle to pri-
oritize and comprehend the key components necessary for generating accurate code.
As a result, splitting the prompt into two distinct parts—one for calculating the box
length with periodicity Fig. 3 and the other for generating a crystal structure with arbi-
trary orientation Fig. 4 was deemed necessary to enhance code reproducibility. This
approach aims to streamline the input process and mitigate the model’s tendency to
overlook essential details, ultimately improving the reliability of the generated code.
After splitting the entire code into smaller, more manageable prompts, the ran-
domness in the model’s responses became controllable. This allowed the model to
produce code for calculating periodicity more accurately, but it still required a few
iterations to refine the output and get the exact code for structure generation. Overall,
this approach proved to be far better compared to using a single prompt for generating
the entire code.
Generation of Atomic Scale Single Crystal 27
Fig. 14 (continued)
Generation of Atomic Scale Single Crystal 29
7 Minor Errors
The code generated by the model might need some adjustments, as there were com-
mon errors during its development. Table 7 are the frequent issues identified along
with the additional guidance provided for rectification. These errors have been fre-
quently encountered, but the subsequent prompts provided offer resolutions to rectify
them effectively. With these modifications implemented, one can replicate the pro-
cess with minimal issues, albeit with some minor periodicity-related issues.
8 Discussion
ongoing need for human intervention in such scenarios. Despite receiving feed-
back, issues pertaining to the periodicity of surface atoms persist in the final output
shown in Figs. 5 and 6.
• Autonomous Decision-Making in Mathematical Computations: However, it is
noteworthy that the model’s proficiency in handling tasks related to dislocation
insertion surpasses its capability in generating single crystals with arbitrary rota-
tions. In this scenario, it autonomously figured out the precise equations shown
in (3) and (4) for calculating displacements, even in the absence of explicit user
prompts. Furthermore, it opted to utilize the np.arctan2(y, x) function from
NumPy instead of np.arctan(y/x), as the former better accounts for quad-
rant distinctions, unlike the latter, which cannot differentiate between quadrants.
These decisions were made without specific instructions from the user, highlight-
ing the models ability to autonomously make informed choices in its computations.
These capabilities suggest an advanced level of mathematical understanding and
problem-solving skills in GPT-4. However, it’s important to note that while the
model-generated code demonstrates competency in many areas, minor issues may
arise still in the insertion of dislocations in specific orientations. Nonetheless, it
has consistently showcased its capability to generate entire code segments within
a few iterations, a notable feat when compared to single crystal generation.
• Sensitivity to User Prompts and Memory Constraints: Moreover, the model’s
responses are highly sensitive to even minor changes in the input prompt, resulting
in inconsistent outputs. Additionally, it has been observed that GPT-4 sometimes
struggles to maintain coherence and consistency over extended conversational con-
texts [7]. Understanding the memory mechanism of Chat-GPT plus is essential for
optimizing interactions with the AI. Chat-GPT plus, based on the GPT-4 architec-
ture, has a short-term memory capacity limited by an 8,000-token constraint [6].
As this limit is reached, the AI begins to forget the earliest parts of the conversation,
unable to recall details beyond its immediate context window. Instead, it relies on
patterns and knowledge acquired during its training on a vast collection of inter-
net texts to generate relevant responses. Recognizing this limitation is crucial for
generating effective prompts and achieving more meaningful engagements with
GPT-4 [5]. Of course, well-established software systems for Molecular Dynamics
(MD) simulations are available with inbuilt atomic structure generation such as,
e.g., LAMMPS [9] or Atomsk [3]. In functionalities they clearly go far beyond
the code presented in this chapter; from straightforward crystal structures to more
intricate designs like nano-wires, and non-periodic structures.
9 Conclusion
References
1. Anderson, P.M., Hirth, J.P., and Lothe, J. Theory of Dislocations, 2017, Cambridge University
Press.
2. Callister, W., Rethwisch, D. Materials Science and Engineering: an Introduction. (John Wiley
& Sons, New York, 2007).
3. Hirel, P. Atomsk: A tool for manipulating and converting atomic data files, pp. 212-219 (2015)
https://www.sciencedirect.com/science/article/pii/S0010465515002817.
4. Hull, D., and Bacon, D.J. Introduction to dislocations. Vol. 37. Elsevier, 2011.
5. OpenAI ChatGPT memory span. (https://www.4fsh.com/), [Online; accessed Feb-2024]
6. (.<i.> Chat GPT-4 Plus Memory Limit – Community.Openai.Com.</i.>, n.d.) [Online; accessed
Feb-2024].
7. Ray, P. ChatGPT: A comprehensive review on background, applications, key challenges, bias,
ethics, limitations and future scope. Internet Of Things And Cyber-Physical Systems. 3 pp.
121-154 (2023), https://www.sciencedirect.com/science/article/pii/S266734522300024X
8. Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO-the Open
Visualization Tool. Modelling And Simulation in Materials Science And Engineering (2010).
9. Thompson, A.P., Aktulga, H.M., Berger, R., et al. LAMMPS - a flexible simulation tool for
particle-based materials modeling at the atomic, meso, and continuum scales, pp.108171 (2022),
https://www.sciencedirect.com/science/article/pii/S0010465521002836.
Molecular Dynamics Simulation of Noble
Gases
1 Introduction
Molecular dynamics (MD) simulations touch on many aspects of physics and are
a valuable resource for comparing theoretical models to experimental results. MD
uses computer simulation with statistical mechanics to compute static and dynamic
properties of a classical many-body system [1–4]. In contrast to Molecular Statics
(MS) whose simulations are carried out at a temperature of .0 K, MD simulations
are carried out at a temperature greater than .0 K. The classical MD method simply
solves numerically Newton’s equations of motion for the interacting many-particle
with pair potentials. Here, we treat atoms as classical Newtonian particles and we can
compute the acceleration of any atom. The force between atoms is still determined
via the interatomic potential by the gradient of the pair potential function. In this
project, we are using Lennard-Jones potential (LJ potential) also known as 12–6
potential.
This is a valid potential for chemical inert gases such as Argon (Ar). Argon is
a noble gas. It is one of the elements in Group 18 of the periodic table, which
includes other noble gases like helium, neon, krypton, xenon, and radon. These
gases are characterized by their lack of reactivity due to having a full valence electron
shell, making them very stable and unlikely to form chemical bonds under normal
conditions. Argon is the third noble gas, following helium and neon.
Temperature and time play a role in MD simulations as compared to MS simula-
tions. There are many ensembles in MD, the NVE (microcanonical) ensemble, NVT
(canonical) ensemble, and NPT ensemble. Here variables are, . N is the number of
particles, .V is the volume of the simulation box, . E is the total energy of the simu-
lation box, .T is the temperature of the simulation box and . P is the pressure in the
simulation box. In the NVE ensemble, variables . N , .V , and . E are kept constant and
this rule applies to all the above ensembles. For NVT and NPT ensembles, additional
couplings will be used such as thermostats and barostats respectively.
The short-range repulsive term models the Pauli exclusion principle by stopping
particles from getting too close. The .r −12 exponent does not have a rigorous physical
justification. It is computationally efficient as it is just the square of the other .r −6
term.
Justification for Use in Modeling Noble Gases Noble gases like argon are char-
acterized by their closed-shell electronic configurations, meaning that they have no
permanent dipole moments and are chemically inert. The interactions between noble
gas atoms are primarily due to weak van der Waals forces. The Lennard-Jones poten-
tial is particularly well-suited to modeling these interactions for several reasons:
• Van der Waals Forces Dominance: For noble gases, the interaction is predom-
inantly due to dispersion forces, which the Lennard-Jones potential effectively
captures through its .(σ/r )6 term.
• Simplicity and Computational Efficiency: The Lennard-Jones potential is com-
putationally simple and efficient to calculate, making it ideal for large-scale sim-
ulations of noble gases where more complex potentials might be unnecessary or
impractical.
• Empirical Fit to Experimental Data: Parameters . and .σ can be fitted to exper-
imental data, allowing the Lennard-Jones potential to accurately reproduce the
properties of noble gases such as their phase behavior (liquid-gas coexistence
curve), transport properties (viscosity, diffusion), and thermodynamic properties
(equation of state).
• Historical Success: The Lennard-Jones potential has a long history of successful
application in simulating noble gases. It was originally developed based on exper-
imental observations of argon, and it has been validated extensively through its
ability to reproduce the experimentally observed behaviors of noble gases.
Newton’s second law of motion is given by the time derivative of the linear momentum
p = mv and the force . F according to
.
36 A. Neelakandan et al.
dp
. F= (2)
dt
which simplifies for constant masses, hence .ṁ = 0 to
. F = ma , (3)
1
. x(t + t) = x(t) + v(t)t + a(t)t 2 , (4)
2
a(t) + a(t + t)
.v(t + t) = v(t) + t , (5)
2
where .t is the time step size over which the integration is performed to obtain the
predicted quantities. Velocity Verlet time integration method is employed since it is
fast, requires little memory, and is easy to use for long time steps.
In this work, boundary conditions (BC) are periodic. They mimic the behavior of
the infinite bulk surrounding the sample. In this way, surface effects are removed.
It should also be noted that if the particles go through a boundary of the simulation
box, they will appear on the other side of the box.
We will present an MD simulation of Argon particles for the gas phase, which is
considered to be moving with the velocity given by the Maxwell-Boltzmann distribu-
tion as shown in Fig. 1. The Maxwell-Boltzmann distribution describes the velocity
of particles in ideal gases at thermodynamic equilibrium, where particles exchange
energy through brief collisions and their velocities follow Maxwell-Boltzmann statis-
tics based on kinetic energy. In MD simulations we first specify the initial positions
and momentum of the particles, the latter refers to the Maxwell-Boltzmann distribu-
tion.
Statistical ensembles define the conditions under which atomistic simulations are
carried out, dictating what thermodynamic variables are conserved or controlled.
Different ensembles correspond to different sets of thermodynamic variables that
are held constant during the simulation. Below is a description of the most com-
mon ensembles of the microcanonical ensemble NVE, the canonical ensemble NVT,
and the isothermal-isobaric ensemble NPT, where the acronyms reflect the variables
Molecular Dynamics Simulation of Noble Gases 37
Fig. 1 Probability density functions of Maxwell-Boltzmann distribution for the velocities of noble
gases like Helium, Neon, Argon and Xeon at a temperature of 298 K
involved, the number of particles . N , the particle volume .V , the energy . E, the tem-
perature .T , and the pressure . P.
Here we use the NVE ensemble which keeps constant . N , .V and the internal
energy . E which follows from the system characteristic being isolated. It meets the
requirements of the present goals simulating systems with no heat exchange with the
surroundings, analogous to an isolated system in thermodynamics. It provides a nat-
ural way to observe the dynamical evolution of a system without external influences,
often used for studying the intrinsic properties of a system.
2 Prompt
The parameters and constants relevant to the simulation are given in Table 1.
Figure 2 shows the initial positions of the particles for the simulation. Then we
evolve the system according to Newton’s second law of motion for which we let the
particles interact through a LJ potential using an NVE ensemble. Finally, we measure
physical quantities as functions of particle positions and momentum.
38 A. Neelakandan et al.
F i (t) t 2
x i (t + t) = x i (t) + v i (t)t +
mi 2
t F i (t) t
v i (t + ) = v i (t) +
2 mi 2
10 end
11 Force Calculation:
12 Recalculate forces F i (t + t) on each particle due to interactions using the interatomic
potential;
13 foreach particle i do
14 Update velocities:
t F i (t + t) t
v i (t + t) = v i (t + )+
2 mi 2
15 end
16 Apply Boundary Conditions:
17 Apply boundary conditions (e.g., periodic boundaries, reflective walls) to updated
positions and velocities. Here periodic boundary conditions are to be applied;
18 Calculate System Properties:
19 Compute kinetic energy E kin (t), potential energy E pot (t), and total energy E tot (t);
20 end
21 Output:
22 Return particle trajectories x i (t), velocities v i (t), and plot E kin (t), E pot (t), and E tot (t) over
time step t;
The final prompt which worked is given in Fig. 3. It was used for generating the Code
Listing 2.1 is,
40 A. Neelakandan et al.
Generate a Python code for the molecular dynamics simulation for argon
gas in a 2D rectangular domain. The simulation box has dimensions of 10
nanometers by 10 nanometers. Consider the FCC lattice and make sure
to initialize the positions of atoms in the FCC lattice itself to avoid the
overlapping of the atoms.
Simulation Parameters: Set the time step for the simulation to 2 femtosec-
onds and simulate 5000 time steps.
Output: Plot the kinetic energy, potential energy, and total energy to the
number of time steps NOT Time.
Print the values of kinetic energy, potential energy and total energy at the
end of the simulation.
Create a snapshot in the form of an image to visualize particles and save the
resulting snapshot output in the user’s desktop path.
Fig. 3 (continued)
42 A. Neelakandan et al.
Fig. 3 (continued)
3 Code Listing
The code listing 2.1 displays the python code generated by ChatGPT-4o for the
prompt given in Fig. 3.
1 i m p o r t os
2 i m p o r t n um py as np
3 i m p o r t m a t p l o t l i b . p y p l o t as plt
4 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
5 from m a t p l o t l i b . a n i m a t i o n i m p o r t P i l l o w W r i t e r
6
7 def g e t _ d e s k t o p _ p a t h () :
8 home = os . path . e x p a n d u s e r ( " ~ " ) # Gets the home
→ d i r e c t o r y
9 d e s k t o p = os . path . join ( home , ’ D e s k t o p ’ ) # Appends
→ D e s k t o p to the home path
10 return desktop
11
12 # Constants
13 kB = 1 . 3 8 0 6 4 8 5 2 e -23 # Boltzmann constant (J/K)
14 e p s i l o n = 1.65 e -21 # Lennard - Jones p o t e n t i a l well depth ( J )
15 sig ma = 3.4 e -10 # Lennard - Jones p o t e n t i a l d i s t a n c e
→ p a r a m e t e r ( m )
16 mass = 6.63 e -26 # Mass of argon atom ( kg )
17 T = 300 # Temperature (K)
18 L = 10 e -9 # L e n g t h of the s i m u l a t i o n box ( m )
19 n u m _ p a r t i c l e s = 100 # N u m b e r of p a r t i c l e s
20 dt = 2e -15 # Time step ( s )
21 n u m _ s t e p s = 5000 # N u m b e r of s i m u l a t i o n steps
22
23 def i n i t i a l i z e _ p o s i t i o n s ( n u m _ p a r t i c l e s , L ) :
24 """ I n i t i a l i z e p o s i t i o n s of p a r t i c l e s in an FCC l a t t i c e
→ """
25 p o s i t i o n s = np . z e r o s (( n u m _ p a r t i c l e s , 2) )
26 n u m _ c e l l s _ p e r _ s i d e = int ( np . ceil ( np . sqrt ( n u m _ p a r t i c l e s ) )
→ )
27 cell_size = L / num_cells_per_side
Molecular Dynamics Simulation of Noble Gases 43
28 particle_idx = 0
29
30 for i in range ( n u m _ c e l l s _ p e r _ s i d e ) :
31 for j in range ( n u m _ c e l l s _ p e r _ s i d e ) :
32 if p a r t i c l e _ i d x >= n u m _ p a r t i c l e s :
33 break
34 p o s i t i o n s [ p a r t i c l e _ i d x ] = np . array ([ i *
→ cell_size , j * c e l l _ s i z e ])
35 p a r t i c l e _ i d x += 1
36
37 # E n s u r e p a r t i c l e s are not on the edge of the box
38 p o s i t i o n s += c e l l _ s i z e / 2
39 return positions
40
41 def i n i t i a l i z e _ v e l o c i t i e s ( n u m _ p a r t i c l e s , T , mass ) :
42 """ I n i t i a l i z e v e l o c i t i e s of p a r t i c l e s to f o l l o w Maxwell -
→ B o l t z m a n n d i s t r i b u t i o n """
43 v e l o c i t i e s = np . r a n d o m . n o r m a l (0 , np . sqrt ( kB * T / mass ) ,
→ ( n u m _ p a r t i c l e s , 2) )
44 v e l o c i t i e s -= np . mean ( velocities , axis =0) # E n s u r e zero
→ net m o m e n t u m
45 return velocities
46
47 def c o m p u t e _ f o r c e s ( p o s i t i o n s ) :
48 """ C o m p u t e f o r c e s and p o t e n t i a l e n e r g y using Lennard -
→ Jo ne s p o t e n t i a l """
49 f o r c e s = np . z e r o s _ l i k e ( p o s i t i o n s )
50 p o t e n t i a l _ e n e r g y = 0.0
51
52 for i in range ( len ( p o s i t i o n s ) ) :
53 for j in range ( i + 1 , len ( p o s i t i o n s ) ) :
54 r_ij = p o s i t i o n s [ i ] - p o s i t i o n s [ j ]
55 r_ij -= L * np . round ( r_ij / L ) # Apply p e r i o d i c
→ boundary conditions
56 r2 = np . dot ( r_ij , r_ij )
57 if r2 < (3 * sigma ) ** 2: # A p p l y a c u t o f f to
→ save c o m p u t a t i o n
58 r 2 _ i n v = s i g m a ** 2 / r2
59 r 6 _ i n v = r 2 _ i n v ** 3
60 r 1 2 _ i n v = r 6 _ i n v ** 2
61 f o r c e _ m a g n i t u d e = 24 * e p s i l o n * (2 *
→ r 1 2 _ i n v - r 6 _ i n v ) / r2
62 f o r c e s [ i ] += f o r c e _ m a g n i t u d e * r_ij
63 f o r c e s [ j ] -= f o r c e _ m a g n i t u d e * r_ij
64 p o t e n t i a l _ e n e r g y += 4 * e p s i l o n * ( r 1 2 _ i n v -
→ r6_inv )
65
66 r e t u r n forces , p o t e n t i a l _ e n e r g y
67
68 def v e l o c i t y _ v e r l e t ( positions , velocities , forces , dt ) :
69 """ P e r f o r m a s i n g l e s t e p of V e l o c i t y V e r l e t i n t e g r a t i o n
→ """
70 p o s i t i o n s += v e l o c i t i e s * dt + 0.5 * f o r c e s * dt ** 2 /
→ mass
44 A. Neelakandan et al.
118
119 # Test 1: P e r i o d i c i t y Test with A n i m a t i o n
120 def t e s t _ p e r i o d i c i t y ( L ) :
121 """ Test for p e r i o d i c i t y by c h e c k i n g if a s i n g l e atom
→ r e e n t e r s the box with a n i m a t i o n """
122 p o s i t i o n s = np . array ([[0 , L /2]]) # Start the p a r t i c l e
→ at the left edge
123 v e l o c i t i e s = np . array ([[1 e4 , 0]]) # Increased velocity
→ for f a s t e r m o v e m e n t
124 fig , ax = plt . s u b p l o t s ()
125 ax . s e t _ x l i m (0 , L )
126 ax . s e t _ y l i m (0 , L )
127 ax . s e t _ x l a b e l ( ’ X P o s i t i o n ( m ) ’ )
128 ax . s e t _ y l a b e l ( ’ Y P o s i t i o n ( m ) ’ )
129 particle , = ax . plot ([] , [] , ’ ro ’ , m a r k e r s i z e =12)
130
131 def init () :
132 p a r t i c l e . s e t _ d a t a ([] , [])
133 r e t u r n particle ,
134
135 def u p d a t e ( f r a m e ) :
136 n o n l o c a l positions , v e l o c i t i e s
137 f o r c e s = np . z e r o s _ l i k e ( p o s i t i o n s ) # No other
→ p a r t i c l e s , so no f o r c e s
138 positions , velocities , _ , _ = v e l o c i t y _ v e r l e t (
→ positions , velocities , forces , dt )
139 p a r t i c l e . s e t _ d a t a ([ p o s i t i o n s [0 , 0]] , [ p o s i t i o n s [0 ,
→ 1]]) # Pass as lists
140 r e t u r n particle ,
141
142 n u m _ f r a m e s = int ( L / ( v e l o c i t i e s [0 , 0] * dt ) ) + 1 #
→ E n s u r e e n o u g h f r a m e s to c o v e r the e n t i r e path
143 ani = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s =
→ nu m_ fr ames , i n i t _ f u n c = init , blit = True )
144 g i f _ p a t h = os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ p e r i o d i c i t y _ t e s t . gif ’ )
145 ani . save ( gif_p a t h , w r i t e r = P i l l o w W r i t e r ( fps =30) )
146 plt . cl os e ( fig )
147 pr in t ( f " P e r i o d i c i t y Test a n i m a t i o n saved as { g i f _ p a t h } " )
148
149 # T e s t 2: E n e r g y C o n s e r v a t i o n Test with A n i m a t i o n
150 def t e s t _ e n e r g y _ c o n s e r v a t i o n () :
151 """ T e s t for e n e r g y c o n s e r v a t i o n with two p a r t i c l e s
→ c o l l i d i n g e l a s t i c a l l y with a n i m a t i o n """
152 v _ i n i t i a l = 1 e5 # I n i t i a l v e l o c i t y of the atoms ( m / s )
153
154 # Initial conditions
155 x1 = L / 4 # I n i t i a l p o s i t i o n of atom 1 ( m )
156 x2 = 3 * L / 4 # I n i t i a l p o s i t i o n of atom 2 ( m )
157 v1 = v _ i n i t i a l # I n i t i a l v e l o c i t y of atom 1 ( m / s )
158 v2 = - v _ i n i t i a l # I n i t i a l v e l o c i t y of atom 2 ( m / s )
159
160 # Li st s to store p o s i t i o n s and e n e r g i e s
161 p o s i t i o n s 1 = []
162 p o s i t i o n s 2 = []
46 A. Neelakandan et al.
163 k i n e t i c _ e n e r g i e s = []
164
165 # S i m u l a t i o n loop
166 for step in range ( n u m _ s t e p s ) :
167 # Update positions
168 x1 += v1 * dt
169 x2 += v2 * dt
170
171 # Ch eck for c o l l i s i o n and u p d a t e v e l o c i t i e s
172 if x1 >= x2 :
173 v1 , v2 = v2 , v1
174
175 # Save p o s i t i o n s and e n e r g i e s
176 p o s i t i o n s 1 . a p p e n d ( x1 )
177 p o s i t i o n s 2 . a p p e n d ( x2 )
178 k i n e t i c _ e n e r g y = 0.5 * mass * ( v1 **2 + v2 **2)
179 kinetic_energies . append ( kinetic_energy )
180
181 # R e f l e c t a t o m s at the b o u n d a r i e s ( e l a s t i c c o l l i s i o n
→ with the wall )
182 if x1 < 0 or x1 > L :
183 v1 = - v1
184 if x2 < 0 or x2 > L :
185 v2 = - v2
186
187 # Create animation
188 fig , ax = plt . s u b p l o t s ()
189 ax . s e t _ x l i m (0 , L )
190 ax . s e t _ y l i m (0 , L )
191 ax . s e t _ x l a b e l ( ’ X P o s i t i o n ( m ) ’ )
192 ax . s e t _ y l a b e l ( ’ Y P o s i t i o n ( m ) ’ )
193
194 line1 , = ax . plot ([] , [] , ’ ro ’ , label = ’ Atom 1 ’ )
195 line2 , = ax . plot ([] , [] , ’ bo ’ , label = ’ Atom 2 ’ )
196
197 def init () :
198 li ne 1 . s e t _ d a t a ([] , [])
199 li ne 2 . s e t _ d a t a ([] , [])
200 r e t u r n line1 , line2
201
202 def u p d a t e ( f r a m e ) :
203 # U p d a t e with s e q u e n c e s i n s t e a d of s i n g l e v a l u e s
204 li ne 1 . s e t _ d a t a ([ p o s i t i o n s 1 [ frame ]] , [ L / 2])
205 li ne 2 . s e t _ d a t a ([ p o s i t i o n s 2 [ frame ]] , [ L / 2])
206 r e t u r n line1 , line2
207
208 ani = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s =
→ num_steps , i n i t _ f u n c = init , blit = True )
209
210 # Save the a n i m a t i o n as a GIF using P i l l o w W r i t e r
211 d e s k t o p _ p a t h = os . path . join ( os . path . e x p a n d u s e r ( " ~ " ) , "
→ D e s k t o p " )
212 g i f _ p a t h = os . path . join ( d e s k t o p _ p a t h , "
→ a t o m i c _ c o l l i s i o n _ s i m u l a t i o n . gif " )
213 ani . save ( gif_p a t h , w r i t e r = ’ p i l l o w ’ , fps =60)
Molecular Dynamics Simulation of Noble Gases 47
214
215 plt . show ()
216
217 pr in t ( f " S i m u l a t i o n c o m p l e t e . GIF saved to : { g i f _ p a t h } " )
218
219 # Main s i m u l a t i o n loop
220 positions = initialize_positions ( num_particles , L)
221 v e l o c i t i e s = i n i t i a l i z e _ v e l o c i t i e s ( n u m _ p a r t i c l e s , T , mass )
222 forces , p o t e n t i a l _ e n e r g y = c o m p u t e _ f o r c e s ( p o s i t i o n s )
223
224 # Save i n i t i a l s n a p s h o t
225 s a v e _ s n a p s h o t ( positions , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ i n i t i a l _ s n a p s h o t . png ’ ) )
226
227 k i n e t i c _ e n e r g i e s = []
228 p o t e n t i a l _ e n e r g i e s = []
229 t o t a l _ e n e r g i e s = []
230
231 for step in range ( n u m _ s t e p s ) :
232 positions , velocities , forces , p o t e n t i a l _ e n e r g y =
→ v e l o c i t y _ v e r l e t ( positions , velocities , forces , dt )
233 k i n e t i c _ e n e r g y = 0.5 * mass * np . sum ( v e l o c i t i e s ** 2)
234 total_energy = kinetic_energy + potential_energy
235
236 kinetic_energies . append ( kinetic_energy )
237 potential_energies . append ( potential_energy )
238 total_energies . append ( total_energy )
239
240 # Plot e n e r g i e s
241 plot_energies ( kinetic_energies , potential_energies ,
→ t o t a l _ e n e r g i e s , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ e n e r g y _ p l o t . png ’ ) )
242
243 # Save final s n a p s h o t
244 s a v e _ s n a p s h o t ( positions , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ f i n a l _ s n a p s h o t . png ’ ) )
245
246 # Pr in t f in al energies
247 pri nt ( f " F i n a l K i n e t i c E n e r g y : { k i n e t i c _ e n e r g i e s [ -1]} " )
248 pri nt ( f " Fi na l P o t e n t i a l E n e r g y : { p o t e n t i a l _ e n e r g i e s [ -1]} " )
249 pri nt ( f " F i n a l T o t a l E n e r g y : { t o t a l _ e n e r g i e s [ -1]} " )
250
251 # Run the a d d i t i o n a l tests with a n i m a t i o n s
252 test_periodicity (L)
253 t e s t _ e n e r g y _ c o n s e r v a t i o n ()
Code Listing 2.1 Code generated by ChatGPT-4o for solving 2D MD simulation of argon gas
The output of the ChatGPT-4o does not guarantee that the code generated will
work without any flaws. Thorough tests are indispensible for that reason.
48 A. Neelakandan et al.
In the following, two tests of rather basic nature are carried out. They are based on
a one-atom system to check the periodic BC, and a two-atom system which checks
quantitatively the energy conservation.
Figure 4 shows the trail of one atom moving from the left to the right (snapshots
at constant time increments .t) of the left simulation box; once it leaves that box
through its periodic boundary at the right edge, it simultaneously re-enters the simu-
lation box as displayed in the 2nd window. The plot at the bottom of Fig. 4 displays
the corresponding kinetic energy for the entire process; the constant value over the
full trail of a length twice the box size verifies (kinetic) energy conservation and,
implicitly, the velocity being constant.
In conclusion, the test verifies the proper functionality of the periodic boundary
condition for the MD simulation.
Fig. 4 Periodic BC test for an MD simulation shows the constant kinetic energy as the atom moves
across the periodic boundary, from left to right
Molecular Dynamics Simulation of Noble Gases 49
Figure 5 shows the setup for the energy conservation test. In the simulation box
with the size of . L = 10 nm there are only two atoms (initial positions at .t0 = 0:
. x 1 (t0 ) = L/6, . x 2 (t0 ) = 5L/6 with .|x 2 (t0 ) − x 1 (t0 )| ≥ r cutoff ), which fly with the
same speed on the same line against each other thus having velocity vectors
with opposite signs, (initial velocities .v 1 (t0 ) = −v 2 (t0 ), .|v 1 (t0 )| = 104 m/s), kinetic
energy will be maximum in left and right image of Fig. 5, because two particles will
be moving with some speed due to attraction and repulsion between them. Whereas
potential energy is maximum in center image of Fig. 5. However, in order to make
Fig. 5 Quantitative test for energy conservation consisting of two atoms traveling towards each
other in the same path with an initial speed of .104 m/s. When they come closer, atoms enter the
cut-off radius limits of the LJ potential resulting in an increase of the potential energy and decrease
in kinetic energy because of their repulsion. This makes the atoms to move in a way from each
other resulting in decrease in potential energy and increase in kinetic energy, till they experience
the same in when they are near to boundary because of periodic boundary condition
50 A. Neelakandan et al.
this effect happen within the compatible range of .t = 2 × 10−15 s, parameter .σ
in the LJ-potential has been increased by one order to .3.4 × 10−9 compared to the
argon value tabulated in Table 1.
For this setting, we test the proper calculation of initial energies as well; for
.|x 2 (t0 ) − x 1 (t0 )| ≥ r cutoff , the potential energy at .t0 is zero, . E pot (t0 ) = 0. The initial
kinetic energy is calculated with .v0 = |v 1 (t0 )| = |v 2 (t0 )| = 104 m/s and the mass of
argon atoms (see Table 1) according to
1
.
calc
E kin (t0 ) = 2 · m v02 = 6.6299999999999994 × 10−18 J = E tot (t0 ) . (6)
2
calc
The figure of . E kin (t0 ) from manual calculation exhibits a minor deviation from the
value computed by the code . E kin (t0 ) = 6.630605061255307 × 10−18 J.
Figure 6 displays the energies of the 100-atom system during the first 5000 time
steps. While the sum of kinetic energy and the potential energy show some scatter,
they sum up to a constant total energy (.4.0592186387714405 × 10−19 J) indicating
energy conservation.
Note that the system evolves from a regular geometric setting where atoms reside
on nodes of a grid with square cells as displayed in Fig. 2. It is the heterogeneous
velocity distribution at the simulation start which drives the system to evolve into a
heterogeneous atom distribution after 5000 time steps as displayed in Fig. 7.
5 Discussion
Some important aspects that are observed while working on prompts and generating
codes are:
• Completeness of the program: Generally, chatbots (ChatGPT-4 and ChatGPT-
4o) account for all the details provided in the prompt. But sometimes it overlooks
some parts of the prompt. Generally, some variables are not defined initially in the
code. The solution to this problem would be, to mention those parts of the prompt
in bold characters and insist chatbot not forget that part.
• Short-term memory capacity: Generally recent chatbots like ChatGPT-4o tend
to have long-term memory since they produce the code output as expected for a
long time. But after a certain point, chatbots tend to miss some parts of the code
and generate the output which is not even mentioned in the prompt. So, we have
to write a prompt very specific about the equations to be used, constants, initial
variables, time integration method, and output plots.
Molecular Dynamics Simulation of Noble Gases 51
Fig. 6 Energies of the 100-atom system versus the number of time steps indicates the conservation
of the total energy while the kinetic and the potential energies show fluctuations
• Reproducibility: If we insert the same prompt from the report to the chatbots,
the output code will probably be similar but not the same as given in this report.
Because chatbots are not consistent with their results. The basic algorithm of the
code will be the same, but some minor changes will be there. The recent versions
of chatbots like ChatGPT-4o will generate code very similar to those given in this
report.
• Reliability: The final code generated by the chatbots is reliable with its results
only after optimizing the prompts. The codes generated by the recent versions like
GPT-4o will be more reliable as compared to the older versions like ChatGPT-4.
• Hallucinations: Hallucinations in chatbots refer to instances where the chatbots
generate responses that are factually incorrect, nonsensical, or completely fabri-
cated, despite being delivered with apparent confidence. In some cases, We have
encountered some minor hallucinations, which can be easily rectified. So, We
would say minor or low-level hallucinations occur in chatbots.
• Learned Lessons: The main lesson that we have learned is, that the prompt should
be very optimized if we need exact and consistent results that match with the
research articles. So even minor information must be added to the prompt so that
it works perfectly without any illogical errors. The quality and correctness of the
output depend mainly on the prompt. So, the prompt should be particular, refined,
and optimized.
52 A. Neelakandan et al.
Fig. 7 Heterogeneous atom distribution after time step 5000 in the MD simulation of argon gas
6 Conclusion
The main aim of this chapter was to construct a prompt for ChatGPT-4o to generate
a Python code for the 2D MD simulation of noble gases such as argon. To put
things into perspective, building blocks of MD simulations were presented such as
Newton’s equation of motion, the Lennard-Jones (12–6) pair potential, statistical
ensembles, and the velocity Verlet algorithm for integrating Newton’s second law.
The code structure was illustrated by a descriptive pseudocode. The resultant Python
code was successfully assessed through a series of tests; the functionality of periodic
boundary conditions was verified and for the collision of two atoms the conservation
of energy along with its dynamic decomposition into kinetic and potential parts was
observed. In conclusion, with some minor manual intervention, ChatGPT-4o has
demonstrated its ability to generate an accurate and reliable code for the molecular
dynamics simulations of noble gases.
Molecular Dynamics Simulation of Noble Gases 53
References
Abstract In the current chapter the assignment for GPT-4 on ChatGPT Plus is
to generate a Python code for grain growth simulation by the phase field method.
Specifically, the non-conserved Allen-Cahn equation with a suitable free energy
functional is solved in 2D using the finite difference method and the explicit Euler
forward time-stepping scheme. By virtue of a specific prompt design for GPT-4 the
resulting Python code allows solving the phase field equations for any generalized
initialization. To this end, two variants of initial grain structures are considered; (i)
a spherical grain embedded in a larger grain, and (ii) a Voronoi tessellation-based
initial structure. Key aspects such as prompt design, code verification and testing of
the outcome are discussed.
1 Introduction
where . f is the local free energy density and .κi are the gradient energy coefficients.
The specific form of the orientation-independent . f is given in the Fan and Chen
model [6] as,
N
N N
A 2 B 4
. f (η1 , η2 , . . . η N ) = − ηi + ηi + ηi2 η2j , (3)
i
2 4 i i= j
∂ηi N
. = −L i − Aηi + Bηi3 + 2ηi η2j − κi ∇ 2 ηi . (4)
∂t i= j
The Laplacian operator in (4) is approximated using the five-point stencil of the
finite difference (FD) method given by,
for .i = 1, 2, ..., N grains with .ηin+1 = ηi (tn+1 ), .ηin = ηi (tn ), and .t = tn+1 − tn .
In the equations describing the model, .η is dimensionless, . L i has the dimen-
sions .[L 2 T −1 ], .κi has the dimensions .[M L 2 T −2 ], . F has the dimensions of energy
2 −2
.[M L T ], and. f has the dimensions of energy density as.[M L −1 T −2 ]. All variables
of the model are treated in a non-dimensional form in the simulation.
2 Prompt
• System
• Context
• Specific instructions for numerical implementation
• Initialization cases
2.1 System
The system prompt is mainly aimed for the chatbot to adopt a persona that has a con-
textual understanding of the broad field and maintains consistency in all responses.
Further, the system prompt can also be used to obtain a specified style of program-
ming. We also enforce the chatbot to provide an error-free code by providing such a
system prompt.
In response to the prompt shown in Fig. 1, ChatGPT 4 mentions the applications
of the PF method. It further elaborates by providing some mathematical aspects of
the PF method such as its dependence on differential equations, commonly used
numerical methods, initialization of random grain structures, use of Matplotlib and
possible ways of evaluating results. It also gives a general example task and briefly
explains the steps involved. This response indicates that the large language model
acknowledges the persona and has an understanding of the PF method.
58 R. Narkhede and B. Eidel
Prompt 1
Your role: You are an expert in computational materials science. You can
understand complex mathematical model of physical phenomena in mate-
rials, which are usually described in terms of differential equations. Your
particular research focus is the modeling of grain growth by the phase-field
method. You understand the phase-field method in depth and know existing
research in its implementation on grain growth modeling. You can judge the
results of a numerical method based on its outputs, specifically plots. Further,
you also know the best ways to initialize random grain structures in 2D. Your
current role requires you to generate Python programs that are error-free
(logically, semantically and numerically). You know the best practices of
using the library Matplotlib to generate publication-ready plots.
Fig. 1 System prompt to provide ChatGPT 4 chatbot a persona that is an expert in the domain of
our problem, understands the numerical methods and enforces error-free code generation
2.2 Context
The model description for our problem setting is provided in the context prompt.
It includes specific equations as mentioned in the introduction. It is observed that
providing equations of the PF model (1), (2) and (3) in the LATEX format ensures the
mathematical correctness in the code implementation.
ChatGPT 4 provides a clear understanding of the terms of the equation in the
response to prompt in the Fig. 2. It mentions some applications of the Allen-Cahn
equation first. Then it explains the individual terms, i.e. the order parameter.ηi , mobil-
ity coefficient . L i , free energy . F, gradient energy coefficients .κi and the local free
energy density . f . It reproduces the equations provided and then provides the sim-
ulation approach briefly with suggestions for numerical methods and their possible
limitations.
Now, specific instructions are provided in a concise prompt shown in Fig. 3. To obtain
a Python code that can be used for any type of initialized domain, we first consider
a rectangular domain with circular grains at random locations with radii between a
specified range. The motive is to use this generated phase field method code for other
modified initializations such as Voronoi tessellation.
Phase Field Modeling of Grain Growth 59
Fig. 2 Context prompt providing the details of the PF model used in the problem with specific
equations written in LATEX format
Care has been taken to make the instructions clear and precise to restrict ChatGPT 4
in making any assumptions on its own. An emphasis on the desired dimension of the
outputs has been made in the instructions. For e.g. the initialized order parameters
should be stored in an array of dimensions .(N x, N y, ngrains), where .(N x, N y)
specifies the grid size and .ngrains is the number of grains. The discretized equa-
tion for time-stepping (4) is also provided in LATEX format to ensure mathematical
correctness in its code implementation (Fig. 3).
60 R. Narkhede and B. Eidel
Fig. 3 Prompt with specific instructions for numerical implementation provided as a one-thrust
prompt
Phase Field Modeling of Grain Growth 61
Fig. 3 (continued)
Fig. 4 Prompt to generate pseudocode for the grain growth simulation based on the specific instruc-
tions provided
ChatGPT 4 takes a relatively long time to process the response and indicates that
it is analyzing its code. After completion it generates a plot of area fractions versus
time steps as the output of verification. This shows that GPT 4 analyzes its generated
code prior to responding and also adds the generated code output plot in its response.
The generated Python code is shown in the code Listing 3.1.
The update of the order parameter, and thus the grain growth over time steps is
conducted by the operations mentioned in the algorithm 2 generated by the prompt
shown in Fig. 4.
62 R. Narkhede and B. Eidel
1
.∇ ηi = ηi (x + x, y) + ηi (x − x, y) + ηi (x, y + y)
2
x 2
+ηi (x, y − y) − 4ηi (x, y)
The Python code obtained from the prompts so far is used along with two different
initializations. The first case of ideal grain growth is considered for verification of the
Python code for the PF method. The second case is based on Voronoi tessellations
as initial domains and represents a general application in materials science.
Ideal grain growth is a special case of normal grain growth, where the grain boundary
motion is driven only by the local curvature of the grain boundary. Here, a spherical
grain embedded in a large second grain is considered (Fig. 5).
For a more generalized application of the generated PF method code, the domain is
initialized by a Voronoi tessellation. Initially, a general random seed based Voronoi
tessellation is generated and the order parameters are initialized. Next, it is modified
for a Voronoi tessellation with a gradient in the cell size. Specific instructions are
provided to initialize the order parameters to eliminate any discrepancies.
Prompt 4
Provide a Python function based on following instructions. Initialize the
order parameters etas using following instructions. Rectangular grid of size
Nx = Ny = 64 with grid spacing dx = dy = 0.5. We want to initialize a
circular grain embedded in another large grain of the size of the rectangular
grid. The circular grain of radius 14 is centered at the center of the grid.
Grain 1 is the larger grain and grain 2 is the embedded circular grain. The
order parameters for each grain are initialized by an array of shape (Nx, Ny,
ngrains), where ngrains = 2 is the number of grains. Each ith 2D sub-array
of shape (Nx, Ny) specifies the order parameter at all grid points for the ith
grain.
Initialization of grain 1: The order parameters over the rectangular grid for
grain 1 are initialized to 1. But all points falling within the radius of the
embedded circular grain have order parameter 0.
Initialization of grain 2: The order parameters over the rectangular grid for
grain 0 are initialized to 0. But all points falling within the radius of the
embedded circular grain have order parameter 1.
Prompt 5
Create a Voronoi tessellation in a domain of size 32x32. Consider 25 grains.
Ensure periodic boundary conditions. Extend any Voronoi cells near the
domain boundary till the domain boundary such that we have the domain
completely filled with the Voronoi cells. Plot the Voronoi tessellation.
After creating this Voronoi tessellation, initialize the order parameters for
the PF method in the grains that lie in the trimmed Voronoi tessellation
domain. Follow these steps.
1. Input: The Voronoi tessellation and its data from generated Voronoi tes-
sellation. Create a rectangular grid of size 64 x 64 with grid spacing of
0.5.
2. Initialize Order Parameter: Create an array to store the order parameter
values for each grain at each grid point. Create an array of dimensions
(64, 64, ngrains), where ’ngrains’ is the number of grains, i.e. the number
of Voronoi cells.
3. For each grain, loop through all grid points. For each grid point (i, j)
in the 64x64 grid, determine the corresponding Voronoi cell using the
tessellation.
4. If the point is inside the Voronoi cell, set the order parameter for that point
to 1.0. If the grid point is outside the Voronoi cell, set the order parameter
for that point to 0.0.
5. Repeat for all grains in the Voronoi tessellation. Provide the final array
of order parameters of dimensions (64, 64, ngrains).
Prompt 6
Modify the previous Voronoi tessellation and initialization of order param-
eters with a strongly decreasing Poisson point process. The gradient in the
grain size should be easily noticeable. Ensure that the final array of etas has
the same number of grains taken as an input.
The Python codes generated in response to the prompts mentioned in Sect. 2 are
provided in this section with their output plots showing grain growth evolution.
Phase Field Modeling of Grain Growth 65
The generated code after prompt 3 is displayed in the Code Listing 3.1. Therein, the
number of steps was adapted manually to the target time period for grain evolution.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
4
5 # Parameters
6 Nx , Ny = 64 , 64 # Grid size
7 dx , dy = 0.5 , 0.5 # Grid s p a c i n g
8 ngrains = 3 # N u m b e r of g r a i n s
9 L = 5.0 # Mobility coefficient
10 A = B = 1 # Free energy parameters
11 kap pa = 0.1 # Gradient energy coefficient
12 dt = 0 .0 05 # Time step
13 nsteps = 1000 # N u m b e r of s t e p s
14 t h r e s h o l d = 0.5 # T h r e s h o l d for area f r a c t i o n
15 r a d i u s _ r a n g e = (7 , 14) # R a n g e of r a d i i for the g r a i n s
16 v o l u m e _ f r a c t i o n _ t h r e s h o l d = 0.001 # T h r e s h o l d for grain
→ e x i s t e n c e
17
18 # I n i t i a l i z e the d o m a i n with c i r c u l a r g r a i n s
19 def i n i t i a l i z e _ g r a i n s ( Nx , Ny , ngrains , r a d i u s _ r a n g e ) :
20 eta = np . z e r o s (( Nx , Ny , n g r a i n s ) )
21 c e n t e r s = np . r a n d o m . rand ( ngrains , 2) * np . array ([ Nx , Ny
→ ])
22 r a d i i = np . r a n d o m . r a n d i n t ( r a d i u s _ r a n g e [0] , r a d i u s _ r a n g e
→ [1] , n g r a i n s )
23
24 for i in range ( n g r a i n s ) :
25 for x in range ( Nx ) :
26 for y in range ( Ny ) :
27 if ( x - c e n t e r s [ i , 0]) **2 + ( y - c e n t e r s [ i ,
→ 1]) **2 <= radii [ i ]**2:
28 eta [x , y , i ] = 1.0
29 r e t u r n eta
30
31 # F u n c t i o n to a p p r o x i m a t e L a p l a c i a n using a five - point
→ s t e n c i l
32 def l a p l a c i a n ( eta , dx , dy ) :
33 l a p l a c e _ e t a = ( np . roll ( eta , -1 , axis =0) + np . roll ( eta ,
→ 1 , axis =0) - 2 * eta ) / dx **2 + \
34 ( np . roll ( eta , -1 , axis =1) + np . roll ( eta ,
→ 1 , axis =1) - 2 * eta ) / dy **2
35 return laplace_eta
36
37 # Ph as e f ie ld model e v o l u t i o n f u n c t i o n
38 def p h a s e _ f i e l d _ e v o l v e ( Nx , Ny , ngrains , dt , nsteps ,
→ e t a _ i n i t i a l ) :
39 eta_hist = [ eta_initial ]
40 v o l u m e _ f r a c t i o n s = np . zeros (( nsteps , n g r a i n s ) )
66 R. Narkhede and B. Eidel
41 g r a i n _ s t a t u s = np . ones (( nsteps , n g r a i n s ) )
42
43 for step in range ( n s t e p s ) :
44 e t a _ n e x t = e t a _ h i s t [ -1]. copy ()
45 for i in range ( n g r a i n s ) :
46 if g r a i n _ s t a t u s [ step -1 , i ] == 0: # Skip e x t i n c t
→ grains
47 continue
48 e ta_i = e t a _ h i s t [ -1][: , : , i ]
49 s u m _ e t a _ s q u a r e d = sum ([ e t a _ h i s t [ -1][: , : , j ]**2
→ for j in range ( n g r a i n s ) if j != i ])
50 l a p l a c e _ e t a _ i = l a p l a c i a n ( eta_i , dx , dy )
51
52 # D i s c r e t i z e d Allen - Cahn e q u a t i o n
53 d _ e t a _ d t = - L * ( - A * eta_i + B * eta_i **3 + 2 *
→ e ta _i * s u m _ e t a _ s q u a r e d - kappa * l a p l a c e _ e t a _ i )
54 e t a _ n e x t [: , : , i ] += d _ e t a _ d t * dt
55
56 # E n f o r c e b o u n d s on order p a r a m e t e r s
57 e t a _ n e x t [: , : , i ] = np . clip ( e t a _ n e x t [: , : , i ] ,
→ 0.0001 , 0 . 9 9 9 )
58
59 # Calculate volume fraction
60 v o l u m e _ f r a c t i o n = np . sum ( e t a _ n e x t [: , : , i ]) / (
→ Nx * Ny )
61 v o l u m e _ f r a c t i o n s [ step , i ] = v o l u m e _ f r a c t i o n
62 if v o l u m e _ f r a c t i o n < v o l u m e _ f r a c t i o n _ t h r e s h o l d :
63 g r a i n _ s t a t u s [ step , i ] = 0 # Mark grain as
→ e x t i n c t
64
65
66 eta_hist . append ( eta_next )
67
68 r e t u r n eta_hist , v o l u m e _ f r a c t i o n s , g r a i n _ s t a t u s
69
70 # F u n c t i o n to c o m p u t e area f r a c t i o n for each grain at each
→ time step
71 def c o m p u t e _ a r e a _ f r a c t i o n ( eta_hist , Nx , Ny , t h r e s h o l d ) :
72 a r e a _ f r a c t i o n s = np . zeros (( len ( e t a _ h i s t ) , n g r a i n s ) )
73
74 for step , eta in e n u m e r a t e ( e t a _ h i s t ) :
75 for i in range ( n g r a i n s ) :
76 e ta_i = eta [: , : , i ]
77 a r e a _ f r a c t i o n = np . sum ( eta_i > t h r e s h o l d ) / ( Nx
→ * Ny )
78 a r e a _ f r a c t i o n s [ step , i ] = a r e a _ f r a c t i o n
79
80 return area_fractions
81
82 # Initialize grains
83 e t a _ i n i t i a l = i n i t i a l i z e _ g r a i n s ( Nx , Ny , ngrains ,
→ r a d i u s _ r a n g e )
84
85 # Run the ph ase field e v o l u t i o n
Phase Field Modeling of Grain Growth 67
86 eta_hist , v o l u m e _ f r a c t i o n s , g r a i n _ s t a t u s =
→ p h a s e _ f i e l d _ e v o l v e ( Nx , Ny , ngrains , dt , nsteps ,
→ e t a _ i n i t i a l )
87
88 # C o m p u t e area f r a c t i o n s
89 a r e a _ f r a c t i o n s = c o m p u t e _ a r e a _ f r a c t i o n ( eta_hist , Nx , Ny ,
→ t h r e s h o l d )
90
91 # P l o t t i n g area f r a c t i o n vs . time for each grain
92 plt . f i g u r e ( f i g s i z e =(10 , 6) )
93 for i in r an ge ( n g r a i n s ) :
94 plt . plot ( np . a r a n g e ( n s t e p s + 1) , a r e a _ f r a c t i o n s [: , i ] ,
→ la be l = f ’ Grain { i +1} ’ )
95 plt . x l a b e l ( ’ Time step ’ )
96 plt . y l a b e l ( ’ Area f r a c t i o n ’ )
97 plt . t it le ( ’ Area f r a c t i o n vs . Time for each grain ’ )
98 plt . l e g e n d ()
99 plt . show ()
100
101 # c r e a t e a n i m a t i o n f u n c t i o n was g e n e r a t e d in a s e p a r a t e
→ p r o m p t r e s p o n s e with minor m o d i f i c a t i o n s .
102 from m a t p l o t l i b . c o l o r s i m p o r t N o r m a l i z e
103
104 i m p o r t n um py as np
105 i m p o r t m a t p l o t l i b . p y p l o t as plt
106 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
107 from m a t p l o t l i b . c o l o r s i m p o r t N o r m a l i z e
108
109 def c r e a t e _ a n i m a t i o n _ w i t h _ c o l o r b a r ( eta_hist , i n t e r v a l =100 ,
→ n p r i n t =10) :
110 """
111 C r e a t e an a n i m a t i o n v i s u a l i z i n g the g r a i n g r o w t h over
→ time with a colorbar ,
112 u s i n g the s q u a r e d sum of eta v a l u e s and e n s u r i n g safe
→ n o r m a l i z a t i o n .
113
114 Parameters :
115 - e t a _ h i s t : A list of numpy arrays , each r e p r e s e n t i n g
→ the s t a t e of the s y s t e m at a g i v e n time step .
116 - i n t e r v a l : Time i n t e r v a l b e t w e e n f r a m e s in m i l l i s e c o n d s
→ .
117 - n p r i n t : I n t e r v a l of t i m e s t e p s to u p d a t e the a n i m a t i o n
→ .
118
119 Returns :
120 - anim : The M a t p l o t l i b a n i m a t i o n o b j e c t .
121 """
122 fig , ax = plt . s u b p l o t s ()
123 ax . s e t _ t i t l e ( ’ G r a i n G r o w t h Over Time ’ )
124
125 # C o m p u t e the s q u a r e d sum of eta v a l u e s for the i n i t i a l
→ fr am e and n o r m a l i z e
126 data = np . sum ( e t a _ h i s t [0]**2 , axis =2)
127 m a x _ v a l = np . max ( data ) if np . max ( data ) != 0 else 1 #
→ Av oi d d i v i s i o n by zero
68 R. Narkhede and B. Eidel
128 d a t a _ n o r m a l i z e d = data / m a x _ v a l
129 im = ax . i m s h o w ( d a t a _ n o r m a l i z e d , a n i m a t e d = True , cmap = ’
→ v i r i d i s ’ , norm = N o r m a l i z e ( vmin =0 , vmax =1) )
130
131 # Create colorbar
132 fig . c o l o r b a r ( im , ax = ax )
133
134 def u p d a t e ( f r a m e ) :
135 """
136 U p d a t e the p l o t for the animation , n o r m a l i z i n g the
→ data for each frame .
137 """
138 n e w _ d a t a = np . sum ( e t a _ h i s t [ frame ]**2 , axis =2)
139 m a x _ v a l = np . max ( n e w _ d a t a ) if np . max ( n e w _ d a t a ) != 0
→ else 1 # Avoid d i v i s i o n by zero
140 new_data_normalized = new_data / max_val
141 im . s e t _ a r r a y ( n e w _ d a t a _ n o r m a l i z e d )
142 r e t u r n ( im ,)
143
144 anim = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s = range
→ (0 , len ( e t a _ h i s t ) , n p r i n t ) , i n t e r v a l = interval , blit =
→ True )
145
146 r e t u r n anim
147
148 # This f u n c t i o n is ready to be used as d e s c r i b e d in the
→ c o m m e n t at the b o t t o m of the s n i p p e t .
149 # U s a g e e x a m p l e ( a s s u m i n g e t a _ h i s t is a l r e a d y c o m p u t e d ) :
150 anim = c r e a t e _ a n i m a t i o n _ w i t h _ c o l o r b a r ( eta_hist , i n t e r v a l =50 ,
→ n p r i n t =10)
151 # To d i s p l a y in J u p y t e r N o t e b o o k :
152 from I P y t h o n . d i s p l a y i m p o r t HTML
153 HTML ( anim . t o _ h t m l 5 _ v i d e o () )
154
155 # To save the a n i m a t i o n as a file ( u n c o m m e n t and use in a
→ lo ca l e n v i r o n m e n t ) :
156 # anim . save ( ’ g r a i n _ g r o w t h _ w i t h _ c o l o r b a r . mp4 ’, w r i t e r = ’ f f m p e g
→ ’, dpi =300)
157
158 # To save the a n i m a t i o n as a file ( u n c o m m e n t and use in a
→ lo ca l e n v i r o n m e n t ) :
159 # anim . save ( ’ g r a i n _ g r o w t h _ w i t h _ c o l o r b a r . mp4 ’, w r i t e r = ’ f f m p e g
→ ’, dpi =300)
Code Listing 3.1 Code generated by ChatGPT 4 after prompt 3. The code includes initialization
of the domain and the numerical implementation as per steps specified in prompt 3
The output of the code creates a plot of area fractions with respect to time for
all grains and an animation of the grain growth. The function that creates the ani-
mation was modified with minor changes to result in an animation with greater
sharpness and to add a colorbar. The code also results a list of arrays having dimen-
sion .(N x, N y, ngrains) which stores the order parameters over the grid for all
grains at all time instants. Moreover, the tracked grain status, area fractions and
volume fractions are also obtained as output. Here, the initial (.t = 0), intermediate
Phase Field Modeling of Grain Growth 69
Fig. 7 Grain growth evolution for the generalized initialization in prompt 3 with three randomly
located circular grains at initial, intermediate (step 500) and final state. The total simulation was
conducted for 1000 time steps and a time-step size of 0.005. The domain size is 64. × 64 with a grid
spacing of 0.5
(.t = nsteps/2 · dt) and final state (.t = nsteps · dt) state of grain growth evolution
is shown in Fig. 7. The final state here refers to the end of the time interval considered
for the simulation.
Case I with the ideal grain growth initialization was generated using the prompt 4.
The resulting initialized order parameters were used in the PF method code listed in
Code Listing 3.1.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3
4 def i n i t i a l i z e _ i d e a l _ g r a i n _ g r o w t h ( Nx , Ny , dx , dy ,
→ r a d i u s _ f a c t o r , n g r a i n s =2) :
5 """
6 I n i t i a l i z e the order p a r a m e t e r s for i d e a l g r a i n g r o w t h
→ with one c i r c u l a r grain
7 e m b e d d e d in a l a r g e r g r a i n .
8 Parameters :
9 - Nx , Ny : Size of the r e c t a n g u l a r grid .
10 - dx , dy : Grid s p a c i n g .
11 - r a d i u s _ f a c t o r : F a c t o r to d e t e r m i n e the r a d i u s of the
→ c i r c u l a r grain .
12 - n g r a i n s : N u m b e r of grains , d e f a u l t is 2.
13 Returns :
14 - eta : I n i t i a l i z e d order p a r a m e t e r s for each grain in
→ the form of an array
15 of s h a p e ( Nx , Ny , n g r a i n s ) .
16 """
17 # I n i t i a l i z e the order p a r a m e t e r array
18 eta = np . z e r o s (( Nx , Ny , n g r a i n s ) )
19 # C a l c u l a t e the c e n t e r of the grid
70 R. Narkhede and B. Eidel
20 center_x , c e n t e r _ y = Nx / 2 , Ny / 2
21 # C a l c u l a t e the r a d i u s of the c i r c u l a r grain
22 r a d i u s = r a d i u s _ f a c t o r * dx
23
24 for x in range ( Nx ) :
25 for y in range ( Ny ) :
26 # C a l c u l a t e the d i s t a n c e of ea ch point from the
→ c e n t e r
27 d i s t a n c e = np . sqrt (( x - c e n t e r _ x ) **2 + ( y -
→ c e n t e r _ y ) **2)
28 # I n i t i a l i z e g r a i n 1 ( the l a r g e r g r a i n )
29 if d i s t a n c e <= r a d i u s :
30 eta [x , y , 0] = 0 # I n s i d e the c i r c u l a r
→ grain , set grain 1 to 0
31 eta [x , y , 1] = 1 # Set grain 2 to 1
32 else :
33 eta [x , y , 0] = 1 # O u t s i d e the c i r c u l a r
→ grain , set grain 1 to 1
34 r e t u r n eta
35
36 # P a r a m e t e r s for i n i t i a l i z a t i o n
37 Nx , Ny = 64 , 64 # Grid size
38 dx , dy = 0.5 , 0.5 # Grid s p a c i n g
39 r a d i u s _ f a c t o r = 28 # M u l t i p l i e r for the r a d i u s b a s e d on dx
40
41 # I n i t i a l i z e the order p a r a m e t e r s
42 e t a _ i d e a l _ g g = i n i t i a l i z e _ i d e a l _ g r a i n _ g r o w t h ( Nx , Ny , dx , dy ,
→ radius_factor )
43
44 # Visualization
45 fig , axes = plt . s u b p l o t s (1 , 2 , f i g s i z e =(12 , 6) )
46 t i t l e s = [ ’ G rain 1 ( Large Grain ) ’ , ’ Grain 2 ( E m b e d d e d
→ C i r c u l a r Grain ) ’ ]
47 for i in r an ge (2) :
48 im = a x e s [ i ]. i m s h o w ( e t a _ i d e a l _ g g [: , : , i ] , cmap = ’ v i r i d i s
→ ’ , o r i g i n = ’ lower ’ )
49 axes [ i ]. s e t _ t i t l e ( t i t l e s [ i ])
50 axes [ i ]. axis ( ’ off ’ )
51 plt . c o l o r b a r ( im , ax = axes [ i ])
52 plt . t i g h t _ l a y o u t ()
53 plt . show ()
Code Listing 3.2 Code generated by ChatGPT 4 after prompt 4 for initialization based on ideal
grain growth case. The domain is initialized with a spherical grain embedded in a larger grain. The
code results in an array of order parameters based on this initialization
The output of the generated PF code was an animation. Here, we show initial
(.t = 0), intermediate (.t = nsteps/2 · dt) and final state (.t = nsteps · dt) of the ideal
grain growth case is shown in Fig. 8.
Phase Field Modeling of Grain Growth 71
Fig. 8 Grain growth evolution for the ideal grain growth initialization as per prompt 4 with a
spherical grain embedded in a larger cell. The evolution at initial, intermediate (step 2500) and final
state shows shrinkage of the spherical cell. The total simulation was conducted for 5000 time steps
and a time-step size of 0.005. The domain size is 64. × 64 with a grid spacing of 0.5
Two types of Voronoi tessellations were considered for initialization. The first one
has randomly distributed seeds following a uniform distribution. The second one has
a decreasing gradient in the seeds, showing a variation in the grain size along the
vertical axis. The corresponding generated codes are provided in code Listing 3.3
and code Listing 3.4.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 from s c i p y . s p a t i a l i m p o r t Voronoi , v o r o n o i _ p l o t _ 2 d
4
5 def g e n e r a t e _ v o r o n o i ( d o m a i n _ s i z e , n g r a i n s ) :
6 """
7 G e n e r a t e a V o r o n o i t e s s e l l a t i o n for a g i v e n d o m a i n s i z e
→ and n u m b e r of g r a i n s .
8
9 Parameters :
10 - d o m a i n _ s i z e : The size of the d o m a i n .
11 - n g r a i n s : The n u m b e r of g r a i n s to g e n e r a t e .
12
13 Returns :
14 - Voronoi tessellation object .
15 """
16 # G e n e r a t e r a n d o m p o i n t s w i t h i n the d o m a i n
17 p o i n t s = np . r a n d o m . rand ( ngrains , 2) * d o m a i n _ s i z e
18 # C r e a t e the V o r o n o i t e s s e l l a t i o n
19 vor = V o r o n o i ( p o i n t s )
20 r e t u r n vor
21
22 def p l o t _ v o r o n o i ( vor , d o m a i n _ s i z e ) :
23 """
24 Plot a V o r o n o i t e s s e l l a t i o n , e x t e n d i n g cells to the
→ d o m a i n b o u n d a r i e s .
25
72 R. Narkhede and B. Eidel
26 Parameters :
27 - vor : V o r o n o i t e s s e l l a t i o n o b j e c t .
28 - d o m a i n _ s i z e : The size of the d o m a i n .
29 """
30 fig , ax = plt . s u b p l o t s ()
31 v o r o n o i _ p l o t _ 2 d ( vor , ax = ax , s h o w _ v e r t i c e s = False ,
→ s h o w _ p o i n t s = False )
32 ax . s e t _ x l i m (0 , d o m a i n _ s i z e )
33 ax . s e t _ y l i m (0 , d o m a i n _ s i z e )
34 ax . s e t _ t i t l e ( ’ V o r o n o i T e s s e l l a t i o n with E x t e n d e d Cells ’ )
35 plt . show ()
36
37 def i n i t i a l i z e _ o r d e r _ p a r a m e t e r s ( vor , grid_size , dx , n g r a i n s )
→ :
38 """
39 I n i t i a l i z e the order p a r a m e t e r s for the phase field
→ m e t h o d b a s e d on V o r o n o i t e s s e l l a t i o n .
40
41 Parameters :
42 - vor : V o r o n o i t e s s e l l a t i o n o b j e c t .
43 - g r i d _ s i z e : The size of the grid .
44 - dx : The grid s p a c i n g .
45 - n g r a i n s : The n u m b e r of g r a i n s .
46
47 Returns :
48 - I n i t i a l i z e d order p a r a m e t e r s array .
49 """
50 eta = np . zeros (( grid_size , grid_size , n g r a i n s ) )
51 for i in range ( g r i d _ s i z e ) :
52 for j in range ( g r i d _ s i z e ) :
53 # Find the c l o s e s t V o r o n o i cell for each grid
→ po in t
54 x , y = ( i + 0.5) * dx , ( j + 0.5) * dx
55 d i s t a n c e s = np . s q r t (( vor . p o i n t s [: , 0] - x ) **2 +
→ ( vor . p o i n t s [: , 1] - y ) **2)
56 c l o s e s t = np . a r g m i n ( d i s t a n c e s )
57 eta [ i , j , c l o s e s t ] = 1.0
58 r e t u r n eta
59
60 # Parameters
61 d o m a i n _ s i z e = 32
62 n g r a i n s = 25
63 g r i d _ s i z e = 64
64 dx = 0.5
65
66 # G e n e r a t e and plot V o r o n o i t e s s e l l a t i o n
67 vor = g e n e r a t e _ v o r o n o i ( d o m a i n _ s i z e , n g r a i n s )
68 p l o t _ v o r o n o i ( vor , d o m a i n _ s i z e )
69
70 # I n i t i a l i z e order p a r a m e t e r s b a s e d on the V o r o n o i
→ t e s s e l l a t i o n
71 eta = i n i t i a l i z e _ o r d e r _ p a r a m e t e r s ( vor , grid_size , dx ,
→ n g r a i n s )
72
Phase Field Modeling of Grain Growth 73
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 from s c i p y . s p a t i a l i m p o r t Voronoi , v o r o n o i _ p l o t _ 2 d
4
5 def g e n e r a t e _ s t r o n g l y _ d e c r e a s i n g _ p o i n t s ( d o m a i n _ s i z e , n g r a i n s
→ ) :
6 """
7 G e n e r a t e p o i n t s for a P o i s s o n V o r o n o i t e s s e l l a t i o n with
→ a s t r o n g l y d e c r e a s i n g density ,
8 e n s u r i n g a n o t i c e a b l e g r a d i e n t in g r a i n size a c r o s s the
→ d o m a i n .
9 """
10 p o i n t s = []
11 wh il e len ( p o i n t s ) < n g r a i n s :
12 x = np . r a n d o m . u n i f o r m (0 , d o m a i n _ s i z e )
13 y = np . r a n d o m . u n i f o r m (0 , d o m a i n _ s i z e )
14 # Use a non - l i n e a r d e c r e a s i n g f u n c t i o n for point
→ d e n s i t y
15 p r o b a b i l i t y = ( d o m a i n _ s i z e - x ) **2 / d o m a i n _ s i z e **2
→ # Q u a d r a t i c a l l y d e c r e a s i n g p r o b a b i l i t y
16 if np . r a n d o m . r a n d () < p r o b a b i l i t y :
17 p o i n t s . a p p e n d ([ x , y ])
18 if len ( p o i n t s ) > n g r a i n s : # Reset if over - g e n e r a t e d
19 p o i n t s = []
20
21 r e t u r n np . a r r a y ( p o i n t s [: n g r a i n s ])
22
23 def i n i t i a l i z e _ o r d e r _ p a r a m e t e r s _ e x a c t l y ( grid_size , dx ,
→ points , n g r a i n s ) :
24 """
25 I n i t i a l i z e the order p a r a m e t e r s for the phase field
→ method , e n s u r i n g the s p e c i f i e d n u m b e r of g r a i n s .
26 """
27 eta = np . zeros (( grid_size , grid_size , n g r a i n s ) )
28
29 for i in range ( g r i d _ s i z e ) :
30 for j in range ( g r i d _ s i z e ) :
31 pos = np . array ([ i * dx , j * dy ])
32 d i s t a n c e s = np . l i n a l g . norm ( p o i n t s - pos , axis =1)
33 c l o s e s t _ p o i n t _ i n d e x = np . a r g m i n ( d i s t a n c e s )
34 eta [i , j , c l o s e s t _ p o i n t _ i n d e x ] = 1.0
35
36 # E n s u r e eta a r r a y r e f l e c t s e x a c t l y ’ n g r a i n s ’ g r a i n s
37 a s s e r t eta . s h a p e [2] == ngrains , " The n u m b e r of g r a i n s in
→ eta does not m a t c h n g r a i n s . "
38
39 r e t u r n eta
40
41 # P a r a m e t e r s r e m a i n the same as the p r e v i o u s e x a m p l e
74 R. Narkhede and B. Eidel
42 d o m a i n _ s i z e = 32
43 n g r a i n s = 25
44 g r i d _ s i z e = 64
45 dx = dy = 0.5
46
47 # G e n e r a t e p o i n t s with a s t r o n g l y d e c r e a s i n g p r o c e s s
48 points_strong_gradient = generate_strongly_decreasing_points
→ ( d o m a i n _ s i z e , n g r a i n s )
49
50 # C r e a t e the V o r o n o i t e s s e l l a t i o n
51 voronoi_strong_gradient = Voronoi ( points_strong_gradient )
52
53 # I n i t i a l i z e the order p a r a m e t e r s
54 eta_strong_gradient = initialize_order_parameters_exactly (
→ grid_size , dx , p o i n t s _ s t r o n g _ g r a d i e n t , n g r a i n s )
55
56 # Plot the V o r o n o i t e s s e l l a t i o n w i t h a s t r o n g g r a d i e n t
57 fig , ax = plt . s u b p l o t s ()
58 v o r o n o i _ p l o t _ 2 d ( v o r o n o i _ s t r o n g _ g r a d i e n t , ax = ax ,
→ s h o w _ v e r t i c e s = False )
59 ax . s e t _ x l i m (0 , d o m a i n _ s i z e )
60 ax . s e t _ y l i m (0 , d o m a i n _ s i z e )
61 ax . s e t _ t i t l e ( " P o i s s o n V o r o n o i T e s s e l l a t i o n with S t r o n g
→ G r a d i e n t " )
62 plt . show ()
Code Listing 3.4 Code generated by ChatGPT 4 after prompt 6 for the initialization with Voronoi
tessellation having a decreasing gradient. The code results in an array of order parameters based on
this initialization and a plot of the Voronoi tessellation. ChatGPT 4 assumes a suitable non-linear
function for the point density on which the seeds of the Voronoi tessellation are created
Initial (.t = 0), intermediate (.t = nsteps/2 · dt) and final state (.t = nsteps · dt)
of the Voronoi tessellation based grain growth in both cases is shown in Figs. 9 and
12 respectively.
Fig. 9 Grain growth evolution for the general Voronoi tessellation based initialization as described
in prompt 5. The evolution at initial, intermediate (step 500) and final state is shown in the plots.
The total simulation was conducted for 1000 time steps and a time-step size of 0.005. The domain
size is 64. × 64 with a grid spacing of 0.5
Phase Field Modeling of Grain Growth 75
In the current problem setting, errors that do not allow the code to run, root from
index errors and array dimension errors. However, using ChatGPT 4 and adhering
the prompt engineering practices, it is observed that the generated code from the
prompts runs without such errors. However, the logical and mathematical correct-
ness of numerical implementation needs to be thoroughly verified. We discuss these
verification aspects in three stages:
• Initialization
• Numerical implementation
• Verification with expected physical observations
Minor errors in the specifics of plotting and visualization are easily rectified by
instructing the chatbot in subsequent prompts. For example, the function that cre-
ates the animation create_animation_with_colorbar was modified with
minor changes to result in an animation with greater sharpness and add to a colorbar.
4.1 Initialization
The two initialization cases considered are first checked. First, the dimensions
of the output array of initialized order parameters are checked. These should be
.(N x, N y, ngrains), where .(N x, N y) signifies the grid size of the domain and
.ngrains is the number of grains, both provided as an input. Next, individual heat-
maps of some of the initialized grains are plotted and verified visually. Figure 10
shows the initialization in case of ideal grain growth as per the prompt in Sect. 2.4.1.
Even though the codes run without errors, they can still have logical and mathematical
errors in terms of the numerical implementation. These are checked by the following
steps:
• Verifying if the initialized arrays to store updated variables are of correct dimen-
sions.
• Checking the correct implementation of the five-point stencil to approximate the
Laplacian operator. In some trials, it is observed that division with the square of
step size was missing in the generated code.
• Checking the mathematically correct implementation of the discretized time-
stepping equation (4). Specifically, the signs in front of each individual
term need
to be checked, and ensuring that the sum in the interaction term, i.e.. iN= j (ηin )2 , has
76 R. Narkhede and B. Eidel
been correctly evaluated. There were instances where the condition .i = j in the
summation operator was not strictly followed, and instead overall sum of all.η2j was
resulted. Such issues were overcome by providing the discretized time-stepping
equation in a LATEX format in the prompt.
• Checking if volume fraction is correctly computed and the condition for extinction
of grains is followed.
• Checking if the bounds on order parameters are enforced.
• Checking if any specified boundary conditions are imposed. Surprisingly, Chat-
GPT 4 omitted the line in prompt 3, instruction 2 (refer Fig. 4) which instructs
imposing Neumann boundary conditions on the right edge of the domain. How-
ever, this can be easily rectified by instructing ChatGPT 4 in a subsequent prompt.
• Checking computation of area fractions.
• Verifying that the parameters mentioned in the prompt are correctly assigned before
running the specific functions of the code.
Ideal grain growth using the code generated by ChatGPT 4 can be used for verification
against physical observations. For the shrinking spherical grain embedded in a large
second grain, it is known that the change in radius of the shrinking grain can be
approximated as:
. D 2 − D02 = kt (7)
Fig. 10 Verifying initialization for ideal grain growth case by plotting heat-maps of individual
grains from the initialized order parameter array. The heat-maps show the desired initialization of
the spherical grain 2 embedded in the larger grain 1
Phase Field Modeling of Grain Growth 77
Fig. 11 Area fraction versus time step plot for ideal grain growth case
where, . D and . D0 are the current and the initial grain radius, respectively, and .k is a
temperature-dependent constant. A similar behavior can be observed by the plot of
area fractions versus time steps using the computation by the generated code. This
is shown in the Fig. 11, where the spherical grain has a radius of 28d.x, where d.x is
the grid spacing and the evolution is carried out with a time step size of 0.005 for
5000 steps.
Similarly, for the Voronoi tessellation based initialization, it is observed that the
evolution follows the physical observation that larger grains grow and smaller ones
disappear eventually. This can be observed in the Fig. 12. The same is also reflected
in the area fraction versus time step plot as shown in the Fig. 13. The area fraction
of the two small grains with indices 3 and 6 decreases to zero as shown in the plot
lines with color red and pink respectively.
5 Discussion
Fig. 12 Grain growth evolution for the Voronoi tessellation with a decreasing gradient as described
in prompt 6. As indicated in the red ellipse, the two small grains shrink and eventually vanish
Fig. 13 Area fraction versus time step plot for the first 10 grains in the Voronoi tessellation with
a decreasing gradient as described in prompt 6. The area fractions of the two small grains with
indices 3 (red) and 6 (pink) decrease and reach zero indicating that these grains have disappeared.
The larger grains growth with varying growth rates
Phase Field Modeling of Grain Growth 79
• Write clear instructions: Clarity in instructions provides limited room for the
LLM to make its own assumptions and in worse cases, produce hallucinations.
However, including too many unnecessary details is not suitable. This was noticed
when details of the Voronoi tessellation algorithm and the specific programming
steps were mentioned in a detailed prompt to create Voronoi tessellation based
initialization. Such details caused programming errors such as those related to
indexing and broadcasting. When only limited instructions were given with certain
restrictions such as periodic boundary conditions and a completely filled domain,
as shown in prompt 5 in Fig. 6, the LLM resulted in an error free code with the
desired outputs. This shows that the prompts should include specific instructions
that build upon the logic of solving the problem. But the specifics of programming
the instructions into a code need not be included.
• Split complex tasks into simpler subtasks: Breaking complex tasks into sub-
steps is a good approach to provide the LLM with the rigorous logic of solving the
problem. Instead of sequential prompting, one can also concatenate all the steps
concisely as done in prompt 3 shown in Fig. 3. Existing functions generated in
response to previous prompts in one chat can be modified for different cases, such
as done in the case of prompt 6 which modifies the function generated from prompt
5 shown in Fig. 6. However, recalling code or information from prompt responses
at an initial stage in a long chat can cause errors due to the limited short-term
memory of the model. This can be remedied by restating the code or information
from such a previous response in the new prompt. This also helps in maintaining
the consistency of the code variables and functions, as we proceed with sequential
subtasks.
• Ask model to adopt a persona: It is observed that providing a system prompt
as shown in Fig. 1, improves the generation of relevant responses and maintains
consistency in the style of code generation. Moreover, specific instructions to
obtain a certain style of response can also be added to a system prompt if needed.
• Providing necessary equations: It is important to provide the context of the
problem in consideration through the required equations. This enables the LLM to
develop an understanding of the problem to be solved and improve the relevance of
the responses to the problem. In our case, we provide all the equations of the model
description in prompt 2 shown in Fig. 2, since the free energy function as defined in
(2) is not a standard well-known differential equation like Laplace equation in heat
transfer or Burgers’ equation in fluid dynamics. In our case, without specifying
the equations in model description, ChatGPT 4 assumed some equations which
did not exactly resemble to the equations describing the PF model.
ChatGPT 4 could also perform the relatively complex mathematical operation
of differentiating the free energy function (2) and local free energy density (3)
and use it in the non-conserved Allen-Cahn equation (1). Further, it implemented
the specified numerical methods to obtain the discretized Eq. (4) for the time-
80 R. Narkhede and B. Eidel
Designing a prompt that adheres to these tactics is an iterative process and may
require some trials. It is usually not prudent to continue with one chat for multiple
variations of the same prompt. Furthermore, rectifying errors generated by a sub-
optimal prompt can help in cases of few errors. However, if the generated code
contains several syntax, semantic or logical errors, then it is suggested to redraft
a better prompt and use it in a new chat. This iterative process of prompt design
also signifies the importance of human intervention in using ChatGPT 4 for code
generation.
6 Conclusion
ChatGPT 4 has been used to generate a PF code for grain growth evolution. The
Fan and Chen [6] model has been applied and the equations describing the model
have been solved numerically using the explicit Euler forward method along with
the five-point stencil of the finite difference method for approximating the Laplace
operator. The complex problem has been broken into steps that instruct ChatGPT 4
to generate the code. To this end, specific prompt engineering tactics have been used
to design prompts that generate a PF method code which can be applied to gener-
alized initialization of the order parameters. Specific applications of this code are
shown by developing two initialization cases, (1) ideal grain growth with a spherical
grain embedded in a larger grain and (2) Voronoi tessellations. The initialization has
also been carried out by code generated by ChatGPT 4 using suitable prompts. The
resulting code generated by ChatGPT 4 is error-free in terms of syntax, semantic
and logical errors. Mathematical correctness of the numerical implementation has
been ensured by rigorous checking of the generated code. The final results have
been verified with expected results based on general physical observations in grain
growth evolution. Details of the prompt engineering tactics applied in the design of
the prompt have been explained.
Phase Field Modeling of Grain Growth 81
In extension to this work, newer prompt engineering methods like zero-shot chain
of thought prompting [8] can be used, which allow the LLM to generate its own
reasoning for solving the problem. This reasoning can be used as an initial point
for suitable prompt design. Another approach to solving complex code generation
tasks is to provide the steps to solve the task in terms of a code template with
comments. The aim is to give the LLM some direction to solve the problem as well
as a template to structure the code, and allow it to develop detailed solutions with code
based on this limited direction. In case of complex initialization, such as a specific
grain microstructure, its image can be provided as an input in the prompt, with
required conditions, and ask ChatGPT 4 to generate a code that provides a similar
initialized microstructure. With the presence of LLMs like ChatGPT 4 and their
continuous developments, prompt engineering based code generation will become
an indispensable approach for computational materials scientists to tackle complex
problems.
References
1. P. Blikstein and A. P. Tschiptschin. Monte Carlo simulation of grain growth. Materials Research,
2:133–137, 1999.
2. F. Wakai, N. Enomoto, and H. Ogawa. Three-dimensional microstructural evolution in ideal
grain growth-general statistics. Acta Materialia, 48(6):1297–1311, 2000.
3. H. J. Frost and C. V. Thompson. Computer simulation of grain growth. Current Opinion in Solid
State and Materials Science, 1(3):361–368, 1996.
4. Y. Liu, T. Baudin, and R. Penelle. Simulation of normal grain growth by cellular automata.
Scripta materialia, 34(11), 1996.
5. S. Biner. Solving Phase-Field Models with Finite Difference Algorithms. Programming Phase-
Field Modeling. pp. 17-97 (2017),
6. D. Fan and L.-Q. Chen. Computer simulation of grain growth using a continuum field model.
Acta Materialia, 45(2):611–622, 1997.
7. D. Weygand, Y. Brechet, and J. Lepinoux. A vertex dynamics simulation of grain growth in two
dimensions. Philosophical Magazine B, 78(4):329–352, 1998.
8. T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa. Large language models are zero-shot
reasoners, 2023.
Modeling Corrosion Using a Cellular
Automaton
1 Introduction
Cellular Automata (CA) are computational models that utilize discrete grid systems
to simulate the evolution of complex systems over time. Each cell within the grid
can exist in a finite number of states and interact with its neighboring cells based on
predefined rules. The classification of cells in the corrosion process is illustrated in
Fig. 1. This figure shows how each cell transitions during the corrosion process, high-
lighting the different states and interactions that are critical to the simulation model.
The diagram helps to visualize the cellular automaton framework and underscores
Fig. 1 The cell classification model and how each cell changes in the corrosion process
Modeling Corrosion Using a Cellular Automaton 85
∂C
. = D∇ 2 C , (1)
∂t
where .C is the concentration, .t is time, and . D is the diffusion coefficient. This
equation is discretized and applied within the CA framework to simulate the diffusion
of corrosive agents. The implementation of FSL in CA involves discretizing the
86 M. Bakhshi Zadeh and B. Eidel
spatial domain into a grid where each cell represents a concentration of the diffusing
species and updating the concentration based on the diffusion coefficient and the
concentration gradient. This approach allows for a detailed representation of the
spatial and temporal evolution of the concentration field, capturing the effects of
diffusion on the corrosion process.
Reaction kinetics (RK) are incorporated to model the chemical reactions occurring
at the material’s surface, which contribute to the corrosion process. The rate of these
reactions is governed by factors such as temperature, concentration of reactants, and
the presence of catalytic agents. In our CA model, these reactions are represented
through state transitions of the cells, where the probability of transition depends on
the local concentration of reactive species. For example, the interaction between the
diffusing oxygen and metal atoms to form oxides can be modeled as a probabilistic
state transition [3]. This probabilistic approach ensures that the model can capture
the inherent variability and stochastic nature of corrosion reactions.
In the context of high-temperature corrosion, the diffusion of oxygen and chlo-
ride ions through molten salt layers and their reaction with metal surfaces is critical.
Wang et al. [1] provides a detailed framework for modeling these interactions, show-
ing that the formation of protective layers and the degradation of the metal substrate
can be effectively simulated using CA. This detailed framework is essential for under-
standing how different environmental conditions and material properties influence
the corrosion process, enabling the development of targeted strategies for corrosion
mitigation.
This study aims to address the challenge of predicting and managing corrosion in
industrial applications by utilizing a 2D CA model. The specific problem involves
simulating the diffusion of corrosive species and the resultant chemical reactions
using FSL and reaction kinetics. The parameters used in the corrosion simulation are
summarized in Table 1.
In the following, we provide the prompt used to generate the algorithm for the
corrosion modeling by ChatGPT-4. This prompt is specifically used to instruct the
AI to create an algorithm based on FSL of diffusion and reaction kinetics within a
2D cellular automaton framework.
The algorithm as LATEX-pseudocode, which is generated by the prompt in Fig. 2
is given in the algorithm box 3.
Modeling Corrosion Using a Cellular Automaton 87
Fig. 2 Prompt for generating LATEXpseudocode of an algorithm for corrosion simulation along
with CA
88 M. Bakhshi Zadeh and B. Eidel
10 end
11 Corrosion Reaction Kinetics:
12 foreach grid point (i, j) do
13 Calculate the local potential V (i, j )n based on the concentration C(i, j )n and other factors (e.g.,
electrolyte, material properties);
14 if V (i, j)n > Vth and S(i, j)n = 1 then
15 Update concentration due to anodic reaction:
16 end
17 else if V (i, j )n < Vth and S(i, j )n = 1 then
18 Update concentration due to cathodic reaction:
19 end
20 end
21 Cellular Automaton Update:
22 foreach grid point (i, j) do
23 Update the corrosion state S(i, j)n+1 based on local concentration C(i, j )n+1 and neighboring states:
24 if C(i, j )n+1 > Ccrit or neighboring sites S(i + 1, j ), S(i − 1, j ), S(i, j + 1), S(i, j − 1) are corroded
then
25 S(i, j )n+1 = 1 ;
// Activate corrosion if concentration exceeds threshold or
neighboring sites are corroded
26 end
27 end
28 Apply Boundary Conditions:
29 Apply appropriate boundary conditions (e.g., Dirichlet, Neumann) to the concentration and corrosion state
fields;
30 end
31 Output:
32 Return the concentration field C(x, t) and corrosion state S(x, t) over time;
Modeling Corrosion Using a Cellular Automaton 89
2 Prompt
In the following, the sketch of the simulation domain in Fig. 3 with its features is
introduced and explained as follows. The coordinate system, labeled with . x and
. y axes, intersects at the origin (0, 0), defining the framework of the simulation.
Fig. 3 Detailed sketch illustrating the 2D corrosion simulation process on metal surfaces using a
Cellular Automaton, highlighting critical parts, diffusion paths, and state transitions
90 M. Bakhshi Zadeh and B. Eidel
Fig. 4 (continued)
3 Code Listing
4 Verification
This section verifies the implementation of FSL in the corrosion simulation code.
Since the numerical solution of the Initial Boundary Value Problem (IBVP) is carried
out by a explicit finite difference (FD) method, the verification implies a test of the
FD scheme, and for the explicit character a check of the stability conditions, and
conducting unit tests.
• Finite Difference Scheme Verification. Fick’s second law in two dimensions, as
.
given in 2
∂C(x, y, t) ∂ C(x, y, t) ∂ 2 C(x, y, t)
. =D + (2)
∂t ∂x2 ∂ y2
Given the parameters .x = y = 1 and the diffusion coefficients of . Dtop = 1.4
12
and . Dbottom = 1.35, the stability limits are obtained as .t ≤ 2·1.4 ≈ 0.357 and
12
.t ≤ ≈ 0.370 in the . x- and . y-directions. With .t set to 0.2 in the simula-
2·1.35
tions, the stability condition is fulfilled.
• Unit Tests Based on Specific Diffusion Function: The verification code, as pro-
.
vided in Code Listing 4.2, is generated using ChatGPT-4o based on the algorithm
outlined in Algorithm Box 4. In these tests, a simplified version of the diffusion
function is used, excluding the random factor for clarity.
Output:
• Cn+1 : 2D array representing the updated concentration field after one time step
Steps:
1. Initialize Cn+1 ← C (Create a copy of the initial concentration array).
2. For each interior grid point (i, j ) where 1 ≤ i ≤ N x − 2 and 1 ≤ j ≤ N y − 2:
a. If C[i, j] = −1 and C[i, j] = −0.75 then:
i. Update Cn+1 [i, j] using the following formula:
C(i + 1, j )n − 2C(i, j )n + C(i − 1, j )n
C(i, j )n+1 = C(i, j )n + Dt
x 2
C(i, j + 1) − 2C(i, j )n + C(i, j − 1)n
n
+
y 2
b. End If.
3. End For.
4. Return Cn+1 (Return the updated concentration field).
Modeling Corrosion Using a Cellular Automaton 97
1 import numpy as np
2 def diffuse_test(C, D, delta_t, delta_x, delta_y):
3 C_n_plus_1 = C.copy()
4 for i in range(1, C.shape[0] - 1):
5 for j in range(1, C.shape[1] - 1):
6 if C[i, j] != -1 and C[i, j] != -0.75:
7 C_new[i, j] += D * delta_t * ((C[i+1, j] - 2 * C[
→ i, j] + C[i-1, j]) / delta_x ** 2 + (C[i, j+1] - 2 * C[i,
→ j] + C[i, j-1]) / delta_y ** 2)
8 return C_n_plus_1
9
10 def test_diffusion():
11 C = np.zeros((5, 5))
12 C[2, 2] = 1 # Initial concentration at the center
13 D = 1.0
14 delta_t = 0.2 # Time step for stability
15 delta_x = delta_y = 1 # Grid spacing
16 C_n_plus_1 = diffuse_test(C, D, delta_t, delta_x, delta_y)
17 expected_center = 1 - 4 * D * delta_t / delta_x ** 2
18 expected_adjacent = D * delta_t / delta_x ** 2
19 assert np.isclose(C_new[2, 2], expected_center), f"Central
→ value should decrease to {expected_center}"
20 assert np.isclose(C_new[1, 2], expected_adjacent), f"Adjacent
→ cells should increase to {expected_adjacent}"
21
22 test_diffusion()
23 print("Test passed!")
Code Listing 4.2 This code is used to ensure the implementation of FSL
Output:
Procedure:
1 import numpy as np
2
14
4.1 Results
Initial conditions are set with specific cells at the corrosion threshold. The reaction
kinetics function is applied, with debug statements providing insights into the process.
As expected, the cells at [4, 5] and [6, 5] corrode, confirming the function’s behavior
under controlled conditions as underpinned in Fig. 5.
The verification process demonstrates that the RK implementation obeys the
expected behavior based on given parameters. By using a fixed random seed and
forcing specific cells to corrode, deterministic outcomes are ensured, confirming
the implementation’s correctness. This approach can be extended to validate other
aspects of the corrosion simulation.
The code provided in the repository [6] is used to specify the initial and boundary
conditions of the problem. The feasibility of the research is also ensured by verifying
that the obtained results closely match with a high degree of accuracy.
Modeling Corrosion Using a Cellular Automaton 101
(a) Initial concentration matrix with cells [4, (b) Final matrix after corrosion. Cells [4, 5]
5] and [6, 5] meeting the corrosion threshold and [6, 5] corroded
5 Discussion
• Error in Updating Cells: Initially, cells outside the damage area did not corrode.
A new prompt for the reaction kinetics function corrected this, ensuring the entire
metal could corrode.
• Neglect of Defined Boundary Conditions: Corrosion is intended to occur only
if a neighboring cell is corroded, but the initial code allows corrosion anywhere.
This problem is fixed by placing a new if statement and specifying its condition.
• Incorrect Display of Primary Damage: The chatbot struggles with basic prob-
lems. The problem is solved by adjusting the order of definition of cell types.
The prompt shown in Fig. 6 is used to edit the code generated by the initial prompt
shown in Fig. 4. The final code is shown in Code Listing 4.1.
Modeling Corrosion Using a Cellular Automaton 103
Fig. 6 (continued)
This section presents the visual documentation of the corrosion simulation process
at different stages highlighting in Fig. 7(a) the initial, in (b) an intermediate, in (c) an
advanced and in (d) the final stage of the corrosion progress. Faster progress in the
upper part of the simulation domain through a larger diffusion coefficient (compare
the values in Table 1) is clearly visible.
6 Conclusion
The objective of this study was to develop a robust 2D CA model for simulating
corrosion processes, incorporating FSL of diffusion and RK. The simulation tool
serves the purpose to predict the initiation and propagation of corrosion under various
environmental conditions with the ultimate goal of an effective corrosion prevention
and control strategies.
The main results of the study are summarized as follows:
Model Development: A 2D CA model was successfully developed to simulate
corrosion. The model accurately captures the transport of corrosive species and the
subsequent chemical reactions at the material’s surface.
Diffusion Verification: The implementation of FSL was rigorously tested. The
verification process confirmed that the model correctly simulates the diffusion pro-
cess, adhering to theoretical expectations and stability conditions.
Reaction Kinetics Verification: The reaction kinetics were verified through con-
trolled testing. The results show that the model accurately simulates the corrosion
process, taking into account the different corrosion rates for various metals and the
influence of initial conditions.
Simulation Results: The simulation effectively demonstrated the progression of
corrosion, showcasing the protective layer’s impact and the different corrosion rates
of the metals. The refined prompts and code adjustments ensured that the simulation
adhered to the specified conditions.
Modeling Corrosion Using a Cellular Automaton 105
Fig. 7 Different stages a–d of the corrosion process along with e temporal evolution of the different
cell types
106 M. Bakhshi Zadeh and B. Eidel
During the development of the 2D CA model for simulating corrosion, several chal-
lenges were encountered with the initial code generated by ChatGPT. Major issues
include improper rule enforcement, such as allowing corroded cells to revert to their
original state, and visualization challenges that require manual fixes. Minor issues
involve plot labeling errors resolved by adjusting the plot size. The initial prompt
lacks sufficient details, leading to issues like incorrect implementation of FSL and
inadequate initial conditions. These issues are addressed through iterative prompts
and manual edits.
Despite these challenges, the process highlights the importance of precise and
detailed prompts to ensure the accuracy of the generated code. While the structure
of the code is consistent, variations in finer details emphasize the need for careful
review and expert oversight. ChatGPT proves to be a powerful research assistant,
significantly aiding the development process. However, its effectiveness depends on
the clarity and specificity of the prompts, as well as the expertise applied in reviewing
and refining its outputs.
Future research can focus on extending the model to include more complex envi-
ronmental conditions and different types of materials, and incorporating machine
learning techniques to enhance the model’s predictive capabilities.
In conclusion, the developed 2D CA model for simulating corrosion is a robust tool
for predicting and managing corrosion processes. The iterative process of refining
prompts and verifying the model ensured that the final simulation met the desired
standards of accuracy and reliability. This study underscores the potential of AI-
assisted tools like ChatGPT in scientific research, while emphasizing the need for
continuous refinement and expert oversight.
References
1. Wang, Y., Guan, B., Wei, X., Lu, J., Ding, J. High-temperature corrosion of Ni-based alloys in
chloride molten salts: A cellular automaton model. Corrosion Science. 150, pp. 17-28 (2019).
https://doi.org/10.1016/j.solmat.2019.110170.
2. Reinoso-Burrows, J.C., Toro, N., Cortés, M. Cellular automata modeling as a tool in corrosion
management. Materials. 16(6051) (2023). https://doi.org/10.3390/ma16176051.
3. Stafiej, J., di Caprio, D., Bartosik, L. Corrosion-passivation processes in a cellular automata-
based simulation study. Journal of Supercomputing. 65, pp. 697-709 (2013). https://doi.org/10.
1007/s11227-013-0933-8.
4. Valor, A., Caleyo, F., Hallen, J.M. Pitting corrosion modeling in low-carbon steels using cellular
automata. Corrosion. 66(2), pp. 025004-1–025004-8 (2010).
5. Chen, H., Chen, Y., Zhang, J. Cellular automaton modeling on the corrosion/oxidation mecha-
nism of steel in a liquid metal environment. Progress in Nuclear Energy. 50(5-6), pp. 587–593
(2008). Elsevier. https://doi.org/10.1016/j.pnucene.2007.11.044.
6. Gooo, T. Cellular Automata Corrosion. (2021). Available at: https://github.com/tracygooo/
cellular-automata-corrosion. Code posted: 21 June 2021, Accessed: 5 May 2024.
Modeling Corrosion Using a Cellular Automaton 107
7. Chen, M.C., Wen, Q.Q. Simulation of corrosion process for structure with the cellular automata
method. IOP Conference Series: Materials Science and Engineering. 216, 012012 (2017). IOP
Publishing. https://doi.org/10.1088/1757-899X/216/1/012012.
8. Xiao, Z., Jun, H., Yuqi, W., Maosheng, Z., Zaoxiao, Z. Simulation of pitting corrosion for Ni-
based alloy using a cellular automata model. Rare Metal Materials and Engineering. 44(10),
pp. 2347–2352 (2015). Elsevier. https://doi.org/10.1016/S1875-5372(16)30018-2
Instationary Heat Conduction
on Rectangular Domains with Arbitrary
Circular Holes
Abstract This chapter presents the generation and evaluation of Python code for
simulating two-dimensional, unsteady heat conduction in rectangular domains with
multiple holes using ChatGPT-4. The focus is on solving the Fourier heat conduc-
tion equation through the Forward Time Centered Space (FTCS) scheme, a finite
difference method. The chapter begins with a detailed derivation of the algorithmic
approach, transitioning from the governing partial differential equation to the discrete
FTCS scheme for temperature updates. The accuracy and reliability of the generated
code are rigorously tested, including a comparison with analytical solutions, demon-
strating the effectiveness of AI-assisted coding in computational materials science
and mechanics.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 109
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_5
110 A. Neelakandan and B. Eidel
where .T represents the temperature, .t is time, .x and . y are the spatial coordinates,
and .α is the thermal diffusivity. This equation represents the conservation of energy
and relates the temporal change in temperature to the spatial variations and accounts
for the thermal diffusivity of the material.
The FTCS scheme is a numerical method used to approximate the solutions of
partial differential equations, which is based on the Finite Difference Method (FDM),
having discretization for both space and time quantities. The FTCS approximates the
spatial derivatives using the central differences scheme and time derivatives using
the forward Euler method.
To apply the FTCS scheme, we discretize the spatial and temporal domains. Let
.x and .y be the spatial step sizes in the . x and . y directions, respectively, and let
.t be the time step. Define the grid points as . x i = ix and . y j = jy for integers
.i and . j, and .tn = nt for integer .n. The temperature at a grid point .(x i , y j ) at time
n
.tn is denoted as . Ti, j .
The finite difference approximations for the first and second derivatives are given
by
j − Ti, j
Ti,n+1 n
∂T
. ≈ , (2)
∂t t
j − 2Ti, j + Ti−1, j
n n n
∂2 T Ti+1,
. ≈ , (3)
∂x2 x 2
∂2 T Ti, j+1 − 2Ti, j + Ti, j−1
n n n
. ≈ . (4)
∂y 2 y 2
Substituting equations (2), (3), and (4) into the heat conduction equation (1), we
obtain the FTCS scheme
n
Ti+1, j − 2Ti,nj + Ti−1,
n
j Ti,nj+1 − 2Ti,nj + Ti,nj−1
n+1
. Ti, j = Ti, j + αt
n
+ , (5)
x 2 y 2
where .Ti,n+1 n
j is the temperature at time .tn+1 calculated from known temperature . Ti, j
at time .tn and .t = tn+1 − tn is the discretized time step with .x and .y being the
spatial step sizes as shown in Fig. 1.
The algorithm for instationary heat conduction is given in the algorithm box 1
and the prompt used for generating it is mentioned in the box Fig. 2.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 111
Fig. 1 2D representation of FTCS scheme based on FDM indicating the spacial points
n+1
(.Ti,n j , Ti+1,
n
j , Ti−1, j , Ti, j+1 , Ti, j−1 ) used for calculating next temporal point (. Ti, j )
n n n
Generate the LaTeX code for an algorithm that solves a 2D transient heat con-
duction problem using the Forward Time Centered Space (FTCS) scheme.
The algorithm should be formatted using the algorithm environment in La-
TeX.
Fig. 2 Prompt for generating the algorithm in LATEX pseudocode for instationary heat conduction
using an FTCS scheme in 2D
112 A. Neelakandan and B. Eidel
9 end
10 Apply Boundary Conditions:
11 Apply the specified boundary conditions to the temperature field .T (x, t);
12 end
13 Output:
14 Return the temperature field .T (x, t) over time;
2 Prompt
The prompt aims to generate code using ChatGPT-4 to numerically compute the time-
dependent temperature field in a rectangular 2D domain with arbitrary dimensions
under the constraints of an arbitrary number of different circular holes in the domain
by applying the FTCS scheme mentioned before in the theory for solving the heat
conduction equation.
For formulating a prompt for ChatGPT-4 to generate a 2D transient heat conduc-
tion simulation in Python using the FTCS method, we have to mention every critical
parameter. These parameters encompass plate dimensions, thermal diffusivity, spa-
tial step, and maximum iteration time. The prompt 5.2 outlines precise boundary
conditions, incorporating Dirichlet at the top and Neumann on the other sides while
describing hole configurations with coordinates, radius, type, and value. The prompt
also includes defining functions for grid initialization, hole condition application,
temperature distribution computation, and high-resolution visual output generation.
Experiments with prompts reveal that some functions need heightened emphasis,
such as hole evolution, visualization, and verbose logging. Verbose logging purpose
is to provide extensive and detailed information about the program’s status, often
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 113
used for debugging or monitoring purposes. The prompt also calls for library avail-
ability checks, prioritizing OpenCV for video creation and seamlessly falling back
to Pillow if required. The anticipated outcome is a meticulously documented, self-
contained script, poised to deliver precise simulations with explicit hole evolution
visibility, archived as both a video and GIF (Fig. 3).
3 Generation Results
The reason that ChatGPT-4 cannot consistently generate the same results is due to
its non-deterministic nature, influenced by random sampling during generation and
contextual dependencies. The model’s training on a diverse dataset contributes to
response variability. To alleviate these deficiencies and speed up programming, one
114 A. Neelakandan and B. Eidel
Fig. 3 (continued)
Fig. 4 Contour plot of the plate domain (quantities on the abscissa .x and ordinate . y in (.mm)) with
two circular holes at time step .t = 749 s, with a time step size of .t = 0.125 s. The top boundary
is maintained at a temperature of .100 K, while the other boundaries are kept at .0 K. The first hole,
centered at .(10, 35), has a temperature of .0 K, and the second hole, centered at .(30, 27), has a
temperature of .100 K
1
2 i m p o r t n um py as np
3 i m p o r t m a t p l o t l i b . p y p l o t as plt
4 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
5 from PIL i m p o r t Image
6 i m p o r t os
7
8 # C h e c k for O p e n C V a v a i l a b i l i t y
9 try :
10 i m p o r t cv2
11 O P E N C V _ A V A I L A B L E = True
12 except ImportError :
13 O P E N C V _ A V A I L A B L E = False
14
15 # Fi xe d N u m e r i c a l P a r a m e t e r s
16 p l a t e _ l e n g t h = 50.0
17 m a x _ i t e r _ t i m e = 750
18 alp ha = 2.0
19 d e l t a _ x = 1.0
20 d e l t a _ t = ( d e l t a _ x ** 2) / (4 * a l p h a ) # Stability
→ c r i t e r i o n
21 g a m m a = a l p h a * d e l t a _ t / ( d e l t a _ x ** 2)
116 A. Neelakandan and B. Eidel
22
23 # Boundary Conditions
24 b o u n d a r i e s = { ’ top ’ : 100.0 , ’ b o t t o m ’ : 0.0 , ’ left ’ : 0.0 , ’
→ ri gh t ’ : 0.0}
25
26 # Hole C o n f i g u r a t i o n
27 hol es = [{ ’ c e n t e r ’ : (25 , 25) , ’ r a d i u s ’ : 5 , ’ type ’ : ’D ’ , ’
→ va lu e ’ : 0.0}]
28
29 def i n i t i a l i z e _ g r i d () :
30 """ I n i t i a l i z e s the t e m p e r a t u r e grid with b o u n d a r y
→ c o n d i t i o n s . """
31 grid = np . zeros (( int ( p l a t e _ l e n g t h ) , int ( p l a t e _ l e n g t h ) ) )
32 grid [0 , :] = b o u n d a r i e s [ ’ top ’ ]
33 # N e u m a n n c o n d i t i o n s on other sides are i m p l i c i t l y set
→ to zero
34 r e t u r n grid
35
36 def a p p l y _ h o l e _ c o n d i t i o n s ( grid ) :
37 """ A p p l i e s hole c o n f i g u r a t i o n s to the grid . """
38 for hole in holes :
39 x_center , y _ c e n t e r = hole [ ’ c e n t e r ’ ]
40 r a d i u s = hole [ ’ r a d i u s ’ ]
41 for i in range ( grid . shape [0]) :
42 for j in range ( grid . shape [1]) :
43 if ( i - x _ c e n t e r ) ** 2 + ( j - y _ c e n t e r ) ** 2
→ < r a d i u s ** 2:
44 grid [i , j ] = hole [ ’ value ’ ]
45 r e t u r n grid
46
47 def FTCS ( grid ) :
48 """ P e r f o r m s a s i n g l e time step of the FT C S m e t h o d . """
49 n e w _ g r i d = np . copy ( grid )
50 for i in range (1 , grid . shape [0] - 1) :
51 for j in range (1 , grid . shape [1] - 1) :
52 n e w _ g r i d [ i , j ] = grid [i , j ] + gamma * ( grid [ i +1 ,
→ j ] + grid [i -1 , j ] + grid [i , j +1] + grid [i , j -1] - 4 *
→ grid [i , j ])
53 return apply_hole_conditions ( new_grid )
54
55 def s a v e _ f r a m e ( frame , iter_time , f r a m e s _ d i r ) :
56 """ S a v e s a s i n g l e f r a m e of the s i m u l a t i o n . """
57 plt . i m s h o w ( frame , cmap = ’ hot ’ , i n t e r p o l a t i o n = ’ n e a r e s t ’ )
58 plt . ti tl e ( f ’ Time Step : { i t e r _ t i m e } ’ )
59 plt . c o l o r b a r ()
60 plt . s a v e f i g ( f " { f r a m e s _ d i r }/ f r a m e _ { i t e r _ t i m e :04 d }. png " )
61 plt . cl os e ()
62
63 def c o m p i l e _ v i d e o ( frames_dir , o u t p u t _ f i l e ) :
64 """ C o m p i l e s f r a m e s into a video . """
65 if not O P E N C V _ A V A I L A B L E :
66 return
67 i m a g e s = [ img for img in os . l i s t d i r ( f r a m e s _ d i r ) if img .
→ e n d s w i t h ( " . png " ) ]
68 f r a m e = cv2 . i m r e a d ( os . p a t h . join ( frames_dir , i m a g e s [ 0 ] ) )
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 117
69 height , width , l a y e r s = f r a m e . s h a p e
70 vi de o = cv2 . V i d e o W r i t e r ( o u t p u t _ f i l e , cv2 .
→ V i d e o W r i t e r _ f o u r c c (* ’ DIVX ’ ) , 15 , ( width , h e i g h t ) )
71 for i ma ge in i m a g e s :
72 v i d e o . w r i t e ( cv2 . i m r e a d ( os . path . join ( f r a m e s _ d i r ,
→ im ag e ) ) )
73 cv2 . d e s t r o y A l l W i n d o w s ()
74 v i d e o . r e l e a s e ()
75
76 def c o m p i l e _ g i f ( frames_dir , o u t p u t _ f i l e ) :
77 """ C o m p i l e s f r a m e s into a GIF . """
78 i m a g e s = [ I m a g e . open ( os . path . join ( frames_dir , img ) ) for
→ img in os . l i s t d i r ( f r a m e s _ d i r ) if img . e n d s w i t h ( " . png " ) ]
79 i m a g e s [ 0 ] . s a v e ( o u t p u t _ f i l e , s a v e _ a l l = True , a p p e n d _ i m a g e s
→ = i m a g e s [1:] , o p t i m i z e = False , d u r a t i o n =40 , loop =0)
80
81 def r u n _ s i m u l a t i o n () :
82 """ R u n s the e n t i r e s i m u l a t i o n . """
83 grid = i n i t i a l i z e _ g r i d ()
84 grid = a p p l y _ h o l e _ c o n d i t i o n s ( grid )
85 frames_dir = ’ frames ’
86 os . m a k e d i r s ( frames_dir , e x i s t _ o k = True )
87
88 for i t e r _ t i m e in range ( m a x _ i t e r _ t i m e ) :
89 pr in t ( f " I t e r a t i o n { i t e r _ t i m e }/{ m a x _ i t e r _ t i m e } " )
90 s a v e _ f r a m e ( grid , iter_time , f r a m e s _ d i r )
91 grid = FTCS ( grid )
92
93 if O P E N C V _ A V A I L A B L E :
94 c o m p i l e _ v i d e o ( frames_dir , ’
→ h e a t _ c o n d u c t i o n _ s i m u l a t i o n . avi ’ )
95 else :
96 c o m p i l e _ g i f ( frames_dir , ’ h e a t _ c o n d u c t i o n _ s i m u l a t i o n .
→ gif ’ )
97
98 if _ _ n a m e _ _ == " _ _ m a i n _ _ " :
99 r u n _ s i m u l a t i o n ()
Code Listing 5.1 Output by ChatGPT for solving 2D Fourier transient heat conduction problem
Fig. 5 MSE plotted in logarithmic scale between FTCS and Analytical Solutions over time steps
along the middle of the plate
x
. T (x, t) = T1 − (T1 − T2 ) ∗ erf √ , (6)
2 αt
z
2
e−y dy .
2
.where erf = √ (7)
π 0
1
n
.MSE = (Tanalytical,i − TFTCS,i )2 , (8)
n i=1
where .n is the number of data points along the plate’s midsection, .Tanalytical,i is the
temperature at point .i computed using the analytical solution and .TFTCS,i is the
temperature at point .i computed using the FTCS scheme.
The diagrams in Figs. 5 and 6 illustrate the MSE between temperatures computed
by the analytical and FTCS scheme along the plate’s midsection. The reducing error
over extended time intervals, coupled with the overall error bounds within 10% for a
.1000 mm plate, indicates the use of the code generated by ChatGPT-4 directly, with
or without minor adjustments for solving a scientific task is a viable option.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 119
Fig. 6 Contour plot of the logarithm of MSE between FTCS and the analytical solutions over
the spatial distance .x in the middle of the plate at . y = 500 mm, displayed up to a distance of
. x ≤ 200 mm. The contour levels are colored to indicate the time steps of the simulation
5 Discussion
The output of the code generated by the ChatGPT-4 and its verification paves the
strong foundation for using the Large Language Models (LLMs) as a critical tool
in solving scientific problems. Even though they have issues with faithfulness in
generated data caused by the sources of data, training, and inference, also known as
hallucinations [2], which may result in outputs that are factually incorrect or flawed
in reasoning due to memorized data, various methods have been developed to reduce
their effects. In the back-end, the model applies higher probability to the statements
it has seen from the training than to the ones it is generating, and there are high
uncertainty errors associated with the tokens which are not present in the user-given
prompt. While the standard user cannot control the effects arising from hallucinations
from data and training, mitigating inference-caused errors by Faithfulness Enhanced
Decoding, which emphasizes the user instruction and its alignment with the generated
content [1]. The authors of [4] develop the method which prioritizes the inputs/tokens
from the user context when there is a high discrepancy or uncertainty between the
output of the LLM and the user’s prompt. So if the given input is high enough
details and does not rely on external facts, the output is less likely to have a large
percentage of hallucinated content, and it is one of the reasons which emphasizes the
importance of having detailed user prompts to get less error output from the ChatGPT
for solving scientific problems. However, the task of generating an extremely detailed
120 A. Neelakandan and B. Eidel
prompt is not always trivial, as there is no ending point in providing details. The
amount of information and how it is structured can also influence the output of the
ChatGPT. During many initial tests, even though ChatGPT-4 has a higher possibility
of generating code, which works straight out of the box without regeneration, the
given prompt generates complete code with ChatGPT-3.5 rather than ChatGPT-4,
since it assumes that the user is asking for the structure of the code and provides only
the overall framework rather than the actual implementation. A possible mitigation is
either using the chain of commands or adjusting the prompt and forcing the ChatGPT
to provide complete code every time, and this study employs the latter method.
A useful strategy in scientific computing is to use LATEX-generated pseudocode as
an intermediate step before code generation. This method offers experienced devel-
opers greater control by first generating a LATEX algorithm via ChatGPT, reviewing
and refining it, and then using it to produce code. This approach ensures a clear, logi-
cal structure before implementation, reducing potential errors. While LATEX provides
precision and consistency, this method introduces challenges, such as the model’s
need to decode LATEX, which can complicate longer prompts or complex modifi-
cations. Additionally, this approach may not scale well for complex problems and
might be less accessible for users unfamiliar with pseudocode. For proficient coders,
the main advantage of ChatGPT may be time-saving rather than enhancing under-
standing, making this algorithmic approach not universally applicable.
6 Conclusion
The central purpose of this chapter was to explore the performance of ChatGPT-4 in
generating code for complex scientific problems, which, in this case, focuses on the
implementation of the 2D Fourier transient heat conduction problem using the FTCS
scheme for a rectangular plate with an arbitrary number of holes. The prompt was
formulated in a way such that it encapsulates every parameter used for the simulation
along with the precise boundary condition, and it also highlights the use of libraries
required to produce the plots requested by the user. The code generated by ChatGPT-
4 was tested and verified with the analytical solution, which demonstrated that the
results are in agreement with the analytical solution. Remaining errors are minor
and can be traced back to the numerical method (FTCS) realized by a flawless code.
The discussion elucidated the general problem of hallucination intimately related
to LLMs, where we suggested effective measures to overcome these issues like
providing details of the instructions in prompt. Overall, the results of this study
strongly suggest that LLMs will play a major role in the future as an assistant in
the development and iteration of faster solutions, thus pushing the boundaries of
scientific computing and problem-solving.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 121
References
1. Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X.,
Qin, B. & Others. A survey on hallucination in large language models: Principles, taxonomy,
challenges, and open questions. ArXiv Preprint ArXiv:2311.05232. (2023)
2. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A. & Fung,
P. Survey of hallucination in natural language generation. ACM Computing Surveys. 55, 1-38
(2023)
3. Oliveira, J., Nascimento, J., Ribeiro, S., Oliveira, G., Costa, G., Guimarães, G. & Santos Jr, J.
Analytical solution of a 2d transient heat conduction problem using green’s functions. Revista
De Engenharia Térmica. 19, 66-71 (2020)
4. Poel, L., Cotterell, R. & Meister, C. Mutual information alleviates hallucinations in abstractive
summarization. Proceedings Of The 2022 Conference On Empirical Methods In Natural Lan-
guage Processing. pp. 5956-5965 (2022,12), https://aclanthology.org/2022.emnlp-main.399
5. Suárez-Carreño, F. & Rosales-Romero, L. Convergency and stability of explicit and implicit
schemes in the simulation of the heat equation. Applied Sciences. 11, 4468 (2021)
Transfer Learning for Alloy
Classification Based on Microstructure
Images
Abstract This study investigates the performance of large language models (LLMs),
specifically GPT-4, in developing a deep learning (DL) model to predict alloy types
based on scanning electron microscopy (SEM) images of steel microstructures. The
approach utilizes transfer learning (TL) and an ensemble of two pre-trained models,
ResNet-50 and DenseNet-121, fine-tuned on SEM scans of 33 types of steels. The
two models achieve validation accuracies of 97.6% and 98.4%, respectively, with the
ensemble model reaching a test accuracy of 99.2%. The results underscore the poten-
tial of LLM-assisted coding in computer vision tasks, such as image classification,
within computational materials science. The limitations are also discussed.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 125
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_6
126 A. Deshmukh and B. Eidel
1.1 Data
We use raw SEM scans of microstructures of 33 types of 9% Cr steels [4]. The data
is divided into three folders, one for each alloy (CPJ, HR, P92), each consisting of
subfolders of specific alloys (e.g., CPJ7J), wherein the microstructure images of that
specific alloy are stored. These three families of alloys are closely related in terms
of their functions. They are heat-resistant and exhibit good creep resistance. In total,
there are 837 images and 33 alloys, with a varying number of images for each alloy.
Figure 1 shows an example of the SEM scan of one microstructure.
Remark. In the context of commercial alloys, the abbreviation HR typically
stands for Heat-Resistant. This designation is often used to identify alloys that
are specifically designed to withstand high temperatures while maintaining their
mechanical properties and corrosion resistance. The abbreviation CPJ stands for
Copper-Phosphorus-Joint alloys. These are specialized brazing alloys used for join-
ing copper and copper-based materials. The term P92 refers to a specific type of
alloy used in high-temperature applications, particularly in the power generation and
petrochemical industries. P92 alloy is a type of creep-resistant, ferritic-martensitic
steel, known for its excellent high-temperature strength, oxidation resistance, and
corrosion resistance. It is commonly used in the manufacturing of components like
pipes, tubes, and fittings in high-temperature steam and pressure environments.
The introduction of data science, especially deep learning (DL), to the discipline
of materials science has been revolutionary [1]. DL is a subset of machine learning
(ML) that uses artificial neural networks (NNs), a structure of repeating computa-
tional layers that do linear and non-linear operations to transform input data into a
new useful representation and finally extract knowledge, to solve several types of
problems, such as computer vision, natural language, in various disciplines, such as
medicine, economics, earth sciences, and, of course, computational materials sci-
ence. In computational materials science, deep learning is primarily used to accel-
erate the process of materials optimization and discovery, which includes problems
related to forward simulation and inverse design, to better understand or exploit the
property-structure-process linkages [5].
In theory, sufficiently deep DL models can approximate any function. They are
therefore quite powerful. However, the goal is to not only fit the input data, called
the training data. A model is considered usable only if it is generalizable, or if it also
performs well on out-of-distribution data. Therefore, models are tested on test data,
and the performance metrics associated with this test data indicate how well the model
may perform in the wild. Training DL models also requires a third data set, called
the validation set, to tune the hyperparameters associated with the model, such as the
number of epochs, which is the number of times the model processes the whole dataset
to learn its parameters or weights, or the learning rate, an optimization parameter that
governs the step size the optimization algorithm takes. In supervised deep learning,
at every optimization step, the deep learning model outputs its predictions. These
are compared against the ground truth via the loss function. Minimization of loss
indicates a good fit. An optimization step is taken in the direction that reduces the
loss. After several epochs and consequent weight updates, the model finds the optimal
weights that minimize the loss function, and the problem is considered solved.
Although the amount of data in materials science is on the rise, this increase mainly
corresponds to certain subsets of materials classes and properties. Materials experi-
ments are expensive.
Most of the big data available originates from computations; e.g., DFT databases,
finite element analysis. Materials data is typically not big data for many interesting
problems. This limits materials scientists from utilizing the full potential of DL.
128 A. Deshmukh and B. Eidel
For instance, DL cannot be used for several materials problems where the available
data is scarce because DL is typically data-hungry. However, some DL models are
transferable. For instance, the shallower layers of a convolutional neural network
(CNN) learn more general concepts about an image. In the deeper convolutions,
the kernels reflect more specific knowledge. When two problems are comparable,
a DL model trained for the first problem can be used to fine-tune a model for the
second problem. This is called Transfer Learning (TL), and it is an immensely useful
approach when one has big data for one problem and small data for the problem at
hand—a common scenario in materials science. Refer to [2] for a thorough discussion
on TL.
Because the microstructures of steels look quite similar within the specific class
of steel, predicting with small data alone is challenging—hence, we use TL. We
use ResNet-50 and DenseNet-121, both trained on millions of images from the
ImageNet data, as the two pre-trained models of the ensemble [3, 6]. ResNet-50’s
architecture includes residual connections that help mitigate the vanishing gradient
problem, thereby improving learning in deep networks. DenseNet-121, on the other
hand, features dense connectivity, where each layer is connected to every other layer,
which enhances feature propagation and reduces the number of parameters, leading
to improved performance. Owing to the architectural differences, the two models
may produce different outputs. By combining them in an ensemble, we combine the
strengths of both models.
ResNet-50 and DenseNet models are a good choice because our problem is essen-
tially an image classification problem, and these models, having their weights trained
on ImageNet, should be able to recognize the general image patterns and features.
When doing TL, the pre-trained model is stacked with new layers, or, at minimum,
stacked with only one layer, to match the output dimensions of the second task. Here,
only the final fully connected layer is replaced by a layer that outputs a tensor of
size matching the number of alloy types. Because ImageNet likely does not contain
many images closely resembling alloy microstructures, we fine-tune the weights of
the output layer and the inner layers. To do this, we keep the default setting for
freezing, which is unfrozen. The overall model, with weights imported from the base
pre-trained models and fine-tuned on our small data set, solves our problem (Fig. 2).
Transfer Learning for Alloy Classification Based on Microstructure Images 129
Fig. 2 Transfer Learning. A model trained for Task A on big data is used to build another model
for another task, Task B, which is fine-tuned on the smaller data set. In addition to the previous
model, the new model at minimum has an output layer corresponding to the output dimensions for
Task B
2.2 Prompt
3 Code Listing
Based on the prompt described in Fig. 3, GPT-4 generated the code listed here.
1
2 # Import necessary libraries
3 i m p o r t os
4 i m p o r t copy
5 i m p o r t p a n d a s as pd
6 import random
7 i m p o r t numpy as np
8 from s k l e a r n . m o d e l _ s e l e c t i o n i m p o r t t r a i n _ t e s t _ s p l i t
9 i m p o r t torch
10 from t o r c h v i s i o n i m p o r t t r a n s f o r m s , m o d e l s
11 from torch . utils . data i m p o r t D a t a L o a d e r
12 i m p o r t torch . nn as nn
13 i m p o r t torch . optim as optim
14 from s k l e a r n . m e t r i c s i m p o r t a c c u r a c y _ s c o r e
15 i m p o r t m a t p l o t l i b . p y p l o t as plt
16 from PIL i m p o r t Image
17 from s k l e a r n . p r e p r o c e s s i n g i m p o r t L a b e l E n c o d e r
18
19 # Set r a n d o m s e e d s for r e p r o d u c i b i l i t y
20 r a n d o m _ s e e d = 99
21 torch . m a n u a l _ s e e d ( r a n d o m _ s e e d )
22 torch . cuda . m a n u a l _ s e e d ( r a n d o m _ s e e d )
23 np . random . seed ( r a n d o m _ s e e d )
24 random . seed ( r a n d o m _ s e e d )
25
Transfer Learning for Alloy Classification Based on Microstructure Images 131
26 # Data P r e p a r a t i o n
27
28 # D i r e c t o r y path
29 b a s e _ p a t h = ’/ Users / a d i t y a d e s h m u k h / D e s k t o p / RJSC ’
30
31 # Alloy folders
32 alloy_folders = {
33 ’ h r _ a l l o y s 2 0 2 2 0 2 0 8 ’ : ’ hr ’ ,
34 ’ C P J _ a l l o y s ’: ’ cpj ’ ,
35 ’ P92 OTHER ’ : ’ p92 ’
36 }
37
38 # Creating a dataframe
39 data = { ’ alloy ’ : [] , ’ image ’ : [] , ’ alloy type ’ : []}
40
41 # L o o p i n g t h r o u g h each main f o l d e r and its s u b f o l d e r s
42 for folder , alloy in a l l o y _ f o l d e r s . items () :
43 f o l d e r _ p a t h = os . path . join ( base_path , folder )
44 for s u b f o l d e r in os . l i s t d i r ( f o l d e r _ p a t h ) :
45 s u b f o l d e r _ p a t h = os . path . join ( f o l d e r _ p a t h , s u b f o l d e r )
46 if os . path . isdir ( s u b f o l d e r _ p a t h ) :
47 for file in os . l i s t d i r ( s u b f o l d e r _ p a t h ) :
48 if file . e n d s w i t h ( ’. bmp ’) :
49 f i l e _ p a t h = os . path . join ( s u b f o l d e r _ p a t h ,
→ file )
50 data [ ’ alloy ’ ]. a p p e n d ( a l l o y )
51 data [ ’ image ’ ]. a p p e n d ( f i l e _ p a t h )
52 data [ ’ alloy type ’ ]. a p p e n d ( s u b f o l d e r )
53
54 # C o n v e r t i n g to p a n d a s D a t a F r a m e
55 df = pd . D a t a F r a m e ( data )
56 print ( df . head () )
57
58 # L a b e l E n c o d i n g : C o n v e r t s t r i n g l a b e l s to n u m e r i c a l l a b e l s (
→ which is what the m o d e l e x p e c t s )
59 l a b e l _ e n c o d e r = L a b e l E n c o d e r ()
60 df [ ’ alloy type ’ ] = l a b e l _ e n c o d e r . f i t _ t r a n s f o r m ( df [ ’ alloy type ’
→ ])
61
62 # S p l i t t i n g data
63 # the first line splits data into t r a i n i n g : test sets in the
→ r a t i o 7 0 : 3 0 .
64 # the second line splits the test set into two halves :
→ v a l i d a t i o n set and test set .
65 # so that u l t i m a t e l y we have train : valid : test = 7 0 : 1 5 : 1 5
66 train_df , t e s t _ d f = t r a i n _ t e s t _ s p l i t ( df , t e s t _ s i z e =0.3 ,
→ s t r a t i f y = df [ ’ alloy type ’ ] , r a n d o m _ s t a t e = r a n d o m _ s e e d )
67 val_df , t e s t _ d f = t r a i n _ t e s t _ s p l i t ( test_df , t e s t _ s i z e =0.5 ,
→ s t r a t i f y = t e s t _ d f [ ’ alloy type ’] , r a n d o m _ s t a t e =
→ r a n d o m _ s e e d )
68
69 # Data a u g m e n t a t i o n and n o r m a l i z a t i o n .
70 # Two d i f f e r e n t sets of t r a n s f o r m a t i o n s are n e c e s s a r y b e c a u s e
→ data a u g m e n t a t i o n t r a n s f o r m a t i o n s
71 # only apply to the t r a i n i n g set , not test and v a l i d a t i o n .
72 data_transforms = {
73 ’ train ’: t r a n s f o r m s . C o m p o s e ([
132 A. Deshmukh and B. Eidel
74 t r a n s f o r m s . T o T e n s o r () ,
75 # C r o p the top and b o t t o m 25 p i x e l s to r e m o v e
→ m i s c e l l a n e o u s SEM a n n o t a t i o n s
76 t r a n s f o r m s . L a m b d a ( l a m b d a x : x [: , 25: -25 , :]) ,
77 # R a n d o m l y c r o p and r e s i z e i m a g e s to 224 x224 ( r e q u i r e d
→ for pre - t r a i n e d m o d e l s )
78 t r a n s f o r m s . R a n d o m R e s i z e d C r o p (224) ,
79 # R a n d o m l y a p p l y h o r i z o n t a l f l i p p i n g and r o t a t i o n for
→ data a u g m e n t a t i o n :
80 # Each time an image is l o a d e d d u r i n g training , the
→ D a t a L o a d e r a p p l i e s the t r a n s f o r m a t i o n s r a n d o m l y .
81 # Essentially , this means the model sees s l i g h t l y
→ d i f f e r e n t v e r s i o n s of the t r a i n i n g i m a g e s t h r o u g h o u t
→ the
82 # t r a i n i n g process , which helps it g e n e r a l i z e b e t t e r
→ by l e a r n i n g from a more d i v e r s e set of data
→ r e p r e s e n t a t i o n s .
83 t r a n s f o r m s . R a n d o m H o r i z o n t a l F l i p () ,
84 t r a n s f o r m s . R a n d o m R o t a t i o n (15) ,
85 # N o r m a l i z e i m a g e s b a s e d on pre - d e f i n e d mean and
→ s t a n d a r d d e v i a t i o n of I m a g e N e t d a t a s e t
86 t r a n s f o r m s . N o r m a l i z e ([0.485 , 0.456 , 0.406] , [0.229 ,
→ 0.224 , 0 . 2 2 5 ] )
87 ]) ,
88 ’ val ’: t r a n s f o r m s . C o m p o s e ([
89 t r a n s f o r m s . T o T e n s o r () ,
90 t r a n s f o r m s . L a m b d a ( l a m b d a x : x [: , 25: -25 , :]) ,
91 t r a n s f o r m s . R e s i z e ((224 , 224) ) ,
92 t r a n s f o r m s . N o r m a l i z e ([0.485 , 0.456 , 0.406] , [0.229 ,
→ 0.224 , 0 . 2 2 5 ] )
93 ]) ,
94 }
95
96 # C u s t o m d a t a s e t c l a s s to load images and apply
→ t r a n s f o r m a t i o n s
97 class S t e e l D a t a s e t ( torch . utils . data . D a t a s e t ) :
98 def _ _ i n i t _ _ ( self , dataframe , t r a n s f o r m = None ) :
99 self . d a t a f r a m e = d a t a f r a m e
100 self . t r a n s f o r m = t r a n s f o r m
101
102 def _ _ l e n _ _ ( self ) :
103 r e t u r n len ( self . d a t a f r a m e )
104
105 def _ _ g e t i t e m _ _ ( self , idx ) :
106 i m g _ n a m e = self . d a t a f r a m e . iloc [ idx , 1]
107 image = Image . open ( i m g _ n a m e )
108 label = self . d a t a f r a m e . iloc [ idx , 2]
109
110 if self . t r a n s f o r m :
111 image = self . t r a n s f o r m ( image )
112
113 r e t u r n image , label
114
115
116 # Datasets
117 t r a i n _ d a t a s e t = S t e e l D a t a s e t ( train_df , t r a n s f o r m =
→ d a t a _ t r a n s f o r m s [ ’ train ’ ])
Transfer Learning for Alloy Classification Based on Microstructure Images 133
118 v a l _ d a t a s e t = S t e e l D a t a s e t ( val_df , t r a n s f o r m = d a t a _ t r a n s f o r m s [ ’
→ val ’ ])
119 t e s t _ d a t a s e t = S t e e l D a t a s e t ( test_df , t r a n s f o r m = d a t a _ t r a n s f o r m s
→ [ ’ val ’ ])
120
121 # D a t a L o a d e r s in P y T o r c h are i t e r a t o r s that enable e f f i c i e n t
→ l o a d i n g of data during the training , validation ,
122 # and t e s t i n g p h a s e s of a m a c h i n e l e a r n i n g m o d e l .
123 b a t c h _ s i z e = 32
124 t r a i n _ l o a d e r = D a t a L o a d e r ( t r a i n _ d a t a s e t , b a t c h _ s i z e = batch_size
→ , s h u f f l e = True , w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed
→ ( r a n d o m _ s e e d ) )
125 v a l _ l o a d e r = D a t a L o a d e r ( v a l _ d a t a s e t , b a t c h _ s i z e = batch_size ,
→ w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed ( r a n d o m _ s e e d ) )
126 t e s t _ l o a d e r = D a t a L o a d e r ( t e s t _ d a t a s e t , b a t c h _ s i z e = batch_size ,
→ w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed ( r a n d o m _ s e e d ) )
127
128 # Device configuration
129 # if your device has a gpu , it n e e d s to be d e c l a r e d e x p l i c i t l y
→ . For macbooks , the gpu is mps .
130 # if you use w i n d o w s or linux , you should try cuda i n s t e a d .
131 d e v i c e = t o r c h . d e v i c e ( " mps " if t o r c h . b a c k e n d s . mps . i s _ a v a i l a b l e
→ () else " cpu " )
132
133 # Load and modify p r e t r a i n e d m o d e l s
134 def l o a d _ m o d e l ( model_name , n u m _ c l a s s e s ) :
135 if m o d e l _ n a m e == " r e s n e t " :
136 # When p r e t r a i n e d = True , it means that the model is
→ i n i t i a l i z e d with w e i g h t s that have a l r e a d y been l e a r n e d
→ .
137 m o d e l = m o d e l s . r e s n e t 5 0 ( p r e t r a i n e d = True )
138 # This line e x t r a c t s the n u m b e r of i n p u t f e a t u r e s to
→ the last fully c o n n e c t e d layer ( fc )
139 n u m _ f t r s = m o d e l . fc . i n _ f e a t u r e s
140 # This line r e p l a c e s the last fc with a new fc
→ t a i l o r e d for our task by s e t t i n g its o u t p u t to n u m b e r
→ of classes ,
141 # or n u m b e r of s t e e l a l l o y s .
142 model . fc = nn . L i n e a r ( num_ftrs , n u m _ c l a s s e s )
143 elif m o d e l _ n a m e == " d e n s e n e t " :
144 m o d e l = m o d e l s . d e n s e n e t 1 2 1 ( p r e t r a i n e d = True )
145 num_ftrs = model . classifier . in_features
146 m o d e l . c l a s s i f i e r = nn . L i n e a r ( num_ftrs , n u m _ c l a s s e s )
147 r e t u r n model
148
149
150 # N u m b e r of c l a s s e s
151 n u m _ c l a s s e s = len ( df [ ’ alloy type ’ ]. u n i q u e () )
152
153 # Load models
154 r e s n e t = l o a d _ m o d e l ( " r e s n e t " , n u m _ c l a s s e s ) . to ( d e v i c e )
155 d e n s e n e t = l o a d _ m o d e l ( " d e n s e n e t " , n u m _ c l a s s e s ) . to ( d e v i c e )
156 torch . m a n u a l _ s e e d ( r a n d o m _ s e e d )
157 torch . cuda . m a n u a l _ s e e d ( r a n d o m _ s e e d )
158
159 # Loss and o p t i m i z e r
134 A. Deshmukh and B. Eidel
195
196 r u n n i n g _ l o s s = 0.0
197 running_corrects = 0
198
199 # I t e r a t e over data
200 for inputs , l a b e l s in d a t a l o a d e r s [ phase ]:
201 i n p u t s = i n p u t s . to ( d e v i c e )
202 l a b e l s = l a b e l s . to ( d e v i c e )
203
204 # Zero the p a r a m e t e r g r a d i e n t s
205 o p t i m i z e r . z e r o _ g r a d ()
206
207 # Forward
208 with torch . s e t _ g r a d _ e n a b l e d ( phase == ’ train ’ ) :
209 outputs = model ( inputs )
210 _ , preds = torch . max ( outputs , 1)
211 loss = c r i t e r i o n ( outputs , l a b e l s )
212
213 # B a c k w a r d + o p t i m i z e only if in t r a i n i n g
→ phase
214 if phase == ’ train ’ :
215 loss . b a c k w a r d ()
216 o p t i m i z e r . step ()
217
218 # Statistics
219 r u n n i n g _ l o s s += loss . item () * inputs . size (0)
220 r u n n i n g _ c o r r e c t s += torch . sum ( p r e d s == l a b e l s .
→ data )
221
222 e p o c h _ l o s s = r u n n i n g _ l o s s / len ( d a t a l o a d e r s [ phase
→ ]. d a t a s e t )
223 e p o c h _ a c c = r u n n i n g _ c o r r e c t s . float () / len (
→ d a t a l o a d e r s [ p h a s e ]. d a t a s e t )
224
225 # Record metrics
226 if phase == ’ train ’:
227 train_losses . append ( epoch_loss )
228 t r a i n _ a c c s . a p p e n d ( e p o c h _ a c c . item () )
229 else :
230 val_losses . append ( epoch_loss )
231 v a l _ a c c s . a p p e n d ( e p o c h _ a c c . item () )
232 s c h e d u l e r . step ( e p o c h _ l o s s ) # Adjust learning
→ rate based on v a l i d a t i o n loss
233
234 print ( ’ {} Loss : {:.4 f } Acc : {:.4 f } ’ . f o r m a t ( phase ,
→ epoch_loss , e p o c h _ a c c ) )
235
236 # Deep copy the best - p e r f o r m i n g model
237 if phase == ’ val ’ and e p o c h _ a c c > b e s t _ a c c :
238 best_acc = epoch_acc
239 b e s t _ m o d e l _ w t s = copy . d e e p c o p y ( m o d e l .
→ s t a t e _ d i c t () )
240
241 print ()
242
243 print ( ’ Best val Acc : {:4 f } ’ . f o r m a t ( b e s t _ a c c ) )
244
136 A. Deshmukh and B. Eidel
• Split the data into training (70%), validation (15%), and test (15%) sets
stratified according to the label ‘alloy type.’ Use train test split twice to
achieve this.
• Do data augmentation.
• Use the validation loss to find the optimum number of epochs. Also, use
ReduceLROnPlateau scheduler to tune the learning rate. Set the number
of epochs to 500 for each model.
• Record training loss, training accuracy, validation loss, validation accu-
racy for each epoch for each model. Ultimately, record test accuracy for
the ensemble. Write code to make relevant graphs.
Do not give me only the code skeleton; I want the whole program in one
code block.
Fig. 3 (continued)
4 Performance
Figures 4 and 5 show the performance of the fine-tuned ResNet-50 model and the
fine-tuned DenseNet-121 model respectively. For both models, with an increasing
number of epochs, the training loss and validation loss decrease, and consequently
the training accuracy and validation accuracy increase. At the last epoch, we observe
a validation accuracy of 97.6190% and 98.4127% for ResNet-50 and DenseNet-
121 respectively, as shown in Table 1, with the test accuracy of the ensemble being
99.2063%.
Fig. 4 Performance of ResNet50 Model. (Left:) Training and validation loss. (Right:) Training and
validation accuracy
140 A. Deshmukh and B. Eidel
Fig. 5 Performance of DenseNet121 Model. (Left:) Training and validation loss. (Right:) Training
and validation accuracy
Figure 6 displays a random sampling of 12 images from the test set. The ground
truths and the predicted labels are in good agreement. Note that the images appear
differently from the sample image shown in Fig. 1 because of ImageNet normaliza-
tion; this is a necessary step in pre-processing when using pre-trained models like
ResNet-50.
The observation that the validation loss is almost consistently smaller than the
training loss in the left of Figs. 4 and 5 can have different reasons. Augmented
data may have been harder for the model to learn and because data augmentation
transformations happen only on some samples belonging to the training set and not
at all on the validation set, the validation results remain unaffected. Smaller datasets
(validation) have smaller intrinsic variance than larger datasets (training); recall their
ratio of 70:15. Or this is simply an accident in that the data split is suitable for such
behavior.
Transfer Learning for Alloy Classification Based on Microstructure Images 141
Fig. 6 A random sampling of 12 images from the test set with their ground truths and predicted
labels
142 A. Deshmukh and B. Eidel
5 Discussion
The results show that GPT-4 is successful in generating working code for the problem
of image classification on SEM scans of steel microstructures using TL. While almost
the entirety of the code is generated via the main prompt, some minor tweaking,
through supplemental prompts, was necessary to achieve a fully functional code.
5.1 Errors
The errors are described below in three categories: major errors, i.e. errors which
require a correction for the code to run; minor errors, i.e. errors with which code will
run but perform sub-optimally; and ignored tasks.
• Import statements necessary for certain functions are not always written by default
(e.g., from PIL import Image).
• While creating the dataframe, traversing the relevant directories requires excep-
tion handling to ignore the Not a directory error due to the presence of
.DS_Store files that are present on a Mac. Note that prior handling of such
an error is a foresight that should not be expected of LLMs because this is an
exception, not a norm. Therefore, this is not necessarily a drawback.
• Label encoding is required because original labels are strings when PyTorch
requires numbers. AttributeError: ’tuple’ object has no
attribute ’to’ is encountered when executing labels = labels.to
(device) This suggests that labels is a tuple when it is expected to be a PyTorch
tensor.
• Default values for the parameters of the LR scheduler are used. We test several
models to tune these values. GPT-4 does not provide unique instruction to guide
these values, but optimization requires scanning the hyperparameter space with
repeated tries. This obviously falls beyond what LLMs can do on their own.
Training loss, training accuracy, validation loss, and validation accuracy for each
epoch is not recorded. The visualization task is fully ignored.
5.2 Fixes
The above errors have easy fixes. GPT-4 is good at troubleshooting; one only needs
to copy paste the error encountered by the Python IDE into GPT-4. GPT-4 will either
give one or more potential solutions. Some errors are quite elusive. GPT-4 does not
right away figure out the case for label encoding. It first examines SteelDataset class’s
__getitem__ method and unpacking of labels inside training loop. When we
explicitly state that both of these are correct, it lists more possibilities. One of these
is label transformation, which reminds us that the labels are originally in strings,
when PyTorch requires labels to be numerically encoded. GPT-4 finally suggests
to use label encoding. This is a process; however, it could have been avoided by
pointing this out right in the main prompt. When fixing other problems, such as
replacing ensemble function or obtaining code for previously ignored task, a simple
instruction is usually sufficient. Table 2 lists these supplemental prompts.
In addition to these fixes, the code is slightly modified to make results repro-
ducible; as long as the random seed is left unchanged, whose purpose is to maintain
consistency in random numbers generated in the program, one gets the same results
every time. Please note that we train the model for 500 epochs to observe if any
learning happens over the long run. However, similar results are achievable with
only 50 epochs.
6 Conclusion
We obtain a working code through GPT-4 to predict an alloy from an SEM scan of
its microstructure. The resulting model is an ensemble of two pre-trained models,
ResNet-50 and DenseNet-121, fine-tuned on a training set that includes SEM scans
of microstructures of 33 different types of 9% Cr steels. The model achieves an
accuracy of 99.20% on the test set, indicating that it performs quite well.
GPT-4 is good at unpacking a prompt, understanding it, writing code, explaining
the code in a broad sense as well as line-by-line, when asked to, and the generated
code is also well documented. It helps to have a sufficiently detailed prompt, rich
with information on not only the nature of the problem but also implementation
details. Troubleshooting is easy by simply providing GPT-4 with the traceback of
the errors encountered. Code that needs to be modified or rewritten can be edited
with additional simple prompts.
By solving this problem, we demonstrate that, with human intervention, GPT-
4 is capable of generating code for common computer vision tasks such as image
classification in computational materials science.
References
1. Agrawal, A. & Choudhary, A. Perspective: Materials informatics and big data: Realization of
the “fourth paradigm” of science in materials science. Apl Materials. 4 (2016)
2. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. & He, Q. A Comprehensive
Survey on Transfer Learning. Proceedings Of The IEEE. 109, 43-76 (2021)
3. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proceedings
Of The IEEE Conference On Computer Vision And Pattern Recognition. pp. 770-778 (2016)
4. Rozman, K., Doğan, Ö., Chinn, R., Jablonksi, P., Detrois, M. & Gao, M. Dataset for machine
learning of microstructures for 9% Cr steels. Data In Brief. 45 pp. 108714 (2022)
5. Choudhary, Kamal, et al. Recent Advances and Applications of Deep Learning Methods in
Materials Science. Npj Computational Materials, vol. 8, no. 1, 2022. pp. 1-26. www.nature.
com, https://doi.org/10.1038/s41524-022-00734-6.
6. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Densely Connected Convolutional
Networks. 2017 IEEE Conference On Computer Vision And Pattern Recognition (CVPR) https://
doi.org/10.1109/cvpr.2017.243
Transfer Learning for Microstructure
Image Segmentation
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 145
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_7
146 R. Narkhede and B. Eidel
mated methods have been developed, significantly enhancing accuracy and efficiency
[1, 2].
Transfer learning is a powerful strategy in machine learning where a model devel-
oped for one task is repurposed for a second, related task. This approach is espe-
cially valuable in environments where data are scarce or where training a new model
from scratch is computationally prohibitive. By utilizing models pre-trained on large
datasets, transfer learning not only conserves resources but also enhances model
accuracy, making it a vital tool in domains where data acquisition is challenging [3].
Among the several deep learning architectures available, the ResNet50 [4] archi-
tecture is known for its depth and the use of residual connections, which enabled
training significantly deeper networks than previously feasible. This architecture is
adept at extracting detailed features that are crucial for accurate segmentation, and
is hence often employed as an encoder in segmentation tasks. The encoder trans-
forms input images to rich, feature-dense representations for the model to capture
the image context. Then, these representations are translated back into meaningful
segmentation maps using decoder architectures like the UNet++. UNet++ [5] is an
improvement of the classic UNet architecture, and shows improvement in segmen-
tation accuracy. The overall framework of the microstructure segmentation task is
shown in the Fig. 1.
The task in this chapter is to apply transfer learning for segmentation of microstruc-
ture images of nickel-based superalloys into matrix, secondary and tertiary phases.
In doing so, it uses the ResNet50 architecture as the encoder and the UNet++ as the
decoder. The encoder is pre-trained on a large microscopy dataset named MicroNet.
The application of a pre-trained encoder thus implies use of transfer learning for the
segmentation of nickel-based superalloy dataset. The work in this chapter is largely
based on the example notebook for image named multiclass_segmentation
_example.ipynb provided as supplementary material with [6]. This segmenta-
tion task encompasses the following key steps:
Fig. 1 Schematic showing the use of encoder pre-trained on the MicroNet dataset for microstructure
image segmentation via transfer learning. The encoder (top), like ResNet50, is a convolutional neural
network followed by a dense neural network classifier (shown in the gray rectangular box on the
top), which is trained to classify microscopy images into different material classes. The encoder
is pre-trained on the MicroNet dataset containing more than 100,000 microstructure images. By
transfer learning, the pre-trained encoder is then used in the encoder-decoder segmentation model
(shown in the bottom), which is then trained on a small dataset of microstructure images to segment
the microstructure into separate classes
2 Prompt
• Model setup
Considering these steps, six prompts have been used to generate the desired Python
code for the segmentation task:
The system prompt orients the chatbot to the task. In this system prompt, ChatGPT
4 is instructed to adapt a persona which knows the requisite libraries and knowledge
to execute the segmentation task. Through such a system prompt, the chatbot is also
given specific instructions to mitigate persistently observed errors in the generated
code.
In response to prompt shown in Fig. 2, ChatGPT 4 acknowledges that it is prepared
to help the user with the task and that it anticipates further instructions.
Transfer Learning for Microstructure Image Segmentation 149
Prompt 1
You are an expert computer vision and machine learning engineer. You
have the fundamental knowledge of deep learning, the methods and their
working, popular deep learning architectures for image classification and
segmentation. You are specifically well versed in PyTorch and allied libraries
like albumenations, segmentation models pytorch, DataLoader and so on.
You can load pre-trained models from urls and fine-tune them for the specific
task at hand. You write an error-free code. Specifically, your code encounters
no issues in the dimensions of the various tensors and arrays in the deep
learning process. You analyze code for any errors before presenting it. You
generate complete code with documentation. You write code for the part that
you are asked for, no further steps. Only what’s specifically instructed. Do not
show sample usage until asked for. You track matrix dimensions throughout
the code and handle any dimension errors. You also ensure that any torch
tensor is being moved to the CPU before it is converted to a NumPy array.
Any array dimension errors will leave a poor score on your performance.
The task I want to achieve is complex and I shall tell you the instructions in
a step-wise fashion.
Fig. 2 System prompt provided to ChatGPT 4 before specific instructions for the segmentation
task
As the first and one of the crucial steps of the segmentation task, ChatGPT 4 is
provided all the details to load the data in a suitable format for further processing
and applying deep learning model for segmentation.
The prompt begins with a brief description of the overall segmentation task with
details about the model architecture. This short description acts as a context for the
chatbot in its subsequent responses. It is also asked to add randomization seeds to
ensure that the results are reproducible.
The location of the training, validation and testing data is specified in the prompt.
The dimensions of the images and masks and their file formats are also specified.
Specific code instructions to read the images and masks, and then to assign classes to
the masks are given. The detailed instructions are important to maintain the correct
assignment of annotated masks to the images.
Pre-processing the images involves data augmentation and normalizing the image
data. Data augmentation in image segmentation involves applying transformations
like flipping, rotation, and adjustments to contrast and brightness to existing images,
enhancing dataset diversity and model generalization without increasing the number
of original images. This method helps prevent overfitting, especially with smaller or
less diverse datasets. After augmentation, the images are normalized using the mean
and standard deviation of the pre-trained model, in this case, the ‘ResNet50’ model
pre-trained with the ‘imagenet’ dataset.
150 R. Narkhede and B. Eidel
to load the final pre-processed data for model training, testing and validation Fig. 5
are provided in prompt 3.
Fig. 4 Part one of prompt 3 which provides instructions for pre-processing the data
152 R. Narkhede and B. Eidel
Fig. 4 (continued)
Fig. 5 Part two of prompt 3 which provides instructions for loading the data and visualizing some
images with their masks
For the microstructure image segmentation, the ‘resnet50’ encoder is used with the
‘UNet++’ segmentation model. The pre-trained encoder weights are downloaded
from a url in the model setup. The function get_pretrained_microscopy
_url that generates the url is used as is from the source [6]. In the prompt 4 (Fig. 6),
these details about the architecture and the source url to obtain the pre-trained encoder
weights are provided, along with further details about the inputs for the model setup,
the activation function of model output, usage of GPU if available and the expected
output.
To track the loss over training the model, a custom loss function as used in
the example notebook multiclass_segmentation_example.ipynb that
combines the Dice and binary cross-entropy (BCE) losses is then created. The model
performance is checked using the intersection over union (IoU) metric, which can
Transfer Learning for Microstructure Image Segmentation 153
Fig. 7 Prompt to create the loss function and the model evaluation metric
After setting up the training, validation and test data, the model, the loss function
and the metric, a function to train the model is required. This function shall put the
individual components together, and create the training loop. As shown in the prompt
in Fig. 8, ChatGPT 4 is instructed to set up an optimizer, use the losses as the model
criterion and then set up the training loop. It is also instructed other details such as
use of early stopping by evaluating the IoU metric over the validation data to avoid
Transfer Learning for Microstructure Image Segmentation 155
overfitting, using mixed precision training to reduce memory usage and saving the
model checkpoint regularly after a given number of epochs. Specific instructions are
also provided to ensure that certain commonly observed errors are eliminated.
The performance of the trained model is evaluated on the test data, again with the IoU
metric, but also with help of visualizations. For this purpose, a function is created
that runs one forward pass on the test data by instantiating the trained model. In the
forward pass, the loss and the IoU metric over the test samples is calculated and
their average values are returned. To ensure that this is executed without errors, the
prompt includes a break down of the steps involved in a forward pass.
The predictions are visualized in two ways: (1) by comparing the true masks of
the test samples with the predicted ones, and (2) by overlaying the predicted masks
over the true masks and marking the true positive, true negative, false positive and
false negative pixels with different colors. The functions for both these visualizations
are generated with the instructions in the prompt shown in Fig. 9 .
The code generated based on the prompts provided for the microstructure image
segmentation is presented in this section. Along with the code listings, the outputs
generated and some remarks on the generated code are also mentioned.
From the prompts Figs. 3, 4 and 5, the following Code Listing 1 was generated. Chat-
GPT 4 generates a Python script that imports required libraries, creates a function that
inserts randomization seeds and then a class named MicrostructureDataset
which extracts images and masks according to the specified details, applies transfor-
mation and returns two variables images and masks which contain the extracted
data.
In response to the prompt mentioned in Figs. 4 and 5, ChatGPT 4 first creates
the function get_training _augmentation, which sets all the transforma-
tions mentioned in the prompt in the pipeline in the mentioned sequence. It also
uses the appropriate function get_preprocessing_fn to normalize the image
data with the mean and standard deviation of the ‘resnet50’ model architecture
pre-trained on the ‘imagenet’ dataset. Similarly the function get_validation
_augmentation is also created which only normalizes the image data.
After the data augmentation, the chatbot uses the DataLoader function from
Pytorch, along with the previously created MicrostructureDataset to create
dataloaders for training, validation and testing. Finally, a visualization function is
also created to plot the augmented images and their corresponding masks.
1
Transfer Learning for Microstructure Image Segmentation 157
Fig. 9 Prompt to test the trained model and visualize the predictions
158 R. Narkhede and B. Eidel
Fig. 9 (continued)
2 import os
3 import numpy as np
4 import cv2
5 import torch
6 import random
7 from torch.utils.data import Dataset, DataLoader
8 from albumentations import Compose, Normalize, Resize
9 from albumentations.pytorch import ToTensorV2
10
22 # Paths to dataset
23 base_directory = ’/kaggle/input/ni-super-alloy/Super1’
24 train_images_dir = os.path.join(base_directory, ’train’)
25 train_masks_dir = os.path.join(base_directory, ’train_annot’)
26 val_images_dir = os.path.join(base_directory, ’val’)
27 val_masks_dir = os.path.join(base_directory, ’val_annot’)
28 test_images_dir = os.path.join(base_directory, ’test’)
29 test_masks_dir = os.path.join(base_directory, ’test_annot’)
30
31 class MicrostructureDataset(Dataset):
32 def __init__(self, images_dir, masks_dir, transform=None):
33 self.images_dir = images_dir
34 self.masks_dir = masks_dir
Transfer Learning for Microstructure Image Segmentation 159
35 self.transform = transform
36 self.images = sorted(os.listdir(images_dir))
37 self.masks = sorted(os.listdir(masks_dir))
38
39 def __len__(self):
40 return len(self.images)
41
46 # Read image
47 image = cv2.cvtColor(cv2.imread(img_path), cv2.
→ COLOR_BGR2RGB)
48
49 # Read mask
50 mask = cv2.imread(mask_path, 1)
51
60
61 # Stack masks
62 mask = np.stack(masks, axis=-1)
63
64 if self.transform:
65 augmented = self.transform(image=image, mask=mask)
66 image = augmented[’image’]
67 mask = augmented[’mask’]
68
73 import albumentations as A
74 from albumentations.pytorch import ToTensorV2
75 from segmentation_models_pytorch.encoders import
→ get_preprocessing_params
76
81
112
154 plt.tight_layout()
155 plt.show()
156
It is important for the user to visualize the augmented images and their masks.
The augmented images should encapsulate some of the transformations in the aug-
mentation function and should differ in terms of these transformations from each
other. Moreover, the masks, ‘matrix’, ‘secondary’ and ‘tertiary’ should be consistent
for all images. A common error observed was overlooking the instruction to update
the ‘matrix’ mask with all the remaining unassigned pixels after extracting the ‘sec-
ondary’ and ‘tertiary’ masks. Errors in extracting the masks, and assigning them to
the corresponding pixels in the image will hinder the model from training well. The
major issue being the incorrect assignment of labels to the pixels, and thus obtaining
a higher loss despite multiple epochs of training the model. Figure 10, shows one of
the augmented images and its corresponding masks.
162 R. Narkhede and B. Eidel
Using the model setup prompts Fig. 6, the functions to setup the segmentation model
from the pre-trained weights, the computation of combined Dice and BCE loss, and
the computation of the IoU metric were created. The function get_pretrained
_microscopynet_url that generates the url for downloading the pre-trained
weights is also mentioned here. It is directly taken as is from the GitHub repository
of the reference [6] and not generated using ChatGPT 4.
1 import torch
2 import torch.utils.model_zoo as model_zoo
3 from segmentation_models_pytorch import UnetPlusPlus
4 from segmentation_models_pytorch.encoders import
→ get_preprocessing_params
5
13 Args:
14 encoder (str): pretrained encoder model name (e.g.
→ resnet50)
15 encoder_weights (str): pretraining dataset, either ’
→ micronet’ or
16 ’imagenet-micronet’ with the latter indicating the
→ encoder
17 was first pretrained on imagenet and then finetuned
→ on microscopynet
18 version (float): model version to use, defaults to
→ latest.
19 Current options are 1.0 or 1.1.
20 self_supervision (str): self-supervision method used.
→ If self-supervision
Transfer Learning for Microstructure Image Segmentation 163
23 Returns:
24 str: url to download the pretrained model
25 """
26
36 # setup self-supervision
37 if self_supervision != ’’:
38 version = 1.0
39 self_supervision = ’_’ + self_supervision
40
49 # get url
50 url_base = ’https://nasa-public-data.s3.amazonaws.com/
→ microscopy_segmentation_models/’
51 url_end = ’_v%s.pth.tar’ %str(version)
52 return url_base + f’{encoder}{self_supervision}_pretrained_
→ {encoder_weights}’ + url_end
53
54 # Model Setup
55
56 def setup_segmentation_model(encoder_name=’resnet50’,
→ class_values=None, encoder_weights=’micronet’):
57 if class_values is None:
58 raise ValueError("class_values must be provided and
→ should not be None")
59
75 # Determine device
76 device = torch.device(’cuda’ if torch.cuda.is_available()
→ else ’cpu’)
77 map_location = ’cuda’ if torch.cuda.is_available() else ’
→ cpu’
78
79 # Load custom weights
80 url = get_pretrained_microscopynet_url(encoder_name,
→ encoder_weights)
81 state_dict = model_zoo.load_url(url, map_location=
→ map_location)
82 model.encoder.load_state_dict(state_dict)
83
91 import torch
92 import torch.nn.functional as F
93 import numpy as np
94 import segmentation_models_pytorch as smp
95
111
As shown in Code Listing 2, a function for computing the combined Dice and
BCE loss is created. In some trials, ChatGPT 4 overlooked the specific instruction
of using torch.reshape method to change the dimensions of the outputs and
targets in the loss function. It used the torch.Tensor.view method, which
serves the same purpose of changing array dimensions, but is not generally applicable
to both contiguous and non-contiguous tensors. The IoU is computed according to
the example provided in the prompt.
Putting all the previous functions to use, the model is trained by creating a
train_model function according to the prompt Fig. 8. An appropriate train model
function is created following the instructions in the prompt. However, the instruc-
tions for ensuring that errors are avoided are not always followed. The train model
function can be then used by asking the chatbot for an example usage. The number of
epochs, patience parameter for early stopping, learning rate and the weighing criteria
in the loss function need to be adjusted to train the model with a higher IoU metric
value. Additionally, the user can also prompt ChatGPT 4 to make a plot showing the
decrease in the loss values over the number of epochs.
1 import os
2 import torch
3 import torch.optim as optim
4 from torch.cuda.amp import GradScaler, autocast
5 from torch.utils.data import DataLoader
6 import numpy as np
7 from tqdm import tqdm
8
21 scaler = GradScaler()
22
36 running_loss = 0.0
37 running_iou = 0.0
38
53 # Forward pass
54 with autocast(enabled=True):
55 outputs = model(inputs)
Transfer Learning for Microstructure Image Segmentation 167
56 if outputs.shape[1] == 1:
57 outputs = torch.sigmoid(outputs)
58 else:
59 outputs = torch.softmax(outputs,
→ dim=1)
60
63 if phase == ’train’:
64 # Backward pass and optimization
65 scaler.scale(loss).backward()
66 scaler.step(optimizer)
67 scaler.update()
68
69 # Statistics
70 running_loss += loss.item() * inputs.size
→ (0)
71 iou = compute_iou(outputs, masks)
72 running_iou += iou * inputs.size(0)
73
81 if phase == ’train’:
82 train_losses.append(epoch_loss)
83 train_ious.append(epoch_iou)
84 else:
85 val_losses.append(epoch_loss)
86 val_ious.append(epoch_iou)
87
90 # Early stopping
91 if phase == ’val’:
92 if epoch_iou > best_iou:
93 best_iou = epoch_iou
94 best_model_wts = model.state_dict()
95 epochs_no_improve = 0
96 torch.save(model.state_dict(), ’best_model.
→ pth’)
97 else:
98 epochs_no_improve += 1
99
114 # Save losses and ious for plotting, ensuring tensors are
→ moved to CPU
115 np.save(’train_losses.npy’, np.array(train_losses))
116 np.save(’val_losses.npy’, np.array(val_losses))
117 np.save(’train_ious.npy’, np.array([iou.cpu().numpy() for
→ iou in train_ious]))
118 np.save(’val_ious.npy’, np.array([iou.cpu().numpy() for iou
→ in val_ious]))
119
122 # Usage
123
• Ensure that the training and validation losses and IoU values are being stored after
each epoch.
• A relatively large and quick decrease in the training loss in the first few epochs,
and a consistent increase in the validation IoU value as the losses decrease.
• Check if early stopping is triggered if the validation IoU value of the epochs does
not exceed the last best validation IoU value.
Using the function in Code Listing 3, the model was trained for 250 epochs and a
patience of 30 epochs. The plot in Fig. 11 shows the change in training and validation
loss over the epochs.
Functions for model evaluation and the required visualizations are generated by
ChatGPT 4 as per the prompt Fig. 9. At times, the dimensions of the masks might not
be handled correctly and in such a scenario, minor human intervention is required.
1 import matplotlib.pyplot as plt
2 import matplotlib.patches as mpatches
3
9 with torch.no_grad():
170 R. Narkhede and B. Eidel
18 outputs = model(images)
19 if outputs.shape[1] > 1: # Multiclass segmentation
20 outputs = torch.softmax(outputs, dim=1)
21 else: # Binary segmentation
22 outputs = outputs # Use logits directly
23
28 test_loss.append(loss.item())
29 test_iou.append(iou.cpu().numpy()) # Move IoU to
→ CPU for numpy operations
30
31 avg_test_loss = np.mean(test_loss)
32 avg_test_iou = np.mean(test_iou)
33
42 for i in range(num_samples):
43 idx = random.randint(0, len(test_dataset) - 1)
44 image, true_mask = test_dataset[idx]
45 image = image.to(device).unsqueeze(0)
46
47 with torch.no_grad():
48 output = model(image)
49 if output.shape[1] == 1:
50 output = torch.sigmoid(output)
51 else:
52 output = torch.softmax(output, dim=1)
53 output = output.squeeze().cpu().numpy()
Transfer Learning for Microstructure Image Segmentation 171
54
80 for j in range(4):
81 axs[2 * i, j].axis(’off’)
82 axs[2 * i + 1, j].axis(’off’)
83
84 plt.tight_layout()
85 plt.show()
86
97 with torch.no_grad():
98 output = model(image)
99 if output.shape[1] == 1:
100 output = torch.sigmoid(output)
101 else:
102 output = torch.softmax(output, dim=1)
172 R. Narkhede and B. Eidel
105
150 # Usage
151
The test_model function runs a forward pass by using the trained model with
the test dataset and computes the loss and the IoU metric. While this function posed
no errors in trials, it is still suggested to check if the test accuracy and IoU value are
close to the ones observed for the epoch where the best model was saved. If there
is a significant deviation, the first step is to check if the test_model function has
been implemented correctly. If the deviation is significant despite having a correctly
implemented function, then there are possible issues in the training of the model.
However, such errors did not arise during any of the trials.
Besides checking the loss and IoU values, a visual observation can also provide
a good idea if the model is performing well. Here, the visualization functions are
useful. In Fig. 12a and b, the predicted masks and the true masks for two models:
one trained with 50 epochs and the other trained with 250 epochs, can be observed.
The prediction accuracy can also be checked by overlaying the predicted masks
over the true masks and marking the false positive and false negative pixels. The
Fig. 13a and b show the for the model trained for 50 epochs and the one trained for
250 epochs.
174 R. Narkhede and B. Eidel
(a) 50 epochs
Fig. 12 True and predicted masks for a model trained for different numbers of epochs. The first
row in a and b each displays the true masks and the second row in a and b show the predicted
masks for the same image
4 Discussion
(a) 50 epochs
Fig. 13 Prediction accuracy visualization for the model trained for different number of epochs.
False negative pixels are colored pink and false positive pixels are colored blue
hugely aided by LLMs like ChatGPT 4. However, if the solution approach is already
decided, then the LLM should be instructed with details and required specifics to
expect desirable outcomes. Here, the solution approach is as per the example note-
book multiclass_segmentation_example.ipynb provided with [6] is
followed, and thus any room for assumptions must be eliminated by providing
detailed instructions. With appropriate prompts, it can be observed that ChatGPT
4 provides suitable Python codes for executing the task. The generated codes had
rare instances of syntax errors, and a few logical errors. In this section, these errors
shall be discussed. In addition, the changes made in the prompts and directly to
the generated codes to avoid these errors are also included. The errors are further
categorized as ‘major errors’ and ‘minor errors’.
parameters, along with the correct mean and standard deviation values to normalize
the images. The prompts Figs. 3 and 4 ensure that the images are loaded with the
desired augmentations and normalization.
• Assigning masks to pixels: In the original data, the pixels are annotated in three
different colors to indicate the three phases in the microstructure. Without spe-
cific instructions for extracting the masks, ChatGPT 4 converts all the masks to
grayscale and then loads them to the data for performing the segmentation task.
Furthermore, ChatGPT 4 often overlooked the instruction to update the matrix
mask by assigning all the remaining pixels to it after the secondary and tertiary
masks had been assigned. The result was the matrix phase constituting all pixels
which were black. These errors result in incorrect labels to the sample images,
which shall eventually hinder the model training.
As a fix to this issue, very specific instructions are provided in the prompt Fig. 3
to extract masks from the directory and to assign them to the pixels. In case the
specific instruction for updating the matrix mask after assigning all other masks to
pixels was overlooked, the same prompt was provided again to the chatbot. Usually,
after providing the same prompt again, all the specific details are considered in
the code generation.
• Sticking to old versions of imported library methods: For some specific cases,
e.g. for using the IoU metric from the segmentation_models_pytorch
library, ChatGPT 4 used the syntax from the older versions. Until the latest syntax
of the functions was not provided explicitly through an example usage from the
library documentation, ChatGPT 4 continued to use the older version. This might
be possibly due to the fact that LLMs such as ChatGPT 4 are not trained frequently,
and the current version might be trained when the older versions of these libraries
existed.
By just providing an example usage in prompt Fig. 6, ChatGPT 4 could adapt its
code response to the latest syntax of the IoU metric.
• Model checkpointing: Despite specific instructions for model checkpointing in
prompt Fig. 8, ChatGPT 4 missed including regular model checkpointing after a
given number of epochs. This step is crucial if the training takes places over a high
number of epochs and requires long training periods.
in the system prompt Fig. 2. However, despite such an instruction, the code had
some cases of dimension error. Another important step to reduce the number
of dimension errors, is to specify the shape of the image and mask files in the
prompt. This reduces the frequency of dimension errors, especially in loading
the data. Finally, ChatGPT 4 is instructed to raise errors if the dimensions are
inappropriate, for e.g. in the train_model function, before passing the masks
and the outputs to the loss functions.
• Moving variables to CPU before typecasting to NumPy arrays: Since the model
is allowed to work on the GPU for training, the PyTorch tensors and the model are
all mapped to the GPU. However, when certain functions or visualizations require
the variables as NumPy arrays, the PyTorch tensors are not always moved to the
CPU before converting them to NumPy arrays.
By providing specific instructions in the prompts, e.g. in prompts Figs. 2, 8 and 9,
this error could be eliminated.
• Reshaping PyTorch tensor dimensions: Two methods are commonly used to
reshape PyTorch tensors: torch.reshape and torch.Tensor.view. The
latter is not applicable to non-contiguous tensors, i.e. it is not generally applicable.
This posed errors in the loss function for the combined Dice and BCE losses.
ChatGPT 4 often overlooked the instruction to use torch.reshape.
5 Conclusion
References
1. Akers, S., Kautz, E., Trevino-Gavito, A., Olszta, M., Matthews, B., Wang, L., Du, Y. & Spurgeon,
S. Rapid and flexible segmentation of electron microscopy data using few-shot machine learning.
Npj Computational Materials. 7, 187 (2021,11,17), https://doi.org/10.1038/s41524-021-00652-
z
2. Stan, T., Thompson, Z. & Voorhees, P. Building towards a universal neural network to segment
large materials science imaging datasets. Developments In X-Ray Tomography XII. 11113 pp.
111131G (2019), https://doi.org/10.1117/12.2525290
3. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. & He, Q. A Comprehensive
Survey on Transfer Learning. (2020)
4. He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. (2015)
5. Zhou, Z., Siddiquee, M., Tajbakhsh, N. & Liang, J. UNet++: A Nested U-Net Architecture for
Medical Image Segmentation. (2018)
6. Stuckner, J., Harder, B. & Smith, T. Microstructure segmentation with deep learning encoders
pre-trained on a large microscopy dataset. Npj Computational Materials. 8, 200 (2022,9,19),
https://doi.org/10.1038/s41524-022-00878-5
Elastic Wave Propagation
Abstract This chapter investigates the ability of ChatGPT to generate code in the
field of Computational Materials Science, specifically using GPT-4. The focus of this
study is on solving the wave equation in 2D, considering elastic wave propagation,
and visualizing the results using Python with a finite difference time domain (FDTD)
scheme. The wave equation is fundamental in understanding how waves, such as
sound or seismic waves, travel through different media. The FDTD method is a
numerical technique used to model wave propagation by discretizing both time and
space. This involves setting up a computational grid, applying initial and boundary
conditions, and iteratively solving the wave equation to simulate the wave’s behavior
over time. Visualization of the results helps in interpreting the physical phenomena
and verifying the accuracy of the computational model.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 181
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_8
182 M. S. Qureshi and B. Eidel
Finite difference methods are numerical techniques used to approximate the deriva-
tives using finite differences. FDTD method was introduced by Kane S. Yee for
numerically solving time-dependent Maxwell’s equations [2]. The method involves
discretizing the space and time partial derivatives based on central-difference approx-
imations [5]. Applying FDTD approximations to (1) we obtain exemplarily for the
derivatives of displacement component .u x
∂ 2u x u x (x + h, y; t) − 2u x (x, y; t) + u x (x − h, y; t)
. ≈ , (2)
∂x2 h2
∂ 2u x u x (x, y + h; t) − 2u x (x, y; t) + u x (x, y − h; t)
. ≈ , (3)
∂y 2 h2
∂ 2u x u x (x, y; t + t) − 2u x (x, y; t) + u x (x, y; t − t)
. ≈ . (4)
∂t 2 t 2
Without loss of generality, we consider a square domain with a uniform spa-
tial grid of size .h both in .x and . y directions and for uniform temporal dis-
cretization .t as the constant time step size. Using (2), (3), (4) in (1) along with
the notations .u(x, y; t) =: u nx,y , .u(x, y; t ± t) =: u n±1
x,y , .u(x ± h, y; t) =: u x±1,y ,
n
.u(x, y ± h; t) =: u x,y±1 , where .n refers to the current time .tn , and .n ± 1 to time
n
1 n+1 c2 n
. u − 2u n
+ u n−1
= u x+1,y + u nx−1,y + u nx,y+1 + u nx,y−1 − 4u nx,y .
t 2 x,y x,y x,y
h 2
(5)
Elastic Wave Propagation 183
c2 t 2 n
u n+1 =
. x,y
2
u x+1,y + u nx−1,y + u nx,y+1 + u nx,y−1 − 4u nx,y + 2u nx,y − u n−1
x,y . (6)
h
Furthermore, initial conditions (ICs), boundary conditions (BCs) have to be defined
for a full-fledged Initial Boundary Values Problem (IBVP); the numerical solution
scheme requires the choice of the discretization parameters .x, .y and .t.
Two types of excitation functions are considered to be used for defining ICs in the
domain.
• Sinusoidal function
.u(t) = A sin(2π f t + φ), (7)
where . A is the amplitude, . f is the frequency and .φ is the phase angle, see Fig. 1
(bottom).
• Gaussian function
(t − b)2
.u(t) = a exp − , (8)
2c2
where .a is the height of the curve’s peak, .b is the center of the peak, and .c is the
standard deviation, see Fig. 1 (top).
184 M. S. Qureshi and B. Eidel
• Mur or absorbing BC, first introduced for time-dependent Maxwell equations [3]:
total absorption at the boundaries, no reflection of the waves. In the discretized
setting, the Mur BC are [7]
h − ct n+1
u n+1 = u n1,y −
. 0,y u 1,y − u n0,y , (10)
h + ct
h − ct n
u n+1
. N ,y = u nN −1,y + N −1,y ,
u N ,y − u n+1 (11)
h + ct
h − ct n+1
u n+1
. x,0 = u nx,1 − u x,1 − u nx,0 , (12)
h + ct
Elastic Wave Propagation 185
h − ct n
u n+1 = u nx,N −1 +
. x,N x,N −1 .
u x,N − u n+1 (13)
h + ct
Here, (10) refers to waves travelling in .−x direction or the left side of the square
domain, (11) refers to waves travelling in .+x direction or the right side of the
square domain, (12) refers to waves travelling in .−y direction or the bottom side
of the square domain, (13) refers to waves travelling in .+y direction or the top
side of the square domain.
For numerical solutions of PDEs involving explicit time integration, the convergence
condition of Courant-Friedrichs-Lewy (CFL) is a necessary condition for conver-
gence. As a consequence of the explicit time integrator FDTD, the time step must
be less than a certain upper bound, given a fixed spatial increment; here, with .cx and
.c y the wave velocities in . x and . y directions, the CFL condition for a uniform spatial
grid size .h can be given according to [6]
cx t c y t
. + ≤ Cmax , (14)
h h
which is met by a value for .Cmax of 1. The equality condition in (14) is exploited to
calculate the critical time step .tmax . To satisfy the CFL condition, the time step .t
must be either equal or smaller than .tmax .
Once the updated displacement field is computed for the entire domain, elastic strains
are calculated using the gradient of the displacement field, and, based on that, stresses
using Hooke’s law. Isotropic, linear elastic material behavior is assumed to hold.
∂u x ∂u y ∂u y ∂u x
ε =
. x , εy = , γx y = + . (15)
∂x ∂y ∂x ∂y
Here, .εx and .ε y are the normal strain components, and .γx y is the shear strain.
E
σ
. xx = (εx + νε y ), (16)
1 − ν2
E
σ
. yy = (ε y + νεx ), (17)
1 − ν2
E
σ
. xy = (γx y ) . (18)
1+ν
186 M. S. Qureshi and B. Eidel
Here, .σx x and .σ yy are the normal stresses, and .σx y is the shear stress. . E is the Young’s
modulus and .ν is the Poisson’s ratio of the material. From the normal stresses and
the shear stress, principal stresses are calculated according to
2
σx x + σ yy σx x − σ yy
.σ1 , σ2 = ± + σx2y . (19)
2 2
1.7 Problems
The proper functionality of the generated code for the wave equation model and
the FDTD solution scheme shall be tested in two problems. The problem defined in
Table 1 has a corresponding reference code available on the internet [7], hence all
the parameters and values are selected to match the reference code. Here, the time
step size is significantly lower than the critical time step. Since this problem merely
aims at a quantitative, numerical analysis of wave propagation, we can safely restrict
for the parameters on their dimensions instead of explicit units. The output from the
GPT-4’s code and the reference code is compared and discussed in Sect. 4.
The problem in Table 2 adds further complexity to the previous problem in terms
of a heterogeneous domain, two types of excitation functions, strain and stress cal-
culations and two types of stress plots. Here, the time step size is equal to the critical
time step.
188 M. S. Qureshi and B. Eidel
2 Prompt
The final prompt is given in Fig. 2. It is used for generating the Code Listing 1. The
proper labelling of the plots had to be done manually.
For Problem 2: Stress Study, the level of flexibility aimed could not be captured
in a single prompt. The Code Listing 2 is generated iteratively. In addition to that,
manual intervention is carried out to label the plots.
Elastic Wave Propagation 189
Fig. 2 (continued)
3 Code Listing
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from matplotlib.animation import FuncAnimation
4
5 # Simulation parameters
6 size = 200 # domain size
7 dx = dy = 1.0 # spatial discretization
8 dt = 1.0 # time step
9 c = 0.2 # wave speed
10
16 # Initialize fields
17 u = np.zeros((size, size)) # current wave field
18 u_prev = np.zeros((size, size)) # previous wave field
19 u_next = np.zeros((size, size)) # next wave field
20
21 # Source parameters
22 source_amplitude = 80
23 source_frequency = 0.04
24 source_position = (100, 100)
25
26 def source(t):
27 """ Returns the source value at time t """
28 return source_amplitude * np.sin(2 * np.pi * source_frequency
→ * t)
29
37 # Apply source
38 u_next[source_position] += source(t)
39
56 def animate(i):
57 """ Animation function """
58 update(u, u_prev, i, boundary_condition=’Mur’) # Toggle ’Mur
→ ’ or ’Dirichlet’
59 img.set_data(u)
60 img.set_clim(-source_amplitude, source_amplitude)
61 return img,
62
73 # Create animation
74 ani = FuncAnimation(fig, animate, frames=1000, interval=50, blit=
→ True)
75
8 import numpy as np
9 import matplotlib.pyplot as plt
10 from matplotlib import animation, rc
11
22 domain_size = 200
23 dx = 1.0
24 dy = 1.0
25 dt = dx / (max(c1, c2)*(2)) # CFL condition for stability
26
27 calculate_principal_stresses = True
28 boundary_type = ’Mur’ # Can be ’Mur’ or ’Dirichlet’
29 excitation_type = ’gaussian’ # Can be ’sinusoidal’ or ’gaussian’
30
31 # Initialize fields
32 x = np.linspace(0, domain_size-1, domain_size)
33 y = np.linspace(0, domain_size-1, domain_size)
34 u = np.zeros((domain_size, domain_size))
35 u_prev = np.zeros((domain_size, domain_size))
36 u_next = np.zeros((domain_size, domain_size))
37 stress_xx = np.zeros((domain_size, domain_size))
38 stress_yy = np.zeros((domain_size, domain_size))
39 stress_xy = np.zeros((domain_size, domain_size))
40
41 # Source definitions
42 sources = [{"position": (100, 100), "frequency": 0.04, "amplitude
→ ": 80}]
43
46 if excitation_type == ’sinusoidal’:
47 omega = 2 * np.pi * frequency
48 return amplitude * np.sin(omega * t)
Elastic Wave Propagation 193
69 # Apply sources
70 for source_info in sources:
71 src_x, src_y = source_info["position"]
72 u_next[src_x, src_y] += source(t, source_info["frequency"
→ ], source_info["amplitude"], excitation_type)
73
94 u_prev, u = u, u_next.copy()
95
96 # Recalculate stresses
97 grad_u_x = np.gradient(u, axis=0) # Gradient along x-axis
98 grad_u_y = np.gradient(u, axis=1) # Gradient along y-axis
99 for i in range(1, domain_size-1):
100 E, nu, _ = get_properties(i)
101 stress_xx[i, 1:-1] = E / (1 - nu**2) * (grad_u_x[i, 1:-1]
→ + nu * grad_u_y[i, 1:-1])
102 stress_yy[i, 1:-1] = E / (1 - nu**2) * (grad_u_y[i, 1:-1]
→ + nu * grad_u_x[i, 1:-1])
103 stress_xy[i, 1:-1] = E / (2 * (1 + nu)) * (grad_u_x[i,
→ 1:-1] + grad_u_y[i, 1:-1])
104
105 if calculate_principal_stresses:
106 # Calculate principal stresses
107 sigma_avg = (stress_xx + stress_yy) / 2
108 sigma_diff = (stress_xx - stress_yy) / 2
109 R = np.sqrt(sigma_diff**2 + stress_xy**2)
110 principal_stress_1 = sigma_avg + R
111 principal_stress_2 = sigma_avg - R
112 return principal_stress_1, principal_stress_2
113 else:
114 return stress_xx, stress_yy, stress_xy
115
For the displacement study in Problem 1, a reference code [7] –equally based on the
FDTD method– is used to test the output of GPT-4’s code. The problem parameters
as listed in Table 1 are chosen exactly the same for a proper comparison.
4.1 Test 1
the cases. Hence, it is concluded that the ICs, BCs and the FDTD method are working
as intended.
4.2 Test 2
For the stress study in Problem 2, the solution shall be tested for plausibility for a
layered domain in terms of stresses and strains. Note that there are no units associated
with the colormap. In the code, the minimum and maximum of colormap are defined
as the minimum and maximum of the respective stress, but due to the dynamic
nature of the plot, the program is unable to provide determine a useful range of the
colormap. From the Code Listing 8.2, the correct calculation of strain and stress
are easily verified. Furthermore, Fig. 5 indicates that the stress field travels faster in
198 M. S. Qureshi and B. Eidel
Fig. 5 Stress contour plots for Gaussian excitation along with Mur BCs and a higher speed of
sound in the lower half of the simulation domain
the lower half (chromium) of the domain than in the upper half (aluminum), which
is consistent with chromium having a higher wave speed constant, 6280 m/s, than
aluminum, 5090 m/s.
These conclusions can further be applied to the other excitation function and BC.
5 Discussion
• Finally modify the prompt with the knowledge gained from the iterations. Ask
GPT-4 to specially focus in areas where most frequent errors occurred. Briefly
explain the structure of the required code in terms of different functions etc., using
the knowledge of the code with correct output.
The methodology mentioned above works for the displacement study of Problem
1, but does not work properly for the stress study in a heterogeneous domain referred
to as Problem 2; the reason is the multitude of features that are requested seemingly
can not be successfully captured in one prompt in a way that it is processed by the
Chatbot.
For that reason, the set of Problem 2 calls for a solution strategy of prompt engi-
neering, where all the required features in the code are communicated in several
prompts. All attempts to generalize these successive prompts have failed, since they
resulted in a large variety of different code outputs generated by GPT-4 in each try.
It is this kind of instability to generate for the same prompt a unique code of granted
quality, which forbids to present a prompt for Problem 2 in this chapter. Instead we
restrict to the generated code that, finally, solved the problem.
Some of the important aspects that are noticed while working on the prompts,
generating code and testing are:
• GPT-4 frequently uses variables that were not defined in the code. Hence, a
NameError shows up while running the code.
• When a detailed prompt is provided to GPT-4, it sometimes ignores certain fea-
tures completely. Sometimes it defines a certain function only as a comment line
describing its usage, and the implementation part is left empty.
• Another common error of GPT-4 is not to define the origin of the domain correctly.
The .x-axis of the plot would start from the bottom left corner of the square and
the . y-axis from the top left corner of the square.
• GPT-4 makes a lot of errors regarding the animation part of the code. A lot of
debugging and iterations are required to make that part of the code function as
intended.
• Even if an expression is explicitly defined, GPT-4 would sometimes use a different
one. For example, CFL formula to calculate .t would be explained in the prompt,
but GPT-4 would use a different formula.
• A recurring issue is GPT-4 coming to a halt while generating code. The speed of
the text generation is very slow.
• Reproducibility remains a severe issue for GPT-4. While testing the prompt men-
tioned in Code Listing 8.1, the correct output is generated most of the times, but
not always. In case there are some errors in the output code, regenerating the code
usually results in removal of those errors.
• GPT-4 makes a lot of errors while implementing absorbing BC, which would result
in reflection of waves at the boundaries. To fix this, the discretized formula (10)
is included in the prompt as a reference, which resolved the issue.
200 M. S. Qureshi and B. Eidel
6 Conclusion
GPT-4 was used to numerically solve the 2D wave equation using finite difference
time domain scheme. As it turned out, GPT-4 exhibits enough domain expertise both
for the wave problem and the numerical method to generate a correct code solution
for a wave propagation displacement study along with absorbing as well as reflecting
boundary conditions. In a problem of wave propagation in a heterogeneous medium
along with stress computation, hence a task of increased complexity, GPT-4 revealed
deficits which called for manual/human intervention. Despite its demonstrated per-
formance in code generation, GPT-4 showed its current limitations in this project;
it suffers (i) from unstable code generation, where exactly the same prompt results
in different outputs, and (ii) from the inability to account for all the different tasks
in a prompt for complex problems. These disadvantages make formulating single
prompts for complex problems difficult, and make successive prompts and manual
intervention unavoidable.
References
1. Kashefi, A., Mukerji, T. ChatGPT for programming numerical methods. Journal of Machine
Learning for Modeling and Computing. 4 pp. 1–74 (2023), http://dx.doi.org/10.1615/
JMachLearnModelComput.2023048492
2. Yee, K. Numerical solution of initial boundary value problems involving maxwell’s equations
in isotropic media. Institute of Electrical and Electronics Engineers (IEEE). 14 pp. 302–307
(1966), http://dx.doi.org/10.1109/TAP.1966.1138693
3. Mur, G. Absorbing Boundary Conditions for the Finite-Difference Approximation of the Time-
Domain Electromagnetic-Field Equations. IEEE Transactions on Electromagnetic Compatibil-
ity. EMC-23 pp. 377–382 (1981), http://dx.doi.org/10.1109/TEMC.1981.303970
4. Agarwal, R.P., O’Regan, D. Two-Dimensional Wave Equation. In: Ordinary and Partial Differ-
ential Equations. Springer, New York, NY. pp. 292–299 (2009), https://doi.org/10.1007/978-0-
387-79146-3_37
5. Linge, S., Langtangen, H.P. Wave Equations. In: Finite Difference Computing with PDEs. Texts
in Computational Science and Engineering, vol 16. Springer, Cham. pp. 93–205 (2017), https://
doi.org/10.1007/978-3-319-55456-3_2
6. Wikipedia contributors. (https://en.wikipedia.org/wiki/Courant-Friedrichs-Lewy_condition),
[Online; accessed July-2024]
7. Hill, C. (https://scipython.com/blog/the-two-dimensional-wave-equation/), [Online; accessed
July-2024]
Electromagnetic Wave Propagation
in Dielectric Media
1 Introduction
The study of electromagnetic wave propagation through various media is essential for
developments in telecommunications, radar, and electronic systems engineering. The
Finite-Difference Time-Domain (FDTD) method, pioneered by Yee in 1966, offers a
dynamic numerical technique for solving Maxwell’s equations in the time-domain.
This method is highly regarded for its direct algorithmic form, which is capable of
managing complex boundary conditions (BC) and varying media properties.
With the advancement of computational methods, FDTD has been widely applied
in conjunction with fluid dynamics concepts to enhance the study of wave propa-
gation in various media. Fluid dynamics principles, such as those encapsulated by
the Boltzmann equation, allow for the modeling of wave interactions in complex
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 201
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_9
202 T. V. Anvekar and B. Eidel
Fig. 1 Transition from 3D to 2D simulation for efficient analysis. The left image a shows a 3D
simulation setup, while the right image b illustrates the 2D cross-sectional view used for the simu-
lation with a cylindrical dielectric medium with a diameter (.d) of 20 cm. The dielectric cylinder is
positioned at the center of the domain. The Gaussian source emits in all directions
racy and effectiveness. This investigation not only evaluates the reliability of AI in
generating viable scientific computing algorithms but also examines its potential to
simplify complex simulation tasks [1, 6, 7].
∇ · D = ρ,
. (1)
. ∇ · B = 0, (2)
∂B
.∇ × E = − , (3)
∂t
∂D
.∇ × H = J + , (4)
∂t
where. E is the electric field,. H is the magnetic field,. D is the electric flux density,. B is
the magnetic flux density, .ρ is the charge density, and . J is the current density. These
equations can be discretized for computational purposes using the FDTD method.
204 T. V. Anvekar and B. Eidel
Fig. 2 Illustration of a Transverse Magnetic (TM) wave. The magnetic field is confined to the .x-. y
plane, while the electric field has a component in the .z direction
visualized in Fig. 2. The Maxwell’s curl equations for this mode are
∂ Hx 1 ∂ Ez
. =− , (5)
∂t μ ∂y
∂ Hy 1 ∂ Ez
. = , (6)
∂t μ ∂x
∂ Ez 1 ∂ Hy ∂ Hx
. = − , (7)
∂t ∂x ∂y
where . is the permittivity and .μ is the permeability of the medium. These equations
are discretized in both time and space, allowing for the fields to be updated iteratively
at each time step.
Electromagnetic Wave Propagation in Dielectric Media 205
TE : E z = 0,
. Hx = Hy = 0 (8)
TM : Hz = 0,
. Ex = E y = 0 (9)
Considering the TM wave, the update equation for the electric field component
. E z is
n+1 (i, j) = E n (i, j) + t
H yn (i, j) − H yn (i − 1, j) Hxn (i, j) − Hxn (i, j − 1)
. Ez z − .
x y
(10)
For the magnetic field components . Hx and . Hy the update equations are
t E zn (i, j + 1) − E zn (i, j)
. Hxn+1/2 (i, j) = Hxn−1/2 (i, j) − , (11)
μ y
t E zn (i + 1, j) − E zn (i, j)
. Hyn+1/2 (i, j) = Hyn−1/2 (i, j) + . (12)
μ x
Here, .x and .y represent the spatial steps in the .x and . y directions, respectively,
while .t denotes the time step size. The indices .i and . j refer to the grid points
in space, and .n indicates the time step. The magnetic field components . Hx and . Hy
are staggered with respect to the electric field component . E z . This staggering is
crucial for accurately calculating the fields at each time step, following the Yee cell
structure, which is a standard approach in FDTD simulations. The interleaving of
these fields, as depicted in Fig. 3, ensures that the electromagnetic fields propagate
correctly throughout the computational domain. These update equations are applied
iteratively to achieve this propagation.
206 T. V. Anvekar and B. Eidel
Fig. 3 Interleaving of the. E and. H fields for the two-dimensional TM formulation, sketch adopted
from [4]
As illustrated in Fig. 4, the parameter distribution within the PML ensures that outgo-
ing waves are effectively absorbed. The overlapping regions at the corners indicate
where both sets of parameters intersect, which is crucial for minimizing reflections.
The parameters .gi2 , gi3 , g j2 , g j3 are given by
1
g = g j2 =
. i2 (14)
σ t
1+
20
σ t
20
g = g j3
. i3 = (15)
σ t
1+
20
Fig. 4 This figure illustrates the parameter distribution within the PML (Perfectly Matched Layer)
region of a 2D FDTD simulation. The overlapping regions at the corners indicate where both sets
of parameters intersect, ensuring effective absorption of outgoing waves
208 T. V. Anvekar and B. Eidel
The total magnetic field in the . y-direction can be evaluated recursively as follows
1 1 1
Hyn+1/2 i + , j = gi3 i + Hyn i + , j
2 2 2
. (18)
1 1
− 0.5gi2 i + ∇ × E − gi1 ( j)Hyn i + , j ,
2 2
The effectiveness of the PML depends on the choice of parameters such as the con-
ductivity profile. A common approach is to use a polynomial grading of conductivity,
which gradually increases from zero at the interface to a maximum value at the outer
edge of the PML. The conductivity .σ in the PML can be expressed as [4]
m
x
σ (x) = σmax
. , (22)
dPML
Electromagnetic Wave Propagation in Dielectric Media 209
where .dPML is the thickness of the PML, .σmax is the maximum conductivity, and .m
is the grading order. By carefully selecting these parameters, the PML can effec-
tively absorb outgoing waves over a broad range of angles and frequencies, ensuring
minimal reflection and accurate simulation results.
where . E 0 is the peak amplitude, .t0 is the time delay, .σ is the pulse width, and . f c is
the central frequency of the sinusoidal wave. The Gaussian envelope ensures that the
pulse is localized in time, while the sinusoidal component determines its frequency
content.
In the context of FDTD, the source can be introduced into the simulation domain
by updating the electric field component at a specific grid point. This allows the study
of the pulse’s interaction with various media and boundaries.
This implies that changes in energy density within a volume are due to the
net flux of energy across the volume’s boundaries. The stabilization of energy in
the graph indicates conservation of internal energy after accounting for boundary
absorption [2].
In this work, a circular dielectric medium is introduced into the 2D FDTD com-
putational domain to investigate the interaction of electromagnetic waves with a
dielectric object as illustrated in Fig. 1. The dielectric cylinder is characterized by a
relative permittivity (.r ) of 30 and a conductivity (.σ ) of 0.3 S m.−1 . The diameter of the
cylinder is 20 cm, representing both lossless and lossy linear isotropic homogeneous
media (LIHMs).
A lossless linear isotropic homogeneous medium (LIHM) is one where the mate-
rial has zero conductivity (.σ = 0 S m.−1 ). In such a medium, electromagnetic waves
can propagate without any energy loss. The material’s properties, like relative per-
mittivity (.r ), affect the speed and direction of the wave, but not its amplitude.
In contrast, a lossy LIHM is a material with non-zero conductivity (.σ > 0 S m.−1 ).
In this case, the medium absorbs some of the electromagnetic wave’s energy, leading
to attenuation of the wave as it propagates. The dielectric cylinder in this work,
characterized by .σ = 0.3 S m.−1 , is an example of a lossy medium, which allows the
study of both idealized (lossless) and realistic (lossy) conditions.
210 T. V. Anvekar and B. Eidel
The presence of the dielectric cylinder affects the propagation of the electromag-
netic waves, causing phenomena such as reflection, refraction, and scattering. These
effects are crucial for understanding the behavior of waves in real-world applications,
such as radar and wireless communications.
The update equations for the FDTD simulation in the presence of a dielectric
medium are modified to account for the material properties. For the electric field
component . E z , the update equation reads
σ t
1− t Hyn (i, j) − Hyn (i − 1, j)
n+1
.Ez (i, j) = 2
σ t
E zn (i, j) +
1+ 1 + σ2t x
2
Hxn (i, j) − Hxn (i, j − 1)
− , (24)
y
where . is the permittivity of the dielectric medium, .t is the time step, and .x and
y are the spatial steps in the .x and . y directions, respectively.
.
For the magnetic field components . Hx and . Hy , the update equations remain the
same as in the free space case, as the magnetic permeability is assumed to be that of
free space (.μ = μ0 )
t E zn (i, j + 1) − E zn (i, j)
. Hxn+1/2 (i, j) = Hxn−1/2 (i, j) − , (25)
μ0 y
t E zn (i + 1, j) − E zn (i, j)
. Hyn+1/2 (i, j) = Hyn−1/2 (i, j) + . (26)
μ0 x
By integrating the circular dielectric medium into the FDTD simulation, the study
aims to provide insights into the complex interactions of electromagnetic waves with
dielectric materials. Understanding these interactions is essential for the design and
analysis of various electromagnetic systems, such as radar and wireless commu-
nication systems. The simulation helps to visualize and quantify the effects of the
dielectric medium on wave propagation, enabling better prediction and optimization
of system performance [4].
The algorithm for simulations of electromagnetic wave propagation in dielectric
media is given in the Algorithm Box 1.
Electromagnetic Wave Propagation in Dielectric Media 211
17 – Store results every 10 steps and calculate total energy: if n mod 10 == 0 then
18 – Store current . E z field for animation
19 – Compute and store total energy in the system
20 end
21 Visualization:
22 – Create 2D animation of . E z over time
23 – Generate 3D surface plots of . E z at selected time steps
24 – Plot total electromagnetic energy as a function of time
212 T. V. Anvekar and B. Eidel
Table 1 Summary of constants, material properties, source configuration, and PML BCs
Parameters Values/Types Units
Constants
Speed of light (.c0 ) .3 × 108 m s.−1
Vacuum permittivity (.0 ) .8.54 × 10
−12 F m.−1
Vacuum permeability (.μ0 ) .4π × 10
−7 H m.−1
Material properties
Relative permittivity (.r ) 30 –
Conductivity (.σ ) 0.30 S m.−1
Diameter of dielectric cylinder 0.20 m
Simulation parameters
Spatial step in .x (.x) 0.01 m
Spatial step in . y (.y) 0.01 m
x
Time step (.t) .0.9 × √ s
c0 2
Domain length in .x (. L x ) 2.0 m
Domain length in . y (. L y ) 2.0 m
Position of dielectric center 1.0, 1.0 m
Total simulation time (.T ) .1 × 10
−7 s
Source configuration
Nx N y
Position . , = (0.5, 1.0) m
4 2
Pulse width .10
−10 s
.5 × 10
Frequency (. f 0 ) 6 Hz
PML BC
Thickness 20 cells
Max conductivity (.σmax ) 0.1 S m.−1
Electromagnetic Wave Propagation in Dielectric Media 213
2 Prompt
• FDTD Setup: Instructions to define the spatial and time steps, initialize the field
arrays, and apply the FDTD update equations for both electric and magnetic fields.
Instructions to incorporate the material properties and PML BC into the simulation.
• Source Implementation: Instructions to define the source configuration, including
its position, pulse width, and frequency.
• Post-Processing: Instructions to compute the outputs like electric and magnetic
field distributions, visualize the fields over time, and analyze the wave interactions
214 T. V. Anvekar and B. Eidel
3 Code Listing
8 import numpy as np
9 import matplotlib.pyplot as plt
10 import matplotlib.animation as animation
11 from matplotlib import cm
12 from mpl_toolkits.mplot3d import Axes3D
13
14 # Constants
15 c0 = 3e8 # Speed of light in vacuum
16 eps0 = 8.854e-12 # Permittivity of free space
17 mu0 = 4 * np.pi * 1e-7 # Permeability of free space
18
19 # Simulation parameters
20 dx = 0.01 # Spatial step (m)
21 dy = 0.01 # Spatial step (m)
22 dt = 0.9 * dx / (c0 * np.sqrt(2)) # Time step (s)
23 Lx = 2.0 # Domain length in x (m)
24 Ly = 2.0 # Domain length in y (m)
25 T = 1e-7 # Total simulation time (s)
26
38 for i in range(Nx):
39 for j in range(Ny):
40 if (i - center_x)**2 + (j - center_y)**2 <= radius**2:
41 eps_r[i, j] = 30
42 sigma[i, j] = 0.3
43
44 # Field arrays
45 Ez = np.zeros((Nx, Ny))
46 Hx = np.zeros((Nx, Ny))
47 Hy = np.zeros((Nx, Ny))
48
49 # Source parameters
50 source_position = (Nx // 4, Ny // 2)
51 pulse_width = 1e-10
52
53 f0 = 5e6 # Frequency (Hz)
54
82 def run_simulation():
83 frames = []
84 for n in range(Nt):
218 T. V. Anvekar and B. Eidel
85 t = n * dt
86 pulse = np.exp(-((t - 4 * pulse_width) ** 2) / (
→ pulse_width ** 2)) * np.cos(2 * np.pi * f0 * t)
87 Ez[source_position] += pulse
88
91 if n % 10 == 0:
92 frames.append(np.copy(Ez))
93
94 return frames
95 def animate(frames, title):
96 fig, ax = plt.subplots()
97
98 def update(frame):
99 ax.clear()
100 im = ax.imshow(frame.T, cmap=cm.viridis, vmin=-0.01, vmax
→ =0.01, animated=True)
101 ax.set_title(title)
102 return [im]
103
131 ax.set_title(title)
132 ax.set_xlabel(’X’)
133 ax.set_ylabel(’Y’)
134 ax.set_zlabel(’Electric Field (Vm$^{-1}$)’)
135
136 plt.show()
137
153 if n % 10 == 0:
154 energy = calculate_energy(Ez, Hx, Hy, eps_r, dx, dy)
155 energies.append(energy)
156 return energies
157 # Run the simulations and plotting
158 energies = run_simulation_with_energy()
159 plt.figure()
160 plt.plot(np.arange(0, Nt, 10) * dt, energies)
161 plt.xlabel(’Time (s)’)
162 plt.ylabel(’Total Electromagnetic Energy (J)’)
163 plt.title(’Energy Conservation in 2D FDTD Simulation’)
164 plt.grid(True)
165 plt.show()
166 # Run simulation and create animation , create 3D surface plots
167 frames = run_simulation()
168 frames_1 = run_simulation_3D_plot()
169
170 time_labels = [100 , 500 , 1000 , 1200]
171 animate(frames, ’2D FDTD Simulation of Gaussian Pulse’)
172 for frame_1, label in zip(frames_1, time_labels):
173 plot_3d_surface(frame_1, f’2D FDTD Simulation at T={label:} s
→ ’)
Listing 1 Output by ChatGPT for solving Electromagnetic wave propagation in dielectric medium
220 T. V. Anvekar and B. Eidel
The most important part of the work is to verify the program generated by ChatGPT-4
and ensure that the results conform with the solution of the problem stated in the
problem description.
This work’s verification is done by comparing the FDTD simulation results with
those presented in other technical papers and through self-verification. The FDTD
program generated by ChatGPT is run on an Integrated Development Environment
(IDE). It generates visualizations of the electromagnetic wave propagation, including
reflection, refraction, and absorption, which are compared to the expected results.
The comparison involves examining the accuracy of the field distributions and energy
conservation over time.
Next, I will discuss the verification done by matching similar graphs from research
papers. The results from the FDTD simulation are compared to those from established
research to ensure the accuracy and validity of the ChatGPT-generated code.
The result from the FDTD simulation at time frame .t = 1200 s aligns well with
the outcomes observed in similar studies, as discussed in [5]. The symmetry, wave
interaction pattern, and boundary conditions all reflect the accuracy of the simulation
generated by the FDTD program.
The Image from [5] shows a simulation of a plane wave impinging on a dielec-
tric cylinder at time frame .t = 75 s, while Fig. 5 from the FDTD simulation of a
Gaussian wave at time frame .t = 1200 s. Despite different waveforms and simula-
tion parameters, both results demonstrate consistent interactions with the dielectric
media, enabling a comparative analysis of wave reflection and refraction patterns.
• Wave Interaction Pattern: Both visualizations show two distinct peaks with a
valley in between, indicating similar wave interaction patterns with the medium.
• Symmetry: The wave propagation appears symmetric in both images, suggesting
correct modeling of the medium’s properties and boundaries.
• Boundary Conditions: Both simulations use absorbing boundary conditions
(PML), as evident from the absence of reflected waves at the edges.
These similarities validate the FDTD simulation approach employed and confirm its
reliability in modeling electromagnetic wave interactions in dielectric media.
The prompt has its own way of validating the program it has generated with the
method of Energy Conservation as discussed below.
Electromagnetic Wave Propagation in Dielectric Media 221
The graph in Fig. 6 shows the total electromagnetic energy as a function of time.
The corresponding snapshots in Fig. 7 visually depict the wave’s interaction with the
dielectric medium at different time steps. Key observations include:
Fig. 6 Total electromagnetic energy as a function of time in a 2D FDTD simulation. The plot
demonstrates energy conservation within the computational domain and validates the accuracy of
the FDTD simulation over time
222 T. V. Anvekar and B. Eidel
Fig. 7 These time-lapse visualizations illustrate the reflection, refraction, and absorption of the
electromagnetic wave as it interacts with the dielectric medium and exits the simulation domain.
The progression from the initial pulse propagation to wave dissipation provides insights into energy
conservation and wave dynamics within the medium
Electromagnetic Wave Propagation in Dielectric Media 223
• Initial Energy Peak: The energy starts at a peak value due to the initial source
input, as seen in Fig. 7a, where the wave is centered within the dielectric medium.
• Energy Decay: Rapid decay in the initial phase, likely due to absorption by the
PML boundaries, can be correlated with Fig. 7c and d, where the wave begins to
exit the medium and is partially absorbed by the boundaries.
• Energy Conservation: The energy drops down to zero after the initial decay,
indicating the wave has left the simulation box, such that a steady state is restored
where internal energy is conserved. This is visually supported by Fig. 7f, where the
wave has exited the simulation domain, leaving behind minimal residual energy.
According to the Poynting theorem, the energy density .u and the Poynting vector
. S satisfy
∂u
. + ∇ · S = − J · E. (27)
∂t
For a lossless medium with no free currents (. J = 0), this simplifies to
∂u
. + ∇ · S = 0. (28)
∂t
5 Discussion
After the ChatGPT-4 generating the program, there are several aspects to verify
before running it in any Integrated Development Environment (IDE). These aspects
are summarized as follows:
• Completeness of the program: You must confirm that ChatGPT 4 has supplied
the entire program for this kind of work, where you need to demonstrate outcomes
rather than generate numbers. The main problem is that each time you enter the
prompt, a slightly different version of the prompt is generated, making it difficult
to anticipate the outcome.
• In Steps: Occasionally, ChatGPT-4 produces code in a step-by-step fashion instead
than as a single, integrated application. If this happens, request the whole program.
• Short term memory capacity: There is a limit to ChatGPT-4’s output where it
outputs precisely what the prompt instructs; after that, you must prompt again to
concentrate on the details provided in the first prompt. Claiming that it is accurate,
the most recent version Chatgpt-4o can read the prompt in its entirety and produce
a program that is as exact as possible; based on the prompt, it can provide the
desired output based on an accurate user prompt.
• Updating and frames: As a visualization objective, it its mandatory to check that
the updated electric field equation captures the frames at every step for a smooth
simulation.
224 T. V. Anvekar and B. Eidel
6 Conclusion
The work aimed to generate a Python program using ChatGPT-4 for simulating elec-
tromagnetic wave propagation in a dielectric medium using the 2D Finite-Difference
Time-Domain (FDTD) method. The objective was to examine wave interactions such
as reflection, transmission, and absorption, within different dielectric environments.
It was observed that ChatGPT-4 requires comprehensive and precise inputs to
perform accurately, as discussed in the prompt section. Detailed task descriptions
and explicit instructions are necessary to ensure that ChatGPT-4 does not make any
assumptions about key factors in the FDTD simulation setup.
Comparing the FDTD program generated by ChatGPT-4 with theoretical models
and established methods that the solutions were correctly formulated. The results of
the simulations, including the electric and magnetic field distributions, were consis-
tent with theoretical predictions and comparable to those produced by other estab-
lished ways.
Verification with graph solutions indicated that the FDTD results converged with
increased grid resolution. The simulation of the Gaussian sinusoidal pulse and its
interaction with a cylindrical dielectric medium provided accurate visualizations of
wave propagation phenomena. The implementation of PML BC effectively mini-
mized reflections at the boundaries, further validating the simulation’s accuracy.
The energy conservation analysis confirmed the physical accuracy of the generated
code, with the total energy stabilizing over time after initial absorption by the PML
boundaries. The overall performance of the ChatGPT-4 generated FDTD simulation
demonstrated high accuracy and reliability.
In summary, this study validates ChatGPT-4’s ability to produce FDTD simula-
tion programs that are accurate and dependable, greatly simplifying and improving
the process of resolving challenging electromagnetic wave propagation issues. This
effective application of AI in computational electromagnetics shows how broadly
applicable it may be in scientific computing and provides a viable method for
automating the creation of intricate simulation tools.
Electromagnetic Wave Propagation in Dielectric Media 225
References
Abstract This chapter delves into the generation of Python code by ChatGPT-4o
for 2D simulations of fluid flow around obstacles of circular or square shape. The
numerical solution is achieved using the Lattice Boltzmann Method (LBM) with
the D2Q9 model. A significant portion of the chapter is dedicated to the intricacies
of prompt engineering for this specific task, as well as the thorough verification
of the generated code. The performance of the code is rigorously tested against
results obtained from a commercial solver, focusing on velocity fields, pressure
fields, and pressure distribution. The remarkable quantitative agreement across all
criteria demonstrates the effectiveness of GPT-4o in producing accurate and reliable
code for complex fluid dynamics simulations.
1 Introduction
The study of fluid dynamics involves understanding the behavior of fluid flow around
obstacles, a task that requires advanced computational methods. The Lattice Boltz-
mann Method (LBM) is such a method in Computational Fluid Dynamics (CFD)
through a lattice grid-based approach. This method models fluid behavior at a micro-
scopic level using particle distribution functions and kinetic theory, which can then
be translated into macroscopic fluid properties. For a comprehensive overview of
the Lattice Boltzmann Method and its applications, we refer to [1] and for a sound
introduction to the LBM in fluid flows [2].
This paper is organized into several sections to comprehensively discuss the AI-
Assisted Coding Project. After this introduction, the next section details the For-
mulation of the Coding Task, explaining the problem setup and the approach taken.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 227
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_10
228 F. Lathiya and B. Eidel
Following this, the Code Listing section presents the AI-generated Python code for
the LBM simulation. The Code Testing/Performance section evaluates the perfor-
mance and accuracy of the code through tests and comparisons with ANSYS Fluent.
The Discussion section provides an analysis of the results and insights gained from
the project. Finally, the Conclusions section summarizes the findings and implica-
tions of the study. Additional details and data are included in the Appendix.
The main idea behind the LBM is to represent the fluid using particles that move
and collide each other on a discrete lattice grid. The key part of this method lies in
the Boltzmann equation, which describes the evolution of the particle distribution
function. A critical component of this equation is the collision step [7], which is
given by
. f out = f in − ω f in − f eq , (1)
where . f in is the incoming particle distribution function, . f out is the outgoing parti-
cle distribution function after the collision, .ω the relaxation parameter, and . f eq the
equilibrium distribution function.
LBM is advantageous due to its simplicity in handling complex boundary con-
ditions (BC) and its flexibility in simulating various types of flows, which makes it
ideal for studying fluid flow around obstacles. The method translates the microscopic
particle dynamics into macroscopic flow properties, such as velocity and pressure
fields, by averaging them.
In LBM, the fluid is modeled as a collection of particles that reside on the nodes of
a discrete lattice. These particles propagate to neighboring nodes and collide, redis-
tributing their velocities according to predefined rules. The macroscopic properties
of the fluid, such as density and velocity, are obtained by taking moments of the
particle distribution functions.
One of the significant advantages of LBM is its ability to handle complex geome-
tries and BC with ease. Traditional CFD methods often require sophisticated meshing
techniques and BC formulations. In contrast, LBM uses simple bounce-back rules to
impose no-slip BC on solid surfaces, making it more straightforward to implement
for problems involving obstacles.
The standard lattice model used in LBM is the D2Q9 model (see Sect. 1.2) for two-
dimensional flows. In this model, each node in the lattice has nine possible velocity
vectors, including one stationary and eight moving in different directions. The particle
distribution function at each node evolves according to the LBM equation, [3]
f (x + ei , t + 1) = f i (x, t) + i ,
. i (2)
where . f i is the particle distribution function in the direction .ei , and .i represents the
collision operator that models the redistribution of particles due to collisions. The
Flow Around an Obstacle Using the Lattice Boltzmann Method 229
where .wi are the weights associated with each direction, .ρ is the fluid density, .u is
the macroscopic velocity, and .cs is the speed of sound in the lattice.
The macroscopic fluid properties, such as density .ρ and velocity .u are obtained
by taking moments of the particle distribution function [4]
ρ=
. fi , (5)
i
ρu =
. f i ei . (6)
i
• .e5 = (1, 1), .e6 = (−1, 1), .e7 = (−1, −1), .e8 = (1, −1): Movement to the next-
nearest neighbors.
Weights and distribution functions: The weights.wi associated with these veloc-
ity vectors through (4) play a fundamental role in the particle distribution calculations
within the model [6]:
.w0 = 4
9
for the rest particle,
w1,2,3,4 = 1
9
for particles moving toward nearest neighbors, (7)
w5,6,7,8 = 1
36
for particles moving toward next-nearest neighbors.
These weights help define the equilibrium distribution function . f eq , which inte-
grates these weights with macroscopic variables like density and velocity. This func-
tion ensures compliance with the macroscopic equations of mass and momentum
conservation, thereby providing accurate fluid dynamics simulations under various
conditions.
In LBM, the simulation proceeds through two main steps: collision and streaming.
During this step, particles at each lattice node collide and redistribute their velocities
according to the collision operator. This operator is typically based on the Bhatnagar-
Gross-Krook (BGK) approximation, which simplifies the collision process by assum-
Flow Around an Obstacle Using the Lattice Boltzmann Method 231
ing a single relaxation parameter .ω. The post-collision distribution function is given
by [3]
f ∗ (x, t) = f i (x, t) + ω f eq,i (x, t) − f i (x, t) .
. i (8)
In this step, the particles move to neighboring lattice nodes based on their velocities.
The distribution function is updated as follows [3]
Applying appropriate BC is crucial for accurately simulating fluid flow around obsta-
cles. In LBM, several BC can be implemented easily [8]:
• No-slip BC: This condition is applied at solid boundaries (such as the surface of an
obstacle) using the bounce-back rule. Particles that hit the boundary are reflected
back along their incoming direction, ensuring that the velocity at the boundary is
zero.
• Inlet and outlet BC: These conditions are used to specify the fluid flow at the
boundaries of the simulation domain. Common approaches include specifying a
constant velocity profile at the inlet and a zero-gradient condition at the outlet.
The algorithm for LBM simulations of 2D fluid flow is given in the Algorithm
Box 1.
Algorithm 1: Lattice Boltzmann Method for 2D Fluid Flow
Input: Lattice size .(N x, N y), time steps .T , Reynolds number . Re, initial
distribution function . f i (x, t = 0), boundary conditions, obstacle type
(square/circle), obstacle parameters (center .(cx, cy) and size)
Output: Velocity field .u(x, T ), density field .ρ(x, T )
1 Initialization:
2 Initialize the lattice grid with size .(N x, N y) and Set the obstacle shape and
position (square or circle);
3 Calculate relaxation parameter .ω based on . Re;
4 Set initial macroscopic variables: density .ρ(x, 0) and velocity .u(x, 0);
5 Initialize the distribution function . f i (x, 0) according to equilibrium
eq
distribution function . f i (ρ, u);
6 for t = 0 to T do
7 Collision Step:
8 foreach lattice node . x ∈ (N x, N y) do
9 Calculate macroscopic variables .ρ(x, t) and .u(x, t) from . f i (x, t);
eq
10 Compute equilibrium distribution . f i (ρ, u);
11 Update distribution function using BGK approximation:
eq
f (x, t + t) = f i (x, t) − ω f i (x, t) − f i (ρ, u)
. i
12 end
13 Streaming Step:
14 foreach lattice direction .i do
15 Move the distribution function to neighboring nodes:
16 end
17 Apply Boundary Conditions:
18 Apply appropriate boundary conditions (e.g., bounce-back for walls,
periodic, outflow conditions etc.) on . f i (x, t + t);
19 Update Macroscopic Quantities:
20 Calculate macroscopic quantities .ρ(x, t + t) and .u(x, t + t);
21 end
22 Output:
23 Return the final velocity field .u(x, T ) and pressure field .ρ(x, T );
Flow Around an Obstacle Using the Lattice Boltzmann Method 233
Ideally, the program generated by AI for this project should use the D2Q9 model to
simulate fluid flow around a square or circular obstacle and consist of the following
key components: [5]
1. Initialization: The lattice grid should be defined, and the particle distribution
functions should be initialized.
2. Collision and streaming: The code should iteratively apply the collision and
streaming steps to update the particle distribution functions.
3. Boundary conditions: No-slip BC should be applied using the bounce-back rule.
Inlet and outlet conditions should be specified to maintain the flow.
4. Visualization: The macroscopic properties, such as velocity magnitude and pres-
sure fields, should be computed and visualized to analyze the flow behavior.
The AI-generated code demonstrates the potential of artificial intelligence in
assisting with complex computational tasks. By providing a detailed prompt, the
AI was able to generate a functional code for simulating fluid flow using the LBM.
This project uses AI, specifically GPT-4o, to assist in coding the LBM for sim-
ulating fluid flow around an obstacle. The goal is to assess the capability of AI in
generating Python code for this complex task. The subsequent sections will present
the specific problem to be solved, the AI-generated prompt, the resulting code, out-
comes, its verification and some remarks about the GPT-4o.
By looking at the intersection of AI and CFD, this report may aim to highlight
the strengths and limits of AI-assisted coding in scientific research.
2 Prompt
The simulation involves modeling fluid flow around an obstacle within a 2D rectan-
gular domain using the LBM. The domain is defined with the following parameters:
• Domain size: The computational domain has a width .nx and a height .ny. Here 1
lattice spacing equals 1mm i.e., there will be .nx lattice nodes in .x direction for .nx
mm width and .ny lattice nodes in . y direction for .ny mm height.
• Coordinate system: The origin (0,0) is located at the bottom-left corner of the
domain. The .x-axis extends horizontally to the right, and the . y-axis extends ver-
tically upwards.
• Obstacle: The obstacle can be either a square or a circle, positioned within the
domain based on user input.
– For a square obstacle, the user specifies the .x and . y coordinates of the center
(.cx, .cy) and the side length.
234 F. Lathiya and B. Eidel
Fig. 2 Computational domain with (top) a circular and (bottom) a square obstacle
– For a circular obstacle, the user specifies the .x and . y coordinates of the center
(.cx, .cy) and the radius.
• Discretization: The lattice nodes are evenly spaced, forming a grid over the
domain. The flow simulation uses these discrete points to model fluid dynamics.
Table 1 Parameters chosen for the fluid simulation study with obstacles
Parameters Values/Types Units
Domain width (.nx) 400 mm
Domain height (.ny) 100 mm
Obstacle type Circular/Square
Obstacle center (.cx, cy) (100, 50) (mm, mm)
Radius of circle 25 mm
Length of square 40 mm
(continued)
The final, working prompt is given in Fig. 3. It is used for generating the Code
Listing 1.
Flow Around an Obstacle Using the Lattice Boltzmann Method 235
Table 1 (continued)
Parameters Values/Types Units
LBM cell size 1 mm
time step size .t 1 s
Total time steps 4000
Reynolds number (. Re) 50
Upstream velocity .u 0.04 mm/s
Model type Bounce-back
Obstacle BC No-slip
Output Velocity, Pressure and streamline plots
Fig. 3 (continued)
Flow Around an Obstacle Using the Lattice Boltzmann Method 237
3 Code Listing
The final code version generated by GPT-4o is given in the Code Listing 1.
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from matplotlib import cm
4 import pandas as pd
5
182 # Set velocity to zero inside the obstacle (no flow inside the
→ obstacle)
183 u[0, obstacle] = 0
184 u[1, obstacle] = 0
185
205 plt.subplot(122)
206 # Plot pressure field
207 plt.imshow(pressure.transpose(), cmap=cm.viridis)
242 F. Lathiya and B. Eidel
208 cbar=plt.colorbar(location=’bottom’)
209 cbar.ax.tick_params(labelsize=13)
210 plt.title(’Pressure(Pa)’,size=’16’)
211 plt.xticks(fontsize=13)
212 plt.yticks(fontsize=13)
213
214 plt.tight_layout()
215 plt.show()
216
The output of the GPT-4o does not guarantee that the code generated will work
without any flaws. More discussion on the understanding of the GPT-4o is mentioned
in Sect. 5.
The element size implemented in the simulation is .h = 2.0616 × 10−5 mm. Simi-
lar to the LBM counterpart in this work, a constant time-stepping method is employed
with a fixed time step size of .t = 0.04 s.
The BCs include a uniform inflow with a slight perturbation an outflow condition
at the domain’s exit, and no-slip conditions on the obstacle surface to simulate the
interaction between the fluid and the solid boundary.
The results from ANSYS Fluent provide a benchmark to compare and validate
the results obtained from the AI-generated Python code. The simulation results,
including velocity and pressure fields, are compared against the Python-based Lattice
Boltzmann Method (LBM) simulation in Sect. 4.1.
4.1 Comparison
To verify the results, we compare the velocity, the streamline, and the pressure profiles
(Figs. 4 and 5) as well as the midline of the geometry (.ny = 50 mm) for both the
Python code and ANSYS simulations (Fig. 6). Velocity data is extracted from both
the Python code and the ANSYS simulations for this purpose. The comparison in
Fig. 6a and b for the .x-velocity and . y-velocity profiles indicate excellent agreement
of the present code with the results using the commercial solver.
The profiles obtained from the Python code and ANSYS Fluent simulations show
a high degree of agreement. The streamline patterns and velocity magnitude plots
exhibit similar flow characteristics around the obstacle. Specifically:
• Streamline plots: The results of both simulation frameworks (Fig. 4a and c) exhibit
the expected vortex shedding behind the obstacle, indicating that the flow separa-
tion and recirculation regions are captured accurately.
• Velocity contour plots: They agree between the present simulation results with
the commercial solver with respect to the velocity distributions and the maxima
(Fig. 4a and c).
• Velocity component profiles: The .x-velocity and . y-velocity profiles at the mid-
line of the geometry (Fig. 6) are in excellent agreement for the present Python code
and the commercial solver, which further strongly suggests the correctness of the
GPT4o-generated code.
Despite minor discrepancies, the overall agreement between the two sets of results
demonstrates that the GPT-4o-generated code accurately simulates the fluid flow
around the obstacle. The close match between the streamline patterns, velocity mag-
nitude distributions, and velocity profiles indicates that the Python code of this work
performs on par with the sophisticated commercial software ANSYS Fluent.
244 F. Lathiya and B. Eidel
(a) Velocity (top) and Streamline (bottom) plot for the circular obstacle
(c) Velocity (top) and streamline (bottom) plots from the ANSYS Fluent
simulation
(d) Pressure distribution plot for the circular obstacle from the ANSYS
Fluent simulation
Fig. 4 Simulation results from python code a, b with a comparison to a reference solution c, d
from a commercial solver for the circular obstacle
Flow Around an Obstacle Using the Lattice Boltzmann Method 245
(a) Velocity (top) and Streamline (bottom) plot for the square obstacle
Fig. 5 Simulation results from python code a, b for the square obstacle
Beyond the comparison carried out above, the characteristics of the flow around
the circular obstacle at a Reynolds number 50 are briefly analyzed and set into
comparison with additional results for Re .= 300, which reveals interesting fluid
dynamics phenomena. These can be categorized into three distinct regions:
• Flow Deceleration and Stagnation Point: As the fluid approaches the circular
obstacle, it begins to decelerate due to the obstruction. Directly in front of the
obstacle, a stagnation point forms where the flow velocity reduces to zero. At this
point, the fluid is diverted around the obstacle.
• Pressure Increase: The deceleration of fluid near the stagnation point causes a
rise in pressure in front of the obstacle, following Bernoulli’s principle. The flow
246 F. Lathiya and B. Eidel
Fig. 6 Comparison of velocity profiles at the mid-line (.ny = 50 mm) for the circular obstacle
between the present python simulation result and a reference solution from the commercial solver
ANSYS
Flow Around an Obstacle Using the Lattice Boltzmann Method 247
lines are closer together, indicating a pressure build-up. Note that for Re .= 50 the
pressure maximum is in some distance to the obstacle, Fig. 4b, for Re .= 300 right
at the obstacle, Fig. 7b.
• Boundary Layer Development: As the fluid moves around the obstacle, a bound-
ary layer forms along the surface of the obstacle. This boundary layer is initially
laminar due to the low Reynolds number but starts thickening as it progresses
along the sides.
• Flow Separation: Given that the Reynolds number is around 50, flow separation
occurs on the sides of the obstacle. The flow cannot remain attached to the surface
of the obstacle as it curves around the sides. This results in the boundary layer
separating from the surface at some point, usually on the rear half of the obstacle.
• Symmetry in Separation: At Re .= 50, the separation is symmetric in that the
separation points on both sides of the obstacle occur at roughly the same position.
248 F. Lathiya and B. Eidel
For Re .= 300, the separation shows minor asymmetry leading to the formation of
alternating vortices in the wake.
• Recirculation Zone: Behind the obstacle, a recirculation zone forms due to the
flow separation. In this region, the fluid moves in the opposite direction to the
main flow, creating vortices. For Re .= 50, these vortices –known as recirculating
eddies or vortex pairs– show high symmetry. The length of the re-circulation
zone is relatively short at this Reynolds number. For Re .= 300, in contrast, the
recirculation zone exhibits alternating vortices. The flow enters a periodic vortex
shedding regime, leading to the formation of a von Kármán vortex street. This
phenomenon is clearly visible in the vorticity and streamline plots, where the
alternating vortices on either side of the wake are prominent.
• Vortex Shedding Onset: At a Reynolds number of around 50, the flow is in a
transitional regime where vortex shedding might start to occur; in the present case
however, there is no clear vortex shedding. For Re .= 300, the wake behind the
obstacle is characterized by alternating vortices forming on either side of the wake.
• Wake Symmetry: At Re .= 50, the wake is symmetric, meaning that the vortices
are of similar size and strength on both sides of the obstacle. More downstream
in some distance to the obstacle the flow turns back to laminar. For Re .= 300 the
wake is inherently asymmetric due to the unsteady vortex shedding. The vortices
alternate in strength and size, which is characteristic of the vortex shedding process
in this Reynolds number range. Transition to Turbulence: The flow exhibits
characteristics of transition towards turbulence in the wake region. The alternating
vortices begin to interact with each other, and their dynamics contribute to the
development of complex, unsteady flow patterns further downstream
5 Discussion
This section discusses various aspects of the project, including errors and omissions,
completeness, reproducibility, and the lessons learned during the process.
Flow Around an Obstacle Using the Lattice Boltzmann Method 249
During the code generation process, various errors and omissions are encountered,
which can be categorized into major and minor issues.
• Major Issues:
– Index mismatching errors: These errors were frequent and challenging to rec-
ognize, often leading to the code failing to execute properly.
– Non-reproducibility of code: The generated code varies significantly with each
attempt, even with the same prompt. This inconsistency makes it difficult to
achieve reliable results.
– Incomplete outputs: On several occasions, the code runs without errors but fails
to produce any output, resulting in blank profiles.
• Minor Issues:
Unfortunately, the reproducibility of the code using the same prompt is not assured.
Several observations are noted during the generation of the code and the formulation
of the prompt:
• GPT-4o generates different codes for each trial, with many instances resulting in
compilation errors such as array dimension mismatches.
• Allocating a persona to GPT-4o which is an expert in the respective domain is
beneficial to a large extent. Additionally, when dealing with specialized subjects,
providing GPT-4o with useful principles and equations enhances the accuracy of
the results.
250 F. Lathiya and B. Eidel
• The previous history of the session fed to GPT-4o plays a critical role in gener-
ating further results more accurately according to the need. Therefore, the same
assignment, even with the same formulation, may not produce identical outputs
due to the influence of the session history.
Overall, this project highlights both the strengths and limitations of using AI
tools like GPT-4o for CFD simulations. The experience underscores the importance
of combining AI capabilities with human expertise to achieve reliable and accurate
results.
6 Conclusion
References
Abstract This chapter condenses the lessons learned in different chapters of this
book into recommendations for prompt engineering of Python programming tasks
for the chatbot GPT-4/4o in the areas of computational materials and mechanics.
Beyond, detailed recommendations for code verification are provided.
Expert knowledge in the areas in which the chatbot is used as a programming assis-
tant is absolutely necessary for responsible use. This refers to the domain of model-
ing problems in nature and the sciences by differential equations and to numerical
methods employed for their solution. It no less refers to the field of professional
programming.
Hence, this expert knowledge must be readily available before the prompt is
shaped—and will certainly be of great help to craft the prompt. This expert knowledge
equally is required in rigorous tests of the code implying verification tests as well as
validation tests. In this book we had restricted to verification, which is, briefly, the
assessment of the accuracy of the code and its solution to a computational model by
comparison with known solutions. Validation is the assessment of the accuracy of a
computational simulation by comparison with experimental data.
While GPT-4 proved to be a powerful tool in generating complex algorithms,
human intervention was throughout necessary in all chapters of this book.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 255
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_11
256 B. Eidel et al.
In working with GPT-4 to generate Python code for computational problems, par-
ticularly in computational materials science, solid mechanics, wave propagation,
and fluid mechanics, we have distilled several key lessons and strategies that can
enhance the effectiveness of AI interactions. The following are detailed recommen-
dations for prompt engineering that have proven valuable in eliciting high-quality
responses from GPT-4. We include links to the general characteristics of LLMs and
their performance in Chatbots.
In a book context, where the interaction with GPT-4 is limited to one or a few well-
constructed prompts, it becomes essential to design each prompt carefully to ensure
it leads to high-quality code generation. Multiple iterative prompts or feedback loops
are often impractical, so the focus should be on refining each prompt to be as effective
as possible from the outset.
1. Persona Assignment: Generating Context
One effective technique is the strategic use of persona assignment, where GPT-4
is attributed a specific role or context. By assigning a persona as an expert with
knowledge in, e.g., computational materials science with particular expertise in
grain growth, the generated solutions are more aligned with user expectations.
This helps GPT-4 frame its answers appropriately, adding a layer of sophistication.
This approach provides context, ensuring that the output is relevant both in terms
of solution and explanation. The persona assignment can be also used to tailor
the responses generated by the chatbot to follow a certain format and take certain
general instructions into consideration.
2. Be Explicit in What You Need: Specify Language, Output Format, or Tech-
niques
Clarity in prompts leads to clarity in responses. It is essential to specify the coding
language, the methods or libraries to be used, and the desired output format. Some
measures to ensure clarity in the prompts are listed here:
over data and training-related hallucinations, errors at the inference level can
be minimized by employing strategies that prioritize faithfulness to the user’s
prompt.
For instance, Faithfulness Enhanced Decoding techniques adjust the model’s
probability weighting to prioritize user-provided context and instruction align-
ment, effectively reducing hallucinatory outputs by focusing on the most relevant
tokens [1]. Moreover, detailed prompts with clear and precise context reduce
ambiguity, helping the model generate responses that are more likely to adhere
closely to the input and reduce error. When the input prompt is sufficiently
detailed, specific and clear, the likelihood of hallucinations decreases, enhancing
response accuracy.
Therefore, for high-stakes scientific applications, detailed, well-structured
prompts are critical. This includes iterative refinement, prompt adjustments,
and, if needed, employing structured query patterns like chaining commands to
guide GPT-4 in generating fully operational response.
Tests for code verification are indispensable. We can distinguish testing action we
already ask the Chatbot to carry out being inbuilt in the prompt or tests we figure out
and design.
1. Incorporate Testing and Verification: Request Edge Case Handling
Given the importance of verification, prompt GPT-4 to include testing cases,
especially for edge conditions. This promotes robust code generation, essential
in computational mechanics.
2. Check Initialization of Problems
Several problems require an initialization of variables over a domain or an initial-
ization of the parameters of a model. If the initialization is incorrect, the solution,
despite being correct, can result in wrong results. Thus, problem initialization
should be checked before moving to the verification of the rest of the solution
provided by GPT-4.
3. Verify Boundary Conditions
Despite clear instructions, GPT-4 may omit some boundary conditions of the
problem. A simple check of the values of the variables at these boundaries can be
made to ensure if they are included correctly in the problem.
4. Ensure Data Inputs are Correct
For problems requiring external sources of data to be imported into the model, the
correct processing of the data should be ensured. For e.g. image segmentation, if
the masks are not assigned correctly, then the models will not result in the correct
segmentation.
5. Mathematical Accuracy of Numerical Implementation
Numerical methods applied for solving differential equations can be implemented
Learned Lessons—Recommendations 259
with minor inaccuracies which can cause major deviations from the expected
outputs. For e.g., in the summation . iN= j (ηin )2 , the code generated can miss out
on ensuring.i = j, resulting in an erroneous summation. Such details of numerical
implementation should be checked.
6. Comparison with Analytical Solutions in Particular Settings
Some complex problems can be reduced to a form where an analytical solution
exists. Providing this information within the prompts helps GPT-4 generate the
corresponding analytical equations, which can then be coded to verify the results.
However, if the generated results are unsatisfactory or incorrect, providing the
explicit analytical form in the form of equations will also work.
7. Comparison with (Commercial) Software Systems When verifying codes
against commercial software, there are several steps and considerations to ensure
a meaningful and robust comparison; (i) define the scope and objectives clearly
(choice of a problem that can be modeled equivalently in the generated code
and the commercial software; choice of comparison metrics; preference of rather
simple benchmark problems), (ii) ensure model consistency (geometry, material
properties, boundary conditions, loading, mesh, etc.), (iii) match solver settings
(for, e.g. finite elements: element formulation, numerical integration, solver algo-
rithm, tolerances, (iv) handle nonlinearities carefully.
Observed differences should be interpreted cautiously. Differences can arise due to
hidden reasons. As an example, commercial software often uses highly optimized
and proprietary algorithms that may differ in subtle ways. Moreover, slight dif-
ferences in floating-point arithmetic can accumulate over iterative calculations.
Moreover, we recommend to treat commercial software as a “gold standard”
cautiously; their results are not inherently infallible.
References
1. Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X.,
Qin, B. & Liu, T. A Survey on Hallucination in Large Language Models: Principles, Taxonomy,
Challenges, and Open Questions. ACM Trans. Inf. Syst.. (2024, 11)
2. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A. & Fung,
P. Survey of hallucination in natural language generation. ACM Computing Surveys. 55, 1–38
(2023)
3. Asai, A., Wu, Z., Wang, Y., Sil, A. & Hajishirzi, H. Self-rag: Learning to retrieve, generate, and
critique through self-reflection. ArXiv Preprint ArXiv:2310.11511. (2023)
4. OpenAI, “OpenAI Platform - Models Documentation,” 2024. [Online]. Available: https://
platform.openai.com/docs/models/. [Accessed: 26- Sep- 2024].
Index
B
Basis, 3 E
Bernoulli’s principle, 245 Edge dislocation, 4
Bhatnagar-Gross-Krook (BGK) approxima- Edge displacement, 17
tion, 229, 230 Electric field, 203
Body Centered Cubic (BCC), 5, 24 Electric flux density, 203
Bounce-back rule, 231 Electromagnetic energy, 221
Boundary Conditions (BCs), 183 Energy conservation, 50, 220
Boundary layer, 247 Ensembling, 130
Bravais lattice, 3 Equilibrium distribution, 229
Burgers vector, 4, 6, 22 Error function (erf), 117
Excitation functions, 183
C
Cellular Automata (CA), 84 F
Charge density, 203 Face Centered Cubic (FCC), 5, 24
Contributors, ix Fick’s Second Law (FSL), 85
Convolutional neural network, 128 Finite difference method, 56
Courant-Friedrichs-Lewy (CFL) condition, Finite Difference Time Domain (FDTD),
185 182, 210
Crystal, 3, 5 Finite Volume Method, 242
Crystal lattice, 5 Flow separation, 243
Current density, 203 Fluent, 242
FTCS scheme, 109
D
D2Q9 model, 228, 229 G
Data augmentation, 130 Gaussian function, 183
Data splitting, 129 Gaussian sinusoidal pulse, 209
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2025 261
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5
262 Index
O
H Optimization, 130
Hallucinations, 77, 119, 120 Overfitting, 130
Heat conduction, 109 OVITO, 16, 23
Hexagonal Close Packed (HCP), 5, 24
High-temperature corrosion, 85
Human expertise, 250 P
Hyperparameters, 127 Parameters, 127
Partial Differential Equation (PDE), 182
Perfectly Matched Layer (PML), 206, 208
I Periodicity, 9
ImageNet, 128 Permeability, 204
Initial Boundary Values Problem (IBVP), Permittivity, 204, 210
183 Pitting corrosion, 85
Initial Conditions (ICs), 183 Poisson’s ratio, 7
Inlet and outlet BC, 231 Pristine crystal, 17
Integrated Development Environment PyTorch, 130
(IDE), 223
R
L Reaction Kinetics (RK), 86
Laminar, 247 Recirculation, 248
Large Language Models (LLMs), 23, 26, Reflecting BC, 184
119, 181 Reflection, 210
Lattice Boltzmann Method (LBM), 228 Refraction, 210
Lattice constant, 4, 8 ReLU, 142
Lattice grid, 232 Reproducibility, 26, 249
Lattice parameter, 4 ResNet50, 128, 146
LBM collision, 230 Rotation matrix, 27
LBM-grid, 229
LBM streaming, 230
Learning rate, 127, 142 S
Lennard-Jones potential, 34 Scanning Electron Microscope (SEM), 125,
142
Scattering, 210
M Screw dislocation, 4
Magnetic field, 203 Screw displacement, 18
Magnetic flux density, 203 Segmentation, 145
Maxwell-Boltzmann distribution, 36 Short-term memory capacity, 30
Maxwell equations, 203 Short term memory, 25
Mean Square Error (MSE), 118 Single crystal, 16
Mesh density, 242 Sinusoidal function, 183
MicroNet, 146 Stagnation point, 245
Microstructure, 142 Statistical ensembles, 36
Mur BC, 184 Structure generation, 15
Superalloy, 146
N
Native orientation, 16 T
Navier-Stokes equations, 242 Test, 129
Non-Bravais lattice, 3 Training, 129
Index 263