0% found this document useful (0 votes)
42 views260 pages

GPT For Python-Coding (2025)

Uploaded by

Marius Paun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views260 pages

GPT For Python-Coding (2025)

Uploaded by

Marius Paun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Studies in Computational Intelligence

Volume 1198

Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

Editorial Board
Marco Dorigo , Université Libre de Bruxelles, Bruxelles, Belgium
Andries Engelbrecht, University of Stellenbosch, Stellenbosch, South Africa
Vladik Kreinovich, University of Texas at El Paso, El Paso, TX, USA
Francesco Carlo Morabito, Mediterranea University of Reggio Calabria, Reggio
Calabria, Italy
Roman Slowinski, Poznan University of Technology, Poznan, Poland
Yingxu Wang, Schulich School of Engineering, Calgary, AB, Canada
Yaochu Jin, Westlake University, Hangzhou, China
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design methods
of computational intelligence, as embedded in the fields of engineering, computer
science, physics and life sciences, as well as the methodologies behind them. The
series contains monographs, lecture notes and edited volumes in computational
intelligence spanning the areas of neural networks, connectionist systems, genetic
algorithms, evolutionary computation, artificial intelligence, cellular automata, self-
organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems.
Of particular value to both the contributors and the readership are the short publica-
tion timeframe and the world-wide distribution, which enable both wide and rapid
dissemination of research output.
Indexed by SCOPUS, DBLP, WTI AG (Switzerland), zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
Bernhard Eidel
Editor

GPT for Python-Coding


in Computational Materials
Science and Mechanics
From Prompt Engineering to Solutions
in Worked-Out Examples
Editor
Bernhard Eidel
TUBA Freiberg
Freiberg, Germany

ISSN 1860-949X ISSN 1860-9503 (electronic)


Studies in Computational Intelligence
ISBN 978-3-031-85469-9 ISBN 978-3-031-85470-5 (eBook)
https://doi.org/10.1007/978-3-031-85470-5

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2025

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

If disposing of this product, please recycle the paper.


Preface

In an era where the boundary between artificial intelligence and human capability
continues to blur, the advent of advanced models like GPT-4 stands as a testament to
the ever-expanding possibilities of technology. This book is dedicated to exploring
the fascinating intersection of artificial intelligence, specifically the capabilities of
GPT-4, with computational materials science and computational mechanics—fields
that are pivotal in shaping the future of engineering. The focus here is not only to
unveil the potential of GPT-4 in generating sophisticated Python coding assignments
but also to deepen the understanding and application of computational techniques in
modern engineering problems.
Unleashing the Potential of GPT-4 in Computational Sciences. The first objec-
tive of this book is to present a curated selection of intriguing problems from computa-
tional materials science and computational mechanics. These problems are carefully
chosen for their relevance to current research and industrial applications and their
suitability for showcasing the advanced capabilities of GPT-4 in code generation.
From predicting material behavior under various conditions to simulating complex
mechanical interactions, the problems serve as a canvas on which GPT-4 paints its
solutions, demonstrating not just accuracy but creativity in problem-solving.
Engineering of Prompts: The Art and Science Behind Effective Questioning.
At the heart of effectively utilizing GPT-4 lies the ‘engineering of prompts’—a
structured and methodological approach to formulating queries that elicit the most
coherent and comprehensive answers from the model. This book dedicates substantial
focus to this art, guiding readers through the nuances of crafting prompts that are
clear, precise, and tailored to extract specific outcomes. By mastering this skill,
engineers and researchers can leverage AI tools like GPT-4 to their fullest potential,
turning vague ideas into precise algorithmic actions and reliable solutions.
Rigorous Code Verification: Ensuring Reliability in AI-Generated Outputs.
Trust in AI-generated code demands rigorous verification. This book addresses the
critical need for robust testing frameworks to validate the code produced by GPT-
4. Through detailed walkthroughs of testing strategies and verification protocols,
readers will learn how to ensure that the solutions provided by AI not only work

v
vi Preface

in theory but also perform flawlessly in real-world applications. This part not only
enhances the reliability of AI as a tool but also instills confidence in its users.
Why This Book Is Indispensable. For students, graduate engineers, and seasoned
researchers, this book serves as a vital resource that illustrates the practical integration
of AI in their work. It demystifies the process of interfacing with one of the most
sophisticated AI models to date, making cutting-edge technology accessible and
applicable. For educators, this book provides a blueprint for teaching computational
concepts using AI, enriching the curriculum, and preparing students for a future
where AI is a ubiquitous part of problem-solving.
Inspiring the Next Generation. By bridging traditional engineering problems
with modern AI technology, this book not only enhances the current educational and
professional landscape but also serves to inspire the next generation of engineers
and developers. It pushes the boundaries of what can be achieved when human
ingenuity combines with artificial intelligence, setting a new standard for innovation
and excellence in engineering.
This book is more than just a textbook or a reference manual; it is a gateway to the
future of engineering, a tool that empowers its readers to build on the cutting edge of
technology. It encourages them to explore, experiment, and excel in their respective
fields, equipped with the knowledge and tools to use AI not just as an aid, but as a
transformative force in computational science.
This book project originated from a module I introduced into the Master’s degree
program in Computational Materials Science (CMS) at TU Bergakademie Freiberg
(TUBAF) during the summer semester of 2023, followed by a continuation in the
summer semester of 2024. The co-authors of this volume are students of the CMS
program at TUBAF who participated in the course and developed their individual
projects as part of its requirements.

Freiberg, Germany Bernhard Eidel


July 2024
Contents

Topics of Computational Materials Science


Generation of Atomic Scale Single Crystals . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Pradeep Periyasamy and Bernhard Eidel
Molecular Dynamics Simulation of Noble Gases . . . . . . . . . . . . . . . . . . . . . . 33
Aagashram Neelakandan, Vishal Vijendra Badami, and Bernhard Eidel
Phase Field Modeling of Grain Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Rahul Narkhede and Bernhard Eidel
Modeling Corrosion Using a Cellular Automaton . . . . . . . . . . . . . . . . . . . . . 83
Mehdi Bakhshi Zadeh and Bernhard Eidel
Instationary Heat Conduction on Rectangular Domains
with Arbitrary Circular Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Aagashram Neelakandan and Bernhard Eidel

Topics of Deep Learning Based Materials Science


Transfer Learning for Alloy Classification Based on Microstructure
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Aditya Deshmukh and Bernhard Eidel
Transfer Learning for Microstructure Image Segmentation . . . . . . . . . . . . 145
Rahul Narkhede and Bernhard Eidel

Topics of Computational Analysis of Waves and Fluid Mechanics


Elastic Wave Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Muhammad Saad Qureshi and Bernhard Eidel
Electromagnetic Wave Propagation in Dielectric Media . . . . . . . . . . . . . . . 201
Tejas Viresh Anvekar and Bernhard Eidel

vii
viii Contents

Flow Around an Obstacle Using the Lattice Boltzmann Method . . . . . . . . 227


Fenil Lathiya and Bernhard Eidel

Conclusions
Learned Lessons-Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Bernhard Eidel, Rahul Narkhede, and Aagashram Neelakandan

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Contributors

Tejas Viresh Anvekar M5, IMFD, TUBAF, Freiberg, Germany


Vishal Vijendra Badami M5, IMFD, TUBAF, Freiberg, Germany
Mehdi Bakhshi Zadeh M5, IMFD, TUBAF, Freiberg, Germany
Aditya Deshmukh M5, IMFD, TUBAF, Freiberg, Germany
Bernhard Eidel M5, IMFD, TUBAF, Freiberg, Germany
Fenil Lathiya M5, IMFD, TUBAF, Freiberg, Germany
Rahul Narkhede M5, IMFD, TUBAF, Freiberg, Germany
Aagashram Neelakandan M5, IMFD, TUBAF, Freiberg, Germany
Pradeep Periyasamy M5, IMFD, TUBAF, Freiberg, Germany
Muhammad Saad Qureshi M5, IMFD, TUBAF, Freiberg, Germany

ix
Acronyms

AI Artificial Intelligence
ANN Artificial Neural Networks
BC Boundary Condition
BCC Body Centered Cubic
BCE Binary Cross Entropy
BGK Bhatnagar-Gross-Krook
BVP Boundary Value Problem
CA Cellular Automata
CFD Computational Fluid Dynamics
CFL Courant-Friedrichs-Lewy condition
CLAHE Contrast Limited Adaptive Histogram Equalization
CNN Convolutional Neural Network
CPU Central Processing Unit
CS Coordinate System
D2Q9 Two-Dimensional, Nine-Directional
DFT Density Functional Theory
DL Deep Learning
erf Error function
FCC Face Centered Cubic
FD Finite Differences
FDM Finite Difference Method
FDTD Finite Difference Time Domain
FEM Finite Element Method
FSL Fick’s Second Law
FTCS Forward Time Centered Space
FVM Finite Volume Method
GIF Graphics Interchange Format
GPT Generative Pre-trained Transformer
GPU Graphics Processing Unit
HCP Hexagonal Close Packed
HEA High Entropy Alloy

xi
xii Acronyms

IC Initial Condition
IDE Integrated Development Environment
IoU Intersection over Union
IVP Initial Value Problem
LBM Lattice Boltzmann Method
LLM Large Language Model
MD Molecular Dynamics
MLP Multi Layer Perceptron
MSE Mean Square Error
NLP Natural Language Processing
NN Neural Network
OpenCV Open Source Computer Vision Library
PDE Partial Differential Equation
PF Phase Field
PML Perfectly Matched Layer
RK Reaction Kinetics
TE Transverse Electric
TL Transfer Learning
TM Transverse Magnetic
ZSL Zero-Shot Learning
Generation of Atomic Scale Single
Crystals

Pradeep Periyasamy and Bernhard Eidel

Abstract This chapter investigates the coding ability of GPT-4 in the generation of
pristine single crystals (Face Centered Cubic (FCC), Body-Centered Cubic (BCC),
and Hexagonal Close Packed (HCP)) structures on the atomic scale in arbitrary
orientation with respect to a Euclidean reference frame. Beyond, the code enables
the insertion of edge and screw dislocations into FCC pristine crystals. The generated
crystals in atomic resolution can be used in Molecular Dynamics (MD) or Molecular
Statics (MS) simulations. The resultant code is checked for correctness and analyzed
to gain deeper insights into GPT-4’s behavior. Furthermore, the ability of GPT-4
to incorporate human feedback is also examined by providing corrections to the
generated code during the evaluation process.

1 Introduction

A solid is considered a crystal when its atoms are arranged with precise and consistent
periodicity in their positions. This organization is made possible through something
called a lattice, which can be divided into two types: Bravais and non-Bravais.
In a Bravais lattice, all the points in the lattice are the same, meaning the atoms
are arranged uniformly throughout the crystal. On the other hand, a non-Bravais
lattice has different points in the lattice, indicating that the atoms are not the same
everywhere. So, the non-Bravais lattice is referred to lattice with a basis.
Figure 1a (a) gives an understanding of the overall structure of a crystal, we can
think of it as the combination of two things: the lattice (the organized arrangement
of points) and the basis (the specific arrangement of atoms at those points). So, in
simpler terms:
.Crystal Structure = Lattice + Basis

P. Periyasamy (B) · B. Eidel


M.5 , IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 3


B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_1
4 P. Periyasamy and B. Eidel

This equation helps us grasp the fundamental elements that contribute to the way
atoms are organized in a crystal.
Figures 1c–e depict the prevalent arrangement of unit cells observed in a variety
of metals. For cubic crystal systems, such as FCC or BCC, the lattice parameter is
typically denoted by ‘.a’ and represents the distance between adjacent lattice points
along each edge of the cubic unit cell. In other crystal systems, such as hexagonal,
there may be multiple lattice parameters that define the dimensions along different
crystallographic axes. In Figs. 1c, d, the parameter ‘.a’, and in Fig. 1e, both parameters
‘.a’ and ‘.c’, are referred to as lattice parameters or lattice constants.
Metallic specimens are composed of multiple unit cells repeated in a pattern.
However, the alignment between the overall orientation of the specimen and the
individual crystals it comprises may not necessarily coincide. An effective approach
to describe the crystal’s orientation relative to the specimen involves defining two
distinct coordinate systems (CS): a crystal CS and a global CS as shown in Fig. 1b.
The global (specimen) CS, often used in experiments, describes the orientation of a
sample based on its geometry, while the crystal CS in crystallography aligns with the
crystal lattice. Rotation matrices . R facilitate relating these systems by transforming
vectors from one system to another. By applying . R to vectors in the global CS, they
can be converted to the crystal CS.
Rotations of vectors from global to local are carried out according to (1) and vice
versa according to (2)

. x  = Rx , (1)
T 
.x = R x , (2)

where . x and . x  are vectors in the global and crystal CS, respectively.
However, pristine crystals, characterized by highly ordered repeating atomic pat-
terns, do not undergo plastic deformation and are thus relatively uninteresting. Con-
versely, real-world materials are predominantly governed by defects, which can
be categorized into various dimensions: zero-dimensional (point) defects (such as
vacancies, interstitials, and substitutions), one-dimensional (line) defects (such as
dislocations), two-dimensional (planar) defects (such as grain boundaries, interfaces,
and free surfaces), and three-dimensional (volume) defects (such as pores and voids).
In this section, our focus is directed solely towards the examination of defects
stemming from dislocations. Dislocations represent linear or one-dimensional irreg-
ularities within a crystal lattice, where the alignment of atoms deviates from the
normal arrangement. Dislocations are broadly categorized into two types: edge dis-
locations and screw dislocations. As illustrated in Fig. 1f, an edge dislocation is
characterized by the insertion of an extra portion of a plane of atoms, the edge of
which terminates within the crystal. This edge is called a dislocation line abbrevi-
ated by .ξ ; for an edge dislocation the Burgers vector . b is perpendicular to .ξ . Within
the region around the dislocation line there is some localized lattice distortion. The
atoms above the dislocation line in Fig. 1f undergo compression, and those below
undergo tension.
Generation of Atomic Scale Single Crystal 5

Basis Lattice Crystal


+ =

(a) Crystal structure (b) Coordinate systems (CS)

(c) FCC crystal structure (d) BCC crystal structure (e) HCP crystal structure

Screw dislocaƟon

(f) Edge dislocation (g) Screw dislocation

Fig. 1 Definition by sketches of (a) crystal structure, (b) of coordinate systems, (c)–(e) of unit
cells representing fundamental crystal structures, and representations of (f) edge and (g) screw
dislocations within a crystal lattice
6 P. Periyasamy and B. Eidel

(a) Visualization of parameters involved in (b) Visualization of parameters involved in


screw dislocation for | | -axis edge dislocation for | | -axis

Fig. 2 Illustration of parameters involved in screw and edge dislocation

A screw dislocation may be thought of as being formed by a shear stress that is


applied to produce the distortion shown in Fig. 1g. Here, the upper front region of
the crystal is shifted one atomic distance to the right relative to the bottom portion.
In the screw dislocation, the Burgers vector . b is parallel to the .ξ -direction. While
most dislocations in crystalline materials likely possess characteristics of both edge
and screw dislocations, known as mixed dislocations, our focus here remains on pure
forms for convenience. For a deeper exploration of crystal defects, comprehensive
insights are available in [2], for dislocations in particular see [1, 4].
In general, a dislocation is introduced using the displacements predicted in the
theory of dislocations. If for a screw dislocations .ξ is parallel to the . Z -axis as shown
in Fig. 2a, then each atom is displaced by a distance .u z parallel to the dislocation line
.(ξ ) (hence . Z -direction), and proportional to the norm of the Burgers vector .b := |b|

according to [3]  
b y
.u z = arctan , (3)
2π x

here, .x and . y represent the positions of atoms in the plane normal to .ξ .


Similarly, for an edge dislocation with .ξ parallel to the . Z -axis as shown in Fig. 2b,
the displacements (.u x and .u y ) applied to atoms are contained in the plane normal to
.ξ according to

 y 
b xy
ux = arctan + ,
2π x 2(1 − ν)(x 2 + y 2 )
.   (4)
b (1 − 2ν) ln(x 2 + y 2 ) x 2 − y2
uy = − + .
2π 4(1 − ν) 4(1 − ν)(x 2 + y 2 )
Generation of Atomic Scale Single Crystal 7

Here, .ν is the Poisson’s ratio of the material. If the position of .ξ is parallel to the
X - or .Y -axis, the positions of atoms in (3) and (4) have to be adjusted consistently,
.
cf. [3].
In the subsequent sections of this chapter, the prompts generated will assess the
Python coding proficiency of GPT-4 in creating single crystals, specifically focusing
on the FCC, BCC, and HCP structures in any arbitrary orientations. Additionally,
starting from the pristine or defect-free FCC crystal, we will introduce edge and screw
dislocations. The visualization of the generated results will be facilitated through
OVITO [8], a renowned scientific software extensively employed for visualizing and
analyzing molecular and material simulation data in the domain of computational
materials science and engineering.

2 Prompt

Users have the flexibility to input a wide range of questions to obtain answers to their
queries. However, the key lies in developing the skill to pose questions in a manner
that an Artificial Intelligence (AI) engine can comprehend, leading to the production
of high-quality and reliable results. An example of such a well-constructed prompt,
achieved through prompt engineering, is presented in Fig. 4.
The prompt offers a comprehensive roadmap of the Python program’s objectives,
detailing specific instructions for generating crystal structures for FCC, BCC, and
HCP systems, with an emphasis on accurately determining and saving atom coor-
dinates. Clear function requirements are outlined such as the function name and
user inputs. Primitive and basis vectors for each crystal type are explicitly defined
to ensure precise calculations for atom positions. The desired output file name is
specified, along with the file format identified as XYZ, compatible with visualiza-
tion tools like OVITO. User interface aspects, including prompts for crystal type,
lattice constants, box length with periodicity in each direction, and orientation, are
also described. Additionally, suggestions for error-handling measures to validate
user inputs, especially for lattice constants and crystal types, are provided. To ensure
reproducibility, two key factors are emphasized: clear prompts defining crucial loop
adjustments in for loops and a reminder that primitive and basis vectors must be
rotated before generating atom positions. The program intentionally excludes from
specifying the function for calculating periodicity length. Furthermore, an additional
prompt Fig. 3 has been provided to calculate it separately. This decision is made to
mitigate the inherent randomness in GPT-4’s behavior, ensuring that the code remains
reproducible, at least after a few iterations. These topics will be discussed further in
detail.
8 P. Periyasamy and B. Eidel

Fig. 3 Prompt for calculating periodicity


Generation of Atomic Scale Single Crystal 9

3 User Input Prompt and Results of Periodicity Calculation


3.1 Prompt for Periodicity Calculation

The user prompt displayed in Fig. 3 showcases the input prompt provided for the
periodicity calculation. Additionally, the Python code generated by GPT-4 and the
corresponding user inputs and the code output are detailed in code Listing 1.1 and
Tables 1 and 2 respectively.

Table 1 Parameters requested from the user and corresponding inputs


Parameters requested User inputs
Enter crystal type (FCC, BCC, HCP): FCC BCC HCP
Enter lattice constant ‘a’ (Å): 3.51 3.3 3.21
Enter lattice constant ‘c’ (Å): – – 5.21
Enter the number of unit cells in each direction for the simulation box (x, y, z): 5, 5, 5 5, 5, 5 5, 5, 5

Table 2 Representation of box length considering periodicity


Description Code output
Periodicity lengths (x, y, z): 43, 43, 43 38, 38, 38 51, 51, 135

3.2 Code Listing for Periodicity Calculation


1 i m p o r t nu mp y as np
2
3 def g e n e r a t e _ c r y s t a l _ s t r u c t u r e ( c r y s t a l _ t y p e , a , c ,
→ n u m b e r _ o f _ u n i t _ c e l l s ) :
4 # V a l i d a t e c r y s t a l type
5 if c r y s t a l _ t y p e not in [ ’ FCC ’ , ’ BCC ’ , ’ HCP ’ ]:
6 ra is e V a l u e E r r o r ( " C r y s t a l type must be ’ FCC ’, ’ BCC ’,
→ or ’ HCP ’. " )
7
8 # Validate lattice constants
9 if not a > 0 or ( c r y s t a l _ t y p e == ’ HCP ’ and not c > a ) :
10 ra is e V a l u e E r r o r ( " I n v a l i d l a t t i c e c o n s t a n t s . E n s u r e
→ ’a ’ > 0 and for HCP , ’c ’ > ’a ’. " )
11
12 # Define primitive vectors
13 if c r y s t a l _ t y p e in [ ’ FCC ’ , ’ BCC ’ ]:
14 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , 0.5* a , 0] ,
→ [0.5* a , 0 , 0.5* a ] , [0 , 0.5* a , 0.5* a ]])
15 elif c r y s t a l _ t y p e == ’ HCP ’ :
16 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , -0.5* np . sqrt
→ (3) *a , 0] , [0.5* a , 0.5* np . sqrt (3) *a , 0] , [0 , 0 , c ]])
17
18 # Calculate periodicity
19 n o r m _ p r i m i t i v e = np . l i n a l g . norm ( p r i m i t i v e _ v e c t o r s , axis
→ =1)
20 if c r y s t a l _ t y p e in [ ’ FCC ’ , ’ BCC ’ ]:
21 p e r i o d i c i t y _ l e n g t h = ( np . array ( n u m b e r _ o f _ u n i t _ c e l l s )
→ * a * n o r m _ p r i m i t i v e ) . a s t y p e ( int )
22 elif c r y s t a l _ t y p e == ’ HCP ’ :
10 P. Periyasamy and B. Eidel

23 p e r i o d i c i t y _ l e n g t h _ x y = ( np . array (
→ n u m b e r _ o f _ u n i t _ c e l l s [:2]) * a * n o r m _ p r i m i t i v e [:2]) .
→ a s t y p e ( int )
24 p e r i o d i c i t y _ l e n g t h _ z = ( n u m b e r _ o f _ u n i t _ c e l l s [2] * c
→ * n o r m _ p r i m i t i v e [ 2 ] ) . a s t y p e ( int )
25 p e r i o d i c i t y _ l e n g t h = np . c o n c a t e n a t e ([
→ p e r i o d i c i t y _ l e n g t h _ x y , [ p e r i o d i c i t y _ l e n g t h _ z ]])
26
27 return periodicity_length
28 def g e t _ u s e r _ i n p u t s () :
29 wh il e True :
30 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC ,
→ HCP ) : " ) . upper ()
31 if c r y s t a l _ t y p e in [ ’ FCC ’ , ’ BCC ’ , ’ HCP ’ ]:
32 b reak
33 pr in t ( " I n v a l i d c r y s t a l t y p e . P l e a s e c h o o s e f r o m ’ FCC
→ ’, ’ BCC ’, or ’ HCP ’. " )
34
35 a = fl oa t ( input ( " E n t e r l a t t i c e c o n s t a n t ’ a ’: " ) )
36
37 if c r y s t a l _ t y p e == ’ HCP ’:
38 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e c o n s t a n t ’
→ c ’ ( must be g r e a t e r t h an ’a ’) : " ) )
39 else :
40 c = None # Not r e q u i r e d for FCC and BCC
41
42 n u m b e r _ o f _ u n i t _ c e l l s = input ( " E n t e r the n u m b e r of u n i t
→ ce ll s in each d i r e c t i o n for the s i m u l a t i o n box ( x , y , z )
→ : " )
43 n u m b e r _ o f _ u n i t _ c e l l s = tuple ( map ( int ,
→ n u m b e r _ o f _ u n i t _ c e l l s . split ( ’ , ’ ) ) )
44
45 return crystal_type , a , c , number_of_unit_cells
46
47 if _ _ n a m e _ _ == " _ _ m a i n _ _ " :
48 try :
49 crystal_type , a , c , number_of_unit_cells =
→ g e t _ u s e r _ i n p u t s ()
50 periodicity_length = generate_crystal_structure (
→ c r y s t a l _ t y p e , a , c , n u m b e r _ o f _ u n i t _ c e l l s )
51 pr in t ( " P e r i o d i c i t y l e n g t h s ( x , y , z ) : " ,
→ p e r i o d i c i t y _ l e n g t h )
52 e x c e p t V a l u e E r r o r as e :
53 pr in t ( " Error : " , e )
Code Listing 1.1 Code generated by GPT-4 for periodicity calculation

4 User Prompt and Results of Structure Generation


with Arbitrary Orientation
4.1 Prompt for Structure Generation

Following the execution of code for periodicity calculation, the user input displayed
in Fig. 4 is provided to GPT-4 to generate crystal structures with arbitrary orientations.
Generation of Atomic Scale Single Crystal 11

Fig. 4 Prompt for crystal structure generation


12 P. Periyasamy and B. Eidel

Fig. 4 (continued)

4.2 Code Listing for Crystal Structure Generation

In code Listing 1.2, the Python code exhibits the conclusive snippet crafted by GPT-
4, accompanied by the corresponding user inputs outlined in Table 3. The outcomes
of the code execution are visually depicted in Fig. 5.
1 # !/ usr / bin / env p y t h o n
2 # c o d i n g : utf -8
3
4 # In [1 8] :
5
6
7 i m p o r t n um py as np
8
9 # R o t a t i o n M a t r i x C a l c u l a t i o n from M i l l e r I n d i c e s
10 def r o t a t i o n _ m a t r i x _ f r o m _ m i l l e r _ i n d i c e s ( hkl_x , hkl_y , hkl_z )
→ :
11 # Normalize input vectors
12 x = np . a r r a y ( h k l _ x ) / np . l i n a l g . norm ( h k l _ x )
13 y = np . a r r a y ( h k l _ y ) / np . l i n a l g . norm ( h k l _ y )
14 z = np . a r r a y ( h k l _ z ) / np . l i n a l g . norm ( h k l _ z )
15
16 # E n s u r e o r t h o g o n a l i t y and c o m p u t e the r o t a t i o n m a t r i x
17 if np . dot (x , y ) == 0 and np . dot (x , z ) == 0 and np . dot (y ,
→ z ) == 0:
Generation of Atomic Scale Single Crystal 13

18 r e t u r n np . array ([ x , y , z ]) . T
19 else :
20 ra is e V a l u e E r r o r ( " I n p u t M i l l e r i n d i c e s are not
→ o r t h o g o n a l . " )
21
22 # G e n e r a t e the c r y s t a l s t r u c t u r e
23 def g e n e r a t e _ c r y s t a l _ s t r u c t u r e ( c r y s t a l _ t y p e , a , c ,
→ p e r i o d i c i t y _ l e n g t h , o r i e n t a t i o n ) :
24 r o t _ m a t r i x = r o t a t i o n _ m a t r i x _ f r o m _ m i l l e r _ i n d i c e s (*
→ o r i e n t a t i o n )
25 p o s i t i o n s = []
26
27 if c r y s t a l _ t y p e == ’ FCC ’:
28 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , 0.5* a , 0] ,
→ [0.5* a , 0 , 0.5* a ] , [0 , 0.5* a , 0.5* a ]])
29 b a s i s _ v e c t o r s = np . array ([[0 , 0 , 0]])
30 elif c r y s t a l _ t y p e == ’ BCC ’ :
31 p r i m i t i v e _ v e c t o r s = np . array ([[ -0.5* a , 0.5* a , 0.5* a
→ ] , [0.5* a , -0.5* a , 0.5* a ] , [0.5* a , 0.5* a , -0.5* a ]])
32 b a s i s _ v e c t o r s = np . array ([[0 , 0 , 0]])
33 elif c r y s t a l _ t y p e == ’ HCP ’ :
34 p r i m i t i v e _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , - 0 . 5 * ( 3 * * ( 1 / 2 )
→ ) *a , 0] , [0.5* a , 0 . 5 * ( 3 * * ( 1 / 2 ) ) * a , 0] , [0 , 0 , c ]])
35 b a s i s _ v e c t o r s = np . a r r a y ( [ [ 0 . 5 * a , 0 . 5 * ( 3 * * ( - 1 / 2 ) ) *a ,
→ 0 .2 5* c ] , [0.5* a , -0.5*(3**( -1/2) ) *a , 0.75* c ]])
36 else :
37 ra is e V a l u e E r r o r ( " I n v a l i d c r y s t a l type . " )
38
39 # A pp ly r o t a t i o n to p r i m i t i v e and b a s i s v e c t o r s
40 p r i m i t i v e _ v e c t o r s = np . dot ( p r i m i t i v e _ v e c t o r s , r o t _ m a t r i x
→ )
41 b a s i s _ v e c t o r s = np . dot ( b a s i s _ v e c t o r s , r o t _ m a t r i x )
42
43 # G e n e r a t e atom p o s i t i o n s
44 for i in range ( - p e r i o d i c i t y _ l e n g t h [0] ,
→ p e r i o d i c i t y _ l e n g t h [0] + 1) :
45 for j in range ( - p e r i o d i c i t y _ l e n g t h [1] ,
→ p e r i o d i c i t y _ l e n g t h [1] + 1) :
46 for k in range ( - p e r i o d i c i t y _ l e n g t h [2] ,
→ p e r i o d i c i t y _ l e n g t h [2] + 1) :
47 c e l l _ o r i g i n = i * p r i m i t i v e _ v e c t o r s [0] + j *
→ p r i m i t i v e _ v e c t o r s [1] + k * p r i m i t i v e _ v e c t o r s [2]
48 for basis in b a s i s _ v e c t o r s :
49 pos = c e l l _ o r i g i n + basis
50 if all (0 <= pos [ n ] <= (
→ p e r i o d i c i t y _ l e n g t h [ n ] + 1) for n in range (3) ) :
51 p o s i t i o n s . a p p e n d ( pos )
52
53 return positions
54
55 # Save p o s i t i o n s to XYZ file
56 def s a v e _ t o _ x y z _ f i l e ( a t o m _ p o s i t i o n s ) :
57 with open ( " C r y s t a l _ s t r u c t u r e . xyz " , " w " ) as file :
58 file . write ( f " { len ( a t o m _ p o s i t i o n s ) }\ n \ n " )
59 for pos in a t o m _ p o s i t i o n s :
14 P. Periyasamy and B. Eidel

60 file . write ( f " Atom { pos [0]} { pos [1]} { pos [2]}\ n " )
61
62 # User I n t e r f a c e
63 def main () :
64 crystal_type , a , c , periodicity_length , orientation =
→ g e t _ u s e r _ i n p u t ()
65 positions = generate_crystal_structure ( crystal_type , a ,
→ c , p e r i o d i c i t y _ l e n g t h , o r i e n t a t i o n )
66 save_to_xyz_file ( positions )
67 pr in t ( " C r y s t a l s t r u c t u r e g e n e r a t e d and saved to
→ C r y s t a l _ s t r u c t u r e . xyz . " )
68
69 def g e t _ u s e r _ i n p u t () :
70 # C o l l e c t and v a l i d a t e c r y s t a l type
71 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC , HCP )
→ : " ) . st rip () . upper ()
72 wh il e c r y s t a l _ t y p e not in [ ’ FCC ’ , ’ BCC ’ , ’ HCP ’ ]:
73 pr in t ( " I n v a l i d c r y s t a l t y p e . P l e a s e e n t e r FCC , BCC ,
→ or HCP . " )
74 c r y s t a l _ t y p e = input ( " E n t e r c r y s t a l type ( FCC , BCC ,
→ HCP ) : " ) . strip () . upper ()
75
76 # C o l l e c t and v a l i d a t e l a t t i c e c o n s t a n t ’ a ’
77 a = fl oa t ( input ( " E n t e r l a t t i c e c o n s t a n t ’ a ’: " ) )
78
79 # I n i t i a l i z e ’c ’ for HCP and v a l i d a t e if n e e d e d
80 c = 0
81 if c r y s t a l _ t y p e == ’ HCP ’:
82 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e c o n s t a n t ’
→ c ’ ( must be > a ) : " ) )
83 wh il e c <= a :
84 p rint ( " I n v a l i d ’ c ’ v a l u e . It must be g r e a t e r
→ than ’a ’. " )
85 c = float ( input ( " Enter a d d i t i o n a l l a t t i c e
→ c o n s t a n t ’c ’: " ) )
86
87 # C o l l e c t and v a l i d a t e p e r i o d i c i t y l e n g t h (x , y , z )
88 p e r i o d i c i t y _ l e n g t h _ s t r = input ( " Enter p e r i o d i c i t y l e n g t h
→ ( x , y , z ) : " )
89 x , y , z = map ( int , p e r i o d i c i t y _ l e n g t h _ s t r . split ( ’ , ’ ) )
90
91 # C o l l e c t and v a l i d a t e o r i e n t a t i o n
92 pr in t ( " En ter o r i e n t a t i o n as M i l l e r i n d i c e s ( h , k , l ) for x
→ , y , z d i r e c t i o n s : " )
93 hk l_ x = t uple ( map ( int , input ( " O r i e n t a t i o n for x
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
94 hk l_ y = t uple ( map ( int , input ( " O r i e n t a t i o n for y
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
95 hk l_ z = t uple ( map ( int , input ( " O r i e n t a t i o n for z
→ d i r e c t i o n : " ) . split ( ’ , ’ ) ) )
96
97 # E n s u r e o r i e n t a t i o n v e c t o r s are o r t h o g o n a l
98 # Note : This is a s i m p l i f i e d check and may need
→ r e f i n e m e n t based on s p e c i f i c r e q u i r e m e n t s
Generation of Atomic Scale Single Crystal 15

99 if not ( np . dot ( hkl_x , hkl_y ) == 0 and np . dot ( hkl_x ,


→ hk l_ z ) == 0 and np . dot ( hkl_y , hkl_z ) == 0) :
100 pr in t ( " W a r n i n g : The p r o v i d e d o r i e n t a t i o n v e c t o r s are
→ not o r t h o g o n a l . P l e a s e c h e c k your i n p u t . " )
101
102 r e t u r n c r y s t a l _ t y p e , a , c , (x , y , z ) , ( hkl_x , hkl_y ,
→ hk l_ z )
103
104
105 if _ _ n a m e _ _ == " _ _ m a i n _ _ " :
106
107 main ()
Code Listing 1.2 Code generated by GPT-4 for single crystal generation

4.3 Results of Structure Generation

The outcomes of generating single crystals of FCC, BCC, and HCP in native ori-
entations are illustrated in Fig. 5a–c as induced by corresponding input parameters
listed in Table 3 (Table 4).

Table 3 User inputs and output for crystal structure generation in native orientation
Input requested parameters: FCC BCC HCP
Crystal type (FCC, BCC, HCP): FCC BCC HCP
Lattice constant ‘a’: 3.51 3.3 3.21
HCP lattice constant ‘c’ (with c.>a): – – 5.21
Periodicity length (x, y, z): 43, 43, 43 38, 38, 38 51, 51, 135
Orientation as Miller indices (h, k, l)
For x direction: 1, 0, 0 1, 0, 0 1, 0, 0
For y direction: 0, 1, 0 0, 1, 0 0, 1, 0
For z direction: 0, 0, 1 0, 0, 1 0, 0, 1
Output: Crystal structure generated and saved to Crystal_structure.xyz
16 P. Periyasamy and B. Eidel

Table 4 User inputs for structure generation in orientation other than native
Requested input FCC BCC HCP
parameters:
Crystal type (FCC, FCC BCC HCP
BCC, HCP):
Lattice constant ‘a’: 3.51 3.3 3.21
HCP lattice constant – – 5.21
‘c’ (with c.>a):
Periodicity length (x, 43, 43, 43 38, 38, 38 51, 51, 135
y, z):
Orientation as Miller indices (h, k, l)
For x direction: 1, .−1, 2 1, .−1, 2 1, .−1, 2
For y direction: .−1, 1, 1 .−1, 1, 1 .−1, 1, 1

For z direction: 1, 1, 0 1, 1, 0 1, 1, 0
Output: Crystal structure generated and saved to Crystal_structure.xyz

(a) FCC crystal structure (b) BCC crystal structure (c) HCP crystal structure

Fig. 5 Generation of single crystals in native orientation (edges align with .< 100 > directions)
and visualization using OVITO

(a) FCC crystal structure (b) BCC crystal structure (c) HCP crystal structure

Fig. 6 Generation of single crystals in orientation other than native and visualization using OVITO
Generation of Atomic Scale Single Crystal 17

5 User Prompt and Results for Incorporating Dislocations


into the Pristine Crystal

5.1 Prompt for Dislocation Insertion

Following the crystal generation process, the user prompt depicted in Fig. 7 was
utilized once more to introduce screw and edge dislocations into the pristine FCC
samples.

Fig. 7 Prompt for introducing dislocations into the crystal


18 P. Periyasamy and B. Eidel

Fig. 7 (continued)
Generation of Atomic Scale Single Crystal 19

5.2 Code Listing for Introducing Dislocation

In code listing 1.3, the Python code exhibits the conclusive snippet crafted by GPT-
4, accompanied by the corresponding user inputs outlined in Tables 5 and 6. The
outcomes of the code execution are visually depicted in Figs. 8 and 9.
1 i m p o r t nu mp y as np
2
3 def c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( x1 , x2 , b , nu ) :
4 t h e t a = np . a r c t a n 2 ( x2 , x1 )
5 r2 = x1 **2 + x2 **2 # S q u a r e of the r a d i a l d i s t a n c e r
6 u1 = ( b / (2 * np . pi ) ) * ( theta + ( x1 * x2 ) / (2 * (1 -
→ nu ) * r2 ) )
7 u2 = ( - b / (2 * np . pi ) ) * ((1 - 2 * nu ) * np . log ( r2 ) /
→ (4 * (1 - nu ) ) + ( x1 **2 - x2 **2) / (4 * (1 - nu ) * r2 )
→ )
8 r e t u r n u1 , u2
9
10 def c a l c u l a t e _ s c r e w _ d i s p l a c e m e n t ( x1 , x2 , b ) :
11 t h e t a = np . a r c t a n 2 ( x2 , x1 )
12 u3 = b * theta / (2 * np . pi )
13 r e t u r n u3
14
15 def a p p l y _ d i s p l a c e m e n t ( a t o m _ p o s i t i o n s , d i s l o c a t i o n _ p o s i t i o n ,
→ b , nu , direction , d i s l o c a t i o n _ t y p e ) :
16 m o d i f i e d _ a t o m s = []
17 for atom in a t o m _ p o s i t i o n s :
18 element , x , y , z = atom
19 if d i r e c t i o n == ’z ’ :
20 dx , dy = x - d i s l o c a t i o n _ p o s i t i o n [0] , y -
→ d i s l o c a t i o n _ p o s i t i o n [1]
21 elif d i r e c t i o n == ’x ’ :
22 dy , dz = y - d i s l o c a t i o n _ p o s i t i o n [0] , z -
→ d i s l o c a t i o n _ p o s i t i o n [1]
23 elif d i r e c t i o n == ’y ’ :
24 dx , dz = x - d i s l o c a t i o n _ p o s i t i o n [0] , z -
→ d i s l o c a t i o n _ p o s i t i o n [1]
25
26 if d i s l o c a t i o n _ t y p e == ’ edge ’ :
27 if d i r e c t i o n == ’z ’:
28 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dx , dy ,
→ b , nu )
29 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u1 , y +
→ u2 , z ])
30 elif d i r e c t i o n == ’x ’ :
31 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dy , dz ,
→ b , nu )
32 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y + u1 , z
→ + u2 ])
33 elif d i r e c t i o n == ’y ’ :
34 u1 , u2 = c a l c u l a t e _ e d g e _ d i s p l a c e m e n t ( dx , dz ,
→ b , nu )
35 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u1 , y , z
→ + u2 ])
36 elif d i s l o c a t i o n _ t y p e == ’ screw ’ :
20 P. Periyasamy and B. Eidel

37 u3 = c a l c u l a t e _ s c r e w _ d i s p l a c e m e n t ( dx , dy , b )
38 if d i r e c t i o n == ’z ’:
39 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y , z + u3
→ ])
40 elif d i r e c t i o n == ’x ’ :
41 m o d i f i e d _ a t o m s . a p p e n d ([ element , x + u3 , y , z
→ ])
42 elif d i r e c t i o n == ’y ’ :
43 m o d i f i e d _ a t o m s . a p p e n d ([ element , x , y + u3 , z
→ ])
44 return modified_atoms
45
46 def i n s e r t _ d i s l o c a t i o n ( i n p u t _ f i l e _ n a m e , o u t p u t _ f i l e _ n a m e ,
→ l a t t i c e _ c o n s t a n t , d i s l o c a t i o n _ p o s i t i o n , direction ,
→ d i s l o c a t i o n _ t y p e , nu ) :
47 with open ( i n p u t _ f i l e _ n a m e , ’r ’ ) as file :
48 li ne s = file . r e a d l i n e s ()
49
50 atoms = [[ line . split () [0] , float ( line . split () [1]) , float
→ ( line . split () [2]) , float ( line . split () [3]) ] for line in
→ l in es [ 2:]]
51
52 b = np . sqrt (2) * l a t t i c e _ c o n s t a n t / 2 # Burgers vector
→ for FCC
53
54 m o d i f i e d _ a t o m s = a p p l y _ d i s p l a c e m e n t ( atoms ,
→ d i s l o c a t i o n _ p o s i t i o n , b , nu , direction ,
→ d i s l o c a t i o n _ t y p e )
55
56 with open ( o u t p u t _ f i l e _ n a m e , ’w ’) as file :
57 file . write ( f " { len ( m o d i f i e d _ a t o m s ) }\ n " )
58 file . write ( f " Atoms with { d i s l o c a t i o n _ t y p e }
→ d i s l o c a t i o n \ n " )
59 for atom in m o d i f i e d _ a t o m s :
60 file . write ( " " . join ( map ( str , atom ) ) + " \ n " )
61
62 # User input h a n d l i n g
63 i n p u t _ f i l e _ n a m e = input ( " Enter the name of the input . xyz
→ file : " )
64 o u t p u t _ f i l e _ n a m e = input ( " E n t e r the name of the o u t p u t . xyz
→ file : " )
65 l a t t i c e _ c o n s t a n t = float ( input ( " Enter the l a t t i c e c o n s t a n t :
→ " ) )
66 nu = fl oa t ( in put ( " E n t e r the P o i s s o n ’s ratio : " ) )
67 d i r e c t i o n = input ( " Enter the d i r e c t i o n of the d i s l o c a t i o n
→ line (x , y , or z ) : " ) . lower ()
68 d i s l o c a t i o n _ t y p e = input ( " Enter the type of d i s l o c a t i o n (
→ screw or edge ) : " ) . lower ()
69
70 # R e q u e s t a d d i t i o n a l c o o r d i n a t e s based on the d i r e c t i o n of
→ the d i s l o c a t i o n line
71 if d i r e c t i o n == ’x ’ :
72 d i s l o c a t i o n _ y = float ( input ( " Enter the y - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
Generation of Atomic Scale Single Crystal 21

73 d i s l o c a t i o n _ z = float ( input ( " Enter the z - c o o r d i n a t e of


→ the d i s l o c a t i o n line : " ) )
74 dislocation_position = ( dislocation_y , dislocation_z )
75 elif d i r e c t i o n == ’y ’ :
76 d i s l o c a t i o n _ x = float ( input ( " Enter the x - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
77 d i s l o c a t i o n _ z = float ( input ( " Enter the z - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
78 dislocation_position = ( dislocation_x , dislocation_z )
79 elif d i r e c t i o n == ’z ’ :
80 d i s l o c a t i o n _ x = float ( input ( " Enter the x - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
81 d i s l o c a t i o n _ y = float ( input ( " Enter the y - c o o r d i n a t e of
→ the d i s l o c a t i o n line : " ) )
82 dislocation_position = ( dislocation_x , dislocation_y )
83 else :
84
85
86 ra is e V a l u e E r r o r ( " I n v a l i d d i r e c t i o n . Pleas enter x , y ,
→ or z . " )
87
88
89 insert_dislocation ( input_file_name , output_file_name ,
→ l a t t i c e _ c o n s t a n t , d i s l o c a t i o n _ p o s i t i o n , direction ,
→ d i s l o c a t i o n _ t y p e , nu )
Code Listing 1.3 Code generated by GPT-4 for introducing dislocation into FCC crystal

Table 5 User inputs for insertion of screw dislocation


Requested input parameters: File names
Enter the name of the input .xyz file: Crystal_structure.xyz
Enter the name of the output .xyz file: Screw_dislocation.xyz
Requested input parameters: Dislocation line
X Y Z
Enter the lattice constant: 3.51 3.51 3.51
Enter the Poisson’s ratio (.ν ): 0 0 0
Enter the direction of the dislocation line (x, y, or z): X Y Z
Enter the type of dislocation (screw or edge): Screw Screw Screw
Enter the coordinate of the dislocation line:
For x direction: – 10 10
For y direction: 10 – 10
For z direction: 10 10 –
22 P. Periyasamy and B. Eidel

Table 6 User inputs for insertion of edge dislocation


Requested input parameters: File names
Enter the name of the input .xyz file: Crystal_structure.xyz
Enter the name of the output .xyz file: Edge_dislocation.xyz
Requested input parameters: Dislocation line
X Y Z
Enter the lattice constant: 3.51 3.51 3.51
Enter the Poisson’s ratio (.ν ): 0.33 0.33 0.33
Enter the direction of the dislocation line (x, y, or z): X Y Z
Enter the type of dislocation (screw or edge): Edge Edge Edge
Enter the coordinate of the dislocation line
For x direction: – 10 15
For y direction: 10 – 15
For z direction: 10 10 –

5.3 Results of Dislocation Insertion

The outcome of the code for inserting screw dislocation and edge dislocation into
the crystal is illustrated in Fig. 8. Note that for screw dislocations, the line direction
.ξ aligns with the Burgers vector.

(a) -direction | | to -axis (b) -direction | | to -axis (c) -direction | | to -axis

Fig. 8 Screw dislocations in FCC crystals for different orientation. The green arrow indicates the
direction of the Burgers vector, the red arrow represents the dislocation line direction. Rendering
by using OVITO

The outcome of the code for inserting an edge dislocation or a screw dislocation
into the crystal is illustrated in Fig. 9. Note that for edge dislocations the direction .ξ
of the dislocation line is perpendicular to the Burgers vector.
Generation of Atomic Scale Single Crystal 23

(a) | | to -direction (b) | | to -direction (c) | | to -direction

Fig. 9 Edge dislocations in FCC crystals for different orientations. The green arrow indicates
the direction of the Burgers vector, while the blue arrows represent the dislocation line direction.
Rendering by using OVITO

6 Testing for Verification

6.1 Major Issues

In the initial formulation of the prompt, the problem objective was clearly articulated,
as depicted in Fig. 10. GPT-4 demonstrated its proficiency by generating representa-
tions of all three crystal structures (FCC, BCC, and HCP) based on this initial prompt.
However, a notable observation surfaced during the simulation box depiction, where
it became apparent that the atoms did not entirely occupy the designated space, as
illustrated in Fig. 11. This discrepancy persisted despite the explicit specification
of this condition in the user prompt. The incongruity raises a significant concern
regarding the accurate adherence to the specified conditions during the generation
process.
To address this issue in the subsequent iteration, a more generic prompt was
presented to the language model, as depicted in Fig. 12.

6.1.1 Randomness in the Model

As observed in the subsequent iteration, the introduction of a more generic prompt


led to unpredictable behavior in GPT-4. Notably, the model began to deviate from
the initially provided user input, altering primitive and basis vectors. Consequently,
all three crystal structures generated by the model deviated significantly from the
expected configurations, resulting in a failure to detect any valid crystal structures in
OVITO as shown in Fig. 13. This outcome underscores the importance of precision in
formulating prompts when interacting with Large Language Models (LLMs), as they
may exhibit more random behavior in the absence of specific and explicit instructions.
24 P. Periyasamy and B. Eidel

Fig. 10 Initial prompt


Generation of Atomic Scale Single Crystal 25

(a) FCC crystal (b) BCC crystal


(c) HCP crystal structure
structure structure

Fig. 11 Simulation box containing atoms with partial occupancy

Fig. 12 Iteration 2 to overcome the issue of partial occupancy

(a) FCC crystal (b) BCC crystal


structure structure (c) HCP crystal structure

Fig. 13 Consequence of a generic prompt: failure to detect all valid crystal structures

6.1.2 Short Term Memory

In subsequent iterations, efforts were made to curb the model’s random behavior by
providing more precise information. To retrieve the equations for primitive and basis
vectors from the initial prompt, GPT-4 was specifically instructed to use the primitive
vectors from the initial input. However, due to its limited short-term memory, the
model encountered difficulties in recalling all the required information from the initial
prompt shown in Fig. 10. Consequently, primitive and basis vectors were reintroduced
26 P. Periyasamy and B. Eidel

as input prompts to ensure accuracy. It is noteworthy that the challenges observed


with GPT-3.5 persist with GPT-4, highlighting the continued need for careful and
explicit instructions to overcome limitations in model memory and enhance overall
performance.

6.1.3 Code Reproducibility

In the subsequent iteration, the model successfully generated an exact code capable
of producing all three crystal structures in arbitrary orientations. Acknowledging
the significant influence of prompt engineering on the performance of LLMs, GPT-4
was specifically tasked with generating a prompt that would enable the production of
precise code consistently. Despite incorporating a comprehensive prompt as depicted
in Fig. 14 that encapsulated all the necessary information for the precise regeneration
of the code, the model encountered challenges in reproducing identical results.
Due to the constrained field knowledge and limited thinking capability, the model
encountered difficulties in distinguishing between the concepts of the number of
unit cells and the calculation of box length considering periodicity. Specifically, it
struggled with prompts related to periodicity calculation, particularly with the HCP
structure generation, as HCP possesses a distinct lattice constant in the .z-direction.
Despite numerous attempts to generate prompts, the model remained inconsistent
in producing accurate code. This inconsistency can be attributed to the inherent
limitations of the model’s understanding and its tendency to overlook specific aspects
of the input prompts.
In addition, the model’s performance was hindered by the abundance of inputs
provided, which may have overwhelmed its attention and led to confusion. When
presented with an excessive amount of information, the model may struggle to pri-
oritize and comprehend the key components necessary for generating accurate code.
As a result, splitting the prompt into two distinct parts—one for calculating the box
length with periodicity Fig. 3 and the other for generating a crystal structure with arbi-
trary orientation Fig. 4 was deemed necessary to enhance code reproducibility. This
approach aims to streamline the input process and mitigate the model’s tendency to
overlook essential details, ultimately improving the reliability of the generated code.
After splitting the entire code into smaller, more manageable prompts, the ran-
domness in the model’s responses became controllable. This allowed the model to
produce code for calculating periodicity more accurately, but it still required a few
iterations to refine the output and get the exact code for structure generation. Overall,
this approach proved to be far better compared to using a single prompt for generating
the entire code.
Generation of Atomic Scale Single Crystal 27

Fig. 14 Final prompt generate by GPT4


28 P. Periyasamy and B. Eidel

Fig. 14 (continued)
Generation of Atomic Scale Single Crystal 29

Table 7 List of frequent errors and subsequent prompts


Frequent errors Subsequent prompts
Validation of atom position The condition used to check the atom positions is not precisely
followed in the code.
For loop ranges The loop range condition needs to be strictly adhered to as
specified in the input prompt.
Rotation The primitive and basis vectors have to be rotated initially.
Relative distance calculation x1 and x2 should be calculated as the relative distance between
the dislocation line and the atom position.
Position of dislocation The dislocation position should be taken as input based on the
dislocation direction provided by the user.

7 Minor Errors

The code generated by the model might need some adjustments, as there were com-
mon errors during its development. Table 7 are the frequent issues identified along
with the additional guidance provided for rectification. These errors have been fre-
quently encountered, but the subsequent prompts provided offer resolutions to rectify
them effectively. With these modifications implemented, one can replicate the pro-
cess with minimal issues, albeit with some minor periodicity-related issues.

8 Discussion

Utilizing GPT-4 for evaluating its potential as automated programming assistance


across a range of atomistic simulation tasks has generally yielded satisfactory out-
comes. However, as the complexity of the tasks escalates, deviations from the user
prompts become more apparent in the model’s behavior. Here, we present a com-
prehensive discussion outlining observations on GPT-4’s performance in handling
intricate tasks.

• Limitation in Handling Complex Tasks: The model encountered difficulties in


accurately generating all three basic crystal structures without precise user inputs.
Challenges arose in properly populating the simulation box and validating atom
positions, particularly without explicit instructions such as employing for loops
and verifying positions. The HCP crystal structure posed significant hurdles due to
its intricate nature, involving distinct lattice parameters (‘a’ & ‘c’) and additional
basis vectors, in contrast to FCC and BCC. This underscores the model’s limita-
tions in handling complex structures. Furthermore, when arbitrary rotations were
introduced, although the model could generate the rotation matrix, it struggled to
apply it accurately to the primitive vectors and basis vectors. Consequently, human
feedback remained essential even with clear and concise prompts, highlighting the
30 P. Periyasamy and B. Eidel

ongoing need for human intervention in such scenarios. Despite receiving feed-
back, issues pertaining to the periodicity of surface atoms persist in the final output
shown in Figs. 5 and 6.
• Autonomous Decision-Making in Mathematical Computations: However, it is
noteworthy that the model’s proficiency in handling tasks related to dislocation
insertion surpasses its capability in generating single crystals with arbitrary rota-
tions. In this scenario, it autonomously figured out the precise equations shown
in (3) and (4) for calculating displacements, even in the absence of explicit user
prompts. Furthermore, it opted to utilize the np.arctan2(y, x) function from
NumPy instead of np.arctan(y/x), as the former better accounts for quad-
rant distinctions, unlike the latter, which cannot differentiate between quadrants.
These decisions were made without specific instructions from the user, highlight-
ing the models ability to autonomously make informed choices in its computations.
These capabilities suggest an advanced level of mathematical understanding and
problem-solving skills in GPT-4. However, it’s important to note that while the
model-generated code demonstrates competency in many areas, minor issues may
arise still in the insertion of dislocations in specific orientations. Nonetheless, it
has consistently showcased its capability to generate entire code segments within
a few iterations, a notable feat when compared to single crystal generation.
• Sensitivity to User Prompts and Memory Constraints: Moreover, the model’s
responses are highly sensitive to even minor changes in the input prompt, resulting
in inconsistent outputs. Additionally, it has been observed that GPT-4 sometimes
struggles to maintain coherence and consistency over extended conversational con-
texts [7]. Understanding the memory mechanism of Chat-GPT plus is essential for
optimizing interactions with the AI. Chat-GPT plus, based on the GPT-4 architec-
ture, has a short-term memory capacity limited by an 8,000-token constraint [6].
As this limit is reached, the AI begins to forget the earliest parts of the conversation,
unable to recall details beyond its immediate context window. Instead, it relies on
patterns and knowledge acquired during its training on a vast collection of inter-
net texts to generate relevant responses. Recognizing this limitation is crucial for
generating effective prompts and achieving more meaningful engagements with
GPT-4 [5]. Of course, well-established software systems for Molecular Dynamics
(MD) simulations are available with inbuilt atomic structure generation such as,
e.g., LAMMPS [9] or Atomsk [3]. In functionalities they clearly go far beyond
the code presented in this chapter; from straightforward crystal structures to more
intricate designs like nano-wires, and non-periodic structures.

9 Conclusion

In conclusion, the study aimed to evaluate GPT-4’s capability in generating stan-


dalone code for constructing single crystals with arbitrary orientations, including
dislocation insertion. The assessment involved examining the code generated by
Generation of Atomic Scale Single Crystal 31

GPT-4 in response to user-defined prompts using visualization software OVITO. The


findings indicate that, with extensive human feedback, the generated results gener-
ally met user requirements satisfactorily. Large Language Models (LLMs) such as
GPT-4 have emerged as crucial assets across various industries. However, their effec-
tiveness relies heavily on well-crafted prompts as user input, highlighting the signifi-
cance of prompt engineering. Crafting effective prompts requires human intervention
and expertise in the relevant field to navigate and harness the inherent randomness
of these models. Without deep domain understanding, generating precise and spe-
cific prompts becomes challenging. Thus, while sophisticated LLMs offer immense
potential, human knowledge remains superior, especially in tackling intricate tasks.
However, a burgeoning programming language known as prompt engineering is
poised to play a pivotal role in the future, aiding humans in their respective domains.

References

1. Anderson, P.M., Hirth, J.P., and Lothe, J. Theory of Dislocations, 2017, Cambridge University
Press.
2. Callister, W., Rethwisch, D. Materials Science and Engineering: an Introduction. (John Wiley
& Sons, New York, 2007).
3. Hirel, P. Atomsk: A tool for manipulating and converting atomic data files, pp. 212-219 (2015)
https://www.sciencedirect.com/science/article/pii/S0010465515002817.
4. Hull, D., and Bacon, D.J. Introduction to dislocations. Vol. 37. Elsevier, 2011.
5. OpenAI ChatGPT memory span. (https://www.4fsh.com/), [Online; accessed Feb-2024]
6. (.<i.> Chat GPT-4 Plus Memory Limit – Community.Openai.Com.</i.>, n.d.) [Online; accessed
Feb-2024].
7. Ray, P. ChatGPT: A comprehensive review on background, applications, key challenges, bias,
ethics, limitations and future scope. Internet Of Things And Cyber-Physical Systems. 3 pp.
121-154 (2023), https://www.sciencedirect.com/science/article/pii/S266734522300024X
8. Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO-the Open
Visualization Tool. Modelling And Simulation in Materials Science And Engineering (2010).
9. Thompson, A.P., Aktulga, H.M., Berger, R., et al. LAMMPS - a flexible simulation tool for
particle-based materials modeling at the atomic, meso, and continuum scales, pp.108171 (2022),
https://www.sciencedirect.com/science/article/pii/S0010465521002836.
Molecular Dynamics Simulation of Noble
Gases

Aagashram Neelakandan, Vishal Vijendra Badami, and Bernhard Eidel

Abstract This chapter presents a detailed exploration of a Python code generated by


ChatGPT-4 for the 2D Molecular Dynamics simulation of noble gases, with a specific
focus on Argon. The chapter discusses the implementation of the velocity Verlet
algorithm for integrating Newton’s second law and the application of the Lennard-
Jones (12–6) pair potential to model the interactions between Argon atoms. The
code is rigorously verified through a series of tests, emphasizing energy conservation
within an NVE ensemble. Both simple two-atom systems and more complex multi-
atom simulations within a periodic boundary condition framework are analyzed,
demonstrating the code’s accuracy and reliability in simulating molecular dynamics
in noble gases.

1 Introduction

Molecular dynamics (MD) simulations touch on many aspects of physics and are
a valuable resource for comparing theoretical models to experimental results. MD
uses computer simulation with statistical mechanics to compute static and dynamic
properties of a classical many-body system [1–4]. In contrast to Molecular Statics
(MS) whose simulations are carried out at a temperature of .0 K, MD simulations
are carried out at a temperature greater than .0 K. The classical MD method simply
solves numerically Newton’s equations of motion for the interacting many-particle
with pair potentials. Here, we treat atoms as classical Newtonian particles and we can
compute the acceleration of any atom. The force between atoms is still determined

A. Neelakandan (B) · V. V. Badami · B. Eidel


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
V. V. Badami
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 33


B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_2
34 A. Neelakandan et al.

via the interatomic potential by the gradient of the pair potential function. In this
project, we are using Lennard-Jones potential (LJ potential) also known as 12–6
potential.
This is a valid potential for chemical inert gases such as Argon (Ar). Argon is
a noble gas. It is one of the elements in Group 18 of the periodic table, which
includes other noble gases like helium, neon, krypton, xenon, and radon. These
gases are characterized by their lack of reactivity due to having a full valence electron
shell, making them very stable and unlikely to form chemical bonds under normal
conditions. Argon is the third noble gas, following helium and neon.
Temperature and time play a role in MD simulations as compared to MS simula-
tions. There are many ensembles in MD, the NVE (microcanonical) ensemble, NVT
(canonical) ensemble, and NPT ensemble. Here variables are, . N is the number of
particles, .V is the volume of the simulation box, . E is the total energy of the simu-
lation box, .T is the temperature of the simulation box and . P is the pressure in the
simulation box. In the NVE ensemble, variables . N , .V , and . E are kept constant and
this rule applies to all the above ensembles. For NVT and NPT ensembles, additional
couplings will be used such as thermostats and barostats respectively.

1.1 Interatomic Potential

At the core of MD simulations are interactions between the individual molecules.


These interactions can be separated into two types, bonded and non-bonded. Bonded
interactions take place within molecules, between atoms which are connected in some
way. Bonded interactions might include terms to change the length of a chemical
bond, or change the angle of a bond. All other interactions between atoms are classed
as non-bonded interactions. This simulation consists entirely of Argon atoms, so only
non-bonded interactions need to be considered. These can be modeled with the LJ
potential.
The LJ potential .V (r ) is given by:
   σ 6 
σ 12
. V (r ) = 4 − , (1)
r r

where .V (r ) is the potential energy as a function of the distance .r between two


particles, . is the depth of the potential well, which represents the strength of the
attractive interaction, here the value is 1.65 .×10−21 J, .σ is the finite distance at which
the inter-particle potential is zero, also known as the collision diameter, here the value
is 3.4 .×10−10 m.
The LJ potential consists of a short-range repulsive term and a (relatively) long-
range attractive term. The .r −6 attractive term comes from the London dispersion
force, which is (the weakest) part of the Van der Waals forces. Van der Waals forces
are weak attractions between atoms that do not cause chemical bonds to be formed.
Often called induced-dipole dipole interactions.
Molecular Dynamics Simulation of Noble Gases 35

The short-range repulsive term models the Pauli exclusion principle by stopping
particles from getting too close. The .r −12 exponent does not have a rigorous physical
justification. It is computationally efficient as it is just the square of the other .r −6
term.
Justification for Use in Modeling Noble Gases Noble gases like argon are char-
acterized by their closed-shell electronic configurations, meaning that they have no
permanent dipole moments and are chemically inert. The interactions between noble
gas atoms are primarily due to weak van der Waals forces. The Lennard-Jones poten-
tial is particularly well-suited to modeling these interactions for several reasons:

• Van der Waals Forces Dominance: For noble gases, the interaction is predom-
inantly due to dispersion forces, which the Lennard-Jones potential effectively
captures through its .(σ/r )6 term.
• Simplicity and Computational Efficiency: The Lennard-Jones potential is com-
putationally simple and efficient to calculate, making it ideal for large-scale sim-
ulations of noble gases where more complex potentials might be unnecessary or
impractical.
• Empirical Fit to Experimental Data: Parameters . and .σ can be fitted to exper-
imental data, allowing the Lennard-Jones potential to accurately reproduce the
properties of noble gases such as their phase behavior (liquid-gas coexistence
curve), transport properties (viscosity, diffusion), and thermodynamic properties
(equation of state).
• Historical Success: The Lennard-Jones potential has a long history of successful
application in simulating noble gases. It was originally developed based on exper-
imental observations of argon, and it has been validated extensively through its
ability to reproduce the experimentally observed behaviors of noble gases.

Limitations and Considerations While the Lennard-Jones potential is highly


effective for noble gases, the limitations shall be briefly mentioned. It is the (i) lack
of directionality, since LJ potential does not account for any directional dependence in
bonding, which is unimportant for noble gases but would be a limitation for systems
involving directional bonds (e.g., covalent bonds in molecules). Furthermore, the (ii)
approximation of repulsion through the.r −12 term is an approximation of the repulsive
forces, chosen primarily for computational convenience. While it works well for
noble gases, more complex repulsive terms might be needed for other systems. As
another shortcoming, (iii) the temperature and pressure dependence; LJ potential can
not be universally accurate across all temperatures and pressures, since the parameters
. and .σ are usually fitted for specific conditions.

1.2 Newton’s 2nd Law of Motion and Time Integration

Newton’s second law of motion is given by the time derivative of the linear momentum
p = mv and the force . F according to
.
36 A. Neelakandan et al.

dp
. F= (2)
dt
which simplifies for constant masses, hence .ṁ = 0 to

. F = ma , (3)

where .m is the mass of the particle and . a its acceleration.


The positions and velocities of the particles are evaluated using the Velocity
Verlet integration method, where the predicted positions. x, predicted velocities.v, and
predicted acceleration . a at time .t + t are obtained from the same, given quantities
at the current time .t in the following way:

1
. x(t + t) = x(t) + v(t)t + a(t)t 2 , (4)
2
a(t) + a(t + t)
.v(t + t) = v(t) + t , (5)
2
where .t is the time step size over which the integration is performed to obtain the
predicted quantities. Velocity Verlet time integration method is employed since it is
fast, requires little memory, and is easy to use for long time steps.
In this work, boundary conditions (BC) are periodic. They mimic the behavior of
the infinite bulk surrounding the sample. In this way, surface effects are removed.
It should also be noted that if the particles go through a boundary of the simulation
box, they will appear on the other side of the box.
We will present an MD simulation of Argon particles for the gas phase, which is
considered to be moving with the velocity given by the Maxwell-Boltzmann distribu-
tion as shown in Fig. 1. The Maxwell-Boltzmann distribution describes the velocity
of particles in ideal gases at thermodynamic equilibrium, where particles exchange
energy through brief collisions and their velocities follow Maxwell-Boltzmann statis-
tics based on kinetic energy. In MD simulations we first specify the initial positions
and momentum of the particles, the latter refers to the Maxwell-Boltzmann distribu-
tion.

1.3 Statistical Ensemble

Statistical ensembles define the conditions under which atomistic simulations are
carried out, dictating what thermodynamic variables are conserved or controlled.
Different ensembles correspond to different sets of thermodynamic variables that
are held constant during the simulation. Below is a description of the most com-
mon ensembles of the microcanonical ensemble NVE, the canonical ensemble NVT,
and the isothermal-isobaric ensemble NPT, where the acronyms reflect the variables
Molecular Dynamics Simulation of Noble Gases 37

Fig. 1 Probability density functions of Maxwell-Boltzmann distribution for the velocities of noble
gases like Helium, Neon, Argon and Xeon at a temperature of 298 K

involved, the number of particles . N , the particle volume .V , the energy . E, the tem-
perature .T , and the pressure . P.
Here we use the NVE ensemble which keeps constant . N , .V and the internal
energy . E which follows from the system characteristic being isolated. It meets the
requirements of the present goals simulating systems with no heat exchange with the
surroundings, analogous to an isolated system in thermodynamics. It provides a nat-
ural way to observe the dynamical evolution of a system without external influences,
often used for studying the intrinsic properties of a system.

2 Prompt

The parameters and constants relevant to the simulation are given in Table 1.
Figure 2 shows the initial positions of the particles for the simulation. Then we
evolve the system according to Newton’s second law of motion for which we let the
particles interact through a LJ potential using an NVE ensemble. Finally, we measure
physical quantities as functions of particle positions and momentum.
38 A. Neelakandan et al.

Table 1 Problem 1: Settings for the MD simulation of the many-particle system


Parameters Values/Types Units
Boltzmann constant .k B 1.380649 .×10−23 J/K
Atomic mass (Ar) 6.63 .×10−26 kg
Interatomic potential (Ar) LJ potential
with . 1.65 .×10−21 J
and .σ 3.4 .×10−10 m
Type of ensemble NVE (microcanonical)
Temperature .T 300 K
Simulation box 2D
Size 10. × 10 nm
BC Periodic
Number of particles 100
Time integrator Velocity Verlet
Time step size .t 2 .×10−15 s
Number of time steps 5000
Output Particle distribution animation

Fig. 2 Initial particle positions in the 2D MD simulation for argon gas


Molecular Dynamics Simulation of Noble Gases 39

The algorithm for the 2D MD simulation is given as pseudocode in the algorithm


box 1
Algorithm 1: Molecular Dynamics Simulation for 2D System
Input: Number of particles N , initial positions x i (0), initial velocities v i (0), time step t,
number of time steps N p , force field parameters
Output: Particle trajectories x i (t), velocities v i (t), and plot kinetic energy E kin (t), potential
energy E pot (t), and total energy E tot (t) over time step t
1 Initialization:
2 Initialize the velocities of the particles such that they follow a Maxwell-Boltzmann
distribution corresponding to a temperature of 300 K. Ensure the system has zero net
momentum by adjusting the velocities;
3 Set initial positions x i (0) and velocities v i (0) for all particles i = 1, . . . , N ;
4 Compute initial forces F i (0) on each particle using the interatomic potential;
5 for t = 0toT with step t do
6 Velocity Verlet Integration:
7 foreach particle i do
8 Update positions:

F i (t) t 2
x i (t + t) = x i (t) + v i (t)t +
mi 2

9 Compute intermediate velocities:

t F i (t) t
v i (t + ) = v i (t) +
2 mi 2

10 end
11 Force Calculation:
12 Recalculate forces F i (t + t) on each particle due to interactions using the interatomic
potential;
13 foreach particle i do
14 Update velocities:

t F i (t + t) t
v i (t + t) = v i (t + )+
2 mi 2

15 end
16 Apply Boundary Conditions:
17 Apply boundary conditions (e.g., periodic boundaries, reflective walls) to updated
positions and velocities. Here periodic boundary conditions are to be applied;
18 Calculate System Properties:
19 Compute kinetic energy E kin (t), potential energy E pot (t), and total energy E tot (t);
20 end
21 Output:
22 Return particle trajectories x i (t), velocities v i (t), and plot E kin (t), E pot (t), and E tot (t) over
time step t;
The final prompt which worked is given in Fig. 3. It was used for generating the Code
Listing 2.1 is,
40 A. Neelakandan et al.

Generate a Python code for the molecular dynamics simulation for argon
gas in a 2D rectangular domain. The simulation box has dimensions of 10
nanometers by 10 nanometers. Consider the FCC lattice and make sure
to initialize the positions of atoms in the FCC lattice itself to avoid the
overlapping of the atoms.

Initialization: Initialize the positions of 100 particles randomly within the


simulation box ensuring that no two particles overlap. Initialize the velocities
of the particles such that they follow a Maxwell-Boltzmann distribution
corresponding to a temperature of 300 Kelvin. Ensure the system has zero
net momentum by adjusting the velocities.

Integration Method: Use the robust Velocity Verlet integration method to


update the positions and velocities of particles based on the forces acting on
them.

Periodic Boundary Conditions: Apply periodic boundary conditions directly


after updating the positions to maintain the particles within the simulation
box.

Interatomic Potential: Use the Lennard-Jones potential for calculating


interatomic forces, with an appropriate small epsilon value to avoid
singularities in the denominator.

Simulation Parameters: Set the time step for the simulation to 2 femtosec-
onds and simulate 5000 time steps.

Ensemble and Conservation: Use the NVE (microcanonical) ensemble. As


atoms interact with each other stored potential energy is converted to kinetic
energy and keep in mind since it’s an NVE ensemble the total energy of the
system should remain constant.

Output: Plot the kinetic energy, potential energy, and total energy to the
number of time steps NOT Time.
Print the values of kinetic energy, potential energy and total energy at the
end of the simulation.
Create a snapshot in the form of an image to visualize particles and save the
resulting snapshot output in the user’s desktop path.

Create separate functions for: Initializing positions and velocities of the

Fig. 3 Prompt for generating a 2d MD code in python for argon gas


Molecular Dynamics Simulation of Noble Gases 41

Fig. 3 (continued)
42 A. Neelakandan et al.

Fig. 3 (continued)

3 Code Listing

The code listing 2.1 displays the python code generated by ChatGPT-4o for the
prompt given in Fig. 3.
1 i m p o r t os
2 i m p o r t n um py as np
3 i m p o r t m a t p l o t l i b . p y p l o t as plt
4 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
5 from m a t p l o t l i b . a n i m a t i o n i m p o r t P i l l o w W r i t e r
6
7 def g e t _ d e s k t o p _ p a t h () :
8 home = os . path . e x p a n d u s e r ( " ~ " ) # Gets the home
→ d i r e c t o r y
9 d e s k t o p = os . path . join ( home , ’ D e s k t o p ’ ) # Appends
→ D e s k t o p to the home path
10 return desktop
11
12 # Constants
13 kB = 1 . 3 8 0 6 4 8 5 2 e -23 # Boltzmann constant (J/K)
14 e p s i l o n = 1.65 e -21 # Lennard - Jones p o t e n t i a l well depth ( J )
15 sig ma = 3.4 e -10 # Lennard - Jones p o t e n t i a l d i s t a n c e
→ p a r a m e t e r ( m )
16 mass = 6.63 e -26 # Mass of argon atom ( kg )
17 T = 300 # Temperature (K)
18 L = 10 e -9 # L e n g t h of the s i m u l a t i o n box ( m )
19 n u m _ p a r t i c l e s = 100 # N u m b e r of p a r t i c l e s
20 dt = 2e -15 # Time step ( s )
21 n u m _ s t e p s = 5000 # N u m b e r of s i m u l a t i o n steps
22
23 def i n i t i a l i z e _ p o s i t i o n s ( n u m _ p a r t i c l e s , L ) :
24 """ I n i t i a l i z e p o s i t i o n s of p a r t i c l e s in an FCC l a t t i c e
→ """
25 p o s i t i o n s = np . z e r o s (( n u m _ p a r t i c l e s , 2) )
26 n u m _ c e l l s _ p e r _ s i d e = int ( np . ceil ( np . sqrt ( n u m _ p a r t i c l e s ) )
→ )
27 cell_size = L / num_cells_per_side
Molecular Dynamics Simulation of Noble Gases 43

28 particle_idx = 0
29
30 for i in range ( n u m _ c e l l s _ p e r _ s i d e ) :
31 for j in range ( n u m _ c e l l s _ p e r _ s i d e ) :
32 if p a r t i c l e _ i d x >= n u m _ p a r t i c l e s :
33 break
34 p o s i t i o n s [ p a r t i c l e _ i d x ] = np . array ([ i *
→ cell_size , j * c e l l _ s i z e ])
35 p a r t i c l e _ i d x += 1
36
37 # E n s u r e p a r t i c l e s are not on the edge of the box
38 p o s i t i o n s += c e l l _ s i z e / 2
39 return positions
40
41 def i n i t i a l i z e _ v e l o c i t i e s ( n u m _ p a r t i c l e s , T , mass ) :
42 """ I n i t i a l i z e v e l o c i t i e s of p a r t i c l e s to f o l l o w Maxwell -
→ B o l t z m a n n d i s t r i b u t i o n """
43 v e l o c i t i e s = np . r a n d o m . n o r m a l (0 , np . sqrt ( kB * T / mass ) ,
→ ( n u m _ p a r t i c l e s , 2) )
44 v e l o c i t i e s -= np . mean ( velocities , axis =0) # E n s u r e zero
→ net m o m e n t u m
45 return velocities
46
47 def c o m p u t e _ f o r c e s ( p o s i t i o n s ) :
48 """ C o m p u t e f o r c e s and p o t e n t i a l e n e r g y using Lennard -
→ Jo ne s p o t e n t i a l """
49 f o r c e s = np . z e r o s _ l i k e ( p o s i t i o n s )
50 p o t e n t i a l _ e n e r g y = 0.0
51
52 for i in range ( len ( p o s i t i o n s ) ) :
53 for j in range ( i + 1 , len ( p o s i t i o n s ) ) :
54 r_ij = p o s i t i o n s [ i ] - p o s i t i o n s [ j ]
55 r_ij -= L * np . round ( r_ij / L ) # Apply p e r i o d i c
→ boundary conditions
56 r2 = np . dot ( r_ij , r_ij )
57 if r2 < (3 * sigma ) ** 2: # A p p l y a c u t o f f to
→ save c o m p u t a t i o n
58 r 2 _ i n v = s i g m a ** 2 / r2
59 r 6 _ i n v = r 2 _ i n v ** 3
60 r 1 2 _ i n v = r 6 _ i n v ** 2
61 f o r c e _ m a g n i t u d e = 24 * e p s i l o n * (2 *
→ r 1 2 _ i n v - r 6 _ i n v ) / r2
62 f o r c e s [ i ] += f o r c e _ m a g n i t u d e * r_ij
63 f o r c e s [ j ] -= f o r c e _ m a g n i t u d e * r_ij
64 p o t e n t i a l _ e n e r g y += 4 * e p s i l o n * ( r 1 2 _ i n v -
→ r6_inv )
65
66 r e t u r n forces , p o t e n t i a l _ e n e r g y
67
68 def v e l o c i t y _ v e r l e t ( positions , velocities , forces , dt ) :
69 """ P e r f o r m a s i n g l e s t e p of V e l o c i t y V e r l e t i n t e g r a t i o n
→ """
70 p o s i t i o n s += v e l o c i t i e s * dt + 0.5 * f o r c e s * dt ** 2 /
→ mass
44 A. Neelakandan et al.

71 positions = positions % L # Apply p e r i o d i c b o u n d a r y


→ c o n d i t i o n s
72 ne w_ fo rc es , p o t e n t i a l _ e n e r g y = c o m p u t e _ f o r c e s ( p o s i t i o n s )
73 v e l o c i t i e s += 0.5 * ( f o r c e s + n e w _ f o r c e s ) * dt / mass
74 r e t u r n positions , velocities , new_forces ,
→ p o t e n t i a l _ e n e r g y
75
76 def r u n _ s i m u l a t i o n ( n u m _ p a r t i c l e s , L , T , mass , dt , n u m _ s t e p s )
→ :
77 positions = initialize_positions ( num_particles , L)
78 velocities = initialize_velocities ( num_particles , T ,
→ mass )
79 forces , p o t e n t i a l _ e n e r g y = c o m p u t e _ f o r c e s ( p o s i t i o n s )
80 k i n e t i c _ e n e r g y = 0.5 * mass * np . sum ( v e l o c i t i e s ** 2)
81
82 k i n e t i c _ e n e r g i e s = []
83 p o t e n t i a l _ e n e r g i e s = []
84 t o t a l _ e n e r g i e s = []
85
86 for step in range ( n u m _ s t e p s ) :
87 positions , velocities , forces , p o t e n t i a l _ e n e r g y =
→ v e l o c i t y _ v e r l e t ( positions , velocities , forces , dt )
88 k i n e t i c _ e n e r g y = 0.5 * mass * np . sum ( v e l o c i t i e s **
→ 2)
89 total_energy = kinetic_energy + potential_energy
90
91 kinetic_energies . append ( kinetic_energy )
92 potential_energies . append ( potential_energy )
93 total_energies . append ( total_energy )
94
95 r e t u r n positions , k i n e t i c _ e n e r g i e s , p o t e n t i a l _ e n e r g i e s ,
→ t o t a l _ e n e r g i e s
96
97 def p l o t _ e n e r g i e s ( k i n e t i c _ e n e r g i e s , p o t e n t i a l _ e n e r g i e s ,
→ t o t a l _ e n e r g i e s , f i l e n a m e ) :
98 plt . f i g u r e ()
99 plt . plot ( k i n e t i c _ e n e r g i e s , label = ’ K i n e t i c E n e r g y ’ )
100 plt . plot ( p o t e n t i a l _ e n e r g i e s , label = ’ P o t e n t i a l E n e r g y ’ )
101 plt . plot ( t o t a l _ e n e r g i e s , l a b e l = ’ T o t a l E n e r g y ’ )
102 plt . x l a b e l ( ’ Time Step ’ )
103 plt . y l a b e l ( ’ E n e r g y ( J ) ’ )
104 plt . l e g e n d ()
105 plt . s a v e f i g ( f i l e n a m e )
106 plt . show ()
107
108 def s a v e _ s n a p s h o t ( positions , f i l e n a m e ) :
109 plt . f i g u r e ()
110 plt . s c a t t e r ( p o s i t i o n s [: , 0] , p o s i t i o n s [: , 1])
111 plt . xlim (0 , L )
112 plt . ylim (0 , L )
113 plt . x l a b e l ( ’X P o s i t i o n ( m ) ’ )
114 plt . y l a b e l ( ’Y P o s i t i o n ( m ) ’ )
115 plt . ti tl e ( ’ P a r t i c l e P o s i t i o n s ’ )
116 plt . s a v e f i g ( f i l e n a m e )
117 plt . show ()
Molecular Dynamics Simulation of Noble Gases 45

118
119 # Test 1: P e r i o d i c i t y Test with A n i m a t i o n
120 def t e s t _ p e r i o d i c i t y ( L ) :
121 """ Test for p e r i o d i c i t y by c h e c k i n g if a s i n g l e atom
→ r e e n t e r s the box with a n i m a t i o n """
122 p o s i t i o n s = np . array ([[0 , L /2]]) # Start the p a r t i c l e
→ at the left edge
123 v e l o c i t i e s = np . array ([[1 e4 , 0]]) # Increased velocity
→ for f a s t e r m o v e m e n t
124 fig , ax = plt . s u b p l o t s ()
125 ax . s e t _ x l i m (0 , L )
126 ax . s e t _ y l i m (0 , L )
127 ax . s e t _ x l a b e l ( ’ X P o s i t i o n ( m ) ’ )
128 ax . s e t _ y l a b e l ( ’ Y P o s i t i o n ( m ) ’ )
129 particle , = ax . plot ([] , [] , ’ ro ’ , m a r k e r s i z e =12)
130
131 def init () :
132 p a r t i c l e . s e t _ d a t a ([] , [])
133 r e t u r n particle ,
134
135 def u p d a t e ( f r a m e ) :
136 n o n l o c a l positions , v e l o c i t i e s
137 f o r c e s = np . z e r o s _ l i k e ( p o s i t i o n s ) # No other
→ p a r t i c l e s , so no f o r c e s
138 positions , velocities , _ , _ = v e l o c i t y _ v e r l e t (
→ positions , velocities , forces , dt )
139 p a r t i c l e . s e t _ d a t a ([ p o s i t i o n s [0 , 0]] , [ p o s i t i o n s [0 ,
→ 1]]) # Pass as lists
140 r e t u r n particle ,
141
142 n u m _ f r a m e s = int ( L / ( v e l o c i t i e s [0 , 0] * dt ) ) + 1 #
→ E n s u r e e n o u g h f r a m e s to c o v e r the e n t i r e path
143 ani = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s =
→ nu m_ fr ames , i n i t _ f u n c = init , blit = True )
144 g i f _ p a t h = os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ p e r i o d i c i t y _ t e s t . gif ’ )
145 ani . save ( gif_p a t h , w r i t e r = P i l l o w W r i t e r ( fps =30) )
146 plt . cl os e ( fig )
147 pr in t ( f " P e r i o d i c i t y Test a n i m a t i o n saved as { g i f _ p a t h } " )
148
149 # T e s t 2: E n e r g y C o n s e r v a t i o n Test with A n i m a t i o n
150 def t e s t _ e n e r g y _ c o n s e r v a t i o n () :
151 """ T e s t for e n e r g y c o n s e r v a t i o n with two p a r t i c l e s
→ c o l l i d i n g e l a s t i c a l l y with a n i m a t i o n """
152 v _ i n i t i a l = 1 e5 # I n i t i a l v e l o c i t y of the atoms ( m / s )
153
154 # Initial conditions
155 x1 = L / 4 # I n i t i a l p o s i t i o n of atom 1 ( m )
156 x2 = 3 * L / 4 # I n i t i a l p o s i t i o n of atom 2 ( m )
157 v1 = v _ i n i t i a l # I n i t i a l v e l o c i t y of atom 1 ( m / s )
158 v2 = - v _ i n i t i a l # I n i t i a l v e l o c i t y of atom 2 ( m / s )
159
160 # Li st s to store p o s i t i o n s and e n e r g i e s
161 p o s i t i o n s 1 = []
162 p o s i t i o n s 2 = []
46 A. Neelakandan et al.

163 k i n e t i c _ e n e r g i e s = []
164
165 # S i m u l a t i o n loop
166 for step in range ( n u m _ s t e p s ) :
167 # Update positions
168 x1 += v1 * dt
169 x2 += v2 * dt
170
171 # Ch eck for c o l l i s i o n and u p d a t e v e l o c i t i e s
172 if x1 >= x2 :
173 v1 , v2 = v2 , v1
174
175 # Save p o s i t i o n s and e n e r g i e s
176 p o s i t i o n s 1 . a p p e n d ( x1 )
177 p o s i t i o n s 2 . a p p e n d ( x2 )
178 k i n e t i c _ e n e r g y = 0.5 * mass * ( v1 **2 + v2 **2)
179 kinetic_energies . append ( kinetic_energy )
180
181 # R e f l e c t a t o m s at the b o u n d a r i e s ( e l a s t i c c o l l i s i o n
→ with the wall )
182 if x1 < 0 or x1 > L :
183 v1 = - v1
184 if x2 < 0 or x2 > L :
185 v2 = - v2
186
187 # Create animation
188 fig , ax = plt . s u b p l o t s ()
189 ax . s e t _ x l i m (0 , L )
190 ax . s e t _ y l i m (0 , L )
191 ax . s e t _ x l a b e l ( ’ X P o s i t i o n ( m ) ’ )
192 ax . s e t _ y l a b e l ( ’ Y P o s i t i o n ( m ) ’ )
193
194 line1 , = ax . plot ([] , [] , ’ ro ’ , label = ’ Atom 1 ’ )
195 line2 , = ax . plot ([] , [] , ’ bo ’ , label = ’ Atom 2 ’ )
196
197 def init () :
198 li ne 1 . s e t _ d a t a ([] , [])
199 li ne 2 . s e t _ d a t a ([] , [])
200 r e t u r n line1 , line2
201
202 def u p d a t e ( f r a m e ) :
203 # U p d a t e with s e q u e n c e s i n s t e a d of s i n g l e v a l u e s
204 li ne 1 . s e t _ d a t a ([ p o s i t i o n s 1 [ frame ]] , [ L / 2])
205 li ne 2 . s e t _ d a t a ([ p o s i t i o n s 2 [ frame ]] , [ L / 2])
206 r e t u r n line1 , line2
207
208 ani = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s =
→ num_steps , i n i t _ f u n c = init , blit = True )
209
210 # Save the a n i m a t i o n as a GIF using P i l l o w W r i t e r
211 d e s k t o p _ p a t h = os . path . join ( os . path . e x p a n d u s e r ( " ~ " ) , "
→ D e s k t o p " )
212 g i f _ p a t h = os . path . join ( d e s k t o p _ p a t h , "
→ a t o m i c _ c o l l i s i o n _ s i m u l a t i o n . gif " )
213 ani . save ( gif_p a t h , w r i t e r = ’ p i l l o w ’ , fps =60)
Molecular Dynamics Simulation of Noble Gases 47

214
215 plt . show ()
216
217 pr in t ( f " S i m u l a t i o n c o m p l e t e . GIF saved to : { g i f _ p a t h } " )
218
219 # Main s i m u l a t i o n loop
220 positions = initialize_positions ( num_particles , L)
221 v e l o c i t i e s = i n i t i a l i z e _ v e l o c i t i e s ( n u m _ p a r t i c l e s , T , mass )
222 forces , p o t e n t i a l _ e n e r g y = c o m p u t e _ f o r c e s ( p o s i t i o n s )
223
224 # Save i n i t i a l s n a p s h o t
225 s a v e _ s n a p s h o t ( positions , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ i n i t i a l _ s n a p s h o t . png ’ ) )
226
227 k i n e t i c _ e n e r g i e s = []
228 p o t e n t i a l _ e n e r g i e s = []
229 t o t a l _ e n e r g i e s = []
230
231 for step in range ( n u m _ s t e p s ) :
232 positions , velocities , forces , p o t e n t i a l _ e n e r g y =
→ v e l o c i t y _ v e r l e t ( positions , velocities , forces , dt )
233 k i n e t i c _ e n e r g y = 0.5 * mass * np . sum ( v e l o c i t i e s ** 2)
234 total_energy = kinetic_energy + potential_energy
235
236 kinetic_energies . append ( kinetic_energy )
237 potential_energies . append ( potential_energy )
238 total_energies . append ( total_energy )
239
240 # Plot e n e r g i e s
241 plot_energies ( kinetic_energies , potential_energies ,
→ t o t a l _ e n e r g i e s , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ e n e r g y _ p l o t . png ’ ) )
242
243 # Save final s n a p s h o t
244 s a v e _ s n a p s h o t ( positions , os . path . join ( g e t _ d e s k t o p _ p a t h () , ’
→ f i n a l _ s n a p s h o t . png ’ ) )
245
246 # Pr in t f in al energies
247 pri nt ( f " F i n a l K i n e t i c E n e r g y : { k i n e t i c _ e n e r g i e s [ -1]} " )
248 pri nt ( f " Fi na l P o t e n t i a l E n e r g y : { p o t e n t i a l _ e n e r g i e s [ -1]} " )
249 pri nt ( f " F i n a l T o t a l E n e r g y : { t o t a l _ e n e r g i e s [ -1]} " )
250
251 # Run the a d d i t i o n a l tests with a n i m a t i o n s
252 test_periodicity (L)
253 t e s t _ e n e r g y _ c o n s e r v a t i o n ()
Code Listing 2.1 Code generated by ChatGPT-4o for solving 2D MD simulation of argon gas

The output of the ChatGPT-4o does not guarantee that the code generated will
work without any flaws. Thorough tests are indispensible for that reason.
48 A. Neelakandan et al.

4 Tests for Verification

In the following, two tests of rather basic nature are carried out. They are based on
a one-atom system to check the periodic BC, and a two-atom system which checks
quantitatively the energy conservation.

4.1 Test for Periodic BC

Figure 4 shows the trail of one atom moving from the left to the right (snapshots
at constant time increments .t) of the left simulation box; once it leaves that box
through its periodic boundary at the right edge, it simultaneously re-enters the simu-
lation box as displayed in the 2nd window. The plot at the bottom of Fig. 4 displays
the corresponding kinetic energy for the entire process; the constant value over the
full trail of a length twice the box size verifies (kinetic) energy conservation and,
implicitly, the velocity being constant.
In conclusion, the test verifies the proper functionality of the periodic boundary
condition for the MD simulation.

Fig. 4 Periodic BC test for an MD simulation shows the constant kinetic energy as the atom moves
across the periodic boundary, from left to right
Molecular Dynamics Simulation of Noble Gases 49

4.2 Two-Atom-Collision—Interplay of Potential and Kinetic


Energies

Figure 5 shows the setup for the energy conservation test. In the simulation box
with the size of . L = 10 nm there are only two atoms (initial positions at .t0 = 0:
. x 1 (t0 ) = L/6, . x 2 (t0 ) = 5L/6 with .|x 2 (t0 ) − x 1 (t0 )| ≥ r cutoff ), which fly with the
same speed on the same line against each other thus having velocity vectors
with opposite signs, (initial velocities .v 1 (t0 ) = −v 2 (t0 ), .|v 1 (t0 )| = 104 m/s), kinetic
energy will be maximum in left and right image of Fig. 5, because two particles will
be moving with some speed due to attraction and repulsion between them. Whereas
potential energy is maximum in center image of Fig. 5. However, in order to make

Fig. 5 Quantitative test for energy conservation consisting of two atoms traveling towards each
other in the same path with an initial speed of .104 m/s. When they come closer, atoms enter the
cut-off radius limits of the LJ potential resulting in an increase of the potential energy and decrease
in kinetic energy because of their repulsion. This makes the atoms to move in a way from each
other resulting in decrease in potential energy and increase in kinetic energy, till they experience
the same in when they are near to boundary because of periodic boundary condition
50 A. Neelakandan et al.

this effect happen within the compatible range of .t = 2 × 10−15 s, parameter .σ
in the LJ-potential has been increased by one order to .3.4 × 10−9 compared to the
argon value tabulated in Table 1.
For this setting, we test the proper calculation of initial energies as well; for
.|x 2 (t0 ) − x 1 (t0 )| ≥ r cutoff , the potential energy at .t0 is zero, . E pot (t0 ) = 0. The initial
kinetic energy is calculated with .v0 = |v 1 (t0 )| = |v 2 (t0 )| = 104 m/s and the mass of
argon atoms (see Table 1) according to

1
.
calc
E kin (t0 ) = 2 · m v02 = 6.6299999999999994 × 10−18 J = E tot (t0 ) . (6)
2
calc
The figure of . E kin (t0 ) from manual calculation exhibits a minor deviation from the
value computed by the code . E kin (t0 ) = 6.630605061255307 × 10−18 J.

4.3 Energy Conservation in a Many-Atom System

Figure 6 displays the energies of the 100-atom system during the first 5000 time
steps. While the sum of kinetic energy and the potential energy show some scatter,
they sum up to a constant total energy (.4.0592186387714405 × 10−19 J) indicating
energy conservation.
Note that the system evolves from a regular geometric setting where atoms reside
on nodes of a grid with square cells as displayed in Fig. 2. It is the heterogeneous
velocity distribution at the simulation start which drives the system to evolve into a
heterogeneous atom distribution after 5000 time steps as displayed in Fig. 7.

5 Discussion

Some important aspects that are observed while working on prompts and generating
codes are:
• Completeness of the program: Generally, chatbots (ChatGPT-4 and ChatGPT-
4o) account for all the details provided in the prompt. But sometimes it overlooks
some parts of the prompt. Generally, some variables are not defined initially in the
code. The solution to this problem would be, to mention those parts of the prompt
in bold characters and insist chatbot not forget that part.
• Short-term memory capacity: Generally recent chatbots like ChatGPT-4o tend
to have long-term memory since they produce the code output as expected for a
long time. But after a certain point, chatbots tend to miss some parts of the code
and generate the output which is not even mentioned in the prompt. So, we have
to write a prompt very specific about the equations to be used, constants, initial
variables, time integration method, and output plots.
Molecular Dynamics Simulation of Noble Gases 51

Fig. 6 Energies of the 100-atom system versus the number of time steps indicates the conservation
of the total energy while the kinetic and the potential energies show fluctuations

• Reproducibility: If we insert the same prompt from the report to the chatbots,
the output code will probably be similar but not the same as given in this report.
Because chatbots are not consistent with their results. The basic algorithm of the
code will be the same, but some minor changes will be there. The recent versions
of chatbots like ChatGPT-4o will generate code very similar to those given in this
report.
• Reliability: The final code generated by the chatbots is reliable with its results
only after optimizing the prompts. The codes generated by the recent versions like
GPT-4o will be more reliable as compared to the older versions like ChatGPT-4.
• Hallucinations: Hallucinations in chatbots refer to instances where the chatbots
generate responses that are factually incorrect, nonsensical, or completely fabri-
cated, despite being delivered with apparent confidence. In some cases, We have
encountered some minor hallucinations, which can be easily rectified. So, We
would say minor or low-level hallucinations occur in chatbots.
• Learned Lessons: The main lesson that we have learned is, that the prompt should
be very optimized if we need exact and consistent results that match with the
research articles. So even minor information must be added to the prompt so that
it works perfectly without any illogical errors. The quality and correctness of the
output depend mainly on the prompt. So, the prompt should be particular, refined,
and optimized.
52 A. Neelakandan et al.

Fig. 7 Heterogeneous atom distribution after time step 5000 in the MD simulation of argon gas

6 Conclusion

The main aim of this chapter was to construct a prompt for ChatGPT-4o to generate
a Python code for the 2D MD simulation of noble gases such as argon. To put
things into perspective, building blocks of MD simulations were presented such as
Newton’s equation of motion, the Lennard-Jones (12–6) pair potential, statistical
ensembles, and the velocity Verlet algorithm for integrating Newton’s second law.
The code structure was illustrated by a descriptive pseudocode. The resultant Python
code was successfully assessed through a series of tests; the functionality of periodic
boundary conditions was verified and for the collision of two atoms the conservation
of energy along with its dynamic decomposition into kinetic and potential parts was
observed. In conclusion, with some minor manual intervention, ChatGPT-4o has
demonstrated its ability to generate an accurate and reliable code for the molecular
dynamics simulations of noble gases.
Molecular Dynamics Simulation of Noble Gases 53

References

1. D. Frenkel, B. Smit, “Understanding molecular simulation: from algorithms to applications”,


Elsevier, 2023. https://doi.org/10.1016/B978-0-12-267351-1.X5000-7.
2. E.B. Tadmor, R.E. Miller, “Modeling materials: continuum, atomistic and multiscale tech-
niques”, Cambridge University Press, 2011. https://doi.org/10.1017/CBO9781139003582.
3. J.M. Haile, “Molecular dynamics simulation: elementary methods”, John Wiley & Sons, Inc.,
1992. https://doi.org/10.1016/0166-1280(93)87060-q.
4. R. LeSar, “Introduction to computational materials science: fundamentals to applications”, Cam-
bridge University Press, 2013. https://doi.org/10.1017/cbo9781139033398.
Phase Field Modeling of Grain Growth

Rahul Narkhede and Bernhard Eidel

Abstract In the current chapter the assignment for GPT-4 on ChatGPT Plus is
to generate a Python code for grain growth simulation by the phase field method.
Specifically, the non-conserved Allen-Cahn equation with a suitable free energy
functional is solved in 2D using the finite difference method and the explicit Euler
forward time-stepping scheme. By virtue of a specific prompt design for GPT-4 the
resulting Python code allows solving the phase field equations for any generalized
initialization. To this end, two variants of initial grain structures are considered; (i)
a spherical grain embedded in a larger grain, and (ii) a Voronoi tessellation-based
initial structure. Key aspects such as prompt design, code verification and testing of
the outcome are discussed.

1 Introduction

Grains of polycrystalline solids constitute the fundamental influencing factors in for


most physical properties such as corrosion resistance, thermal and electrical conduc-
tivity, and mainly mechanical properties such as strength, ductility and toughness.
Usually, the mechanical properties depend on the mean grain size and the grain size
distribution. Thus, studying grain evolution becomes a key part of computational
materials science.
The local energy at the interface of grains, or grain boundaries, is higher than
the corresponding energy in the bulk of the grain. This extra energy at the interface
proves the thermodynamic driving force for moving the grain boundary in order to
minimize the total free energy. This movement of the grain boundary leads to growth
of grains which implies shrinking of others.

R. Narkhede (B) · B. Eidel


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 55


B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_3
56 R. Narkhede and B. Eidel

At the mesoscale, different computational models, such as Monte Carlo Potts


model [1], Surface Evolver [2], front-tracking method [3], and cellular automata
[4], have been applied. As the number of grains increases, the computational costs
become quite demanding due to the need to track individual grain boundaries and
apply specific constitutive relations to their evolution. Therefore, phase field (PF)
modeling is a potential approach to reduce this computational cost by the description
of interfaces as continuous instead of sharp. Several PF models have been proposed
for grain growth kinetics. Here we present the use of PF modeling for grain growth
simulation based on the example from [5].
In this study, we incorporate the grain growth model of Fan and Chen [6]. In this
model, each grain is described by an order parameter .ηi , which takes the value of
one for a designated grain and zero for all other grains. The evolution of the order
parameters is described by the non-conserved Allen-Cahn equation in the form of
∂ηi δF
. = −L i (1)
∂t δηi

for .i = 1, 2, . . . , N grains, where . L i is the mobility coefficient, .δ the variational


symbol, and . F the free energy functional given by
  
N

κi
.F = f (η1 , η2 , . . . η N ) + |∇ηi |2 dv , (2)
V i
2

where . f is the local free energy density and .κi are the gradient energy coefficients.
The specific form of the orientation-independent . f is given in the Fan and Chen
model [6] as,
N 
  N  N
A 2 B 4
. f (η1 , η2 , . . . η N ) = − ηi + ηi + ηi2 η2j , (3)
i
2 4 i i= j

in which . A and . B are positive constants.


The evolution equation governing the numerical implementation is

∂ηi   N
. = −L i − Aηi + Bηi3 + 2ηi η2j − κi ∇ 2 ηi . (4)
∂t i= j

The Laplacian operator in (4) is approximated using the five-point stencil of the
finite difference (FD) method given by,

ηi+1, j + ηi−1, j + ηi, j+1 + ηi, j−1 − 4ηi, j


.(∇ 2 η)i, j = (5)
h2
for a grid spacing .h = h x = h y . The time integration is carried out using the explicit
Euler time stepping scheme and thus, we obtain the following discretized expression
Phase Field Modeling of Grain Growth 57

∂ηi ηn+1 − ηin   N


. = i = −L − Aηin + B(ηin )3 + 2ηin (ηin )2 − κi ∇ 2 ηin (6)
∂t t i= j

for .i = 1, 2, ..., N grains with .ηin+1 = ηi (tn+1 ), .ηin = ηi (tn ), and .t = tn+1 − tn .
In the equations describing the model, .η is dimensionless, . L i has the dimen-
sions .[L 2 T −1 ], .κi has the dimensions .[M L 2 T −2 ], . F has the dimensions of energy
2 −2
.[M L T ], and. f has the dimensions of energy density as.[M L −1 T −2 ]. All variables
of the model are treated in a non-dimensional form in the simulation.

2 Prompt

The numerical implementation of the PF method based on the above equation is


done using a Python code generated by GPT-4 on GPT Plus (GPT-4 in the following
for convenience). Utilization of prompt engineering practices is done by breaking
down the problem into sub-tasks with clear instructions and parameters. The overall
prompt is broken into four major steps:

• System
• Context
• Specific instructions for numerical implementation
• Initialization cases

– Case I : Ideal grain growth


– Case II : Voronoi tessellation based initialization

2.1 System

The system prompt is mainly aimed for the chatbot to adopt a persona that has a con-
textual understanding of the broad field and maintains consistency in all responses.
Further, the system prompt can also be used to obtain a specified style of program-
ming. We also enforce the chatbot to provide an error-free code by providing such a
system prompt.
In response to the prompt shown in Fig. 1, ChatGPT 4 mentions the applications
of the PF method. It further elaborates by providing some mathematical aspects of
the PF method such as its dependence on differential equations, commonly used
numerical methods, initialization of random grain structures, use of Matplotlib and
possible ways of evaluating results. It also gives a general example task and briefly
explains the steps involved. This response indicates that the large language model
acknowledges the persona and has an understanding of the PF method.
58 R. Narkhede and B. Eidel

Prompt 1
Your role: You are an expert in computational materials science. You can
understand complex mathematical model of physical phenomena in mate-
rials, which are usually described in terms of differential equations. Your
particular research focus is the modeling of grain growth by the phase-field
method. You understand the phase-field method in depth and know existing
research in its implementation on grain growth modeling. You can judge the
results of a numerical method based on its outputs, specifically plots. Further,
you also know the best ways to initialize random grain structures in 2D. Your
current role requires you to generate Python programs that are error-free
(logically, semantically and numerically). You know the best practices of
using the library Matplotlib to generate publication-ready plots.

Fig. 1 System prompt to provide ChatGPT 4 chatbot a persona that is an expert in the domain of
our problem, understands the numerical methods and enforces error-free code generation

2.2 Context

The model description for our problem setting is provided in the context prompt.
It includes specific equations as mentioned in the introduction. It is observed that
providing equations of the PF model (1), (2) and (3) in the LATEX format ensures the
mathematical correctness in the code implementation.
ChatGPT 4 provides a clear understanding of the terms of the equation in the
response to prompt in the Fig. 2. It mentions some applications of the Allen-Cahn
equation first. Then it explains the individual terms, i.e. the order parameter.ηi , mobil-
ity coefficient . L i , free energy . F, gradient energy coefficients .κi and the local free
energy density . f . It reproduces the equations provided and then provides the sim-
ulation approach briefly with suggestions for numerical methods and their possible
limitations.

2.3 Specific Instructions for Numerical Implementation

Now, specific instructions are provided in a concise prompt shown in Fig. 3. To obtain
a Python code that can be used for any type of initialized domain, we first consider
a rectangular domain with circular grains at random locations with radii between a
specified range. The motive is to use this generated phase field method code for other
modified initializations such as Voronoi tessellation.
Phase Field Modeling of Grain Growth 59

Fig. 2 Context prompt providing the details of the PF model used in the problem with specific
equations written in LATEX format

Care has been taken to make the instructions clear and precise to restrict ChatGPT 4
in making any assumptions on its own. An emphasis on the desired dimension of the
outputs has been made in the instructions. For e.g. the initialized order parameters
should be stored in an array of dimensions .(N x, N y, ngrains), where .(N x, N y)
specifies the grid size and .ngrains is the number of grains. The discretized equa-
tion for time-stepping (4) is also provided in LATEX format to ensure mathematical
correctness in its code implementation (Fig. 3).
60 R. Narkhede and B. Eidel

Fig. 3 Prompt with specific instructions for numerical implementation provided as a one-thrust
prompt
Phase Field Modeling of Grain Growth 61

Fig. 3 (continued)

Fig. 4 Prompt to generate pseudocode for the grain growth simulation based on the specific instruc-
tions provided

ChatGPT 4 takes a relatively long time to process the response and indicates that
it is analyzing its code. After completion it generates a plot of area fractions versus
time steps as the output of verification. This shows that GPT 4 analyzes its generated
code prior to responding and also adds the generated code output plot in its response.
The generated Python code is shown in the code Listing 3.1.
The update of the order parameter, and thus the grain growth over time steps is
conducted by the operations mentioned in the algorithm 2 generated by the prompt
shown in Fig. 4.
62 R. Narkhede and B. Eidel

Algorithm 2: Grain Growth Simulation using Phase-Field Method and Euler


Forward Method
Data: Initial phase fields .ηi (x, y, t = 0), .i = 1, . . . , N ;
Parameters: . A = 1, . B = 1, .κi for .i = 1, . . . , N ;
Grid size . N x, . N y; Grid spacing .x, .y;
Time step .t; Total simulation time .T = nsteps × t;
Number of grains .ngrains = 3.
Result: Phase field evolution .ηi (x, y, t) for .t ∈ [0, T ], Volume fractions over time, Grain
status.
1 Initialization: Set .t ← 0 ;
2 Create a grid of size . N x × N y with spacing .x and .y ;
3 Initialize .ngrains and for each grain .i, set the order parameter .ηi (x, y) = 1.0 inside the
grain and .0.0 outside ;
4 while .t < T do
5 for each grid point .(x, y) do
6 for each grain .i = 1, . . . , N do
7 Compute the Laplacian .∇ 2 ηi using finite differences:
8

1 
.∇ ηi = ηi (x + x, y) + ηi (x − x, y) + ηi (x, y + y)
2
x 2
+ηi (x, y − y) − 4ηi (x, y)

9 Update the order parameter using the Explicit Euler method:


⎛ ⎞

N
n+1
.ηi = ηin + t L ⎝−Aηin + B(ηin )3 + 2ηin (ηnj )2 − κi ∇ 2 ηin ⎠
j=i

10 Enforce the bounds on .ηin+1 to keep it within .[0.0001, 0.999] ;


11 end
12 end
13 for each grain .i = 1, . . . , N do
14 Compute the volume fraction as the sum of .ηin+1 over all grid points divided by the
area . N x × N y ;
15 if volume fraction .< 0.001 then
16 Mark grain .i as extinct in the grain_status list ;
17 end
18 end
19 Store .ηin+1 for all grains in .eta_hist ;
20 Apply boundary conditions;
21 Update time: .t ← t + t ;
22 end
23 Post-processing: Analyze the final phase field distributions .ηi (x, y, t = T );
24 Output: .ηhist , Volume fractions over time, Grain status.
Phase Field Modeling of Grain Growth 63

2.4 Initialization Cases

The Python code obtained from the prompts so far is used along with two different
initializations. The first case of ideal grain growth is considered for verification of the
Python code for the PF method. The second case is based on Voronoi tessellations
as initial domains and represents a general application in materials science.

2.4.1 Case I: Ideal Grain Growth

Ideal grain growth is a special case of normal grain growth, where the grain boundary
motion is driven only by the local curvature of the grain boundary. Here, a spherical
grain embedded in a large second grain is considered (Fig. 5).

2.4.2 Case II : Voronoi Tessellation

For a more generalized application of the generated PF method code, the domain is
initialized by a Voronoi tessellation. Initially, a general random seed based Voronoi
tessellation is generated and the order parameters are initialized. Next, it is modified
for a Voronoi tessellation with a gradient in the cell size. Specific instructions are
provided to initialize the order parameters to eliminate any discrepancies.

Prompt 4
Provide a Python function based on following instructions. Initialize the
order parameters etas using following instructions. Rectangular grid of size
Nx = Ny = 64 with grid spacing dx = dy = 0.5. We want to initialize a
circular grain embedded in another large grain of the size of the rectangular
grid. The circular grain of radius 14 is centered at the center of the grid.
Grain 1 is the larger grain and grain 2 is the embedded circular grain. The
order parameters for each grain are initialized by an array of shape (Nx, Ny,
ngrains), where ngrains = 2 is the number of grains. Each ith 2D sub-array
of shape (Nx, Ny) specifies the order parameter at all grid points for the ith
grain.
Initialization of grain 1: The order parameters over the rectangular grid for
grain 1 are initialized to 1. But all points falling within the radius of the
embedded circular grain have order parameter 0.
Initialization of grain 2: The order parameters over the rectangular grid for
grain 0 are initialized to 0. But all points falling within the radius of the
embedded circular grain have order parameter 1.

Fig. 5 Prompt for ideal grain growth initialization


64 R. Narkhede and B. Eidel

Prompt 5
Create a Voronoi tessellation in a domain of size 32x32. Consider 25 grains.
Ensure periodic boundary conditions. Extend any Voronoi cells near the
domain boundary till the domain boundary such that we have the domain
completely filled with the Voronoi cells. Plot the Voronoi tessellation.
After creating this Voronoi tessellation, initialize the order parameters for
the PF method in the grains that lie in the trimmed Voronoi tessellation
domain. Follow these steps.
1. Input: The Voronoi tessellation and its data from generated Voronoi tes-
sellation. Create a rectangular grid of size 64 x 64 with grid spacing of
0.5.
2. Initialize Order Parameter: Create an array to store the order parameter
values for each grain at each grid point. Create an array of dimensions
(64, 64, ngrains), where ’ngrains’ is the number of grains, i.e. the number
of Voronoi cells.
3. For each grain, loop through all grid points. For each grid point (i, j)
in the 64x64 grid, determine the corresponding Voronoi cell using the
tessellation.
4. If the point is inside the Voronoi cell, set the order parameter for that point
to 1.0. If the grid point is outside the Voronoi cell, set the order parameter
for that point to 0.0.
5. Repeat for all grains in the Voronoi tessellation. Provide the final array
of order parameters of dimensions (64, 64, ngrains).

Prompt 6
Modify the previous Voronoi tessellation and initialization of order param-
eters with a strongly decreasing Poisson point process. The gradient in the
grain size should be easily noticeable. Ensure that the final array of etas has
the same number of grains taken as an input.

Fig. 6 Prompts for Voronoi tessellation based initialization

3 Generated Code Listing and Results

The Python codes generated in response to the prompts mentioned in Sect. 2 are
provided in this section with their output plots showing grain growth evolution.
Phase Field Modeling of Grain Growth 65

3.1 Rectangular Domain with Randomly Located Circular


Grains

The generated code after prompt 3 is displayed in the Code Listing 3.1. Therein, the
number of steps was adapted manually to the target time period for grain evolution.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
4
5 # Parameters
6 Nx , Ny = 64 , 64 # Grid size
7 dx , dy = 0.5 , 0.5 # Grid s p a c i n g
8 ngrains = 3 # N u m b e r of g r a i n s
9 L = 5.0 # Mobility coefficient
10 A = B = 1 # Free energy parameters
11 kap pa = 0.1 # Gradient energy coefficient
12 dt = 0 .0 05 # Time step
13 nsteps = 1000 # N u m b e r of s t e p s
14 t h r e s h o l d = 0.5 # T h r e s h o l d for area f r a c t i o n
15 r a d i u s _ r a n g e = (7 , 14) # R a n g e of r a d i i for the g r a i n s
16 v o l u m e _ f r a c t i o n _ t h r e s h o l d = 0.001 # T h r e s h o l d for grain
→ e x i s t e n c e
17
18 # I n i t i a l i z e the d o m a i n with c i r c u l a r g r a i n s
19 def i n i t i a l i z e _ g r a i n s ( Nx , Ny , ngrains , r a d i u s _ r a n g e ) :
20 eta = np . z e r o s (( Nx , Ny , n g r a i n s ) )
21 c e n t e r s = np . r a n d o m . rand ( ngrains , 2) * np . array ([ Nx , Ny
→ ])
22 r a d i i = np . r a n d o m . r a n d i n t ( r a d i u s _ r a n g e [0] , r a d i u s _ r a n g e
→ [1] , n g r a i n s )
23
24 for i in range ( n g r a i n s ) :
25 for x in range ( Nx ) :
26 for y in range ( Ny ) :
27 if ( x - c e n t e r s [ i , 0]) **2 + ( y - c e n t e r s [ i ,
→ 1]) **2 <= radii [ i ]**2:
28 eta [x , y , i ] = 1.0
29 r e t u r n eta
30
31 # F u n c t i o n to a p p r o x i m a t e L a p l a c i a n using a five - point
→ s t e n c i l
32 def l a p l a c i a n ( eta , dx , dy ) :
33 l a p l a c e _ e t a = ( np . roll ( eta , -1 , axis =0) + np . roll ( eta ,
→ 1 , axis =0) - 2 * eta ) / dx **2 + \
34 ( np . roll ( eta , -1 , axis =1) + np . roll ( eta ,
→ 1 , axis =1) - 2 * eta ) / dy **2
35 return laplace_eta
36
37 # Ph as e f ie ld model e v o l u t i o n f u n c t i o n
38 def p h a s e _ f i e l d _ e v o l v e ( Nx , Ny , ngrains , dt , nsteps ,
→ e t a _ i n i t i a l ) :
39 eta_hist = [ eta_initial ]
40 v o l u m e _ f r a c t i o n s = np . zeros (( nsteps , n g r a i n s ) )
66 R. Narkhede and B. Eidel

41 g r a i n _ s t a t u s = np . ones (( nsteps , n g r a i n s ) )
42
43 for step in range ( n s t e p s ) :
44 e t a _ n e x t = e t a _ h i s t [ -1]. copy ()
45 for i in range ( n g r a i n s ) :
46 if g r a i n _ s t a t u s [ step -1 , i ] == 0: # Skip e x t i n c t
→ grains
47 continue
48 e ta_i = e t a _ h i s t [ -1][: , : , i ]
49 s u m _ e t a _ s q u a r e d = sum ([ e t a _ h i s t [ -1][: , : , j ]**2
→ for j in range ( n g r a i n s ) if j != i ])
50 l a p l a c e _ e t a _ i = l a p l a c i a n ( eta_i , dx , dy )
51
52 # D i s c r e t i z e d Allen - Cahn e q u a t i o n
53 d _ e t a _ d t = - L * ( - A * eta_i + B * eta_i **3 + 2 *
→ e ta _i * s u m _ e t a _ s q u a r e d - kappa * l a p l a c e _ e t a _ i )
54 e t a _ n e x t [: , : , i ] += d _ e t a _ d t * dt
55
56 # E n f o r c e b o u n d s on order p a r a m e t e r s
57 e t a _ n e x t [: , : , i ] = np . clip ( e t a _ n e x t [: , : , i ] ,
→ 0.0001 , 0 . 9 9 9 )
58
59 # Calculate volume fraction
60 v o l u m e _ f r a c t i o n = np . sum ( e t a _ n e x t [: , : , i ]) / (
→ Nx * Ny )
61 v o l u m e _ f r a c t i o n s [ step , i ] = v o l u m e _ f r a c t i o n
62 if v o l u m e _ f r a c t i o n < v o l u m e _ f r a c t i o n _ t h r e s h o l d :
63 g r a i n _ s t a t u s [ step , i ] = 0 # Mark grain as
→ e x t i n c t
64
65
66 eta_hist . append ( eta_next )
67
68 r e t u r n eta_hist , v o l u m e _ f r a c t i o n s , g r a i n _ s t a t u s
69
70 # F u n c t i o n to c o m p u t e area f r a c t i o n for each grain at each
→ time step
71 def c o m p u t e _ a r e a _ f r a c t i o n ( eta_hist , Nx , Ny , t h r e s h o l d ) :
72 a r e a _ f r a c t i o n s = np . zeros (( len ( e t a _ h i s t ) , n g r a i n s ) )
73
74 for step , eta in e n u m e r a t e ( e t a _ h i s t ) :
75 for i in range ( n g r a i n s ) :
76 e ta_i = eta [: , : , i ]
77 a r e a _ f r a c t i o n = np . sum ( eta_i > t h r e s h o l d ) / ( Nx
→ * Ny )
78 a r e a _ f r a c t i o n s [ step , i ] = a r e a _ f r a c t i o n
79
80 return area_fractions
81
82 # Initialize grains
83 e t a _ i n i t i a l = i n i t i a l i z e _ g r a i n s ( Nx , Ny , ngrains ,
→ r a d i u s _ r a n g e )
84
85 # Run the ph ase field e v o l u t i o n
Phase Field Modeling of Grain Growth 67

86 eta_hist , v o l u m e _ f r a c t i o n s , g r a i n _ s t a t u s =
→ p h a s e _ f i e l d _ e v o l v e ( Nx , Ny , ngrains , dt , nsteps ,
→ e t a _ i n i t i a l )
87
88 # C o m p u t e area f r a c t i o n s
89 a r e a _ f r a c t i o n s = c o m p u t e _ a r e a _ f r a c t i o n ( eta_hist , Nx , Ny ,
→ t h r e s h o l d )
90
91 # P l o t t i n g area f r a c t i o n vs . time for each grain
92 plt . f i g u r e ( f i g s i z e =(10 , 6) )
93 for i in r an ge ( n g r a i n s ) :
94 plt . plot ( np . a r a n g e ( n s t e p s + 1) , a r e a _ f r a c t i o n s [: , i ] ,
→ la be l = f ’ Grain { i +1} ’ )
95 plt . x l a b e l ( ’ Time step ’ )
96 plt . y l a b e l ( ’ Area f r a c t i o n ’ )
97 plt . t it le ( ’ Area f r a c t i o n vs . Time for each grain ’ )
98 plt . l e g e n d ()
99 plt . show ()
100
101 # c r e a t e a n i m a t i o n f u n c t i o n was g e n e r a t e d in a s e p a r a t e
→ p r o m p t r e s p o n s e with minor m o d i f i c a t i o n s .
102 from m a t p l o t l i b . c o l o r s i m p o r t N o r m a l i z e
103
104 i m p o r t n um py as np
105 i m p o r t m a t p l o t l i b . p y p l o t as plt
106 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
107 from m a t p l o t l i b . c o l o r s i m p o r t N o r m a l i z e
108
109 def c r e a t e _ a n i m a t i o n _ w i t h _ c o l o r b a r ( eta_hist , i n t e r v a l =100 ,
→ n p r i n t =10) :
110 """
111 C r e a t e an a n i m a t i o n v i s u a l i z i n g the g r a i n g r o w t h over
→ time with a colorbar ,
112 u s i n g the s q u a r e d sum of eta v a l u e s and e n s u r i n g safe
→ n o r m a l i z a t i o n .
113
114 Parameters :
115 - e t a _ h i s t : A list of numpy arrays , each r e p r e s e n t i n g
→ the s t a t e of the s y s t e m at a g i v e n time step .
116 - i n t e r v a l : Time i n t e r v a l b e t w e e n f r a m e s in m i l l i s e c o n d s
→ .
117 - n p r i n t : I n t e r v a l of t i m e s t e p s to u p d a t e the a n i m a t i o n
→ .
118
119 Returns :
120 - anim : The M a t p l o t l i b a n i m a t i o n o b j e c t .
121 """
122 fig , ax = plt . s u b p l o t s ()
123 ax . s e t _ t i t l e ( ’ G r a i n G r o w t h Over Time ’ )
124
125 # C o m p u t e the s q u a r e d sum of eta v a l u e s for the i n i t i a l
→ fr am e and n o r m a l i z e
126 data = np . sum ( e t a _ h i s t [0]**2 , axis =2)
127 m a x _ v a l = np . max ( data ) if np . max ( data ) != 0 else 1 #
→ Av oi d d i v i s i o n by zero
68 R. Narkhede and B. Eidel

128 d a t a _ n o r m a l i z e d = data / m a x _ v a l
129 im = ax . i m s h o w ( d a t a _ n o r m a l i z e d , a n i m a t e d = True , cmap = ’
→ v i r i d i s ’ , norm = N o r m a l i z e ( vmin =0 , vmax =1) )
130
131 # Create colorbar
132 fig . c o l o r b a r ( im , ax = ax )
133
134 def u p d a t e ( f r a m e ) :
135 """
136 U p d a t e the p l o t for the animation , n o r m a l i z i n g the
→ data for each frame .
137 """
138 n e w _ d a t a = np . sum ( e t a _ h i s t [ frame ]**2 , axis =2)
139 m a x _ v a l = np . max ( n e w _ d a t a ) if np . max ( n e w _ d a t a ) != 0
→ else 1 # Avoid d i v i s i o n by zero
140 new_data_normalized = new_data / max_val
141 im . s e t _ a r r a y ( n e w _ d a t a _ n o r m a l i z e d )
142 r e t u r n ( im ,)
143
144 anim = a n i m a t i o n . F u n c A n i m a t i o n ( fig , update , f r a m e s = range
→ (0 , len ( e t a _ h i s t ) , n p r i n t ) , i n t e r v a l = interval , blit =
→ True )
145
146 r e t u r n anim
147
148 # This f u n c t i o n is ready to be used as d e s c r i b e d in the
→ c o m m e n t at the b o t t o m of the s n i p p e t .
149 # U s a g e e x a m p l e ( a s s u m i n g e t a _ h i s t is a l r e a d y c o m p u t e d ) :
150 anim = c r e a t e _ a n i m a t i o n _ w i t h _ c o l o r b a r ( eta_hist , i n t e r v a l =50 ,
→ n p r i n t =10)
151 # To d i s p l a y in J u p y t e r N o t e b o o k :
152 from I P y t h o n . d i s p l a y i m p o r t HTML
153 HTML ( anim . t o _ h t m l 5 _ v i d e o () )
154
155 # To save the a n i m a t i o n as a file ( u n c o m m e n t and use in a
→ lo ca l e n v i r o n m e n t ) :
156 # anim . save ( ’ g r a i n _ g r o w t h _ w i t h _ c o l o r b a r . mp4 ’, w r i t e r = ’ f f m p e g
→ ’, dpi =300)
157
158 # To save the a n i m a t i o n as a file ( u n c o m m e n t and use in a
→ lo ca l e n v i r o n m e n t ) :
159 # anim . save ( ’ g r a i n _ g r o w t h _ w i t h _ c o l o r b a r . mp4 ’, w r i t e r = ’ f f m p e g
→ ’, dpi =300)
Code Listing 3.1 Code generated by ChatGPT 4 after prompt 3. The code includes initialization
of the domain and the numerical implementation as per steps specified in prompt 3

The output of the code creates a plot of area fractions with respect to time for
all grains and an animation of the grain growth. The function that creates the ani-
mation was modified with minor changes to result in an animation with greater
sharpness and to add a colorbar. The code also results a list of arrays having dimen-
sion .(N x, N y, ngrains) which stores the order parameters over the grid for all
grains at all time instants. Moreover, the tracked grain status, area fractions and
volume fractions are also obtained as output. Here, the initial (.t = 0), intermediate
Phase Field Modeling of Grain Growth 69

Fig. 7 Grain growth evolution for the generalized initialization in prompt 3 with three randomly
located circular grains at initial, intermediate (step 500) and final state. The total simulation was
conducted for 1000 time steps and a time-step size of 0.005. The domain size is 64. × 64 with a grid
spacing of 0.5

(.t = nsteps/2 · dt) and final state (.t = nsteps · dt) state of grain growth evolution
is shown in Fig. 7. The final state here refers to the end of the time interval considered
for the simulation.

3.2 Ideal Grain Growth

Case I with the ideal grain growth initialization was generated using the prompt 4.
The resulting initialized order parameters were used in the PF method code listed in
Code Listing 3.1.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3
4 def i n i t i a l i z e _ i d e a l _ g r a i n _ g r o w t h ( Nx , Ny , dx , dy ,
→ r a d i u s _ f a c t o r , n g r a i n s =2) :
5 """
6 I n i t i a l i z e the order p a r a m e t e r s for i d e a l g r a i n g r o w t h
→ with one c i r c u l a r grain
7 e m b e d d e d in a l a r g e r g r a i n .
8 Parameters :
9 - Nx , Ny : Size of the r e c t a n g u l a r grid .
10 - dx , dy : Grid s p a c i n g .
11 - r a d i u s _ f a c t o r : F a c t o r to d e t e r m i n e the r a d i u s of the
→ c i r c u l a r grain .
12 - n g r a i n s : N u m b e r of grains , d e f a u l t is 2.
13 Returns :
14 - eta : I n i t i a l i z e d order p a r a m e t e r s for each grain in
→ the form of an array
15 of s h a p e ( Nx , Ny , n g r a i n s ) .
16 """
17 # I n i t i a l i z e the order p a r a m e t e r array
18 eta = np . z e r o s (( Nx , Ny , n g r a i n s ) )
19 # C a l c u l a t e the c e n t e r of the grid
70 R. Narkhede and B. Eidel

20 center_x , c e n t e r _ y = Nx / 2 , Ny / 2
21 # C a l c u l a t e the r a d i u s of the c i r c u l a r grain
22 r a d i u s = r a d i u s _ f a c t o r * dx
23
24 for x in range ( Nx ) :
25 for y in range ( Ny ) :
26 # C a l c u l a t e the d i s t a n c e of ea ch point from the
→ c e n t e r
27 d i s t a n c e = np . sqrt (( x - c e n t e r _ x ) **2 + ( y -
→ c e n t e r _ y ) **2)
28 # I n i t i a l i z e g r a i n 1 ( the l a r g e r g r a i n )
29 if d i s t a n c e <= r a d i u s :
30 eta [x , y , 0] = 0 # I n s i d e the c i r c u l a r
→ grain , set grain 1 to 0
31 eta [x , y , 1] = 1 # Set grain 2 to 1
32 else :
33 eta [x , y , 0] = 1 # O u t s i d e the c i r c u l a r
→ grain , set grain 1 to 1
34 r e t u r n eta
35
36 # P a r a m e t e r s for i n i t i a l i z a t i o n
37 Nx , Ny = 64 , 64 # Grid size
38 dx , dy = 0.5 , 0.5 # Grid s p a c i n g
39 r a d i u s _ f a c t o r = 28 # M u l t i p l i e r for the r a d i u s b a s e d on dx
40
41 # I n i t i a l i z e the order p a r a m e t e r s
42 e t a _ i d e a l _ g g = i n i t i a l i z e _ i d e a l _ g r a i n _ g r o w t h ( Nx , Ny , dx , dy ,
→ radius_factor )
43
44 # Visualization
45 fig , axes = plt . s u b p l o t s (1 , 2 , f i g s i z e =(12 , 6) )
46 t i t l e s = [ ’ G rain 1 ( Large Grain ) ’ , ’ Grain 2 ( E m b e d d e d
→ C i r c u l a r Grain ) ’ ]
47 for i in r an ge (2) :
48 im = a x e s [ i ]. i m s h o w ( e t a _ i d e a l _ g g [: , : , i ] , cmap = ’ v i r i d i s
→ ’ , o r i g i n = ’ lower ’ )
49 axes [ i ]. s e t _ t i t l e ( t i t l e s [ i ])
50 axes [ i ]. axis ( ’ off ’ )
51 plt . c o l o r b a r ( im , ax = axes [ i ])
52 plt . t i g h t _ l a y o u t ()
53 plt . show ()
Code Listing 3.2 Code generated by ChatGPT 4 after prompt 4 for initialization based on ideal
grain growth case. The domain is initialized with a spherical grain embedded in a larger grain. The
code results in an array of order parameters based on this initialization

The output of the generated PF code was an animation. Here, we show initial
(.t = 0), intermediate (.t = nsteps/2 · dt) and final state (.t = nsteps · dt) of the ideal
grain growth case is shown in Fig. 8.
Phase Field Modeling of Grain Growth 71

Fig. 8 Grain growth evolution for the ideal grain growth initialization as per prompt 4 with a
spherical grain embedded in a larger cell. The evolution at initial, intermediate (step 2500) and final
state shows shrinkage of the spherical cell. The total simulation was conducted for 5000 time steps
and a time-step size of 0.005. The domain size is 64. × 64 with a grid spacing of 0.5

3.3 Voronoi Tessellation Based Initialization

Two types of Voronoi tessellations were considered for initialization. The first one
has randomly distributed seeds following a uniform distribution. The second one has
a decreasing gradient in the seeds, showing a variation in the grain size along the
vertical axis. The corresponding generated codes are provided in code Listing 3.3
and code Listing 3.4.
1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 from s c i p y . s p a t i a l i m p o r t Voronoi , v o r o n o i _ p l o t _ 2 d
4
5 def g e n e r a t e _ v o r o n o i ( d o m a i n _ s i z e , n g r a i n s ) :
6 """
7 G e n e r a t e a V o r o n o i t e s s e l l a t i o n for a g i v e n d o m a i n s i z e
→ and n u m b e r of g r a i n s .
8
9 Parameters :
10 - d o m a i n _ s i z e : The size of the d o m a i n .
11 - n g r a i n s : The n u m b e r of g r a i n s to g e n e r a t e .
12
13 Returns :
14 - Voronoi tessellation object .
15 """
16 # G e n e r a t e r a n d o m p o i n t s w i t h i n the d o m a i n
17 p o i n t s = np . r a n d o m . rand ( ngrains , 2) * d o m a i n _ s i z e
18 # C r e a t e the V o r o n o i t e s s e l l a t i o n
19 vor = V o r o n o i ( p o i n t s )
20 r e t u r n vor
21
22 def p l o t _ v o r o n o i ( vor , d o m a i n _ s i z e ) :
23 """
24 Plot a V o r o n o i t e s s e l l a t i o n , e x t e n d i n g cells to the
→ d o m a i n b o u n d a r i e s .
25
72 R. Narkhede and B. Eidel

26 Parameters :
27 - vor : V o r o n o i t e s s e l l a t i o n o b j e c t .
28 - d o m a i n _ s i z e : The size of the d o m a i n .
29 """
30 fig , ax = plt . s u b p l o t s ()
31 v o r o n o i _ p l o t _ 2 d ( vor , ax = ax , s h o w _ v e r t i c e s = False ,
→ s h o w _ p o i n t s = False )
32 ax . s e t _ x l i m (0 , d o m a i n _ s i z e )
33 ax . s e t _ y l i m (0 , d o m a i n _ s i z e )
34 ax . s e t _ t i t l e ( ’ V o r o n o i T e s s e l l a t i o n with E x t e n d e d Cells ’ )
35 plt . show ()
36
37 def i n i t i a l i z e _ o r d e r _ p a r a m e t e r s ( vor , grid_size , dx , n g r a i n s )
→ :
38 """
39 I n i t i a l i z e the order p a r a m e t e r s for the phase field
→ m e t h o d b a s e d on V o r o n o i t e s s e l l a t i o n .
40
41 Parameters :
42 - vor : V o r o n o i t e s s e l l a t i o n o b j e c t .
43 - g r i d _ s i z e : The size of the grid .
44 - dx : The grid s p a c i n g .
45 - n g r a i n s : The n u m b e r of g r a i n s .
46
47 Returns :
48 - I n i t i a l i z e d order p a r a m e t e r s array .
49 """
50 eta = np . zeros (( grid_size , grid_size , n g r a i n s ) )
51 for i in range ( g r i d _ s i z e ) :
52 for j in range ( g r i d _ s i z e ) :
53 # Find the c l o s e s t V o r o n o i cell for each grid
→ po in t
54 x , y = ( i + 0.5) * dx , ( j + 0.5) * dx
55 d i s t a n c e s = np . s q r t (( vor . p o i n t s [: , 0] - x ) **2 +
→ ( vor . p o i n t s [: , 1] - y ) **2)
56 c l o s e s t = np . a r g m i n ( d i s t a n c e s )
57 eta [ i , j , c l o s e s t ] = 1.0
58 r e t u r n eta
59
60 # Parameters
61 d o m a i n _ s i z e = 32
62 n g r a i n s = 25
63 g r i d _ s i z e = 64
64 dx = 0.5
65
66 # G e n e r a t e and plot V o r o n o i t e s s e l l a t i o n
67 vor = g e n e r a t e _ v o r o n o i ( d o m a i n _ s i z e , n g r a i n s )
68 p l o t _ v o r o n o i ( vor , d o m a i n _ s i z e )
69
70 # I n i t i a l i z e order p a r a m e t e r s b a s e d on the V o r o n o i
→ t e s s e l l a t i o n
71 eta = i n i t i a l i z e _ o r d e r _ p a r a m e t e r s ( vor , grid_size , dx ,
→ n g r a i n s )
72
Phase Field Modeling of Grain Growth 73

73 # eta c o n t a i n s the order p a r a m e t e r s array of d i m e n s i o n s (64 ,


→ 64 , 25)
Code Listing 3.3 Code generated by ChatGPT 4 after prompt 5 for the general Voronoi tessellation
based initialization. The code results in an array of order parameters based on this initialization and
a plot of the Voronoi tessellation

1 i m p o r t nu mp y as np
2 i m p o r t m a t p l o t l i b . p y p l o t as plt
3 from s c i p y . s p a t i a l i m p o r t Voronoi , v o r o n o i _ p l o t _ 2 d
4
5 def g e n e r a t e _ s t r o n g l y _ d e c r e a s i n g _ p o i n t s ( d o m a i n _ s i z e , n g r a i n s
→ ) :
6 """
7 G e n e r a t e p o i n t s for a P o i s s o n V o r o n o i t e s s e l l a t i o n with
→ a s t r o n g l y d e c r e a s i n g density ,
8 e n s u r i n g a n o t i c e a b l e g r a d i e n t in g r a i n size a c r o s s the
→ d o m a i n .
9 """
10 p o i n t s = []
11 wh il e len ( p o i n t s ) < n g r a i n s :
12 x = np . r a n d o m . u n i f o r m (0 , d o m a i n _ s i z e )
13 y = np . r a n d o m . u n i f o r m (0 , d o m a i n _ s i z e )
14 # Use a non - l i n e a r d e c r e a s i n g f u n c t i o n for point
→ d e n s i t y
15 p r o b a b i l i t y = ( d o m a i n _ s i z e - x ) **2 / d o m a i n _ s i z e **2
→ # Q u a d r a t i c a l l y d e c r e a s i n g p r o b a b i l i t y
16 if np . r a n d o m . r a n d () < p r o b a b i l i t y :
17 p o i n t s . a p p e n d ([ x , y ])
18 if len ( p o i n t s ) > n g r a i n s : # Reset if over - g e n e r a t e d
19 p o i n t s = []
20
21 r e t u r n np . a r r a y ( p o i n t s [: n g r a i n s ])
22
23 def i n i t i a l i z e _ o r d e r _ p a r a m e t e r s _ e x a c t l y ( grid_size , dx ,
→ points , n g r a i n s ) :
24 """
25 I n i t i a l i z e the order p a r a m e t e r s for the phase field
→ method , e n s u r i n g the s p e c i f i e d n u m b e r of g r a i n s .
26 """
27 eta = np . zeros (( grid_size , grid_size , n g r a i n s ) )
28
29 for i in range ( g r i d _ s i z e ) :
30 for j in range ( g r i d _ s i z e ) :
31 pos = np . array ([ i * dx , j * dy ])
32 d i s t a n c e s = np . l i n a l g . norm ( p o i n t s - pos , axis =1)
33 c l o s e s t _ p o i n t _ i n d e x = np . a r g m i n ( d i s t a n c e s )
34 eta [i , j , c l o s e s t _ p o i n t _ i n d e x ] = 1.0
35
36 # E n s u r e eta a r r a y r e f l e c t s e x a c t l y ’ n g r a i n s ’ g r a i n s
37 a s s e r t eta . s h a p e [2] == ngrains , " The n u m b e r of g r a i n s in
→ eta does not m a t c h n g r a i n s . "
38
39 r e t u r n eta
40
41 # P a r a m e t e r s r e m a i n the same as the p r e v i o u s e x a m p l e
74 R. Narkhede and B. Eidel

42 d o m a i n _ s i z e = 32
43 n g r a i n s = 25
44 g r i d _ s i z e = 64
45 dx = dy = 0.5
46
47 # G e n e r a t e p o i n t s with a s t r o n g l y d e c r e a s i n g p r o c e s s
48 points_strong_gradient = generate_strongly_decreasing_points
→ ( d o m a i n _ s i z e , n g r a i n s )
49
50 # C r e a t e the V o r o n o i t e s s e l l a t i o n
51 voronoi_strong_gradient = Voronoi ( points_strong_gradient )
52
53 # I n i t i a l i z e the order p a r a m e t e r s
54 eta_strong_gradient = initialize_order_parameters_exactly (
→ grid_size , dx , p o i n t s _ s t r o n g _ g r a d i e n t , n g r a i n s )
55
56 # Plot the V o r o n o i t e s s e l l a t i o n w i t h a s t r o n g g r a d i e n t
57 fig , ax = plt . s u b p l o t s ()
58 v o r o n o i _ p l o t _ 2 d ( v o r o n o i _ s t r o n g _ g r a d i e n t , ax = ax ,
→ s h o w _ v e r t i c e s = False )
59 ax . s e t _ x l i m (0 , d o m a i n _ s i z e )
60 ax . s e t _ y l i m (0 , d o m a i n _ s i z e )
61 ax . s e t _ t i t l e ( " P o i s s o n V o r o n o i T e s s e l l a t i o n with S t r o n g
→ G r a d i e n t " )
62 plt . show ()
Code Listing 3.4 Code generated by ChatGPT 4 after prompt 6 for the initialization with Voronoi
tessellation having a decreasing gradient. The code results in an array of order parameters based on
this initialization and a plot of the Voronoi tessellation. ChatGPT 4 assumes a suitable non-linear
function for the point density on which the seeds of the Voronoi tessellation are created

Initial (.t = 0), intermediate (.t = nsteps/2 · dt) and final state (.t = nsteps · dt)
of the Voronoi tessellation based grain growth in both cases is shown in Figs. 9 and
12 respectively.

Fig. 9 Grain growth evolution for the general Voronoi tessellation based initialization as described
in prompt 5. The evolution at initial, intermediate (step 500) and final state is shown in the plots.
The total simulation was conducted for 1000 time steps and a time-step size of 0.005. The domain
size is 64. × 64 with a grid spacing of 0.5
Phase Field Modeling of Grain Growth 75

4 Tests for Verification

In the current problem setting, errors that do not allow the code to run, root from
index errors and array dimension errors. However, using ChatGPT 4 and adhering
the prompt engineering practices, it is observed that the generated code from the
prompts runs without such errors. However, the logical and mathematical correct-
ness of numerical implementation needs to be thoroughly verified. We discuss these
verification aspects in three stages:

• Initialization
• Numerical implementation
• Verification with expected physical observations

Minor errors in the specifics of plotting and visualization are easily rectified by
instructing the chatbot in subsequent prompts. For example, the function that cre-
ates the animation create_animation_with_colorbar was modified with
minor changes to result in an animation with greater sharpness and add to a colorbar.

4.1 Initialization

The two initialization cases considered are first checked. First, the dimensions
of the output array of initialized order parameters are checked. These should be
.(N x, N y, ngrains), where .(N x, N y) signifies the grid size of the domain and
.ngrains is the number of grains, both provided as an input. Next, individual heat-

maps of some of the initialized grains are plotted and verified visually. Figure 10
shows the initialization in case of ideal grain growth as per the prompt in Sect. 2.4.1.

4.2 Numerical Implementation

Even though the codes run without errors, they can still have logical and mathematical
errors in terms of the numerical implementation. These are checked by the following
steps:

• Verifying if the initialized arrays to store updated variables are of correct dimen-
sions.
• Checking the correct implementation of the five-point stencil to approximate the
Laplacian operator. In some trials, it is observed that division with the square of
step size was missing in the generated code.
• Checking the mathematically correct implementation of the discretized time-
stepping equation (4). Specifically, the signs in front of each individual
 term need
to be checked, and ensuring that the sum in the interaction term, i.e.. iN= j (ηin )2 , has
76 R. Narkhede and B. Eidel

been correctly evaluated. There were instances where the condition .i = j in the
summation operator was not strictly followed, and instead overall sum of all.η2j was
resulted. Such issues were overcome by providing the discretized time-stepping
equation in a LATEX format in the prompt.
• Checking if volume fraction is correctly computed and the condition for extinction
of grains is followed.
• Checking if the bounds on order parameters are enforced.
• Checking if any specified boundary conditions are imposed. Surprisingly, Chat-
GPT 4 omitted the line in prompt 3, instruction 2 (refer Fig. 4) which instructs
imposing Neumann boundary conditions on the right edge of the domain. How-
ever, this can be easily rectified by instructing ChatGPT 4 in a subsequent prompt.
• Checking computation of area fractions.
• Verifying that the parameters mentioned in the prompt are correctly assigned before
running the specific functions of the code.

4.3 Verification with Expected Physical Observation

Ideal grain growth using the code generated by ChatGPT 4 can be used for verification
against physical observations. For the shrinking spherical grain embedded in a large
second grain, it is known that the change in radius of the shrinking grain can be
approximated as:

. D 2 − D02 = kt (7)

Fig. 10 Verifying initialization for ideal grain growth case by plotting heat-maps of individual
grains from the initialized order parameter array. The heat-maps show the desired initialization of
the spherical grain 2 embedded in the larger grain 1
Phase Field Modeling of Grain Growth 77

Fig. 11 Area fraction versus time step plot for ideal grain growth case

where, . D and . D0 are the current and the initial grain radius, respectively, and .k is a
temperature-dependent constant. A similar behavior can be observed by the plot of
area fractions versus time steps using the computation by the generated code. This
is shown in the Fig. 11, where the spherical grain has a radius of 28d.x, where d.x is
the grid spacing and the evolution is carried out with a time step size of 0.005 for
5000 steps.
Similarly, for the Voronoi tessellation based initialization, it is observed that the
evolution follows the physical observation that larger grains grow and smaller ones
disappear eventually. This can be observed in the Fig. 12. The same is also reflected
in the area fraction versus time step plot as shown in the Fig. 13. The area fraction
of the two small grains with indices 3 and 6 decreases to zero as shown in the plot
lines with color red and pink respectively.

5 Discussion

In the context of code generation, ChatGPT 4 performs surprisingly well with a


minimal amount of errors. However, as the complexity of tasks increases, the chances
of running into errors, specifically, logical errors, increase. In such cases, the large
language model (LLM) produces hallucinations, which are outputs that do not match
the intent of the prompt. To drive the LLM to a desired output for a complex task,
a suitably tailored prompt has to be created. In this regard, a suitable prompt design
78 R. Narkhede and B. Eidel

Fig. 12 Grain growth evolution for the Voronoi tessellation with a decreasing gradient as described
in prompt 6. As indicated in the red ellipse, the two small grains shrink and eventually vanish

Fig. 13 Area fraction versus time step plot for the first 10 grains in the Voronoi tessellation with
a decreasing gradient as described in prompt 6. The area fractions of the two small grains with
indices 3 (red) and 6 (pink) decrease and reach zero indicating that these grains have disappeared.
The larger grains growth with varying growth rates
Phase Field Modeling of Grain Growth 79

is important. Here, we discuss the key implications of prompt engineering tactics in


the context of the current problem setting. The following prompt engineering tactics
are in the prompt design:

• Write clear instructions: Clarity in instructions provides limited room for the
LLM to make its own assumptions and in worse cases, produce hallucinations.
However, including too many unnecessary details is not suitable. This was noticed
when details of the Voronoi tessellation algorithm and the specific programming
steps were mentioned in a detailed prompt to create Voronoi tessellation based
initialization. Such details caused programming errors such as those related to
indexing and broadcasting. When only limited instructions were given with certain
restrictions such as periodic boundary conditions and a completely filled domain,
as shown in prompt 5 in Fig. 6, the LLM resulted in an error free code with the
desired outputs. This shows that the prompts should include specific instructions
that build upon the logic of solving the problem. But the specifics of programming
the instructions into a code need not be included.
• Split complex tasks into simpler subtasks: Breaking complex tasks into sub-
steps is a good approach to provide the LLM with the rigorous logic of solving the
problem. Instead of sequential prompting, one can also concatenate all the steps
concisely as done in prompt 3 shown in Fig. 3. Existing functions generated in
response to previous prompts in one chat can be modified for different cases, such
as done in the case of prompt 6 which modifies the function generated from prompt
5 shown in Fig. 6. However, recalling code or information from prompt responses
at an initial stage in a long chat can cause errors due to the limited short-term
memory of the model. This can be remedied by restating the code or information
from such a previous response in the new prompt. This also helps in maintaining
the consistency of the code variables and functions, as we proceed with sequential
subtasks.
• Ask model to adopt a persona: It is observed that providing a system prompt
as shown in Fig. 1, improves the generation of relevant responses and maintains
consistency in the style of code generation. Moreover, specific instructions to
obtain a certain style of response can also be added to a system prompt if needed.
• Providing necessary equations: It is important to provide the context of the
problem in consideration through the required equations. This enables the LLM to
develop an understanding of the problem to be solved and improve the relevance of
the responses to the problem. In our case, we provide all the equations of the model
description in prompt 2 shown in Fig. 2, since the free energy function as defined in
(2) is not a standard well-known differential equation like Laplace equation in heat
transfer or Burgers’ equation in fluid dynamics. In our case, without specifying
the equations in model description, ChatGPT 4 assumed some equations which
did not exactly resemble to the equations describing the PF model.
ChatGPT 4 could also perform the relatively complex mathematical operation
of differentiating the free energy function (2) and local free energy density (3)
and use it in the non-conserved Allen-Cahn equation (1). Further, it implemented
the specified numerical methods to obtain the discretized Eq. (4) for the time-
80 R. Narkhede and B. Eidel

stepping of order parameters. Despite correct mathematical operations, sometimes


it made errors in maintaining the mathematical correctness during code generation
as mentioned in the verification of numerical implementation in Sect. 4.2. It is also
important to note that providing equations in the LATEX format ensured their correct
interpretation and mathematical understanding by the LLM.
• To do or not to do?: It is observed that prompts with statements that tell “what
not to do” result in more erroneous responses as compared to prompts that avoid
them. This has been also followed in the prompts used in the current problem.
For example, in prompt 5 shown in Fig. 6, instead of mentioning, “Do not let the
domain be empty at any location.”, it is mentioned in a specific way with a remedy
to the issue by stating, “Extend any Voronoi cells near the domain boundary till
the domain boundary such that we have the domain completely filled with Voronoi
cells.”

Designing a prompt that adheres to these tactics is an iterative process and may
require some trials. It is usually not prudent to continue with one chat for multiple
variations of the same prompt. Furthermore, rectifying errors generated by a sub-
optimal prompt can help in cases of few errors. However, if the generated code
contains several syntax, semantic or logical errors, then it is suggested to redraft
a better prompt and use it in a new chat. This iterative process of prompt design
also signifies the importance of human intervention in using ChatGPT 4 for code
generation.

6 Conclusion

ChatGPT 4 has been used to generate a PF code for grain growth evolution. The
Fan and Chen [6] model has been applied and the equations describing the model
have been solved numerically using the explicit Euler forward method along with
the five-point stencil of the finite difference method for approximating the Laplace
operator. The complex problem has been broken into steps that instruct ChatGPT 4
to generate the code. To this end, specific prompt engineering tactics have been used
to design prompts that generate a PF method code which can be applied to gener-
alized initialization of the order parameters. Specific applications of this code are
shown by developing two initialization cases, (1) ideal grain growth with a spherical
grain embedded in a larger grain and (2) Voronoi tessellations. The initialization has
also been carried out by code generated by ChatGPT 4 using suitable prompts. The
resulting code generated by ChatGPT 4 is error-free in terms of syntax, semantic
and logical errors. Mathematical correctness of the numerical implementation has
been ensured by rigorous checking of the generated code. The final results have
been verified with expected results based on general physical observations in grain
growth evolution. Details of the prompt engineering tactics applied in the design of
the prompt have been explained.
Phase Field Modeling of Grain Growth 81

In extension to this work, newer prompt engineering methods like zero-shot chain
of thought prompting [8] can be used, which allow the LLM to generate its own
reasoning for solving the problem. This reasoning can be used as an initial point
for suitable prompt design. Another approach to solving complex code generation
tasks is to provide the steps to solve the task in terms of a code template with
comments. The aim is to give the LLM some direction to solve the problem as well
as a template to structure the code, and allow it to develop detailed solutions with code
based on this limited direction. In case of complex initialization, such as a specific
grain microstructure, its image can be provided as an input in the prompt, with
required conditions, and ask ChatGPT 4 to generate a code that provides a similar
initialized microstructure. With the presence of LLMs like ChatGPT 4 and their
continuous developments, prompt engineering based code generation will become
an indispensable approach for computational materials scientists to tackle complex
problems.

References

1. P. Blikstein and A. P. Tschiptschin. Monte Carlo simulation of grain growth. Materials Research,
2:133–137, 1999.
2. F. Wakai, N. Enomoto, and H. Ogawa. Three-dimensional microstructural evolution in ideal
grain growth-general statistics. Acta Materialia, 48(6):1297–1311, 2000.
3. H. J. Frost and C. V. Thompson. Computer simulation of grain growth. Current Opinion in Solid
State and Materials Science, 1(3):361–368, 1996.
4. Y. Liu, T. Baudin, and R. Penelle. Simulation of normal grain growth by cellular automata.
Scripta materialia, 34(11), 1996.
5. S. Biner. Solving Phase-Field Models with Finite Difference Algorithms. Programming Phase-
Field Modeling. pp. 17-97 (2017),
6. D. Fan and L.-Q. Chen. Computer simulation of grain growth using a continuum field model.
Acta Materialia, 45(2):611–622, 1997.
7. D. Weygand, Y. Brechet, and J. Lepinoux. A vertex dynamics simulation of grain growth in two
dimensions. Philosophical Magazine B, 78(4):329–352, 1998.
8. T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa. Large language models are zero-shot
reasoners, 2023.
Modeling Corrosion Using a Cellular
Automaton

Mehdi Bakhshi Zadeh and Bernhard Eidel

Abstract This study develops a 2D cellular automaton model to simulate corro-


sion processes using Fick’s second law of diffusion and Reaction Kinetics. The
model accurately predicts corrosion initiation and propagation under various envi-
ronmental conditions, providing valuable insights for designing effective corrosion
prevention strategies. The verification process confirmed the accurate implementa-
tion of theoretical principles, while simulation results highlighted the progression
of corrosion and the impact of protective layers and different metal corrosion rates.
Furthermore, the model emphasizes the potential of AI-assisted tools in advancing
research methodologies. Continuous refinement and expert oversight are essential
to enhance the reliability of these tools. This work demonstrates the significant role
of Cellular Automata (CA) in understanding complex corrosion mechanisms and
underscores the necessity for ongoing improvements in simulation techniques. By
integrating theoretical and practical aspects, this study offers a robust framework for
future research and applications in corrosion management.

1 Introduction

Corrosion is a natural, inevitable process that results in the gradual deterioration of


materials, particularly metals, through chemical or electrochemical reactions with
their environment. This phenomenon is driven by the tendency of materials to return
to their more stable, lower-energy states, often leading to the formation of oxides
or other compounds. The consequences of corrosion are significant, affecting the
structural integrity, safety, and longevity of infrastructure, machinery, and industrial

M. Bakhshi Zadeh (B) · B. Eidel


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 83


B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_4
84 M. Bakhshi Zadeh and B. Eidel

components. In the absence of effective management strategies, corrosion can lead


to catastrophic failures, causing economic losses and safety hazards.
Corrosion can occur uniformly across a material’s surface or in localized forms,
such as pitting and crevice corrosion, which lead to concentrated damage in specific
areas. Pitting corrosion, in particular, is a dangerous form as it can lead to the rapid
penetration of materials with minimal overall mass loss. Understanding the mecha-
nisms behind various corrosion types is vital for developing effective prevention and
mitigation strategies, which can significantly reduce the risk of structural failures.
Recent advancements in computational modeling, such as Cellular Automata
(CA), have provided powerful tools to simulate and understand the complex, dynamic
nature of corrosion processes. CA provides a discrete modeling framework, enabling
the simulation of corrosion at various scales, from the initiation of corrosion pits to
the evolution of surface roughness in materials. This study focuses on applying CA
models, particularly through Fick’s second law and reaction kinetics, to simulate and
predict corrosion processes in a 2D environment.
Studies show that CA models are effective in capturing the stochastic nature of
localized corrosion, such as pitting, in various materials including low-carbon steels
[4]. Additionally, CA has been employed to model oxidation mechanisms in steel,
accurately simulating the diffusion of oxygen and the formation of oxide layers
in high-temperature environments [5]. These applications highlight the versatility
of CA as a tool for studying a wide range of corrosion scenarios across different
materials and environmental conditions.

1.1 Theoretical Backbone

Cellular Automata (CA) are computational models that utilize discrete grid systems
to simulate the evolution of complex systems over time. Each cell within the grid
can exist in a finite number of states and interact with its neighboring cells based on
predefined rules. The classification of cells in the corrosion process is illustrated in
Fig. 1. This figure shows how each cell transitions during the corrosion process, high-
lighting the different states and interactions that are critical to the simulation model.
The diagram helps to visualize the cellular automaton framework and underscores

Fig. 1 The cell classification model and how each cell changes in the corrosion process
Modeling Corrosion Using a Cellular Automaton 85

the importance of accurate state classification in predicting the progression of corro-


sion. Key ingredients of CA include the grid structure, state set, neighborhood, and
transition rules. This method is particularly effective in modeling local interactions
and emergent global behaviors, making it suitable for simulating corrosion processes
where localized interactions can lead to significant material degradation. The utility
of this approach in corrosion management has been well documented. For instance,
Wang et al. (2019) demonstrated the effectiveness of CA models in simulating the
high-temperature corrosion of Ni-based alloys in chloride molten salts, showing how
these models can predict the growth of corrosion layers and the migration of elements
under various conditions [1].
From an electro-chemical perspective, corrosion involves anodic reactions where
metal atoms lose electrons and form metal ions (e.g., .Fe → Fe2+ + 2e− ), and
cathodic reactions where electrons are consumed, typically by a reduction process
(e.g., .O2 + 4H + + 4e− → 2H2 O). The presence of an electrolyte is crucial as it
facilitates ion and electron movement, and the electrochemical cells created by these
reactions drive the corrosion process through the potential difference they gener-
ate. The approach further highlights the ability of CA models to capture intricate
interactions between electrochemical reactions, material properties, and environ-
mental factors, providing valuable insights into corrosion mechanisms and aiding in
the development of predictive maintenance strategies [2]. Additionally, CA models
allow for the incorporation of stochastic elements, which are essential in capturing
the inherent randomness and complexity of corrosion processes. This makes CA an
invaluable tool for researchers and engineers who aim to develop predictive models
and design effective corrosion prevention measures.
Further studies have emphasized the application of CA models in simulating
different types of corrosion phenomena. For example, Chen and Wen [7] utilized
CA models to simulate the uniform corrosion damage evolution of steel structures
exposed to acid rain, demonstrating how varying concentrations of corrosive agents
impact the corrosion depth over time. Similarly, Xiao et al. [8] applied CA models to
simulate pitting corrosion in Ni-based alloys, showing the influence of chloride ion
concentration and solution pH on the evolution of pits. These studies underscore the
versatility of CA in capturing both uniform and localized corrosion processes, offer-
ing detailed insights that are crucial for predicting material degradation in various
environments.
The model implemented in this study employs two critical theoretical components:
Fick’s Second Law (FSL) of Diffusion and reaction kinetics. FSL describes how the
concentration of a substance changes over time due to diffusion, which is fundamental
in modeling the transport of corrosive species in materials. The equation is given by:

∂C
. = D∇ 2 C , (1)
∂t
where .C is the concentration, .t is time, and . D is the diffusion coefficient. This
equation is discretized and applied within the CA framework to simulate the diffusion
of corrosive agents. The implementation of FSL in CA involves discretizing the
86 M. Bakhshi Zadeh and B. Eidel

spatial domain into a grid where each cell represents a concentration of the diffusing
species and updating the concentration based on the diffusion coefficient and the
concentration gradient. This approach allows for a detailed representation of the
spatial and temporal evolution of the concentration field, capturing the effects of
diffusion on the corrosion process.
Reaction kinetics (RK) are incorporated to model the chemical reactions occurring
at the material’s surface, which contribute to the corrosion process. The rate of these
reactions is governed by factors such as temperature, concentration of reactants, and
the presence of catalytic agents. In our CA model, these reactions are represented
through state transitions of the cells, where the probability of transition depends on
the local concentration of reactive species. For example, the interaction between the
diffusing oxygen and metal atoms to form oxides can be modeled as a probabilistic
state transition [3]. This probabilistic approach ensures that the model can capture
the inherent variability and stochastic nature of corrosion reactions.
In the context of high-temperature corrosion, the diffusion of oxygen and chlo-
ride ions through molten salt layers and their reaction with metal surfaces is critical.
Wang et al. [1] provides a detailed framework for modeling these interactions, show-
ing that the formation of protective layers and the degradation of the metal substrate
can be effectively simulated using CA. This detailed framework is essential for under-
standing how different environmental conditions and material properties influence
the corrosion process, enabling the development of targeted strategies for corrosion
mitigation.

1.2 Problem Statement

This study aims to address the challenge of predicting and managing corrosion in
industrial applications by utilizing a 2D CA model. The specific problem involves
simulating the diffusion of corrosive species and the resultant chemical reactions
using FSL and reaction kinetics. The parameters used in the corrosion simulation are
summarized in Table 1.
In the following, we provide the prompt used to generate the algorithm for the
corrosion modeling by ChatGPT-4. This prompt is specifically used to instruct the
AI to create an algorithm based on FSL of diffusion and reaction kinetics within a
2D cellular automaton framework.
The algorithm as LATEX-pseudocode, which is generated by the prompt in Fig. 2
is given in the algorithm box 3.
Modeling Corrosion Using a Cellular Automaton 87

Table 1 Parameters used in the corrosion simulation study


Parameters Values/Types Dimensions
Domain size (. N x , N y ) 200 .[L]

Protection layer thickness 2 .[L]

Initial damage (crack) indices List of tuples .[L] × [L]


Diffusion coefficient (top, . Dtop ) 0.1 .[L
2 T −1 ]

Diffusion coefficient (bottom, . Dbottom ) 0.05 .[L


2 T −1 ]

Time step size (.t) 0.01 .[T ]

Spatial step size (.x, y) 1 .[L]

Random factor (.random_factor) Uniform distribution .[1]

Output Corrosion animation –

Fig. 2 Prompt for generating LATEXpseudocode of an algorithm for corrosion simulation along
with CA
88 M. Bakhshi Zadeh and B. Eidel

Algorithm 3: Corrosion Modeling in 2D using Fick’s 2nd Law and Cellular


Automata
Input: Grid size (N x, N y), time steps T , time step t , grid spacing x , y , diffusion coefficient D , initial
concentration C(x, 0), reaction rates ka , kc , threshold potential Vth
Output: Concentration field C(x, t) and corrosion state over time
1 Initialization:
2 Set the grid with size (N x, N y);
3 Initialize the concentration field C(x, 0);
4 Initialize the corrosion state S(x, 0) (1 for active corrosion sites, 0 otherwise);
5 Set diffusion coefficient D and reaction rates ka , kc ;
6 for t = 0 to T with step t do
7 Diffusion Step (using Fick’s 2nd Law):
8 foreach grid point (i, j ) do
9 Calculate the concentration C(i, j)n+1 at the next time step using the central finite difference scheme:

C(i + 1, j )n − 2C(i, j )n + C(i − 1, j )n
C(i, j )n+1 = C(i, j )n + Dt
x 2

C(i, j + 1)n − 2C(i, j)n + C(i, j − 1)n
+
y 2

10 end
11 Corrosion Reaction Kinetics:
12 foreach grid point (i, j) do
13 Calculate the local potential V (i, j )n based on the concentration C(i, j )n and other factors (e.g.,
electrolyte, material properties);
14 if V (i, j)n > Vth and S(i, j)n = 1 then
15 Update concentration due to anodic reaction:

C(i, j )n+1 = C(i, j )n+1 − ka t

16 end
17 else if V (i, j )n < Vth and S(i, j )n = 1 then
18 Update concentration due to cathodic reaction:

C(i, j )n+1 = C(i, j)n+1 + kc t

19 end
20 end
21 Cellular Automaton Update:
22 foreach grid point (i, j) do
23 Update the corrosion state S(i, j)n+1 based on local concentration C(i, j )n+1 and neighboring states:
24 if C(i, j )n+1 > Ccrit or neighboring sites S(i + 1, j ), S(i − 1, j ), S(i, j + 1), S(i, j − 1) are corroded
then
25 S(i, j )n+1 = 1 ;
// Activate corrosion if concentration exceeds threshold or
neighboring sites are corroded
26 end
27 end
28 Apply Boundary Conditions:
29 Apply appropriate boundary conditions (e.g., Dirichlet, Neumann) to the concentration and corrosion state
fields;
30 end
31 Output:
32 Return the concentration field C(x, t) and corrosion state S(x, t) over time;
Modeling Corrosion Using a Cellular Automaton 89

2 Prompt

In the following, the sketch of the simulation domain in Fig. 3 with its features is
introduced and explained as follows. The coordinate system, labeled with . x and
. y axes, intersects at the origin (0, 0), defining the framework of the simulation.

The domain consists of a 2D rectangular grid with dimensions . N x × N y and grid


spacing .x and . y. Initial damage sites, highlighted in red at the top and bottom,
indicate entry points for corrosive substances. Surrounding the grid, dashed blue
lines represent a protective layer of specified thickness. The upper half of the grid,
referred to as the Top Metal, is shaded light grey, while the lower half, the Bottom
Metal, is shaded dark grey. Green arrows illustrate the diffusion of corrosive species
through the grid. Orange patches and arrows depict the state transitions of cells from
uncorroded to corroded states. Additionally, the “Time-stepping Loop” annotation
illustrates the iterative update process of the simulation.
The initial prompt for generating the code is given in Fig. 4.

Fig. 3 Detailed sketch illustrating the 2D corrosion simulation process on metal surfaces using a
Cellular Automaton, highlighting critical parts, diffusion paths, and state transitions
90 M. Bakhshi Zadeh and B. Eidel

Fig. 4 Prompt to simulate corrosion in metals with CA


Modeling Corrosion Using a Cellular Automaton 91

Fig. 4 (continued)

3 Code Listing

1 # To find a better animation, please use the saved gif file


2 import numpy as np
3 import matplotlib.pyplot as plt
4 import matplotlib.animation as animation
5 import matplotlib
6 from matplotlib.colors import ListedColormap
7 # Set the backend to ’TkAgg’ for interactive plotting with
→ Tkinter
8 matplotlib.use(’TkAgg’)
9 # Define the domain and initialize parameters with specific
→ initial damage and corroded cells
10 def initialize_domain(N_x, N_y, protection_layer_thickness):
11 C = np.zeros((N_x, N_y))
12 # Set initial damage cells for the top crack
13 top_crack_indices = [(slice(0, 2), slice(95, 105)),
14 (slice(3, 5), slice(96, 104)),
15 (slice(6, 8), slice(97, 103)),
16 (slice(9, 11), slice(99, 101)),
17 (slice(12, 14), slice(95, 101)),
18 (slice(15, 17), slice(94, 100)),]
19 # Set initial damage cells for the bottom crack
20 bottom_crack_indices = [(slice(198, 200), slice(95, 105)),
21 (slice(195, 197), slice(96, 104)),
22 (slice(192, 194), slice(97, 103)),
23 (slice(189, 191), slice(96, 102)),
24 (slice(186, 188), slice(95, 101)),
25 (slice(183, 185), slice(94, 100)),]
92 M. Bakhshi Zadeh and B. Eidel

26 for idx in top_crack_indices + bottom_crack_indices:


27 C[idx] = -0.75 # Initial damage (black)
28 # Add protective layer
29 C[:protection_layer_thickness, :] = -1
30 C[-protection_layer_thickness:, :] = -1
31 C[:, :protection_layer_thickness] = -1
32 C[:, -protection_layer_thickness:] = -1
33 return C
34 # Implement Fick’s second law for diffusion with different
→ coefficients for each metal
35 def diffuse(C, D_top, D_bottom, delta_t, delta_x, delta_y):
36 C_new = C.copy()
37 random_factor = np.random.uniform(0.9, 1.1, size=C.shape)
38 # Apply different diffusion coefficients based on the metal
→ type
39 for i in range(1, C.shape[0] - 1):
40 for j in range(1, C.shape[1] - 1):
41 if C[i, j] != -1 and C[i, j] != -0.75: # Skip
→ protective layer and initial damage
42 if i < C.shape[0] // 2:
43 D = D_top
44 else:
45 D = D_bottom
46 C_new[i, j] += D * delta_t * ((C[i+1, j] - 2 * C[
→ i, j] + C[i-1, j]) / delta_x ** 2 + (C[i, j+1] - 2 * C[i,
→ j] + C[i, j-1]) / delta_y ** 2) * random_factor[i, j]
47 return C_new
48 # Incorporate reaction kinetics with different corrosion rates
→ for different metals
49 def apply_reaction_kinetics(C, state, reaction_threshold,
→ corrosion_probability_top, corrosion_probability_low, step
→ ):
50 new_state = state.copy()
51 if step < 50:
52 return new_state # No corrosion until step 50
53

54 corroded = (C >= reaction_threshold) & (state == 0)


55 # Determine the metal type for each cell
56 is_top_metal = np.arange(state.shape[0]).reshape(-1, 1) <
→ state.shape[0] // 2
57 is_low_metal = ~is_top_metal
58 # Apply corrosion probabilities
59 corrosion_probability = np.where(is_top_metal,
→ corrosion_probability_top, corrosion_probability_low)
60 probabilistic_corrosion = np.random.rand(*C.shape) <
→ corrosion_probability
61 # Only corrode if at least one neighbor is corroded
62 for i in range(1, state.shape[0] - 1):
63 for j in range(1, state.shape[1] - 1):
64 if corroded[i, j] and np.any(state[i-1:i+2, j-1:j+2]
→ == 1):
65 if probabilistic_corrosion[i, j]:
Modeling Corrosion Using a Cellular Automaton 93

66 new_state[i, j] = 1 # New cells become


→ corroded with specified probability
67 return new_state
68 # Simulation function
69 def simulate_corrosion(N_x, N_y, D_top, D_bottom, time_steps,
→ delta_x, delta_y, delta_t, protection_layer_thickness,
70 reaction_threshold,
→ corrosion_probability_top, corrosion_probability_low,
→ filename):
71 C = initialize_domain(N_x, N_y, protection_layer_thickness)
72 state = np.zeros_like(C)
73 # Set initial corroded cells in the state array
74 initial_corroded_indices = [(3, N_y // 2),(N_x - 4, N_y // 2)
→ ]
75 for idx in initial_corroded_indices:
76 state[idx] = 1 # Initial corroded cells
77 fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 10)) # Keep
→ the figure size the same
78 corroded_counts = []
79 top_metal_counts = []
80 bottom_metal_counts = []
81 protective_layer_counts = []
82 def update(t):
83 nonlocal C, state
84 if t >= 50:
85 C = diffuse(C, D_top, D_bottom, delta_t, delta_x,
→ delta_y)
86 state = apply_reaction_kinetics(C, state,
→ reaction_threshold, corrosion_probability_top,
→ corrosion_probability_low, t)
87 # Display state: corroded cells are marked with 1,
→ initial damage with -0.75, protective layer with -1,
88 # and different colors for different metals
89 display_state = np.zeros_like(state, dtype=float)
90 display_state[(state == 0) & (np.arange(state.shape[0]).
→ reshape(-1, 1) < state.shape[0] // 2)] = -0.75 # Upper
→ metal
91 display_state[(state == 0) & (np.arange(state.shape[0]).
→ reshape(-1, 1) >= state.shape[0] // 2)] = -0.25 # Lower
→ metal
92 display_state[C == -1] = -1 # Protective layer
93 display_state[C == -0.75] = 0.75 # Initial damage
94 display_state[state == 1] = 1 # Corroded cells
95 ax1.clear()
96 im = ax1.imshow(display_state, animated=True, cmap=’
→ inferno’, vmin=-1, vmax=1)
97 ax1.set_title(f’Time Step: {t + 1}’)
98 ax1.set_xlabel(’X Coordinate’)
99 ax1.set_ylabel(’Y Coordinate’)
100 # Custom legend
101 colors = [’yellow’, ’gold’, ’midnightblue’, ’darkviolet’,
→ ’black’]
94 M. Bakhshi Zadeh and B. Eidel

102 labels = [’Corroded Cells’, ’Initial Damage’, ’Top Metal’


→ , ’Bottom Metal’, ’Protective Layer’]
103 handles = [plt.Line2D([0], [0], marker=’o’, color=’w’,
→ markerfacecolor=c, markersize=10) for c in colors]
104 ax1.legend(handles, labels, bbox_to_anchor=(1.28, 1.05),
→ loc=’upper left’, borderaxespad=0.)
105 # Count the number of each type of cell
106 corroded_count = np.sum(state == 1)
107 top_metal_count = np.sum((state == 0) & (np.arange(state.
→ shape[0]).reshape(-1, 1) < state.shape[0] // 2))
108 bottom_metal_count = np.sum((state == 0) & (np.arange(
→ state.shape[0]).reshape(-1, 1) >= state.shape[0] // 2))
109 protective_layer_count = np.sum(C == -1)
110 corroded_counts.append(corroded_count)
111 top_metal_counts.append(top_metal_count)
112 bottom_metal_counts.append(bottom_metal_count)
113 protective_layer_counts.append(protective_layer_count)
114 ax2.clear()
115 ax2.plot(corroded_counts, color=’yellow’, label=’Corroded
→ Cells’)
116 ax2.plot(top_metal_counts, color=’midnightblue’, label=’
→ Top Metal’)
117 ax2.plot(bottom_metal_counts, color=’darkviolet’, label=’
→ Bottom Metal’)
118 ax2.plot(protective_layer_counts, color=’black’, label=’
→ Protective Layer’)
119 ax2.set_xlabel(’Time Step’)
120 ax2.set_ylabel(’Number of Cells’)
121 ax2.legend()
122 return im,
123 ani = animation.FuncAnimation(fig, update, frames=time_steps,
→ repeat=False, interval=50, blit=True)
124 # Set the colorbar
125 cbar = plt.colorbar(ax1.imshow(np.zeros((N_x, N_y)), animated
→ =True, cmap=’inferno’, vmin=-1, vmax=1), ax=ax1,
→ orientation=’vertical’, pad=0.1)
126 cbar.ax.set_ylabel(’Corrosion Status’)
127 plt.subplots_adjust(left=0.2, right=0.85, top=0.85, bottom
→ =0.2) # Adjust plot size
128 plt.tight_layout() # Adjust layout to make space for legend
129 # Save the animation as a GIF using Pillow
130 ani.save(filename, writer=’pillow’)
131 plt.show()
132 return ani
133 # Example usage with specified initial conditions
134 N_x, N_y = 200, 200 # Dimensions of the plate
135 D_top = 1.4 # Diffusion coefficient for the top metal
136 D_bottom = 1.35 # Diffusion coefficient for the bottom metal
137 time_steps = 500 # Number of time steps to ensure full corrosion
138 delta_x = delta_y = 1 # Grid spacing
139 delta_t = 0.2 # Increased time step to ensure full corrosion
140 protection_layer_thickness = 2 # Thickness of the protective
→ layer
Modeling Corrosion Using a Cellular Automaton 95

141 reaction_threshold = 0.1 # Threshold concentration to trigger


→ corrosion
142 # Corrosion probabilities
143 corrosion_probability_top = 0.35 # 35% chance of each cell being
→ corroded in the top metal
144 corrosion_probability_low = 0.33 # 33% chance of each cell being
→ corroded in the low metal
145 # Call the simulation function and save the animation
146 filename = ’corrosion_simulation.gif’
147 ani = simulate_corrosion(N_x, N_y, D_top, D_bottom, time_steps,
→ delta_x, delta_y, delta_t, protection_layer_thickness,
148 reaction_threshold,
→ corrosion_probability_top, corrosion_probability_low,
→ filename)
Code Listing 4.1 Output by ChatGPT-4o for solving 2D Modeling Corrosion Using CA

4 Verification

This section verifies the implementation of FSL in the corrosion simulation code.
Since the numerical solution of the Initial Boundary Value Problem (IBVP) is carried
out by a explicit finite difference (FD) method, the verification implies a test of the
FD scheme, and for the explicit character a check of the stability conditions, and
conducting unit tests.
• Finite Difference Scheme Verification. Fick’s second law in two dimensions, as
.

given in  2 
∂C(x, y, t) ∂ C(x, y, t) ∂ 2 C(x, y, t)
. =D + (2)
∂t ∂x2 ∂ y2

is approximated in the simulation code using a central finite difference scheme as


shown in (2).
In this scheme, .C n [i, j] represents the concentration at grid point .(i, j) at time
.tn , and .C
n+1
[i, j] represents the concentration at time .tn+1 . The terms .C[i + 1, j],
.C[i − 1, j], .C[i, j + 1], and .C[i, j − 1] correspond to the concentrations at the
neighboring grid points. Additionally, .x and .y are the grid spacings in the .x and
. y directions, respectively, .t is the time step, and . D is the diffusion coefficient.

.• Stability Condition. The stability condition for an explicit finite difference


method in two dimensions is:
x 2
.t ≤ (3)
2D
96 M. Bakhshi Zadeh and B. Eidel

Given the parameters .x =  y = 1 and the diffusion coefficients of . Dtop = 1.4
12
and . Dbottom = 1.35, the stability limits are obtained as .t ≤ 2·1.4 ≈ 0.357 and
12
.t ≤ ≈ 0.370 in the . x- and . y-directions. With .t set to 0.2 in the simula-
2·1.35
tions, the stability condition is fulfilled.
• Unit Tests Based on Specific Diffusion Function: The verification code, as pro-
.
vided in Code Listing 4.2, is generated using ChatGPT-4o based on the algorithm
outlined in Algorithm Box 4. In these tests, a simplified version of the diffusion
function is used, excluding the random factor for clarity.

Algorithm 4: Calculation of Diffusion to Verify the Simulation


1 Input:
• C: 2D array representing the initial concentration field, with shape (N x, N y)
• D: Diffusion coefficient (scalar)
• t: Time step size (scalar)
• x, y: Grid spacing in the x and y directions, respectively (scalars)

Output:

• Cn+1 : 2D array representing the updated concentration field after one time step

Steps:
1. Initialize Cn+1 ← C (Create a copy of the initial concentration array).
2. For each interior grid point (i, j ) where 1 ≤ i ≤ N x − 2 and 1 ≤ j ≤ N y − 2:
a. If C[i, j] = −1 and C[i, j] = −0.75 then:
i. Update Cn+1 [i, j] using the following formula:

C(i + 1, j )n − 2C(i, j )n + C(i − 1, j )n
C(i, j )n+1 = C(i, j )n + Dt
x 2

C(i, j + 1) − 2C(i, j )n + C(i, j − 1)n
n
+
y 2

b. End If.
3. End For.
4. Return Cn+1 (Return the updated concentration field).
Modeling Corrosion Using a Cellular Automaton 97

1 import numpy as np
2 def diffuse_test(C, D, delta_t, delta_x, delta_y):
3 C_n_plus_1 = C.copy()
4 for i in range(1, C.shape[0] - 1):
5 for j in range(1, C.shape[1] - 1):
6 if C[i, j] != -1 and C[i, j] != -0.75:
7 C_new[i, j] += D * delta_t * ((C[i+1, j] - 2 * C[
→ i, j] + C[i-1, j]) / delta_x ** 2 + (C[i, j+1] - 2 * C[i,
→ j] + C[i, j-1]) / delta_y ** 2)
8 return C_n_plus_1
9

10 def test_diffusion():
11 C = np.zeros((5, 5))
12 C[2, 2] = 1 # Initial concentration at the center
13 D = 1.0
14 delta_t = 0.2 # Time step for stability
15 delta_x = delta_y = 1 # Grid spacing
16 C_n_plus_1 = diffuse_test(C, D, delta_t, delta_x, delta_y)
17 expected_center = 1 - 4 * D * delta_t / delta_x ** 2
18 expected_adjacent = D * delta_t / delta_x ** 2
19 assert np.isclose(C_new[2, 2], expected_center), f"Central
→ value should decrease to {expected_center}"
20 assert np.isclose(C_new[1, 2], expected_adjacent), f"Adjacent
→ cells should increase to {expected_adjacent}"
21

22 test_diffusion()
23 print("Test passed!")
Code Listing 4.2 This code is used to ensure the implementation of FSL

In conclusion, the verification confirms that the diffusion implementation in the


corrosion simulation code adheres to FSL. The finite difference scheme is correctly
applied, the stability condition is met, and unit tests validate the expected diffu-
sion behavior. Therefore, the diffusion component of the simulation is accurately
implemented.
Reaction Kinetics
This section verifies that the implemented RK in the corrosion simulation obeys the
expected behavior based on given parameters, ensuring the corrosion process follows
the reaction threshold and corrosion probabilities for different metal types.
The code used for verification, as detailed in Code Listing 4.3, is generated by
ChatGPT-4o following the algorithm presented in Algorithm Box 5. A test function
initializes a grid representing a metal plate with specific initial conditions for both
corrosion and concentration. The reaction kinetics function is then applied to verify
if the cells corrode as expected. To ensure consistent results, the random seed is fixed,
and specific cells are deliberately corroded.
98 M. Bakhshi Zadeh and B. Eidel

Algorithm 5: Calculation of Reaction Kinetics to Verify the Corrosion Modeling


1 Input:
• C: 2D array representing the concentration field, with shape (N x, N y)
• state: 2D array representing the current corrosion state, with shape (N x, N y)
• r eaction_thr eshold: Scalar value representing the concentration threshold for corrosion
• corr osion_ pr obabilit y_top: Probability of corrosion for the top metal layer
• corr osion_ pr obabilit y_low: Probability of corrosion for the bottom metal layer
• step: Integer representing the current simulation step

Output:

• new_state: 2D array representing the updated corrosion state

Procedure:

1. Initialize new_state ← state.


2. If step < 50, then return new_state.
3. Identify cells where C ≥ r eaction_thr eshold and state == 0.
4. Create a boolean mask is_top_metal to identify top metal regions.
5. Assign corr osion_ pr obabilit y based on is_top_metal.
6. Set a fixed random seed for reproducibility.
7. Generate a random array pr obabilistic_corr osion.
8. Manually set pr obabilistic_corr osion[4, 5] and pr obabilistic_corr osion[6, 5] to 0.
9. For each interior grid point (i, j ):

a. If corr oded[i, j] is True:


i. Extract neighbors of state[i, j].
ii. If any neighbor is corroded:
A. If pr obabilistic_corr osion[i, j] < corr osion_ pr obabilit y[i, j], then set
new_state[i, j] = 1.
iii. Else, print a message indicating no corrosion.
b. Else, print a message indicating no corroded neighbors.

10. Return new_state.

1 import numpy as np
2

3 # Corrected apply_reaction_kinetics function with deterministic


→ outcome for testing
4 def apply_reaction_kinetics(C, state, reaction_threshold,
→ corrosion_probability_top, corrosion_probability_low, step
→ ):
5 new_state = state.copy()
6 if step < 50:
7 print(f"Step {step}: No corrosion applied (initial delay)
→ .")
8 return new_state # No corrosion until step 50
9

10 corroded = (C >= reaction_threshold) & (state == 0)


11 is_top_metal = np.zeros_like(C, dtype=bool)
12 is_top_metal[:C.shape[0] // 2, :] = True # First half rows
→ are top metal
13 corrosion_probability = np.where(is_top_metal,
→ corrosion_probability_top, corrosion_probability_low)
Modeling Corrosion Using a Cellular Automaton 99

14

15 # Set a fixed random seed for reproducibility


16 np.random.seed(0)
17 probabilistic_corrosion = np.random.rand(*C.shape)
18 # Force the cells [4, 5] and [6, 5] to corrode for testing
→ purposes
19 probabilistic_corrosion[4, 5] = 0
20 probabilistic_corrosion[6, 5] = 0
21

22 for i in range(1, state.shape[0] - 1):


23 for j in range(1, state.shape[1] - 1):
24 if corroded[i, j]:
25 neighbors = state[i - 1:i + 2, j - 1:j + 2]
26 if np.any(neighbors == 1):
27 if probabilistic_corrosion[i, j] <
→ corrosion_probability[i, j]:
28 new_state[i, j] = 1 # New cells become
→ corroded with specified probability
29 print(
30 f"Cell ({i}, {j}) corroded. Neighbors
→ : {neighbors.flatten()}, Prob: {corrosion_probability[i, j
→ ]}, Rand: {probabilistic_corrosion[i, j]}")
31 else:
32 print(
33 f"Cell ({i}, {j}) not corroded.
→ Neighbors: {neighbors.flatten()}, Prob: {
→ corrosion_probability[i, j]}, Rand: {
→ probabilistic_corrosion[i, j]}")
34 else:
35 print(f"Cell ({i}, {j}) has no corroded
→ neighbors.")
36 else:
37 print(f"Cell ({i}, {j}) not above reaction
→ threshold.")
38 return new_state
39

40 # Test function for reaction kinetics


41 # The test function initializes the grid and verifies the
→ corrosion process.
42 def test_reaction_kinetics():
43 N_x, N_y = 10, 10 # Small grid for testing
44 C = np.zeros((N_x, N_y))
45 state = np.zeros((N_x, N_y))
46 # Set initial conditions
47 reaction_threshold = 0.1
48 corrosion_probability_top = 0.5
49 corrosion_probability_low = 0.3
50 # Set specific cells to meet the corrosion threshold
51 C[4, 5] = 0.2 # Above threshold in top metal
52 C[6, 5] = 0.2 # Above threshold in bottom metal
53 state[4, 4] = 1 # Initial corroded cell near top threshold
→ cell
100 M. Bakhshi Zadeh and B. Eidel

54 state[6, 6] = 1 # Initial corroded cell near bottom


→ threshold cell
55

56 # Print initial state and concentration


57 print("Initial State:")
58 print(state)
59 print("Initial Concentration (C):")
60 print(C)
61

62 # Apply reaction kinetics


63 step = 51 # Step beyond the initial delay
64 new_state = apply_reaction_kinetics(C, state,
→ reaction_threshold, corrosion_probability_top,
→ corrosion_probability_low, step)
65

66 # Print new state after applying reaction kinetics


67 print("New State After Reaction Kinetics:")
68 print(new_state)
69

70 # Check if the specific cells have corroded based on their


→ neighbors and probabilities
71 assert new_state[4, 5] == 1, "Top metal cell should be
→ corroded."
72 assert new_state[6, 5] == 1, "Bottom metal cell should be
→ corroded."
73

74 # Run the test


75 test_reaction_kinetics()
Code Listing 4.3 Code to test the correct implementation of reaction kinetics

4.1 Results

Initial conditions are set with specific cells at the corrosion threshold. The reaction
kinetics function is applied, with debug statements providing insights into the process.
As expected, the cells at [4, 5] and [6, 5] corrode, confirming the function’s behavior
under controlled conditions as underpinned in Fig. 5.
The verification process demonstrates that the RK implementation obeys the
expected behavior based on given parameters. By using a fixed random seed and
forcing specific cells to corrode, deterministic outcomes are ensured, confirming
the implementation’s correctness. This approach can be extended to validate other
aspects of the corrosion simulation.
The code provided in the repository [6] is used to specify the initial and boundary
conditions of the problem. The feasibility of the research is also ensured by verifying
that the obtained results closely match with a high degree of accuracy.
Modeling Corrosion Using a Cellular Automaton 101

(a) Initial concentration matrix with cells [4, (b) Final matrix after corrosion. Cells [4, 5]
5] and [6, 5] meeting the corrosion threshold and [6, 5] corroded

Fig. 5 Comparison of initial and final concentration matrices in the RK test

5 Discussion

Repeatability as a Reliability Signature The main form of the chatbot’s responses


remains consistent, with similar reactions applicable to similar prompts. However,
there are definite differences in the details.

5.1 Errors, Omissions

The shortcomings can be summarized as follows:

• Rule Violations: The code generated by ChatGPT-4o sometimes violates basic


rules. For example, corroded parts of the material are allowed to revert to their
original state, which is unrealistic.
• Inability to Fix Visualization Errors: Some visualization errors in the generated
code required manual intervention despite attempts to resolve them with secondary
prompts.
• Improper Plot Labeling: Some labels in the legend section are incorrectly printed
on top of each other, resolved by adjusting the plot size.
• Incorrect Implementation of Fick’s second law: The initial simulation did not
correctly follow FSL. This is resolved with a secondary prompt using the appro-
priate function.
• Inadequate Initial Conditions: The generated code displays initial damage incor-
rectly and contains errors, which are resolved by manual code editing. In this
edition, the coordinates of the cells that suffered initial damage are accurately
entered.
102 M. Bakhshi Zadeh and B. Eidel

• Error in Updating Cells: Initially, cells outside the damage area did not corrode.
A new prompt for the reaction kinetics function corrected this, ensuring the entire
metal could corrode.
• Neglect of Defined Boundary Conditions: Corrosion is intended to occur only
if a neighboring cell is corroded, but the initial code allows corrosion anywhere.
This problem is fixed by placing a new if statement and specifying its condition.
• Incorrect Display of Primary Damage: The chatbot struggles with basic prob-
lems. The problem is solved by adjusting the order of definition of cell types.

5.2 Learned Lessons

In conclusion, ChatGPT-4o is a powerful research assistant capable of significantly


streamlining the research process by providing comprehensive information, data
analysis, and insights. However, the results generated by ChatGPT-4o require expert
review to correct any minor issues or inaccuracies, ensuring the reliability and accu-
racy of the information. The chatbot’s effectiveness is highly dependent on proper
utilization by the user, who must understand its strengths and limitations to maximize
response quality. When used correctly, ChatGPT can deliver highly valuable results,
contributing meaningfully to research endeavors.
The project highlights the substantial improvements in the latest iteration of
ChatGPT, specifically the difference between versions 3.5 and 4o. ChatGPT-4o
exhibits a marked enhancement in contextual understanding, information retrieval,
and response coherence, making it a more effective and reliable research tool. These
advancements enable users to tackle more complex tasks, saving time and improv-
ing output quality. While expert review remains crucial, the enhanced capabilities of
ChatGPT-4o significantly boost its utility as an intelligent and dependable research
assistant.

5.3 Refined Corrosion Simulation Prompt

The prompt shown in Fig. 6 is used to edit the code generated by the initial prompt
shown in Fig. 4. The final code is shown in Code Listing 4.1.
Modeling Corrosion Using a Cellular Automaton 103

Fig. 6 Prompt used for the final editing of the code


104 M. Bakhshi Zadeh and B. Eidel

Fig. 6 (continued)

5.4 Visual Results

This section presents the visual documentation of the corrosion simulation process
at different stages highlighting in Fig. 7(a) the initial, in (b) an intermediate, in (c) an
advanced and in (d) the final stage of the corrosion progress. Faster progress in the
upper part of the simulation domain through a larger diffusion coefficient (compare
the values in Table 1) is clearly visible.

6 Conclusion

The objective of this study was to develop a robust 2D CA model for simulating
corrosion processes, incorporating FSL of diffusion and RK. The simulation tool
serves the purpose to predict the initiation and propagation of corrosion under various
environmental conditions with the ultimate goal of an effective corrosion prevention
and control strategies.
The main results of the study are summarized as follows:
Model Development: A 2D CA model was successfully developed to simulate
corrosion. The model accurately captures the transport of corrosive species and the
subsequent chemical reactions at the material’s surface.
Diffusion Verification: The implementation of FSL was rigorously tested. The
verification process confirmed that the model correctly simulates the diffusion pro-
cess, adhering to theoretical expectations and stability conditions.
Reaction Kinetics Verification: The reaction kinetics were verified through con-
trolled testing. The results show that the model accurately simulates the corrosion
process, taking into account the different corrosion rates for various metals and the
influence of initial conditions.
Simulation Results: The simulation effectively demonstrated the progression of
corrosion, showcasing the protective layer’s impact and the different corrosion rates
of the metals. The refined prompts and code adjustments ensured that the simulation
adhered to the specified conditions.
Modeling Corrosion Using a Cellular Automaton 105

(a) Initial stage (b) Intermediate stage

(c) Advanced stage (d) Final stage

(e) Number of different cell types over time

Fig. 7 Different stages a–d of the corrosion process along with e temporal evolution of the different
cell types
106 M. Bakhshi Zadeh and B. Eidel

6.1 Discussion and Future Work

During the development of the 2D CA model for simulating corrosion, several chal-
lenges were encountered with the initial code generated by ChatGPT. Major issues
include improper rule enforcement, such as allowing corroded cells to revert to their
original state, and visualization challenges that require manual fixes. Minor issues
involve plot labeling errors resolved by adjusting the plot size. The initial prompt
lacks sufficient details, leading to issues like incorrect implementation of FSL and
inadequate initial conditions. These issues are addressed through iterative prompts
and manual edits.
Despite these challenges, the process highlights the importance of precise and
detailed prompts to ensure the accuracy of the generated code. While the structure
of the code is consistent, variations in finer details emphasize the need for careful
review and expert oversight. ChatGPT proves to be a powerful research assistant,
significantly aiding the development process. However, its effectiveness depends on
the clarity and specificity of the prompts, as well as the expertise applied in reviewing
and refining its outputs.
Future research can focus on extending the model to include more complex envi-
ronmental conditions and different types of materials, and incorporating machine
learning techniques to enhance the model’s predictive capabilities.
In conclusion, the developed 2D CA model for simulating corrosion is a robust tool
for predicting and managing corrosion processes. The iterative process of refining
prompts and verifying the model ensured that the final simulation met the desired
standards of accuracy and reliability. This study underscores the potential of AI-
assisted tools like ChatGPT in scientific research, while emphasizing the need for
continuous refinement and expert oversight.

References

1. Wang, Y., Guan, B., Wei, X., Lu, J., Ding, J. High-temperature corrosion of Ni-based alloys in
chloride molten salts: A cellular automaton model. Corrosion Science. 150, pp. 17-28 (2019).
https://doi.org/10.1016/j.solmat.2019.110170.
2. Reinoso-Burrows, J.C., Toro, N., Cortés, M. Cellular automata modeling as a tool in corrosion
management. Materials. 16(6051) (2023). https://doi.org/10.3390/ma16176051.
3. Stafiej, J., di Caprio, D., Bartosik, L. Corrosion-passivation processes in a cellular automata-
based simulation study. Journal of Supercomputing. 65, pp. 697-709 (2013). https://doi.org/10.
1007/s11227-013-0933-8.
4. Valor, A., Caleyo, F., Hallen, J.M. Pitting corrosion modeling in low-carbon steels using cellular
automata. Corrosion. 66(2), pp. 025004-1–025004-8 (2010).
5. Chen, H., Chen, Y., Zhang, J. Cellular automaton modeling on the corrosion/oxidation mecha-
nism of steel in a liquid metal environment. Progress in Nuclear Energy. 50(5-6), pp. 587–593
(2008). Elsevier. https://doi.org/10.1016/j.pnucene.2007.11.044.
6. Gooo, T. Cellular Automata Corrosion. (2021). Available at: https://github.com/tracygooo/
cellular-automata-corrosion. Code posted: 21 June 2021, Accessed: 5 May 2024.
Modeling Corrosion Using a Cellular Automaton 107

7. Chen, M.C., Wen, Q.Q. Simulation of corrosion process for structure with the cellular automata
method. IOP Conference Series: Materials Science and Engineering. 216, 012012 (2017). IOP
Publishing. https://doi.org/10.1088/1757-899X/216/1/012012.
8. Xiao, Z., Jun, H., Yuqi, W., Maosheng, Z., Zaoxiao, Z. Simulation of pitting corrosion for Ni-
based alloy using a cellular automata model. Rare Metal Materials and Engineering. 44(10),
pp. 2347–2352 (2015). Elsevier. https://doi.org/10.1016/S1875-5372(16)30018-2
Instationary Heat Conduction
on Rectangular Domains with Arbitrary
Circular Holes

Aagashram Neelakandan and Bernhard Eidel

Abstract This chapter presents the generation and evaluation of Python code for
simulating two-dimensional, unsteady heat conduction in rectangular domains with
multiple holes using ChatGPT-4. The focus is on solving the Fourier heat conduc-
tion equation through the Forward Time Centered Space (FTCS) scheme, a finite
difference method. The chapter begins with a detailed derivation of the algorithmic
approach, transitioning from the governing partial differential equation to the discrete
FTCS scheme for temperature updates. The accuracy and reliability of the generated
code are rigorously tested, including a comparison with analytical solutions, demon-
strating the effectiveness of AI-assisted coding in computational materials science
and mechanics.

1 Introduction

The study focuses on understanding the performance and usage of ChatGPT-4’s


ability to generate code for a scientific problem by proposing the implementation of
the 2D Fourier transient heat conduction problem using the Forward Time Centered
Space (FTCS) scheme [5].
The heat conduction equation describes how heat is transferred within a material
over time. In its general form, the 2D heat conduction equation is given by
 
∂T ∂2 T ∂2 T
. =α + , (1)
∂t ∂x2 ∂ y2

A. Neelakandan (B) · B. Eidel


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 109
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_5
110 A. Neelakandan and B. Eidel

where .T represents the temperature, .t is time, .x and . y are the spatial coordinates,
and .α is the thermal diffusivity. This equation represents the conservation of energy
and relates the temporal change in temperature to the spatial variations and accounts
for the thermal diffusivity of the material.
The FTCS scheme is a numerical method used to approximate the solutions of
partial differential equations, which is based on the Finite Difference Method (FDM),
having discretization for both space and time quantities. The FTCS approximates the
spatial derivatives using the central differences scheme and time derivatives using
the forward Euler method.
To apply the FTCS scheme, we discretize the spatial and temporal domains. Let
.x and .y be the spatial step sizes in the . x and . y directions, respectively, and let
.t be the time step. Define the grid points as . x i = ix and . y j = jy for integers
.i and . j, and .tn = nt for integer .n. The temperature at a grid point .(x i , y j ) at time
n
.tn is denoted as . Ti, j .
The finite difference approximations for the first and second derivatives are given
by

j − Ti, j
Ti,n+1 n
∂T
. ≈ , (2)
∂t t
j − 2Ti, j + Ti−1, j
n n n
∂2 T Ti+1,
. ≈ , (3)
∂x2 x 2
∂2 T Ti, j+1 − 2Ti, j + Ti, j−1
n n n
. ≈ . (4)
∂y 2 y 2

Substituting equations (2), (3), and (4) into the heat conduction equation (1), we
obtain the FTCS scheme
 n 
Ti+1, j − 2Ti,nj + Ti−1,
n
j Ti,nj+1 − 2Ti,nj + Ti,nj−1
n+1
. Ti, j = Ti, j + αt
n
+ , (5)
x 2 y 2

where .Ti,n+1 n
j is the temperature at time .tn+1 calculated from known temperature . Ti, j
at time .tn and .t = tn+1 − tn is the discretized time step with .x and .y being the
spatial step sizes as shown in Fig. 1.
The algorithm for instationary heat conduction is given in the algorithm box 1
and the prompt used for generating it is mentioned in the box Fig. 2.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 111

Fig. 1 2D representation of FTCS scheme based on FDM indicating the spacial points
n+1
(.Ti,n j , Ti+1,
n
j , Ti−1, j , Ti, j+1 , Ti, j−1 ) used for calculating next temporal point (. Ti, j )
n n n

Generate the LaTeX code for an algorithm that solves a 2D transient heat con-
duction problem using the Forward Time Centered Space (FTCS) scheme.
The algorithm should be formatted using the algorithm environment in La-
TeX.

Fig. 2 Prompt for generating the algorithm in LATEX pseudocode for instationary heat conduction
using an FTCS scheme in 2D
112 A. Neelakandan and B. Eidel

Algorithm 1: Instationary Heat Conduction using FTCS Scheme in 2D


Input: Grid size .(N x, N y), time steps .T , time step .t, grid spacing .x, .y, thermal
diffusivity .α, initial temperature field .T (x, 0), boundary conditions
Output: Temperature field .T (x, t) over time
1 Initialization:
2 Set the grid with size .(N x, N y);
3 Initialize the temperature field .T (x, 0);
4 Set thermal diffusivity .α;
5 for .t = 0toT with step .t do
6 Update Temperature Field:
7 foreach interior grid point .(i, j) do
8 Calculate the temperature .T (i, j)n+1 at the next time step using the FTCS scheme:

T (i + 1, j)n − 2T (i, j)n + T (i − 1, j)n
. T (i, j)n+1 = T (i, j)n + αt
x 2

T (i, j + 1)n − 2T (i, j)n + T (i, j − 1)n
+
y 2

9 end
10 Apply Boundary Conditions:
11 Apply the specified boundary conditions to the temperature field .T (x, t);
12 end
13 Output:
14 Return the temperature field .T (x, t) over time;

2 Prompt

The prompt aims to generate code using ChatGPT-4 to numerically compute the time-
dependent temperature field in a rectangular 2D domain with arbitrary dimensions
under the constraints of an arbitrary number of different circular holes in the domain
by applying the FTCS scheme mentioned before in the theory for solving the heat
conduction equation.
For formulating a prompt for ChatGPT-4 to generate a 2D transient heat conduc-
tion simulation in Python using the FTCS method, we have to mention every critical
parameter. These parameters encompass plate dimensions, thermal diffusivity, spa-
tial step, and maximum iteration time. The prompt 5.2 outlines precise boundary
conditions, incorporating Dirichlet at the top and Neumann on the other sides while
describing hole configurations with coordinates, radius, type, and value. The prompt
also includes defining functions for grid initialization, hole condition application,
temperature distribution computation, and high-resolution visual output generation.
Experiments with prompts reveal that some functions need heightened emphasis,
such as hole evolution, visualization, and verbose logging. Verbose logging purpose
is to provide extensive and detailed information about the program’s status, often
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 113

Fig. 3 Prompt or solving 2D Fourier transient heat conduction problem

used for debugging or monitoring purposes. The prompt also calls for library avail-
ability checks, prioritizing OpenCV for video creation and seamlessly falling back
to Pillow if required. The anticipated outcome is a meticulously documented, self-
contained script, poised to deliver precise simulations with explicit hole evolution
visibility, archived as both a video and GIF (Fig. 3).

3 Generation Results

The reason that ChatGPT-4 cannot consistently generate the same results is due to
its non-deterministic nature, influenced by random sampling during generation and
contextual dependencies. The model’s training on a diverse dataset contributes to
response variability. To alleviate these deficiencies and speed up programming, one
114 A. Neelakandan and B. Eidel

Fig. 3 (continued)

needs to control randomness by refining input through elaborate prompt engineering


and regenerating the answers, and the user should understand that complete gener-
ation is not guaranteed in a single prompt. The following code in code Listing 3 is
generated by the ChatGPT-4 with the given prompt, without any additional inputs
and one of the output plot is shown in Fig. 4.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 115

Fig. 4 Contour plot of the plate domain (quantities on the abscissa .x and ordinate . y in (.mm)) with
two circular holes at time step .t = 749 s, with a time step size of .t = 0.125 s. The top boundary
is maintained at a temperature of .100 K, while the other boundaries are kept at .0 K. The first hole,
centered at .(10, 35), has a temperature of .0 K, and the second hole, centered at .(30, 27), has a
temperature of .100 K

1
2 i m p o r t n um py as np
3 i m p o r t m a t p l o t l i b . p y p l o t as plt
4 i m p o r t m a t p l o t l i b . a n i m a t i o n as a n i m a t i o n
5 from PIL i m p o r t Image
6 i m p o r t os
7
8 # C h e c k for O p e n C V a v a i l a b i l i t y
9 try :
10 i m p o r t cv2
11 O P E N C V _ A V A I L A B L E = True
12 except ImportError :
13 O P E N C V _ A V A I L A B L E = False
14
15 # Fi xe d N u m e r i c a l P a r a m e t e r s
16 p l a t e _ l e n g t h = 50.0
17 m a x _ i t e r _ t i m e = 750
18 alp ha = 2.0
19 d e l t a _ x = 1.0
20 d e l t a _ t = ( d e l t a _ x ** 2) / (4 * a l p h a ) # Stability
→ c r i t e r i o n
21 g a m m a = a l p h a * d e l t a _ t / ( d e l t a _ x ** 2)
116 A. Neelakandan and B. Eidel

22
23 # Boundary Conditions
24 b o u n d a r i e s = { ’ top ’ : 100.0 , ’ b o t t o m ’ : 0.0 , ’ left ’ : 0.0 , ’
→ ri gh t ’ : 0.0}
25
26 # Hole C o n f i g u r a t i o n
27 hol es = [{ ’ c e n t e r ’ : (25 , 25) , ’ r a d i u s ’ : 5 , ’ type ’ : ’D ’ , ’
→ va lu e ’ : 0.0}]
28
29 def i n i t i a l i z e _ g r i d () :
30 """ I n i t i a l i z e s the t e m p e r a t u r e grid with b o u n d a r y
→ c o n d i t i o n s . """
31 grid = np . zeros (( int ( p l a t e _ l e n g t h ) , int ( p l a t e _ l e n g t h ) ) )
32 grid [0 , :] = b o u n d a r i e s [ ’ top ’ ]
33 # N e u m a n n c o n d i t i o n s on other sides are i m p l i c i t l y set
→ to zero
34 r e t u r n grid
35
36 def a p p l y _ h o l e _ c o n d i t i o n s ( grid ) :
37 """ A p p l i e s hole c o n f i g u r a t i o n s to the grid . """
38 for hole in holes :
39 x_center , y _ c e n t e r = hole [ ’ c e n t e r ’ ]
40 r a d i u s = hole [ ’ r a d i u s ’ ]
41 for i in range ( grid . shape [0]) :
42 for j in range ( grid . shape [1]) :
43 if ( i - x _ c e n t e r ) ** 2 + ( j - y _ c e n t e r ) ** 2
→ < r a d i u s ** 2:
44 grid [i , j ] = hole [ ’ value ’ ]
45 r e t u r n grid
46
47 def FTCS ( grid ) :
48 """ P e r f o r m s a s i n g l e time step of the FT C S m e t h o d . """
49 n e w _ g r i d = np . copy ( grid )
50 for i in range (1 , grid . shape [0] - 1) :
51 for j in range (1 , grid . shape [1] - 1) :
52 n e w _ g r i d [ i , j ] = grid [i , j ] + gamma * ( grid [ i +1 ,
→ j ] + grid [i -1 , j ] + grid [i , j +1] + grid [i , j -1] - 4 *
→ grid [i , j ])
53 return apply_hole_conditions ( new_grid )
54
55 def s a v e _ f r a m e ( frame , iter_time , f r a m e s _ d i r ) :
56 """ S a v e s a s i n g l e f r a m e of the s i m u l a t i o n . """
57 plt . i m s h o w ( frame , cmap = ’ hot ’ , i n t e r p o l a t i o n = ’ n e a r e s t ’ )
58 plt . ti tl e ( f ’ Time Step : { i t e r _ t i m e } ’ )
59 plt . c o l o r b a r ()
60 plt . s a v e f i g ( f " { f r a m e s _ d i r }/ f r a m e _ { i t e r _ t i m e :04 d }. png " )
61 plt . cl os e ()
62
63 def c o m p i l e _ v i d e o ( frames_dir , o u t p u t _ f i l e ) :
64 """ C o m p i l e s f r a m e s into a video . """
65 if not O P E N C V _ A V A I L A B L E :
66 return
67 i m a g e s = [ img for img in os . l i s t d i r ( f r a m e s _ d i r ) if img .
→ e n d s w i t h ( " . png " ) ]
68 f r a m e = cv2 . i m r e a d ( os . p a t h . join ( frames_dir , i m a g e s [ 0 ] ) )
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 117

69 height , width , l a y e r s = f r a m e . s h a p e
70 vi de o = cv2 . V i d e o W r i t e r ( o u t p u t _ f i l e , cv2 .
→ V i d e o W r i t e r _ f o u r c c (* ’ DIVX ’ ) , 15 , ( width , h e i g h t ) )
71 for i ma ge in i m a g e s :
72 v i d e o . w r i t e ( cv2 . i m r e a d ( os . path . join ( f r a m e s _ d i r ,
→ im ag e ) ) )
73 cv2 . d e s t r o y A l l W i n d o w s ()
74 v i d e o . r e l e a s e ()
75
76 def c o m p i l e _ g i f ( frames_dir , o u t p u t _ f i l e ) :
77 """ C o m p i l e s f r a m e s into a GIF . """
78 i m a g e s = [ I m a g e . open ( os . path . join ( frames_dir , img ) ) for
→ img in os . l i s t d i r ( f r a m e s _ d i r ) if img . e n d s w i t h ( " . png " ) ]
79 i m a g e s [ 0 ] . s a v e ( o u t p u t _ f i l e , s a v e _ a l l = True , a p p e n d _ i m a g e s
→ = i m a g e s [1:] , o p t i m i z e = False , d u r a t i o n =40 , loop =0)
80
81 def r u n _ s i m u l a t i o n () :
82 """ R u n s the e n t i r e s i m u l a t i o n . """
83 grid = i n i t i a l i z e _ g r i d ()
84 grid = a p p l y _ h o l e _ c o n d i t i o n s ( grid )
85 frames_dir = ’ frames ’
86 os . m a k e d i r s ( frames_dir , e x i s t _ o k = True )
87
88 for i t e r _ t i m e in range ( m a x _ i t e r _ t i m e ) :
89 pr in t ( f " I t e r a t i o n { i t e r _ t i m e }/{ m a x _ i t e r _ t i m e } " )
90 s a v e _ f r a m e ( grid , iter_time , f r a m e s _ d i r )
91 grid = FTCS ( grid )
92
93 if O P E N C V _ A V A I L A B L E :
94 c o m p i l e _ v i d e o ( frames_dir , ’
→ h e a t _ c o n d u c t i o n _ s i m u l a t i o n . avi ’ )
95 else :
96 c o m p i l e _ g i f ( frames_dir , ’ h e a t _ c o n d u c t i o n _ s i m u l a t i o n .
→ gif ’ )
97
98 if _ _ n a m e _ _ == " _ _ m a i n _ _ " :
99 r u n _ s i m u l a t i o n ()
Code Listing 5.1 Output by ChatGPT for solving 2D Fourier transient heat conduction problem

4 Testing and Verification

To assess ChatGPT-4’s code generation accuracy, we employ intrinsic verification [3].


This involves validating the output by comparing it to established benchmarks, such
as the one-dimensional heat conduction problem. For this validation, we set the left
boundary of the plate (.x = 0 mm) at a temperature .T1 = 1000 K and the right side
(.x = 1000 mm) at a lower temperature of .T2 = 0 K, creating a temperature gradient
in the.x-direction. This reduces the problem to one-dimensional, for which the closed
form could be obtained by solving (1) in 1D, which yields (6) expressed in terms of
the error function erf
118 A. Neelakandan and B. Eidel

Fig. 5 MSE plotted in logarithmic scale between FTCS and Analytical Solutions over time steps
along the middle of the plate

 
x
. T (x, t) = T1 − (T1 − T2 ) ∗ erf √ , (6)
2 αt
 z
2
e−y dy .
2
.where erf = √ (7)
π 0

The analytical solution serves to verify the code-generated numerical solution,


where the deviation is measured by the Mean Square Error (MSE) defined as

1
n
.MSE = (Tanalytical,i − TFTCS,i )2 , (8)
n i=1

where .n is the number of data points along the plate’s midsection, .Tanalytical,i is the
temperature at point .i computed using the analytical solution and .TFTCS,i is the
temperature at point .i computed using the FTCS scheme.
The diagrams in Figs. 5 and 6 illustrate the MSE between temperatures computed
by the analytical and FTCS scheme along the plate’s midsection. The reducing error
over extended time intervals, coupled with the overall error bounds within 10% for a
.1000 mm plate, indicates the use of the code generated by ChatGPT-4 directly, with
or without minor adjustments for solving a scientific task is a viable option.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 119

Fig. 6 Contour plot of the logarithm of MSE between FTCS and the analytical solutions over
the spatial distance .x in the middle of the plate at . y = 500 mm, displayed up to a distance of
. x ≤ 200 mm. The contour levels are colored to indicate the time steps of the simulation

5 Discussion

The output of the code generated by the ChatGPT-4 and its verification paves the
strong foundation for using the Large Language Models (LLMs) as a critical tool
in solving scientific problems. Even though they have issues with faithfulness in
generated data caused by the sources of data, training, and inference, also known as
hallucinations [2], which may result in outputs that are factually incorrect or flawed
in reasoning due to memorized data, various methods have been developed to reduce
their effects. In the back-end, the model applies higher probability to the statements
it has seen from the training than to the ones it is generating, and there are high
uncertainty errors associated with the tokens which are not present in the user-given
prompt. While the standard user cannot control the effects arising from hallucinations
from data and training, mitigating inference-caused errors by Faithfulness Enhanced
Decoding, which emphasizes the user instruction and its alignment with the generated
content [1]. The authors of [4] develop the method which prioritizes the inputs/tokens
from the user context when there is a high discrepancy or uncertainty between the
output of the LLM and the user’s prompt. So if the given input is high enough
details and does not rely on external facts, the output is less likely to have a large
percentage of hallucinated content, and it is one of the reasons which emphasizes the
importance of having detailed user prompts to get less error output from the ChatGPT
for solving scientific problems. However, the task of generating an extremely detailed
120 A. Neelakandan and B. Eidel

prompt is not always trivial, as there is no ending point in providing details. The
amount of information and how it is structured can also influence the output of the
ChatGPT. During many initial tests, even though ChatGPT-4 has a higher possibility
of generating code, which works straight out of the box without regeneration, the
given prompt generates complete code with ChatGPT-3.5 rather than ChatGPT-4,
since it assumes that the user is asking for the structure of the code and provides only
the overall framework rather than the actual implementation. A possible mitigation is
either using the chain of commands or adjusting the prompt and forcing the ChatGPT
to provide complete code every time, and this study employs the latter method.
A useful strategy in scientific computing is to use LATEX-generated pseudocode as
an intermediate step before code generation. This method offers experienced devel-
opers greater control by first generating a LATEX algorithm via ChatGPT, reviewing
and refining it, and then using it to produce code. This approach ensures a clear, logi-
cal structure before implementation, reducing potential errors. While LATEX provides
precision and consistency, this method introduces challenges, such as the model’s
need to decode LATEX, which can complicate longer prompts or complex modifi-
cations. Additionally, this approach may not scale well for complex problems and
might be less accessible for users unfamiliar with pseudocode. For proficient coders,
the main advantage of ChatGPT may be time-saving rather than enhancing under-
standing, making this algorithmic approach not universally applicable.

6 Conclusion

The central purpose of this chapter was to explore the performance of ChatGPT-4 in
generating code for complex scientific problems, which, in this case, focuses on the
implementation of the 2D Fourier transient heat conduction problem using the FTCS
scheme for a rectangular plate with an arbitrary number of holes. The prompt was
formulated in a way such that it encapsulates every parameter used for the simulation
along with the precise boundary condition, and it also highlights the use of libraries
required to produce the plots requested by the user. The code generated by ChatGPT-
4 was tested and verified with the analytical solution, which demonstrated that the
results are in agreement with the analytical solution. Remaining errors are minor
and can be traced back to the numerical method (FTCS) realized by a flawless code.
The discussion elucidated the general problem of hallucination intimately related
to LLMs, where we suggested effective measures to overcome these issues like
providing details of the instructions in prompt. Overall, the results of this study
strongly suggest that LLMs will play a major role in the future as an assistant in
the development and iteration of faster solutions, thus pushing the boundaries of
scientific computing and problem-solving.
Instationary Heat Conduction on Rectangular Domains with Arbitrary … 121

References

1. Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X.,
Qin, B. & Others. A survey on hallucination in large language models: Principles, taxonomy,
challenges, and open questions. ArXiv Preprint ArXiv:2311.05232. (2023)
2. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A. & Fung,
P. Survey of hallucination in natural language generation. ACM Computing Surveys. 55, 1-38
(2023)
3. Oliveira, J., Nascimento, J., Ribeiro, S., Oliveira, G., Costa, G., Guimarães, G. & Santos Jr, J.
Analytical solution of a 2d transient heat conduction problem using green’s functions. Revista
De Engenharia Térmica. 19, 66-71 (2020)
4. Poel, L., Cotterell, R. & Meister, C. Mutual information alleviates hallucinations in abstractive
summarization. Proceedings Of The 2022 Conference On Empirical Methods In Natural Lan-
guage Processing. pp. 5956-5965 (2022,12), https://aclanthology.org/2022.emnlp-main.399
5. Suárez-Carreño, F. & Rosales-Romero, L. Convergency and stability of explicit and implicit
schemes in the simulation of the heat equation. Applied Sciences. 11, 4468 (2021)
Transfer Learning for Alloy
Classification Based on Microstructure
Images

Aditya Deshmukh and Bernhard Eidel

Abstract This study investigates the performance of large language models (LLMs),
specifically GPT-4, in developing a deep learning (DL) model to predict alloy types
based on scanning electron microscopy (SEM) images of steel microstructures. The
approach utilizes transfer learning (TL) and an ensemble of two pre-trained models,
ResNet-50 and DenseNet-121, fine-tuned on SEM scans of 33 types of steels. The
two models achieve validation accuracies of 97.6% and 98.4%, respectively, with the
ensemble model reaching a test accuracy of 99.2%. The results underscore the poten-
tial of LLM-assisted coding in computer vision tasks, such as image classification,
within computational materials science. The limitations are also discussed.

1 Introduction

The study evaluates the ability of GPT-4 at coding a transfer-learning-based model


that predicts an alloy given a Scanning Electron Microscope (SEM) scan of a
microstructure. This section describes the data and approaches relevant to this task.
Section 2 describes the prompt-engineering approach and the prompt. Section 3 lists
the code generated by this prompt. Section 4 reports the performance of the code.
Section 5 discusses the coding errors GPT-4 made and the measures taken. Finally,
in Sect. 6, we summarize the findings.

A. Deshmukh (B) · B. Eidel


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
B. Eidel
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 125
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_6
126 A. Deshmukh and B. Eidel

1.1 Data

We use raw SEM scans of microstructures of 33 types of 9% Cr steels [4]. The data
is divided into three folders, one for each alloy (CPJ, HR, P92), each consisting of
subfolders of specific alloys (e.g., CPJ7J), wherein the microstructure images of that
specific alloy are stored. These three families of alloys are closely related in terms
of their functions. They are heat-resistant and exhibit good creep resistance. In total,
there are 837 images and 33 alloys, with a varying number of images for each alloy.
Figure 1 shows an example of the SEM scan of one microstructure.
Remark. In the context of commercial alloys, the abbreviation HR typically
stands for Heat-Resistant. This designation is often used to identify alloys that
are specifically designed to withstand high temperatures while maintaining their
mechanical properties and corrosion resistance. The abbreviation CPJ stands for
Copper-Phosphorus-Joint alloys. These are specialized brazing alloys used for join-
ing copper and copper-based materials. The term P92 refers to a specific type of
alloy used in high-temperature applications, particularly in the power generation and
petrochemical industries. P92 alloy is a type of creep-resistant, ferritic-martensitic
steel, known for its excellent high-temperature strength, oxidation resistance, and
corrosion resistance. It is commonly used in the manufacturing of components like
pipes, tubes, and fittings in high-temperature steam and pressure environments.

Fig. 1 Sample SEM scan of


a steel microstructure. It
exhibits a resolution of 630
.× 592 px, while the scale bar
extends to 100 .µm
Transfer Learning for Alloy Classification Based on Microstructure Images 127

1.2 Deep Learning

The introduction of data science, especially deep learning (DL), to the discipline
of materials science has been revolutionary [1]. DL is a subset of machine learning
(ML) that uses artificial neural networks (NNs), a structure of repeating computa-
tional layers that do linear and non-linear operations to transform input data into a
new useful representation and finally extract knowledge, to solve several types of
problems, such as computer vision, natural language, in various disciplines, such as
medicine, economics, earth sciences, and, of course, computational materials sci-
ence. In computational materials science, deep learning is primarily used to accel-
erate the process of materials optimization and discovery, which includes problems
related to forward simulation and inverse design, to better understand or exploit the
property-structure-process linkages [5].
In theory, sufficiently deep DL models can approximate any function. They are
therefore quite powerful. However, the goal is to not only fit the input data, called
the training data. A model is considered usable only if it is generalizable, or if it also
performs well on out-of-distribution data. Therefore, models are tested on test data,
and the performance metrics associated with this test data indicate how well the model
may perform in the wild. Training DL models also requires a third data set, called
the validation set, to tune the hyperparameters associated with the model, such as the
number of epochs, which is the number of times the model processes the whole dataset
to learn its parameters or weights, or the learning rate, an optimization parameter that
governs the step size the optimization algorithm takes. In supervised deep learning,
at every optimization step, the deep learning model outputs its predictions. These
are compared against the ground truth via the loss function. Minimization of loss
indicates a good fit. An optimization step is taken in the direction that reduces the
loss. After several epochs and consequent weight updates, the model finds the optimal
weights that minimize the loss function, and the problem is considered solved.

1.3 Transfer Learning

Although the amount of data in materials science is on the rise, this increase mainly
corresponds to certain subsets of materials classes and properties. Materials experi-
ments are expensive.
Most of the big data available originates from computations; e.g., DFT databases,
finite element analysis. Materials data is typically not big data for many interesting
problems. This limits materials scientists from utilizing the full potential of DL.
128 A. Deshmukh and B. Eidel

For instance, DL cannot be used for several materials problems where the available
data is scarce because DL is typically data-hungry. However, some DL models are
transferable. For instance, the shallower layers of a convolutional neural network
(CNN) learn more general concepts about an image. In the deeper convolutions,
the kernels reflect more specific knowledge. When two problems are comparable,
a DL model trained for the first problem can be used to fine-tune a model for the
second problem. This is called Transfer Learning (TL), and it is an immensely useful
approach when one has big data for one problem and small data for the problem at
hand—a common scenario in materials science. Refer to [2] for a thorough discussion
on TL.
Because the microstructures of steels look quite similar within the specific class
of steel, predicting with small data alone is challenging—hence, we use TL. We
use ResNet-50 and DenseNet-121, both trained on millions of images from the
ImageNet data, as the two pre-trained models of the ensemble [3, 6]. ResNet-50’s
architecture includes residual connections that help mitigate the vanishing gradient
problem, thereby improving learning in deep networks. DenseNet-121, on the other
hand, features dense connectivity, where each layer is connected to every other layer,
which enhances feature propagation and reduces the number of parameters, leading
to improved performance. Owing to the architectural differences, the two models
may produce different outputs. By combining them in an ensemble, we combine the
strengths of both models.
ResNet-50 and DenseNet models are a good choice because our problem is essen-
tially an image classification problem, and these models, having their weights trained
on ImageNet, should be able to recognize the general image patterns and features.
When doing TL, the pre-trained model is stacked with new layers, or, at minimum,
stacked with only one layer, to match the output dimensions of the second task. Here,
only the final fully connected layer is replaced by a layer that outputs a tensor of
size matching the number of alloy types. Because ImageNet likely does not contain
many images closely resembling alloy microstructures, we fine-tune the weights of
the output layer and the inner layers. To do this, we keep the default setting for
freezing, which is unfrozen. The overall model, with weights imported from the base
pre-trained models and fine-tuned on our small data set, solves our problem (Fig. 2).
Transfer Learning for Alloy Classification Based on Microstructure Images 129

Fig. 2 Transfer Learning. A model trained for Task A on big data is used to build another model
for another task, Task B, which is fine-tuned on the smaller data set. In addition to the previous
model, the new model at minimum has an output layer corresponding to the output dimensions for
Task B

2 Formulation of Coding Task

2.1 Prompt Engineering Approach

Before we describe the prompt, we provide a brief discussion on the prompt-


engineering approach. Although a concise problem statement may be sufficient, the
resulting code can have problems. At best, GPT-4 leaves the user certain program-
ming choices and tells them when what choice is relevant and how to implement
it. At worst, the user has to troubleshoot and re-prompt to fix errors. GPT-4 can be
made more aware, thereby reducing such errors, by describing all user expectations,
including implementation details.
If the user does not know the implementation details, such as data splitting ratio for
training, validation, and test, which optimization method to use, should techniques
like data augmentation be implemented, either source this information from prevalent
research, or, better yet, use GPT-4. Give it the problem statement and query about
the implementation details. For our problem, GPT-4 gave correct suggestions, such
as data augmentation and ensemble techniques with what models to use. Include this
information in the main prompt. A sufficiently detailed prompt is necessary to avoid
numerous iterations of supplemental prompts and code updates, where consistency
problems like changes in variable names or implementation logic may occur.
130 A. Deshmukh and B. Eidel

2.2 Prompt

We now provide the prompt given to GPT-4. It includes a detailed description of


data and model. We also provide the following implementation details: instruction
on data splitting, DL framework of choice (PyTorch), optimization method, learning
rate scheduler (here, ReduceLROnPlateau is used; it decreases learning rate by a
given magnitude if the validation loss plateaus for a given number of epochs), GPU
device (MPS is explicitly mentioned because this program is initially built for a
Mac), and visualization instructions. We also include explicit instruction to do data
augmentation and use ensembling. Data augmentation is a technique wherein a ran-
dom slice of training data is transformed in simple ways, such as flipping, rotation,
or color. This increased variance of data helps in preventing the model from overfit-
ting to the training data by encouraging it to learn more general features. Different
models have different strengths and weaknesses, and ensembling methods combine
the outcomes of these models, thereby combining the strengths and mitigating the
weaknesses.
We aim to generate fully functional code through only one main prompt. However,
a few supplemental prompts were necessary. These are discussed in Sect. 5.

3 Code Listing

Based on the prompt described in Fig. 3, GPT-4 generated the code listed here.
1
2 # Import necessary libraries
3 i m p o r t os
4 i m p o r t copy
5 i m p o r t p a n d a s as pd
6 import random
7 i m p o r t numpy as np
8 from s k l e a r n . m o d e l _ s e l e c t i o n i m p o r t t r a i n _ t e s t _ s p l i t
9 i m p o r t torch
10 from t o r c h v i s i o n i m p o r t t r a n s f o r m s , m o d e l s
11 from torch . utils . data i m p o r t D a t a L o a d e r
12 i m p o r t torch . nn as nn
13 i m p o r t torch . optim as optim
14 from s k l e a r n . m e t r i c s i m p o r t a c c u r a c y _ s c o r e
15 i m p o r t m a t p l o t l i b . p y p l o t as plt
16 from PIL i m p o r t Image
17 from s k l e a r n . p r e p r o c e s s i n g i m p o r t L a b e l E n c o d e r
18
19 # Set r a n d o m s e e d s for r e p r o d u c i b i l i t y
20 r a n d o m _ s e e d = 99
21 torch . m a n u a l _ s e e d ( r a n d o m _ s e e d )
22 torch . cuda . m a n u a l _ s e e d ( r a n d o m _ s e e d )
23 np . random . seed ( r a n d o m _ s e e d )
24 random . seed ( r a n d o m _ s e e d )
25
Transfer Learning for Alloy Classification Based on Microstructure Images 131

26 # Data P r e p a r a t i o n
27
28 # D i r e c t o r y path
29 b a s e _ p a t h = ’/ Users / a d i t y a d e s h m u k h / D e s k t o p / RJSC ’
30
31 # Alloy folders
32 alloy_folders = {
33 ’ h r _ a l l o y s 2 0 2 2 0 2 0 8 ’ : ’ hr ’ ,
34 ’ C P J _ a l l o y s ’: ’ cpj ’ ,
35 ’ P92 OTHER ’ : ’ p92 ’
36 }
37
38 # Creating a dataframe
39 data = { ’ alloy ’ : [] , ’ image ’ : [] , ’ alloy type ’ : []}
40
41 # L o o p i n g t h r o u g h each main f o l d e r and its s u b f o l d e r s
42 for folder , alloy in a l l o y _ f o l d e r s . items () :
43 f o l d e r _ p a t h = os . path . join ( base_path , folder )
44 for s u b f o l d e r in os . l i s t d i r ( f o l d e r _ p a t h ) :
45 s u b f o l d e r _ p a t h = os . path . join ( f o l d e r _ p a t h , s u b f o l d e r )
46 if os . path . isdir ( s u b f o l d e r _ p a t h ) :
47 for file in os . l i s t d i r ( s u b f o l d e r _ p a t h ) :
48 if file . e n d s w i t h ( ’. bmp ’) :
49 f i l e _ p a t h = os . path . join ( s u b f o l d e r _ p a t h ,
→ file )
50 data [ ’ alloy ’ ]. a p p e n d ( a l l o y )
51 data [ ’ image ’ ]. a p p e n d ( f i l e _ p a t h )
52 data [ ’ alloy type ’ ]. a p p e n d ( s u b f o l d e r )
53
54 # C o n v e r t i n g to p a n d a s D a t a F r a m e
55 df = pd . D a t a F r a m e ( data )
56 print ( df . head () )
57
58 # L a b e l E n c o d i n g : C o n v e r t s t r i n g l a b e l s to n u m e r i c a l l a b e l s (
→ which is what the m o d e l e x p e c t s )
59 l a b e l _ e n c o d e r = L a b e l E n c o d e r ()
60 df [ ’ alloy type ’ ] = l a b e l _ e n c o d e r . f i t _ t r a n s f o r m ( df [ ’ alloy type ’
→ ])
61
62 # S p l i t t i n g data
63 # the first line splits data into t r a i n i n g : test sets in the
→ r a t i o 7 0 : 3 0 .
64 # the second line splits the test set into two halves :
→ v a l i d a t i o n set and test set .
65 # so that u l t i m a t e l y we have train : valid : test = 7 0 : 1 5 : 1 5
66 train_df , t e s t _ d f = t r a i n _ t e s t _ s p l i t ( df , t e s t _ s i z e =0.3 ,
→ s t r a t i f y = df [ ’ alloy type ’ ] , r a n d o m _ s t a t e = r a n d o m _ s e e d )
67 val_df , t e s t _ d f = t r a i n _ t e s t _ s p l i t ( test_df , t e s t _ s i z e =0.5 ,
→ s t r a t i f y = t e s t _ d f [ ’ alloy type ’] , r a n d o m _ s t a t e =
→ r a n d o m _ s e e d )
68
69 # Data a u g m e n t a t i o n and n o r m a l i z a t i o n .
70 # Two d i f f e r e n t sets of t r a n s f o r m a t i o n s are n e c e s s a r y b e c a u s e
→ data a u g m e n t a t i o n t r a n s f o r m a t i o n s
71 # only apply to the t r a i n i n g set , not test and v a l i d a t i o n .
72 data_transforms = {
73 ’ train ’: t r a n s f o r m s . C o m p o s e ([
132 A. Deshmukh and B. Eidel

74 t r a n s f o r m s . T o T e n s o r () ,
75 # C r o p the top and b o t t o m 25 p i x e l s to r e m o v e
→ m i s c e l l a n e o u s SEM a n n o t a t i o n s
76 t r a n s f o r m s . L a m b d a ( l a m b d a x : x [: , 25: -25 , :]) ,
77 # R a n d o m l y c r o p and r e s i z e i m a g e s to 224 x224 ( r e q u i r e d
→ for pre - t r a i n e d m o d e l s )
78 t r a n s f o r m s . R a n d o m R e s i z e d C r o p (224) ,
79 # R a n d o m l y a p p l y h o r i z o n t a l f l i p p i n g and r o t a t i o n for
→ data a u g m e n t a t i o n :
80 # Each time an image is l o a d e d d u r i n g training , the
→ D a t a L o a d e r a p p l i e s the t r a n s f o r m a t i o n s r a n d o m l y .
81 # Essentially , this means the model sees s l i g h t l y
→ d i f f e r e n t v e r s i o n s of the t r a i n i n g i m a g e s t h r o u g h o u t
→ the
82 # t r a i n i n g process , which helps it g e n e r a l i z e b e t t e r
→ by l e a r n i n g from a more d i v e r s e set of data
→ r e p r e s e n t a t i o n s .
83 t r a n s f o r m s . R a n d o m H o r i z o n t a l F l i p () ,
84 t r a n s f o r m s . R a n d o m R o t a t i o n (15) ,
85 # N o r m a l i z e i m a g e s b a s e d on pre - d e f i n e d mean and
→ s t a n d a r d d e v i a t i o n of I m a g e N e t d a t a s e t
86 t r a n s f o r m s . N o r m a l i z e ([0.485 , 0.456 , 0.406] , [0.229 ,
→ 0.224 , 0 . 2 2 5 ] )
87 ]) ,
88 ’ val ’: t r a n s f o r m s . C o m p o s e ([
89 t r a n s f o r m s . T o T e n s o r () ,
90 t r a n s f o r m s . L a m b d a ( l a m b d a x : x [: , 25: -25 , :]) ,
91 t r a n s f o r m s . R e s i z e ((224 , 224) ) ,
92 t r a n s f o r m s . N o r m a l i z e ([0.485 , 0.456 , 0.406] , [0.229 ,
→ 0.224 , 0 . 2 2 5 ] )
93 ]) ,
94 }
95
96 # C u s t o m d a t a s e t c l a s s to load images and apply
→ t r a n s f o r m a t i o n s
97 class S t e e l D a t a s e t ( torch . utils . data . D a t a s e t ) :
98 def _ _ i n i t _ _ ( self , dataframe , t r a n s f o r m = None ) :
99 self . d a t a f r a m e = d a t a f r a m e
100 self . t r a n s f o r m = t r a n s f o r m
101
102 def _ _ l e n _ _ ( self ) :
103 r e t u r n len ( self . d a t a f r a m e )
104
105 def _ _ g e t i t e m _ _ ( self , idx ) :
106 i m g _ n a m e = self . d a t a f r a m e . iloc [ idx , 1]
107 image = Image . open ( i m g _ n a m e )
108 label = self . d a t a f r a m e . iloc [ idx , 2]
109
110 if self . t r a n s f o r m :
111 image = self . t r a n s f o r m ( image )
112
113 r e t u r n image , label
114
115
116 # Datasets
117 t r a i n _ d a t a s e t = S t e e l D a t a s e t ( train_df , t r a n s f o r m =
→ d a t a _ t r a n s f o r m s [ ’ train ’ ])
Transfer Learning for Alloy Classification Based on Microstructure Images 133

118 v a l _ d a t a s e t = S t e e l D a t a s e t ( val_df , t r a n s f o r m = d a t a _ t r a n s f o r m s [ ’
→ val ’ ])
119 t e s t _ d a t a s e t = S t e e l D a t a s e t ( test_df , t r a n s f o r m = d a t a _ t r a n s f o r m s
→ [ ’ val ’ ])
120
121 # D a t a L o a d e r s in P y T o r c h are i t e r a t o r s that enable e f f i c i e n t
→ l o a d i n g of data during the training , validation ,
122 # and t e s t i n g p h a s e s of a m a c h i n e l e a r n i n g m o d e l .
123 b a t c h _ s i z e = 32
124 t r a i n _ l o a d e r = D a t a L o a d e r ( t r a i n _ d a t a s e t , b a t c h _ s i z e = batch_size
→ , s h u f f l e = True , w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed
→ ( r a n d o m _ s e e d ) )
125 v a l _ l o a d e r = D a t a L o a d e r ( v a l _ d a t a s e t , b a t c h _ s i z e = batch_size ,
→ w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed ( r a n d o m _ s e e d ) )
126 t e s t _ l o a d e r = D a t a L o a d e r ( t e s t _ d a t a s e t , b a t c h _ s i z e = batch_size ,
→ w o r k e r _ i n i t _ f n = l a m b d a _ : np . random . seed ( r a n d o m _ s e e d ) )
127
128 # Device configuration
129 # if your device has a gpu , it n e e d s to be d e c l a r e d e x p l i c i t l y
→ . For macbooks , the gpu is mps .
130 # if you use w i n d o w s or linux , you should try cuda i n s t e a d .
131 d e v i c e = t o r c h . d e v i c e ( " mps " if t o r c h . b a c k e n d s . mps . i s _ a v a i l a b l e
→ () else " cpu " )
132
133 # Load and modify p r e t r a i n e d m o d e l s
134 def l o a d _ m o d e l ( model_name , n u m _ c l a s s e s ) :
135 if m o d e l _ n a m e == " r e s n e t " :
136 # When p r e t r a i n e d = True , it means that the model is
→ i n i t i a l i z e d with w e i g h t s that have a l r e a d y been l e a r n e d
→ .
137 m o d e l = m o d e l s . r e s n e t 5 0 ( p r e t r a i n e d = True )
138 # This line e x t r a c t s the n u m b e r of i n p u t f e a t u r e s to
→ the last fully c o n n e c t e d layer ( fc )
139 n u m _ f t r s = m o d e l . fc . i n _ f e a t u r e s
140 # This line r e p l a c e s the last fc with a new fc
→ t a i l o r e d for our task by s e t t i n g its o u t p u t to n u m b e r
→ of classes ,
141 # or n u m b e r of s t e e l a l l o y s .
142 model . fc = nn . L i n e a r ( num_ftrs , n u m _ c l a s s e s )
143 elif m o d e l _ n a m e == " d e n s e n e t " :
144 m o d e l = m o d e l s . d e n s e n e t 1 2 1 ( p r e t r a i n e d = True )
145 num_ftrs = model . classifier . in_features
146 m o d e l . c l a s s i f i e r = nn . L i n e a r ( num_ftrs , n u m _ c l a s s e s )
147 r e t u r n model
148
149
150 # N u m b e r of c l a s s e s
151 n u m _ c l a s s e s = len ( df [ ’ alloy type ’ ]. u n i q u e () )
152
153 # Load models
154 r e s n e t = l o a d _ m o d e l ( " r e s n e t " , n u m _ c l a s s e s ) . to ( d e v i c e )
155 d e n s e n e t = l o a d _ m o d e l ( " d e n s e n e t " , n u m _ c l a s s e s ) . to ( d e v i c e )
156 torch . m a n u a l _ s e e d ( r a n d o m _ s e e d )
157 torch . cuda . m a n u a l _ s e e d ( r a n d o m _ s e e d )
158
159 # Loss and o p t i m i z e r
134 A. Deshmukh and B. Eidel

160 # The loss q u a n t i f i e s how well the model ’s p r e d i c t i o n s match


→ l a b e l s in the t r a i n i n g data .
161 # The optimizer , here s t o c h a s t i c g r a d i e n t descent , is an
→ a l g o r i t h m that a d j u s t s the p a r a m e t e r s ( w e i g h t s and
→ b i a s e s )
162 # of the n e u r a l n e t w o r k d u r i n g t r a i n i n g to m i n i m i z e the loss .
163 c r i t e r i o n = nn . C r o s s E n t r o p y L o s s ()
164 # M o m e n t u m is g e n e r a l l y set to 0.9. LR , the l e a r n i n g rate ,
→ will be tuned .
165 o p t i m i z e r _ r e s n e t = o p t i m . SGD ( r e s n e t . p a r a m e t e r s () , lr =0.001 ,
→ m o m e n t u m = 0 . 9 )
166 o p t i m i z e r _ d e n s e n e t = o p t i m . SGD ( d e n s e n e t . p a r a m e t e r s () , lr
→ =0.001 , m o m e n t u m = 0 . 9 )
167
168 # L e a r n i n g rate s c h e d u l e r .
169 # LR is tuned in this way : if the loss p l a t e a u s for 5 e p o c h s (
→ i t e r a t i o n s ) , LR is r e d u c e d by a f a c t o r of 0.1
170 s c h e d u l e r _ r e s n e t = optim . l r _ s c h e d u l e r . R e d u c e L R O n P l a t e a u (
→ o p t i m i z e r _ r e s n e t , ’ min ’ , f a c t o r =0.1 , p a t i e n c e =5 ,
→ v e r b o s e = True )
171 s c h e d u l e r _ d e n s e n e t = optim . l r _ s c h e d u l e r . R e d u c e L R O n P l a t e a u (
→ o p t i m i z e r _ d e n s e n e t , ’ min ’ , f a c t o r =0.1 , p a t i e n c e =5 ,
→ v e r b o s e = True )
172
173 # Training function
174 # in m a c h i n e learning , data is fed to the model as d a t a l o a d e r s
→ for a n u m b e r of i t e r a t i o n s , c a l l e d e p o c h s .
175 # In each t r a i n i n g epoch , m o d e l t r i e s to l e a r n o p t i m u m w e i g h t s
→ that m i n i m i z e the t r a i n i n g loss . This model is
→ e v a l u a t e d
176 # on v a l i d a t i o n data in a v a l i d a t i o n epoch . H y p e r p a r a m e t e r s
→ are tuned in order to o b t a i n l o w e r v a l i d a t i o n loss .
177 # A c c u r a c i e s are also stored for each epoch . The model
→ a s s o c i a t e d with the best v a l i d a t i o n
178 # a c c u r a c y is returned , along with the e n t i r e h i s t o r y of
→ t r a i n i n g loss , t r a i n i n g accuracy , v a l i d a t i o n loss , and
179 # validation accuracy .
180 def t r a i n _ m o d e l ( model , dataloaders , criterion , optimizer ,
→ scheduler , n u m _ e p o c h s =25) :
181 b e s t _ a c c = 0.0
182 # I n i t i a l i z e a r r a y s to s t o r e m e t r i c s
183 t r a i n _ l o s s e s , v a l _ l o s s e s , t r a i n _ a c c s , v a l _ a c c s = [] , [] ,
→ [] , []
184
185 for epoch in range ( n u m _ e p o c h s ) :
186 print ( ’ Epoch {}/{} ’ . f o r m a t ( epoch , n u m _ e p o c h s - 1) )
187 print ( ’ - ’ * 10)
188
189 # Each epoch has a t r a i n i n g and v a l i d a t i o n phase
190 for phase in [ ’ train ’ , ’ val ’ ]:
191 if phase == ’ train ’:
192 model . train () # Set model to t r a i n i n g mode .
→ Here w e i g h t s are l e a r n e d .
193 else :
194 model . eval () # Set model to e v a l u a t e mode .
→ Here we only use the l e a r n e d w e i g h t s to make
→ p r e d i c t i o n s .
Transfer Learning for Alloy Classification Based on Microstructure Images 135

195
196 r u n n i n g _ l o s s = 0.0
197 running_corrects = 0
198
199 # I t e r a t e over data
200 for inputs , l a b e l s in d a t a l o a d e r s [ phase ]:
201 i n p u t s = i n p u t s . to ( d e v i c e )
202 l a b e l s = l a b e l s . to ( d e v i c e )
203
204 # Zero the p a r a m e t e r g r a d i e n t s
205 o p t i m i z e r . z e r o _ g r a d ()
206
207 # Forward
208 with torch . s e t _ g r a d _ e n a b l e d ( phase == ’ train ’ ) :
209 outputs = model ( inputs )
210 _ , preds = torch . max ( outputs , 1)
211 loss = c r i t e r i o n ( outputs , l a b e l s )
212
213 # B a c k w a r d + o p t i m i z e only if in t r a i n i n g
→ phase
214 if phase == ’ train ’ :
215 loss . b a c k w a r d ()
216 o p t i m i z e r . step ()
217
218 # Statistics
219 r u n n i n g _ l o s s += loss . item () * inputs . size (0)
220 r u n n i n g _ c o r r e c t s += torch . sum ( p r e d s == l a b e l s .
→ data )
221
222 e p o c h _ l o s s = r u n n i n g _ l o s s / len ( d a t a l o a d e r s [ phase
→ ]. d a t a s e t )
223 e p o c h _ a c c = r u n n i n g _ c o r r e c t s . float () / len (
→ d a t a l o a d e r s [ p h a s e ]. d a t a s e t )
224
225 # Record metrics
226 if phase == ’ train ’:
227 train_losses . append ( epoch_loss )
228 t r a i n _ a c c s . a p p e n d ( e p o c h _ a c c . item () )
229 else :
230 val_losses . append ( epoch_loss )
231 v a l _ a c c s . a p p e n d ( e p o c h _ a c c . item () )
232 s c h e d u l e r . step ( e p o c h _ l o s s ) # Adjust learning
→ rate based on v a l i d a t i o n loss
233
234 print ( ’ {} Loss : {:.4 f } Acc : {:.4 f } ’ . f o r m a t ( phase ,
→ epoch_loss , e p o c h _ a c c ) )
235
236 # Deep copy the best - p e r f o r m i n g model
237 if phase == ’ val ’ and e p o c h _ a c c > b e s t _ a c c :
238 best_acc = epoch_acc
239 b e s t _ m o d e l _ w t s = copy . d e e p c o p y ( m o d e l .
→ s t a t e _ d i c t () )
240
241 print ()
242
243 print ( ’ Best val Acc : {:4 f } ’ . f o r m a t ( b e s t _ a c c ) )
244
136 A. Deshmukh and B. Eidel

245 # Load best m o d e l w e i g h t s


246 model . l o a d _ s t a t e _ d i c t ( b e s t _ m o d e l _ w t s )
247
248 r e t u r n model , t r a i n _ l o s s e s , v a l _ l o s s e s , t r a i n _ a c c s ,
→ v a l _ a c c s
249
250 # this f u n c t i o n p l o t s t r a i n _ l o s s e s , v a l _ l o s s e s and train_accs ,
→ v a l _ a c c s in two g r a p h s .
251 def p l o t _ p e r f o r m a n c e ( t r a i n _ l o s s e s , v a l _ l o s s e s , t r a i n _ a c c s ,
→ val_accs , f i l e _ n a m e ) :
252 np . random . seed ( r a n d o m _ s e e d )
253 plt . f i g u r e ( f i g s i z e =(12 , 5) )
254
255 plt . s u b p l o t (1 , 2 , 1)
256 plt . plot ( t r a i n _ l o s s e s , l a b e l = ’ T r a i n i n g loss ’)
257 plt . plot ( val_losses , label = ’ V a l i d a t i o n loss ’)
258 plt . title ( ’ T r a i n i n g and V a l i d a t i o n Loss ’)
259 plt . x l a b e l ( ’ E p o c h s ’ )
260 plt . y l a b e l ( ’ Loss ’)
261 plt . l e g e n d ()
262
263 plt . s u b p l o t (1 , 2 , 2)
264 plt . plot ( train_accs , label = ’ T r a i n i n g a c c u r a c y ’ )
265 plt . plot ( val_accs , label = ’ V a l i d a t i o n a c c u r a c y ’ )
266 plt . title ( ’ T r a i n i n g and V a l i d a t i o n A c c u r a c y ’ )
267 plt . x l a b e l ( ’ E p o c h s ’ )
268 plt . y l a b e l ( ’ A c c u r a c y ’ )
269 plt . l e g e n d ()
270 plt . s a v e f i g ( f i l e _ n a m e )
271
272
273 # Train models
274 d a t a l o a d e r s = { ’ train ’ : t r a i n _ l o a d e r , ’ val ’: v a l _ l o a d e r }
275 print ( ’\ n RESNET -50 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ n ’)
276 resnet , r e s n e t _ t r a i n _ l o s s , r e s n e t _ v a l _ l o s s , r e s n e t _ t r a i n _ a c c ,
→ r e s n e t _ v a l _ a c c = t r a i n _ m o d e l ( resnet , d a t a l o a d e r s ,
→ criterion , o p t i m i z e r _ r e s n e t , s c h e d u l e r _ r e s n e t ,
→ n u m _ e p o c h s =500)
277 print ( ’\ n DENSENET -121 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \ n ’)
278 densenet , d e n s e n e t _ t r a i n _ l o s s , d e n s e n e t _ v a l _ l o s s ,
→ d e n s e n e t _ t r a i n _ a c c , d e n s e n e t _ v a l _ a c c = t r a i n _ m o d e l (
→ densenet , dataloaders , criterion , o p t i m i z e r _ d e n s e n e t ,
→ s c h e d u l e r _ d e n s e n e t , n u m _ e p o c h s =500)
279
280 torch . save ( resnet . s t a t e _ d i c t () , ’ r e s n e t _ m o d e l . pth ’)
281 torch . save ( d e n s e n e t . s t a t e _ d i c t () , ’ d e n s e n e t _ m o d e l . pth ’ )
282
283 # Ensemble
284 # we a v e r a g e the o u t p u t s of our two models and use this
→ a v e r a g e d o u t p u t to make p r e d i c t i o n s .
285 class A v e r a g i n g E n s e m b l e ( nn . M o d u l e ) :
286 def _ _ i n i t _ _ ( self , modelA , m o d e l B ) :
287 super ( A v e r a g i n g E n s e m b l e , self ) . _ _ i n i t _ _ ()
288 self . modelA = modelA
289 self . modelB = modelB
290
291 def f o r w a r d ( self , x ) :
Transfer Learning for Alloy Classification Based on Microstructure Images 137

292 # Get p r e d i c t i o n s from both m o d e l s


293 o u t p u t A = self . modelA ( x )
294 o u t p u t B = self . modelB ( x )
295
296 # A v e r a g e the p r e d i c t i o n s
297 average_output = ( outputA + outputB ) / 2
298 return average_output
299
300 e n s e m b l e _ m o d e l = A v e r a g i n g E n s e m b l e ( resnet , d e n s e n e t )
301
302
303 # Evaluate Ensemble
304 def e v a l u a t e _ m o d e l ( model , d a t a l o a d e r ) :
305 model . eval ()
306 a l l _ p r e d s = []
307 a l l _ l a b e l s = []
308 with torch . n o _ g r a d () :
309 for inputs , l a b e l s in d a t a l o a d e r :
310 i n p u t s = i n p u t s . to ( d e v i c e )
311 l a b e l s = l a b e l s . to ( d e v i c e )
312 outputs = model ( inputs )
313 _ , preds = torch . max ( outputs , 1)
314 a l l _ p r e d s . e x t e n d ( p r e d s . t o l i s t () )
315 a l l _ l a b e l s . e x t e n d ( l a b e l s . t o l i s t () )
316
317 a c c u r a c y = a c c u r a c y _ s c o r e ( all_labels , a l l _ p r e d s )
318 return accuracy
319
320
321 test_accuracy = evaluate_model ( ensemble_model , test_loader )
322 print ( ’ Test A c c u r a c y of E n s e m b l e : {:.4 f }% ’ . f o r m a t (
→ t e s t _ a c c u r a c y * 100) )
323
324 # Plotting functions
325 plot_performance ( resnet_train_loss , resnet_val_loss ,
→ r e s n e t _ t r a i n _ a c c , r e s n e t _ v a l _ a c c , " res1 . png " )
326 plot_performance ( densenet_train_loss , densenet_val_loss ,
→ d e n s e n e t _ t r a i n _ a c c , d e n s e n e t _ v a l _ a c c , " res2 . png " )
Code Listing 6.1 Program by ChatGPT-4 for classifying microstructure
138 A. Deshmukh and B. Eidel

I require a Python program for classifying steel microstructure images ac-


cording to their alloy type using transfer learning. A description of the data
as well as the model is provided below.
Data
I have three folders of steel microstructure images in the directory /Users/a-
dityadeshmukh/Desktop/RJSC. Each folder is described here as follows:
• The folder ‘hr alloys20220208’ contains the initial microstructure for HR
alloys.
– It contains 11 subfolders with subfolder titles corresponding to a spe-
cific HR alloy.
– The ‘.bmp’ files in each subfolder are the raw microstructure images
for that specific alloy.
• The folder ‘CPJ alloys’ contains the initial microstructure for CPJ alloys.
– It contains 18 subfolders with subfolder titles corresponding to a spe-
cific CPJ alloy.
– The ‘.bmp’ files in each subfolder are the raw microstructure images
for that specific alloy.
• The folder ‘P92 OTHER’ contains the initial microstructure for P92 alloys.
– It contains 4 subfolders with subfolder titles corresponding to a specific
P92 alloy.
– The ‘.bmp’ files in each subfolder are the raw microstructure images
for that specific alloy.
Note that all three folders and subfolders have differently named and different
numbers of items. You’d first make a pandas dataframe with three columns:
‘alloy’ (cpj, hr, or p92), ‘image’ (path to the .bmp image file), ‘alloy type’
(name of the subfolder wherein the image exists, which is also the specific
alloy).
Model
• Transfer learning requires a pre-trained model. Comment on the feasibility
of ResNet-50 and DenseNet-121 for this task. Fine-tune on all two and
create an ensemble.
• Use PyTorch framework.
• Set GPU device as ‘mps’.
• While fine-tuning, ensure that there is no freezing.

Fig. 3 Prompt for classifying microstructures


Transfer Learning for Alloy Classification Based on Microstructure Images 139

• Split the data into training (70%), validation (15%), and test (15%) sets
stratified according to the label ‘alloy type.’ Use train test split twice to
achieve this.
• Do data augmentation.
• Use the validation loss to find the optimum number of epochs. Also, use
ReduceLROnPlateau scheduler to tune the learning rate. Set the number
of epochs to 500 for each model.
• Record training loss, training accuracy, validation loss, validation accu-
racy for each epoch for each model. Ultimately, record test accuracy for
the ensemble. Write code to make relevant graphs.

Do not give me only the code skeleton; I want the whole program in one
code block.

Fig. 3 (continued)

4 Performance

Figures 4 and 5 show the performance of the fine-tuned ResNet-50 model and the
fine-tuned DenseNet-121 model respectively. For both models, with an increasing
number of epochs, the training loss and validation loss decrease, and consequently
the training accuracy and validation accuracy increase. At the last epoch, we observe
a validation accuracy of 97.6190% and 98.4127% for ResNet-50 and DenseNet-
121 respectively, as shown in Table 1, with the test accuracy of the ensemble being
99.2063%.

Fig. 4 Performance of ResNet50 Model. (Left:) Training and validation loss. (Right:) Training and
validation accuracy
140 A. Deshmukh and B. Eidel

Fig. 5 Performance of DenseNet121 Model. (Left:) Training and validation loss. (Right:) Training
and validation accuracy

Table 1 Model performance comparison


ResNet-50 DenseNet-121 Ensemble
Validation Acc 97.6190% 98.4127%
Test Acc 99.2063%

Figure 6 displays a random sampling of 12 images from the test set. The ground
truths and the predicted labels are in good agreement. Note that the images appear
differently from the sample image shown in Fig. 1 because of ImageNet normaliza-
tion; this is a necessary step in pre-processing when using pre-trained models like
ResNet-50.
The observation that the validation loss is almost consistently smaller than the
training loss in the left of Figs. 4 and 5 can have different reasons. Augmented
data may have been harder for the model to learn and because data augmentation
transformations happen only on some samples belonging to the training set and not
at all on the validation set, the validation results remain unaffected. Smaller datasets
(validation) have smaller intrinsic variance than larger datasets (training); recall their
ratio of 70:15. Or this is simply an accident in that the data split is suitable for such
behavior.
Transfer Learning for Alloy Classification Based on Microstructure Images 141

Fig. 6 A random sampling of 12 images from the test set with their ground truths and predicted
labels
142 A. Deshmukh and B. Eidel

5 Discussion

The results show that GPT-4 is successful in generating working code for the problem
of image classification on SEM scans of steel microstructures using TL. While almost
the entirety of the code is generated via the main prompt, some minor tweaking,
through supplemental prompts, was necessary to achieve a fully functional code.

5.1 Errors

The errors are described below in three categories: major errors, i.e. errors which
require a correction for the code to run; minor errors, i.e. errors with which code will
run but perform sub-optimally; and ignored tasks.

5.1.1 Major Errors

• Import statements necessary for certain functions are not always written by default
(e.g., from PIL import Image).
• While creating the dataframe, traversing the relevant directories requires excep-
tion handling to ignore the Not a directory error due to the presence of
.DS_Store files that are present on a Mac. Note that prior handling of such
an error is a foresight that should not be expected of LLMs because this is an
exception, not a norm. Therefore, this is not necessarily a drawback.
• Label encoding is required because original labels are strings when PyTorch
requires numbers. AttributeError: ’tuple’ object has no
attribute ’to’ is encountered when executing labels = labels.to
(device) This suggests that labels is a tuple when it is expected to be a PyTorch
tensor.

5.1.2 Minor Errors

• One more transformation is necessary in image pre-processing to crop out the


top and bottom segments of the image because those strips contain miscellaneous
SEM annotations. GPT-4 cannot–and is not expected to–fix such specific errors it
is not even aware of. However, the fix, described later, is straightforward.
• Learning rate scheduler step is initially taken on training loss instead of validation
loss.
• The initial ensemble model uses a neural network with a linear layer that combines
the individual outputs of the two models, ResNet-50 and DenseNet-121, and ReLU.
It performs quite poorly. It is replaced with an averaging operation upon request.
Transfer Learning for Alloy Classification Based on Microstructure Images 143

• Default values for the parameters of the LR scheduler are used. We test several
models to tune these values. GPT-4 does not provide unique instruction to guide
these values, but optimization requires scanning the hyperparameter space with
repeated tries. This obviously falls beyond what LLMs can do on their own.

5.1.3 Ignored Tasks

Training loss, training accuracy, validation loss, and validation accuracy for each
epoch is not recorded. The visualization task is fully ignored.

5.2 Fixes

The above errors have easy fixes. GPT-4 is good at troubleshooting; one only needs
to copy paste the error encountered by the Python IDE into GPT-4. GPT-4 will either
give one or more potential solutions. Some errors are quite elusive. GPT-4 does not
right away figure out the case for label encoding. It first examines SteelDataset class’s
__getitem__ method and unpacking of labels inside training loop. When we
explicitly state that both of these are correct, it lists more possibilities. One of these
is label transformation, which reminds us that the labels are originally in strings,
when PyTorch requires labels to be numerically encoded. GPT-4 finally suggests
to use label encoding. This is a process; however, it could have been avoided by
pointing this out right in the main prompt. When fixing other problems, such as
replacing ensemble function or obtaining code for previously ignored task, a simple
instruction is usually sufficient. Table 2 lists these supplemental prompts.
In addition to these fixes, the code is slightly modified to make results repro-
ducible; as long as the random seed is left unchanged, whose purpose is to maintain
consistency in random numbers generated in the program, one gets the same results
every time. Please note that we train the model for 500 epochs to observe if any
learning happens over the long run. However, similar results are achievable with
only 50 epochs.

Table 2 Modifications and supplemental prompts


Modification Supplemental prompt
New ensemble function Give me an ensemble that simply gives the average
LR step correction Fix this to take scheduler step based on validation loss
Visualization function I need to store arrays of validation loss, validation accuracy, training
loss, training accuracy for each epoch so that I can print them later.
Also implement that and give me code to visualize the performance
144 A. Deshmukh and B. Eidel

6 Conclusion

We obtain a working code through GPT-4 to predict an alloy from an SEM scan of
its microstructure. The resulting model is an ensemble of two pre-trained models,
ResNet-50 and DenseNet-121, fine-tuned on a training set that includes SEM scans
of microstructures of 33 different types of 9% Cr steels. The model achieves an
accuracy of 99.20% on the test set, indicating that it performs quite well.
GPT-4 is good at unpacking a prompt, understanding it, writing code, explaining
the code in a broad sense as well as line-by-line, when asked to, and the generated
code is also well documented. It helps to have a sufficiently detailed prompt, rich
with information on not only the nature of the problem but also implementation
details. Troubleshooting is easy by simply providing GPT-4 with the traceback of
the errors encountered. Code that needs to be modified or rewritten can be edited
with additional simple prompts.
By solving this problem, we demonstrate that, with human intervention, GPT-
4 is capable of generating code for common computer vision tasks such as image
classification in computational materials science.

References

1. Agrawal, A. & Choudhary, A. Perspective: Materials informatics and big data: Realization of
the “fourth paradigm” of science in materials science. Apl Materials. 4 (2016)
2. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. & He, Q. A Comprehensive
Survey on Transfer Learning. Proceedings Of The IEEE. 109, 43-76 (2021)
3. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proceedings
Of The IEEE Conference On Computer Vision And Pattern Recognition. pp. 770-778 (2016)
4. Rozman, K., Doğan, Ö., Chinn, R., Jablonksi, P., Detrois, M. & Gao, M. Dataset for machine
learning of microstructures for 9% Cr steels. Data In Brief. 45 pp. 108714 (2022)
5. Choudhary, Kamal, et al. Recent Advances and Applications of Deep Learning Methods in
Materials Science. Npj Computational Materials, vol. 8, no. 1, 2022. pp. 1-26. www.nature.
com, https://doi.org/10.1038/s41524-022-00734-6.
6. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Densely Connected Convolutional
Networks. 2017 IEEE Conference On Computer Vision And Pattern Recognition (CVPR) https://
doi.org/10.1109/cvpr.2017.243
Transfer Learning for Microstructure
Image Segmentation

Rahul Narkhede and Bernhard Eidel

Abstract Microstructure segmentation is a crucial task in materials science which


facilitates detailed material characterization and establishing processing-structure-
property linkages. Image segmentation is a classical technique in computer vision for
which several deep learning models exist. Furthermore, transfer learning can leverage
the performance of these deep learning models on new datasets by initializing them
with parameters pre-trained on large datasets. In this chapter, transfer learning is
applied for segmentation of nickel-based superalloy microstructure images using a
model pre-trained on a large dataset of microscopy images called MicroNet. GPT-4
on ChatGPT Plus is instructed to generate a Python code for performing this task.
By a suitably designed sequence of prompts, GPT-4 provides promising results in
implementing the task. Aspects of prompt design, handling errors and testing the
outcome are considered as well.

1 Introduction

Image segmentation is a cornerstone technique in visual data analysis, enabling the


differentiation of objects within an image based on distinct characteristics. This tech-
nique is particularly crucial in materials science for the quantification and analysis of
microstructures, which is fundamental for understanding the properties and behaviors
of materials. Traditionally, segmentation has been performed manually or through
semi-automated methods, which can be labor-intensive and prone to error. With
the advent of computer vision and machine learning, more sophisticated and auto-

R. Narkhede · B. Eidel (B)


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
R. Narkhede
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 145
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_7
146 R. Narkhede and B. Eidel

mated methods have been developed, significantly enhancing accuracy and efficiency
[1, 2].
Transfer learning is a powerful strategy in machine learning where a model devel-
oped for one task is repurposed for a second, related task. This approach is espe-
cially valuable in environments where data are scarce or where training a new model
from scratch is computationally prohibitive. By utilizing models pre-trained on large
datasets, transfer learning not only conserves resources but also enhances model
accuracy, making it a vital tool in domains where data acquisition is challenging [3].
Among the several deep learning architectures available, the ResNet50 [4] archi-
tecture is known for its depth and the use of residual connections, which enabled
training significantly deeper networks than previously feasible. This architecture is
adept at extracting detailed features that are crucial for accurate segmentation, and
is hence often employed as an encoder in segmentation tasks. The encoder trans-
forms input images to rich, feature-dense representations for the model to capture
the image context. Then, these representations are translated back into meaningful
segmentation maps using decoder architectures like the UNet++. UNet++ [5] is an
improvement of the classic UNet architecture, and shows improvement in segmen-
tation accuracy. The overall framework of the microstructure segmentation task is
shown in the Fig. 1.
The task in this chapter is to apply transfer learning for segmentation of microstruc-
ture images of nickel-based superalloys into matrix, secondary and tertiary phases.
In doing so, it uses the ResNet50 architecture as the encoder and the UNet++ as the
decoder. The encoder is pre-trained on a large microscopy dataset named MicroNet.
The application of a pre-trained encoder thus implies use of transfer learning for the
segmentation of nickel-based superalloy dataset. The work in this chapter is largely
based on the example notebook for image named multiclass_segmentation
_example.ipynb provided as supplementary material with [6]. This segmenta-
tion task encompasses the following key steps:

• Data Preparation: Automated extraction and association of images with corre-


sponding masks to establish ground truths for supervised learning.
• Data Augmentation: Application of transformations such as rotations and flips,
and adjustments in brightness and contrast to enhance model robustness.
• Model Setup: Building a model based on the ResNet50 encoder pre-trained on
the MicroNet dataset and a UNet++ decoder.
• Loss function: Integration of Dice Loss and Binary Cross Entropy (BCE) Loss to
optimize both the segmentation overlap and pixel-wise classification accuracy.
• Performance Metric: Adoption of Intersection over Union (IoU) to quantify
model performance, a standard metric in segmentation model evaluation
• Model Training and Early Stopping: Implementation of early stopping based
on IoU improvement to prevent overfitting.
• Testing and Visualization: Evaluation of the model on unseen data and visual-
ization of the results to assess performance qualitatively.
Transfer Learning for Microstructure Image Segmentation 147

Fig. 1 Schematic showing the use of encoder pre-trained on the MicroNet dataset for microstructure
image segmentation via transfer learning. The encoder (top), like ResNet50, is a convolutional neural
network followed by a dense neural network classifier (shown in the gray rectangular box on the
top), which is trained to classify microscopy images into different material classes. The encoder
is pre-trained on the MicroNet dataset containing more than 100,000 microstructure images. By
transfer learning, the pre-trained encoder is then used in the encoder-decoder segmentation model
(shown in the bottom), which is then trained on a small dataset of microstructure images to segment
the microstructure into separate classes

2 Prompt

The Python code for the segmentation of microstructure images is generated by


GPT-4. This task requires pre-processing of data, building the deep learning model
with pre-trained weights, implementation of a training routine with multiple intrica-
cies, evaluating model performance and visualizing the predictions. It also involves
implementing methods from multiple Python libraries. Considering these aspects, it
is decided to instruct GPT-4 in multiple sub-tasks which include the details of all
the mentioned steps. Moreover, it is also observed that upon providing all the details
of this segmentation task as a one-thrust prompt resulted in generation of incom-
plete Python which also overlooked some instructions. Using fundamental aspects
of prompt engineering, and at times using GPT-4 to summarize its steps in a refined
148 R. Narkhede and B. Eidel

prompt, a well-instructed and detailed sequence of prompts is drafted. This process


involved multiple iterations and resolving several errors, which are also discussed in
the next sections.
The sequence of prompts is based on the steps involved in a typical transfer
learning and image segmentation task:
• System prompt with general instructions for all sub-tasks
• Loading and pre-processing data

– Importing microstructure images and assigning classes to the masks.


– Augmenting the dataset by randomized image transformations.
– Loading the augmented data for the model training and inference.

• Model setup

– Building the model with pre-trained weights.


– Creating loss function and metric to measure model performance.

• Training and testing model

– Setting up a training and validation loop.


– Evaluating model performance and visualizing prediction accuracy.

Considering these steps, six prompts have been used to generate the desired Python
code for the segmentation task:

• Prompt 1: System prompt


• Prompt 2: Dataset configuration
• Prompt 3: Data augmentation and data loading
• Prompt 4: Model setup, loss function and metric
• Prompt 5: Training loop setup
• Prompt 6: Model testing and visualizing predictions.

2.1 System Prompt

The system prompt orients the chatbot to the task. In this system prompt, ChatGPT
4 is instructed to adapt a persona which knows the requisite libraries and knowledge
to execute the segmentation task. Through such a system prompt, the chatbot is also
given specific instructions to mitigate persistently observed errors in the generated
code.
In response to prompt shown in Fig. 2, ChatGPT 4 acknowledges that it is prepared
to help the user with the task and that it anticipates further instructions.
Transfer Learning for Microstructure Image Segmentation 149

Prompt 1
You are an expert computer vision and machine learning engineer. You
have the fundamental knowledge of deep learning, the methods and their
working, popular deep learning architectures for image classification and
segmentation. You are specifically well versed in PyTorch and allied libraries
like albumenations, segmentation models pytorch, DataLoader and so on.
You can load pre-trained models from urls and fine-tune them for the specific
task at hand. You write an error-free code. Specifically, your code encounters
no issues in the dimensions of the various tensors and arrays in the deep
learning process. You analyze code for any errors before presenting it. You
generate complete code with documentation. You write code for the part that
you are asked for, no further steps. Only what’s specifically instructed. Do not
show sample usage until asked for. You track matrix dimensions throughout
the code and handle any dimension errors. You also ensure that any torch
tensor is being moved to the CPU before it is converted to a NumPy array.
Any array dimension errors will leave a poor score on your performance.
The task I want to achieve is complex and I shall tell you the instructions in
a step-wise fashion.

Fig. 2 System prompt provided to ChatGPT 4 before specific instructions for the segmentation
task

2.2 Loading and Pre-processing Data

As the first and one of the crucial steps of the segmentation task, ChatGPT 4 is
provided all the details to load the data in a suitable format for further processing
and applying deep learning model for segmentation.
The prompt begins with a brief description of the overall segmentation task with
details about the model architecture. This short description acts as a context for the
chatbot in its subsequent responses. It is also asked to add randomization seeds to
ensure that the results are reproducible.
The location of the training, validation and testing data is specified in the prompt.
The dimensions of the images and masks and their file formats are also specified.
Specific code instructions to read the images and masks, and then to assign classes to
the masks are given. The detailed instructions are important to maintain the correct
assignment of annotated masks to the images.
Pre-processing the images involves data augmentation and normalizing the image
data. Data augmentation in image segmentation involves applying transformations
like flipping, rotation, and adjustments to contrast and brightness to existing images,
enhancing dataset diversity and model generalization without increasing the number
of original images. This method helps prevent overfitting, especially with smaller or
less diverse datasets. After augmentation, the images are normalized using the mean
and standard deviation of the pre-trained model, in this case, the ‘ResNet50’ model
pre-trained with the ‘imagenet’ dataset.
150 R. Narkhede and B. Eidel

Fig. 3 Prompt with instructions to get the data

The pre-processed data is then passed to a dataloader. Dataloaders efficiently


manage the flow of data during training by batching, shuffling, and preparing data
for input into the model. They handle loading data from the dataset into memory,
applying transformations such as data augmentation on-the-fly, and ensuring that data
is supplied to the model in suitable format for efficient processing, thus optimizing
the use of computational resources.
The specific instructions for pre-processing the data and then creating dataloaders
for training, validation and testing are provided to ChatGPT 4 in the next prompt.
The detailed transformations for pre-processing the data Fig. 4 and the instructions
Transfer Learning for Microstructure Image Segmentation 151

to load the final pre-processed data for model training, testing and validation Fig. 5
are provided in prompt 3.

Fig. 4 Part one of prompt 3 which provides instructions for pre-processing the data
152 R. Narkhede and B. Eidel

Fig. 4 (continued)

Fig. 5 Part two of prompt 3 which provides instructions for loading the data and visualizing some
images with their masks

2.3 Model Setup

For the microstructure image segmentation, the ‘resnet50’ encoder is used with the
‘UNet++’ segmentation model. The pre-trained encoder weights are downloaded
from a url in the model setup. The function get_pretrained_microscopy
_url that generates the url is used as is from the source [6]. In the prompt 4 (Fig. 6),
these details about the architecture and the source url to obtain the pre-trained encoder
weights are provided, along with further details about the inputs for the model setup,
the activation function of model output, usage of GPU if available and the expected
output.
To track the loss over training the model, a custom loss function as used in
the example notebook multiclass_segmentation_example.ipynb that
combines the Dice and binary cross-entropy (BCE) losses is then created. The model
performance is checked using the intersection over union (IoU) metric, which can
Transfer Learning for Microstructure Image Segmentation 153

be obtained from the ‘segmentation_models_pytorch’ library. However, it


is noticed that ChatGPT 4 is not aware of the latest version of the IOU function from
this library. To overcome the persistent error of incorrect syntax in using the IOU
metric, an example usage based on the syntax of the latest version is provided in the
loss functions and metrics part of the prompt 4 shown in Fig. 7.

Fig. 6 Prompt to set up the model


154 R. Narkhede and B. Eidel

Fig. 7 Prompt to create the loss function and the model evaluation metric

2.4 Training the Model

After setting up the training, validation and test data, the model, the loss function
and the metric, a function to train the model is required. This function shall put the
individual components together, and create the training loop. As shown in the prompt
in Fig. 8, ChatGPT 4 is instructed to set up an optimizer, use the losses as the model
criterion and then set up the training loop. It is also instructed other details such as
use of early stopping by evaluating the IoU metric over the validation data to avoid
Transfer Learning for Microstructure Image Segmentation 155

overfitting, using mixed precision training to reduce memory usage and saving the
model checkpoint regularly after a given number of epochs. Specific instructions are
also provided to ensure that certain commonly observed errors are eliminated.

Fig. 8 Prompt to set up model training


156 R. Narkhede and B. Eidel

2.5 Evaluating Model Performance on Test Data

The performance of the trained model is evaluated on the test data, again with the IoU
metric, but also with help of visualizations. For this purpose, a function is created
that runs one forward pass on the test data by instantiating the trained model. In the
forward pass, the loss and the IoU metric over the test samples is calculated and
their average values are returned. To ensure that this is executed without errors, the
prompt includes a break down of the steps involved in a forward pass.
The predictions are visualized in two ways: (1) by comparing the true masks of
the test samples with the predicted ones, and (2) by overlaying the predicted masks
over the true masks and marking the true positive, true negative, false positive and
false negative pixels with different colors. The functions for both these visualizations
are generated with the instructions in the prompt shown in Fig. 9 .

3 Generated Code and Discussion

The code generated based on the prompts provided for the microstructure image
segmentation is presented in this section. Along with the code listings, the outputs
generated and some remarks on the generated code are also mentioned.

3.1 Loading and Pre-processing Data

From the prompts Figs. 3, 4 and 5, the following Code Listing 1 was generated. Chat-
GPT 4 generates a Python script that imports required libraries, creates a function that
inserts randomization seeds and then a class named MicrostructureDataset
which extracts images and masks according to the specified details, applies transfor-
mation and returns two variables images and masks which contain the extracted
data.
In response to the prompt mentioned in Figs. 4 and 5, ChatGPT 4 first creates
the function get_training _augmentation, which sets all the transforma-
tions mentioned in the prompt in the pipeline in the mentioned sequence. It also
uses the appropriate function get_preprocessing_fn to normalize the image
data with the mean and standard deviation of the ‘resnet50’ model architecture
pre-trained on the ‘imagenet’ dataset. Similarly the function get_validation
_augmentation is also created which only normalizes the image data.
After the data augmentation, the chatbot uses the DataLoader function from
Pytorch, along with the previously created MicrostructureDataset to create
dataloaders for training, validation and testing. Finally, a visualization function is
also created to plot the augmented images and their corresponding masks.
1
Transfer Learning for Microstructure Image Segmentation 157

Fig. 9 Prompt to test the trained model and visualize the predictions
158 R. Narkhede and B. Eidel

Fig. 9 (continued)

2 import os
3 import numpy as np
4 import cv2
5 import torch
6 import random
7 from torch.utils.data import Dataset, DataLoader
8 from albumentations import Compose, Normalize, Resize
9 from albumentations.pytorch import ToTensorV2
10

11 # Set random seeds for reproducibility


12 SEED = 42
13 random.seed(SEED)
14 np.random.seed(SEED)
15 torch.manual_seed(SEED)
16 torch.cuda.manual_seed_all(SEED)
17 torch.backends.cudnn.deterministic = True
18 torch.backends.cudnn.benchmark = False
19

20 # Load Data from directory


21

22 # Paths to dataset
23 base_directory = ’/kaggle/input/ni-super-alloy/Super1’
24 train_images_dir = os.path.join(base_directory, ’train’)
25 train_masks_dir = os.path.join(base_directory, ’train_annot’)
26 val_images_dir = os.path.join(base_directory, ’val’)
27 val_masks_dir = os.path.join(base_directory, ’val_annot’)
28 test_images_dir = os.path.join(base_directory, ’test’)
29 test_masks_dir = os.path.join(base_directory, ’test_annot’)
30

31 class MicrostructureDataset(Dataset):
32 def __init__(self, images_dir, masks_dir, transform=None):
33 self.images_dir = images_dir
34 self.masks_dir = masks_dir
Transfer Learning for Microstructure Image Segmentation 159

35 self.transform = transform
36 self.images = sorted(os.listdir(images_dir))
37 self.masks = sorted(os.listdir(masks_dir))
38

39 def __len__(self):
40 return len(self.images)
41

42 def __getitem__(self, idx):


43 img_path = os.path.join(self.images_dir, self.images[
→ idx])
44 mask_path = os.path.join(self.masks_dir, self.masks[idx
→ ])
45

46 # Read image
47 image = cv2.cvtColor(cv2.imread(img_path), cv2.
→ COLOR_BGR2RGB)
48
49 # Read mask
50 mask = cv2.imread(mask_path, 1)
51

52 # Extract binary masks for each class


53 matrix_mask = np.all(mask == [0, 0, 0], axis=-1).astype
→ (np.uint8)
54 secondary_mask = np.all(mask == [255, 0, 0], axis=-1).
→ astype(np.uint8)
55 tertiary_mask = np.all(mask == [0, 0, 255], axis=-1).
→ astype(np.uint8)
56 masks = [matrix_mask, secondary_mask, tertiary_mask]
57 if mask.shape[2] > 1:
58 masks[0] = ~np.any(masks[1:], axis=0)
59

60

61 # Stack masks
62 mask = np.stack(masks, axis=-1)
63

64 if self.transform:
65 augmented = self.transform(image=image, mask=mask)
66 image = augmented[’image’]
67 mask = augmented[’mask’]
68

69 return image, mask


70
71 # Data Augmentation
72

73 import albumentations as A
74 from albumentations.pytorch import ToTensorV2
75 from segmentation_models_pytorch.encoders import
→ get_preprocessing_params
76

77 # Get normalization parameters from segmentation_models_pytorch


78 preprocessing_params = get_preprocessing_params(’resnet50’)
79 mean = preprocessing_params[’mean’]
80 std = preprocessing_params[’std’]
160 R. Narkhede and B. Eidel

81

82 # Define training augmentations


83 def get_training_augmentation_pmm():
84 return A.Compose([
85 A.HorizontalFlip(p=0.75),
86 A.RandomRotate90(p=1),
87 A.GaussNoise(p=0.5),
88 A.OneOf([
89 A.CLAHE(),
90 A.RandomBrightnessContrast(brightness_limit=0.25,
→ contrast_limit=0.25),
91 A.RandomGamma()
92 ], p=0.5),
93 A.OneOf([
94 A.Sharpen(),
95 A.Blur(blur_limit=3)
96 ], p=0.5),
97 A.OneOf([
98 A.RandomBrightnessContrast(),
99 A.HueSaturationValue()
100 ], p=0.5),
101 A.Normalize(mean=mean, std=std),
102 ToTensorV2()
103 ])
104

105 # Define validation augmentations


106 def get_validation_augmentation():
107 return A.Compose([
108 A.Normalize(mean=mean, std=std),
109 ToTensorV2()
110 ])
111

112

113 # Create DataLoaders and visualize some samples


114

115 import matplotlib.pyplot as plt


116

117 # Custom dataset class already defined above


118

119 # Define training, validation, and test datasets with


→ augmentations
120 train_dataset = MicrostructureDataset(train_images_dir,
→ train_masks_dir, transform=get_training_augmentation_pmm
→ ())
121 val_dataset = MicrostructureDataset(val_images_dir,
→ val_masks_dir, transform=get_validation_augmentation())
122 test_dataset = MicrostructureDataset(test_images_dir,
→ test_masks_dir, transform=get_validation_augmentation())
123

124 # Define DataLoaders


125 train_loader = DataLoader(train_dataset, batch_size=4, shuffle=
→ True, num_workers=0)
Transfer Learning for Microstructure Image Segmentation 161

126 val_loader = DataLoader(val_dataset, batch_size=4, shuffle=


→ False, num_workers=0)
127 test_loader = DataLoader(test_dataset, batch_size=1, shuffle=
→ False, num_workers=0)
128

129 # Function to visualize augmented images and their masks


130 def visualize_augmentations(dataset, num_samples=4):
131 fig, axs = plt.subplots(num_samples, 4, figsize=(20, 5 *
→ num_samples))
132

133 for i in range(num_samples):


134 idx = random.randint(0, len(dataset) - 1)
135 image, mask = dataset[idx]
136

137 # Convert tensor to numpy array for visualization


138 image_np = image.permute(1, 2, 0).cpu().numpy()
139 mask_np = mask.cpu().numpy()
140

141 # Plot image and masks


142 axs[i, 0].imshow(image_np)
143 axs[i, 0].set_title("Image")
144 axs[i, 1].imshow(mask_np[..., 0], cmap=’gray’)
145 axs[i, 1].set_title("Matrix Mask")
146 axs[i, 2].imshow(mask_np[..., 1], cmap=’gray’)
147 axs[i, 2].set_title("Secondary Mask")
148 axs[i, 3].imshow(mask_np[..., 2], cmap=’gray’)
149 axs[i, 3].set_title("Tertiary Mask")
150

151 for j in range(4):


152 axs[i, j].axis(’off’)
153

154 plt.tight_layout()
155 plt.show()
156

157 # Visualize augmentations


158 visualize_augmentations(train_dataset, num_samples=4)
Code Listing 1 Loading and pre-processing data

It is important for the user to visualize the augmented images and their masks.
The augmented images should encapsulate some of the transformations in the aug-
mentation function and should differ in terms of these transformations from each
other. Moreover, the masks, ‘matrix’, ‘secondary’ and ‘tertiary’ should be consistent
for all images. A common error observed was overlooking the instruction to update
the ‘matrix’ mask with all the remaining unassigned pixels after extracting the ‘sec-
ondary’ and ‘tertiary’ masks. Errors in extracting the masks, and assigning them to
the corresponding pixels in the image will hinder the model from training well. The
major issue being the incorrect assignment of labels to the pixels, and thus obtaining
a higher loss despite multiple epochs of training the model. Figure 10, shows one of
the augmented images and its corresponding masks.
162 R. Narkhede and B. Eidel

Fig. 10 Sample of augmented image and its corresponding masks

3.2 Model Setup

Using the model setup prompts Fig. 6, the functions to setup the segmentation model
from the pre-trained weights, the computation of combined Dice and BCE loss, and
the computation of the IoU metric were created. The function get_pretrained
_microscopynet_url that generates the url for downloading the pre-trained
weights is also mentioned here. It is directly taken as is from the GitHub repository
of the reference [6] and not generated using ChatGPT 4.
1 import torch
2 import torch.utils.model_zoo as model_zoo
3 from segmentation_models_pytorch import UnetPlusPlus
4 from segmentation_models_pytorch.encoders import
→ get_preprocessing_params
5

6 # Assuming get_pretrained_microscopynet_url function is already


→ provided
7 # from custom_utils import get_pretrained_microscopynet_url
8

9 def get_pretrained_microscopynet_url(encoder, encoder_weights,


→ version=1.1,
10 self_supervision=’’):
11 """Get the url to download the specified pretrained encoder
→ .
12

13 Args:
14 encoder (str): pretrained encoder model name (e.g.
→ resnet50)
15 encoder_weights (str): pretraining dataset, either ’
→ micronet’ or
16 ’imagenet-micronet’ with the latter indicating the
→ encoder
17 was first pretrained on imagenet and then finetuned
→ on microscopynet
18 version (float): model version to use, defaults to
→ latest.
19 Current options are 1.0 or 1.1.
20 self_supervision (str): self-supervision method used.
→ If self-supervision
Transfer Learning for Microstructure Image Segmentation 163

21 was not used set to ’’ (which is default).


22

23 Returns:
24 str: url to download the pretrained model
25 """
26

27 # there is an error with the name for resnext101_32x8d so


→ catch and return
28 # (currently there is only version 1.0 for this model so
→ don’t need to check version.)
29 if encoder == ’resnext101_32x8d’:
30 return ’https://nasa-public-data.s3.amazonaws.com/
→ microscopy_segmentation_models/
→ resnext101_pretrained_microscopynet_v1.0.pth.tar’
31

32 # only resnet50/micronet has version 1.1 so I’m not going


→ to overcomplicate this right now.
33 if encoder != ’resnet50’ or encoder_weights != ’micronet’:
34 version = 1.0
35

36 # setup self-supervision
37 if self_supervision != ’’:
38 version = 1.0
39 self_supervision = ’_’ + self_supervision
40

41 # correct for name change for URL


42 if encoder_weights == ’micronet’:
43 encoder_weights = ’microscopynet’
44 elif encoder_weights == ’image-micronet’:
45 encoder_weights = ’imagenet-microscopynet’
46 else:
47 raise ValueError("encoder_weights must be ’micronet’ or
→ ’image-micronet’")
48

49 # get url
50 url_base = ’https://nasa-public-data.s3.amazonaws.com/
→ microscopy_segmentation_models/’
51 url_end = ’_v%s.pth.tar’ %str(version)
52 return url_base + f’{encoder}{self_supervision}_pretrained_
→ {encoder_weights}’ + url_end
53

54 # Model Setup
55

56 def setup_segmentation_model(encoder_name=’resnet50’,
→ class_values=None, encoder_weights=’micronet’):
57 if class_values is None:
58 raise ValueError("class_values must be provided and
→ should not be None")
59

60 # Determine number of classes


61 num_classes = len(class_values)
62

63 # Define activation function based on number of classes


164 R. Narkhede and B. Eidel

64 activation = ’softmax2d’ if num_classes > 1 else ’sigmoid’


65

66 # Initialize U-Net++ model


67 model = UnetPlusPlus(
68 encoder_name=encoder_name,
69 encoder_weights=None, # Skip default weight loading
70 in_channels=3,
71 classes=num_classes,
72 activation=activation
73 )
74

75 # Determine device
76 device = torch.device(’cuda’ if torch.cuda.is_available()
→ else ’cpu’)
77 map_location = ’cuda’ if torch.cuda.is_available() else ’
→ cpu’
78
79 # Load custom weights
80 url = get_pretrained_microscopynet_url(encoder_name,
→ encoder_weights)
81 state_dict = model_zoo.load_url(url, map_location=
→ map_location)
82 model.encoder.load_state_dict(state_dict)
83

84 # Move model to the appropriate device


85 model = model.to(device)
86

87 return model, device


88

89 # Loss and IoU metric


90

91 import torch
92 import torch.nn.functional as F
93 import numpy as np
94 import segmentation_models_pytorch as smp
95

96 # Combined Dice and BCE loss function


97 def dice_bce_loss(inputs, targets, bce_weight=0.5):
98 # Apply sigmoid to inputs
99 inputs = torch.sigmoid(inputs)
100

101 # Flatten inputs and targets


102 inputs = inputs.reshape(-1) #manual
103 targets = targets.reshape(-1) #manual
104

105 # Compute Dice loss


106 intersection = (inputs * targets).sum()
107 dice_loss = 1 - (2. * intersection + 1) / (inputs.sum() +
→ targets.sum() + 1)
108

109 # Compute BCE loss using logits


110 bce_loss = F.binary_cross_entropy_with_logits(inputs,
→ targets.float())
Transfer Learning for Microstructure Image Segmentation 165

111

112 # Combine Dice and BCE losses


113 combined_loss = dice_loss + bce_weight * bce_loss
114 return combined_loss
115

116 # IoU metric function using smp


117 def compute_iou(output, target):
118 # Get statistics for IoU calculation
119 tp, fp, fn, tn = smp.metrics.get_stats(output, target, mode
→ =’multilabel’, threshold=0.5)
120

121 # Compute IoU score


122 iou_score = smp.metrics.iou_score(tp, fp, fn, tn, reduction
→ ="micro")
123 return iou_score
Code Listing 2 Model setup loss function and evaluation metric function

As shown in Code Listing 2, a function for computing the combined Dice and
BCE loss is created. In some trials, ChatGPT 4 overlooked the specific instruction
of using torch.reshape method to change the dimensions of the outputs and
targets in the loss function. It used the torch.Tensor.view method, which
serves the same purpose of changing array dimensions, but is not generally applicable
to both contiguous and non-contiguous tensors. The IoU is computed according to
the example provided in the prompt.

3.3 Training the Model

Putting all the previous functions to use, the model is trained by creating a
train_model function according to the prompt Fig. 8. An appropriate train model
function is created following the instructions in the prompt. However, the instruc-
tions for ensuring that errors are avoided are not always followed. The train model
function can be then used by asking the chatbot for an example usage. The number of
epochs, patience parameter for early stopping, learning rate and the weighing criteria
in the loss function need to be adjusted to train the model with a higher IoU metric
value. Additionally, the user can also prompt ChatGPT 4 to make a plot showing the
decrease in the loss values over the number of epochs.
1 import os
2 import torch
3 import torch.optim as optim
4 from torch.cuda.amp import GradScaler, autocast
5 from torch.utils.data import DataLoader
6 import numpy as np
7 from tqdm import tqdm
8

9 # Define the training and validation loop


166 R. Narkhede and B. Eidel

10 def train_model(model, train_loader, val_loader, criterion,


→ optimizer, num_epochs, patience, device,
→ checkpoint_interval):
11 best_model_wts = None
12 best_iou = 0.0
13 epochs_no_improve = 0
14

15 # Store losses and IoUs for plotting


16 train_losses = []
17 val_losses = []
18 train_ious = []
19 val_ious = []
20

21 scaler = GradScaler()
22

23 for epoch in range(num_epochs):


24 print(f’Epoch {epoch}/{num_epochs - 1}’)
25 print(’-’ * 10)
26

27 # Each epoch has a training and validation phase


28 for phase in [’train’, ’val’]:
29 if phase == ’train’:
30 model.train() # Set model to training mode
31 data_loader = train_loader
32 else:
33 model.eval() # Set model to evaluate mode
34 data_loader = val_loader
35

36 running_loss = 0.0
37 running_iou = 0.0
38

39 # Iterate over data with progress bar


40 with tqdm(total=len(data_loader), desc=f’{phase.
→ capitalize()} Epoch {epoch + 1}’, unit=’batch’) as pbar:
41 for inputs, masks in data_loader:
42 inputs = inputs.to(device)
43 masks = masks.to(device)
44 masks = masks.permute(0, 3, 1, 2) #
→ Correcting mask shape
45

46 # Check mask and input shapes for


→ compatibility
47 if masks.shape != inputs.shape:
48 raise ValueError(f"Mask shape {masks.
→ shape} and input shape {inputs.shape} are incompatible
→ for Dice loss and IoU computation.")
49

50 # Zero the parameter gradients


51 optimizer.zero_grad()
52

53 # Forward pass
54 with autocast(enabled=True):
55 outputs = model(inputs)
Transfer Learning for Microstructure Image Segmentation 167

56 if outputs.shape[1] == 1:
57 outputs = torch.sigmoid(outputs)
58 else:
59 outputs = torch.softmax(outputs,
→ dim=1)
60

61 loss = criterion(outputs, masks)


62

63 if phase == ’train’:
64 # Backward pass and optimization
65 scaler.scale(loss).backward()
66 scaler.step(optimizer)
67 scaler.update()
68

69 # Statistics
70 running_loss += loss.item() * inputs.size
→ (0)
71 iou = compute_iou(outputs, masks)
72 running_iou += iou * inputs.size(0)
73

74 # Update progress bar


75 pbar.update(1)
76 pbar.set_postfix(loss=loss.item(), iou=iou.
→ item())
77

78 epoch_loss = running_loss / len(data_loader.dataset


→ )
79 epoch_iou = running_iou / len(data_loader.dataset)
80

81 if phase == ’train’:
82 train_losses.append(epoch_loss)
83 train_ious.append(epoch_iou)
84 else:
85 val_losses.append(epoch_loss)
86 val_ious.append(epoch_iou)
87

88 print(f’{phase} Loss: {epoch_loss:.4f} IoU: {


→ epoch_iou:.4f}’)
89

90 # Early stopping
91 if phase == ’val’:
92 if epoch_iou > best_iou:
93 best_iou = epoch_iou
94 best_model_wts = model.state_dict()
95 epochs_no_improve = 0
96 torch.save(model.state_dict(), ’best_model.
→ pth’)
97 else:
98 epochs_no_improve += 1
99

100 # Checkpoint the model every ‘checkpoint_interval‘


→ epochs
101 if (epoch + 1) % checkpoint_interval == 0:
168 R. Narkhede and B. Eidel

102 checkpoint_path = f’model_checkpoint_epoch_{epoch +


→ 1}.pth’
103 torch.save(model.state_dict(), checkpoint_path)
104 print(f’Model checkpoint saved at {checkpoint_path}
→ ’)
105

106 if epochs_no_improve >= patience:


107 print(’Early stopping triggered’)
108 break
109

110 # Load best model weights


111 if best_model_wts:
112 model.load_state_dict(best_model_wts)
113

114 # Save losses and ious for plotting, ensuring tensors are
→ moved to CPU
115 np.save(’train_losses.npy’, np.array(train_losses))
116 np.save(’val_losses.npy’, np.array(val_losses))
117 np.save(’train_ious.npy’, np.array([iou.cpu().numpy() for
→ iou in train_ious]))
118 np.save(’val_ious.npy’, np.array([iou.cpu().numpy() for iou
→ in val_ious]))
119

120 return model


121

122 # Usage
123

124 class_values = [0, 1, 2] # Example class values


125 model, device = setup_segmentation_model(encoder_name=’resnet50
→ ’, class_values=class_values, encoder_weights=’micronet’
→ )
126

127 optimizer = optim.Adam(model.parameters(), lr=2e-4)


128 criterion = lambda outputs, masks: dice_bce_loss(outputs, masks
→ , bce_weight=0.7)
129

130 # Train model


131 num_epochs = 250
132 patience = 30
133 checkpoint_interval = 10 # Save model every 10 epochs
134 model = train_model(model, train_loader, val_loader, criterion,
→ optimizer, num_epochs, patience, device,
→ checkpoint_interval)
Code Listing 3 Function to train the model
The following items must be checked with a few number of epochs (e.g. 50 epochs
with a patience value of 5) before running the train_model function with a higher
number of epochs and a high patience value for early stopping:
• Check the shapes of the masks and images are correct. Ideally, the function should
raise an error if the shapes of the image and masks are inappropriate for loss and
IoU metric computation.
• Check if the trained model is being saved regularly after a given interval of epochs.
Transfer Learning for Microstructure Image Segmentation 169

Fig. 11 Training and validation loss over epochs

• Ensure that the training and validation losses and IoU values are being stored after
each epoch.
• A relatively large and quick decrease in the training loss in the first few epochs,
and a consistent increase in the validation IoU value as the losses decrease.
• Check if early stopping is triggered if the validation IoU value of the epochs does
not exceed the last best validation IoU value.
Using the function in Code Listing 3, the model was trained for 250 epochs and a
patience of 30 epochs. The plot in Fig. 11 shows the change in training and validation
loss over the epochs.

3.4 Evaluating Model Performance on Test Data

Functions for model evaluation and the required visualizations are generated by
ChatGPT 4 as per the prompt Fig. 9. At times, the dimensions of the masks might not
be handled correctly and in such a scenario, minor human intervention is required.
1 import matplotlib.pyplot as plt
2 import matplotlib.patches as mpatches
3

4 # Function to evaluate the model on the test set


5 def test_model(model, test_loader, criterion):
6 model.eval()
7 test_loss, test_iou = [], []
8

9 with torch.no_grad():
170 R. Narkhede and B. Eidel

10 for images, masks in test_loader:


11 images, masks = images.to(device), masks.to(device)
12

13 # Convert masks to the shape [batch_size,


→ num_classes, height, width]
14 if masks.ndim == 4 and masks.shape[-1] == 3:
15 masks = masks.permute(0, 3, 1, 2) # Change
→ shape from [batch_size, height, width, num_classes] to [
→ batch_size, num_classes, height, width]
16 masks = masks.float() # Ensure masks are in the
→ correct format
17

18 outputs = model(images)
19 if outputs.shape[1] > 1: # Multiclass segmentation
20 outputs = torch.softmax(outputs, dim=1)
21 else: # Binary segmentation
22 outputs = outputs # Use logits directly
23

24 loss = criterion(outputs, masks)


25 masks_int = masks.long() # Convert masks to
→ integer type for IoU calculation
26 iou = compute_iou(outputs, masks_int)
27

28 test_loss.append(loss.item())
29 test_iou.append(iou.cpu().numpy()) # Move IoU to
→ CPU for numpy operations
30

31 avg_test_loss = np.mean(test_loss)
32 avg_test_iou = np.mean(test_iou)
33

34 print(f’Test Loss: {avg_test_loss:.4f}, Test IoU: {


→ avg_test_iou:.4f}’)
35 return avg_test_loss, avg_test_iou
36

37 # Function to visualize predictions and true masks


38 def visualize_predictions(model, test_dataset, device,
→ num_samples=4):
39 model.eval()
40 fig, axs = plt.subplots(2 * num_samples, 4, figsize=(20, 10
→ * num_samples))
41

42 for i in range(num_samples):
43 idx = random.randint(0, len(test_dataset) - 1)
44 image, true_mask = test_dataset[idx]
45 image = image.to(device).unsqueeze(0)
46

47 with torch.no_grad():
48 output = model(image)
49 if output.shape[1] == 1:
50 output = torch.sigmoid(output)
51 else:
52 output = torch.softmax(output, dim=1)
53 output = output.squeeze().cpu().numpy()
Transfer Learning for Microstructure Image Segmentation 171

54

55 image = image.squeeze().permute(1, 2, 0).cpu().numpy()


56 true_mask = true_mask.cpu().numpy() # Ensure correct
→ shape
57

58 # Plot original image


59 axs[2 * i, 0].imshow(image)
60 axs[2 * i, 0].set_title("Image")
61

62 # Plot true masks


63 axs[2 * i, 1].imshow(true_mask[..., 0], cmap=’gray’)
64 axs[2 * i, 1].set_title("True Matrix Mask")
65 axs[2 * i, 2].imshow(true_mask[..., 1], cmap=’gray’)
66 axs[2 * i, 2].set_title("True Secondary Mask")
67 axs[2 * i, 3].imshow(true_mask[..., 2], cmap=’gray’)
68 axs[2 * i, 3].set_title("True Tertiary Mask")
69
70 # Plot predicted masks
71 axs[2 * i + 1, 0].imshow(image)
72 axs[2 * i + 1, 0].set_title("Image")
73 axs[2 * i + 1, 1].imshow(output[0], cmap=’gray’)
74 axs[2 * i + 1, 1].set_title("Predicted Matrix Mask")
75 axs[2 * i + 1, 2].imshow(output[1], cmap=’gray’)
76 axs[2 * i + 1, 2].set_title("Predicted Secondary Mask")
77 axs[2 * i + 1, 3].imshow(output[2], cmap=’gray’)
78 axs[2 * i + 1, 3].set_title("Predicted Tertiary Mask")
79

80 for j in range(4):
81 axs[2 * i, j].axis(’off’)
82 axs[2 * i + 1, j].axis(’off’)
83

84 plt.tight_layout()
85 plt.show()
86

87 # Function to overlay predicted and true masks, highlighting


→ errors
88 def visualize_prediction_accuracy(model, test_dataset, device,
→ num_samples=4):
89 model.eval()
90 fig, axs = plt.subplots(num_samples, 4, figsize=(20, 5 *
→ num_samples))
91
92 for i in range(num_samples):
93 idx = random.randint(0, len(test_dataset) - 1)
94 image, true_mask = test_dataset[idx]
95 image = image.to(device).unsqueeze(0)
96

97 with torch.no_grad():
98 output = model(image)
99 if output.shape[1] == 1:
100 output = torch.sigmoid(output)
101 else:
102 output = torch.softmax(output, dim=1)
172 R. Narkhede and B. Eidel

103 output = output.squeeze().permute(1, 2, 0).cpu().


→ numpy()
104

105

106 image = image.squeeze().permute(1, 2, 0).cpu().numpy()


107 true_mask = true_mask.cpu().numpy()
108

109 #print("output: ", output.shape)


110 #print("image: ", image.shape)
111 #print("true mask: ", true_mask.shape)
112 if output.shape != true_mask.shape:
113 raise ValueError(f"Shape mismatch: Predicted mask
→ shape {output.shape} and true mask shape {true_mask.
→ shape} are incompatible for visualization.")
114

115 axs[i, 0].imshow(image)


116 axs[i, 0].set_title("Image")
117

118 for j in range(3):


119 pred_mask = output[:, :, j] > 0.5
120 true_class_mask = true_mask[:, :, j]
121

122 overlay = np.zeros((*true_class_mask.shape, 3),


→ dtype=np.uint8)
123 #print("pred_mask :", pred_mask.shape)
124 #print("true_class_Mask: ", true_class_mask.shape)
125

126 true_positives = (pred_mask == 1) & (


→ true_class_mask == 1)
127 true_negatives = (pred_mask == 0) & (
→ true_class_mask == 0)
128 false_positives = (pred_mask == 1) & (
→ true_class_mask == 0)
129 false_negatives = (pred_mask == 0) & (
→ true_class_mask == 1)
130

131 overlay[true_positives] = [255, 255, 255] # White


132 overlay[true_negatives] = [0, 0, 0] # Black
133 overlay[false_positives] = [0, 0, 255] # Blue
134 overlay[false_negatives] = [255, 105, 180] # Pink
135

136 axs[i, j + 1].imshow(overlay)


137 axs[i, j + 1].set_title(f"Mask {j+1} Overlay")
138

139 for j in range(4):


140 axs[i, j].axis(’off’)
141

142 white_patch = mpatches.Patch(color=’white’, label=’True


→ Positive’)
143 black_patch = mpatches.Patch(color=’black’, label=’True
→ Negative’)
144 blue_patch = mpatches.Patch(color=’blue’, label=’False
→ Positive’)
Transfer Learning for Microstructure Image Segmentation 173

145 pink_patch = mpatches.Patch(color=’pink’, label=’False


→ Negative’)
146 plt.legend(handles=[white_patch, black_patch, blue_patch,
→ pink_patch], loc=’upper right’)
147 plt.tight_layout()
148 plt.show()
149

150 # Usage
151

152 # 250 epochs, patience = 30, never reached early stopping


153 model.load_state_dict(torch.load(’model_checkpoint_epoch_250.
→ pth’))
154

155 # Evaluate the model on the test set


156 test_loss, test_iou = evaluate_model(model, test_loader,
→ criterion, device)
157
158 # Visualize predictions and accuracy
159 #visualize_predictions(model, test_dataset, device, num_samples
→ =4)
160 visualize_prediction_accuracy(model, test_dataset, device,
→ num_samples=4)
Code Listing 4 Functions for evaluating model performance on test data and visualizing
predictions

The test_model function runs a forward pass by using the trained model with
the test dataset and computes the loss and the IoU metric. While this function posed
no errors in trials, it is still suggested to check if the test accuracy and IoU value are
close to the ones observed for the epoch where the best model was saved. If there
is a significant deviation, the first step is to check if the test_model function has
been implemented correctly. If the deviation is significant despite having a correctly
implemented function, then there are possible issues in the training of the model.
However, such errors did not arise during any of the trials.
Besides checking the loss and IoU values, a visual observation can also provide
a good idea if the model is performing well. Here, the visualization functions are
useful. In Fig. 12a and b, the predicted masks and the true masks for two models:
one trained with 50 epochs and the other trained with 250 epochs, can be observed.
The prediction accuracy can also be checked by overlaying the predicted masks
over the true masks and marking the false positive and false negative pixels. The
Fig. 13a and b show the for the model trained for 50 epochs and the one trained for
250 epochs.
174 R. Narkhede and B. Eidel

(a) 50 epochs

(b) 250 epochs

Fig. 12 True and predicted masks for a model trained for different numbers of epochs. The first
row in a and b each displays the true masks and the second row in a and b show the predicted
masks for the same image

4 Discussion

Microstructure image segmentation using transfer learning is a relatively complex


task, primarily due to the involvement of multiple sub-steps that each require the
usage of different Python libraries and functions. Moreover, since most machine
learning tasks do not have a definite solution, there is a large room for possi-
ble solution approaches. This intrinsic freedom of machine learning approaches
in such complex tasks demands multiple iterations and variations, which can be
Transfer Learning for Microstructure Image Segmentation 175

(a) 50 epochs

(b) 250 epochs

Fig. 13 Prediction accuracy visualization for the model trained for different number of epochs.
False negative pixels are colored pink and false positive pixels are colored blue

hugely aided by LLMs like ChatGPT 4. However, if the solution approach is already
decided, then the LLM should be instructed with details and required specifics to
expect desirable outcomes. Here, the solution approach is as per the example note-
book multiclass_segmentation_example.ipynb provided with [6] is
followed, and thus any room for assumptions must be eliminated by providing
detailed instructions. With appropriate prompts, it can be observed that ChatGPT
4 provides suitable Python codes for executing the task. The generated codes had
rare instances of syntax errors, and a few logical errors. In this section, these errors
shall be discussed. In addition, the changes made in the prompts and directly to
the generated codes to avoid these errors are also included. The errors are further
categorized as ‘major errors’ and ‘minor errors’.

4.1 Major Errors

• Extraction of images: If specific instructions on the commands for loading the


images from the directories are not provided, ChatGPT 4 changes the image
dimensions to the ones appropriate for common deep learning architectures like
‘ResNet50’. It also augments the images with random transformations and nor-
malizes them with inappropriate mean and standard deviation values.
For the user to ensure that the correct transformations are applied to the images,
the user must specify the required steps in the augmentation pipeline and their
176 R. Narkhede and B. Eidel

parameters, along with the correct mean and standard deviation values to normalize
the images. The prompts Figs. 3 and 4 ensure that the images are loaded with the
desired augmentations and normalization.
• Assigning masks to pixels: In the original data, the pixels are annotated in three
different colors to indicate the three phases in the microstructure. Without spe-
cific instructions for extracting the masks, ChatGPT 4 converts all the masks to
grayscale and then loads them to the data for performing the segmentation task.
Furthermore, ChatGPT 4 often overlooked the instruction to update the matrix
mask by assigning all the remaining pixels to it after the secondary and tertiary
masks had been assigned. The result was the matrix phase constituting all pixels
which were black. These errors result in incorrect labels to the sample images,
which shall eventually hinder the model training.
As a fix to this issue, very specific instructions are provided in the prompt Fig. 3
to extract masks from the directory and to assign them to the pixels. In case the
specific instruction for updating the matrix mask after assigning all other masks to
pixels was overlooked, the same prompt was provided again to the chatbot. Usually,
after providing the same prompt again, all the specific details are considered in
the code generation.
• Sticking to old versions of imported library methods: For some specific cases,
e.g. for using the IoU metric from the segmentation_models_pytorch
library, ChatGPT 4 used the syntax from the older versions. Until the latest syntax
of the functions was not provided explicitly through an example usage from the
library documentation, ChatGPT 4 continued to use the older version. This might
be possibly due to the fact that LLMs such as ChatGPT 4 are not trained frequently,
and the current version might be trained when the older versions of these libraries
existed.
By just providing an example usage in prompt Fig. 6, ChatGPT 4 could adapt its
code response to the latest syntax of the IoU metric.
• Model checkpointing: Despite specific instructions for model checkpointing in
prompt Fig. 8, ChatGPT 4 missed including regular model checkpointing after a
given number of epochs. This step is crucial if the training takes places over a high
number of epochs and requires long training periods.

4.2 Minor Errors

• Dimension errors: ChatGPT 4 produces syntactically correct code. However, it


is quite frequent to get dimension errors, mainly in the dimensions of the masks.
The underlying reason might be that ChatGPT 4 may not be effective in tracking
the changes in the dimensions of the masks when they are passed through multiple
functions throughout the data loading, pre-processing and training process.
An attempt to mitigate this error was to explicitly request ChatGPT 4 to ensure
that there are no dimension errors in the code it generates. This can be observed
Transfer Learning for Microstructure Image Segmentation 177

in the system prompt Fig. 2. However, despite such an instruction, the code had
some cases of dimension error. Another important step to reduce the number
of dimension errors, is to specify the shape of the image and mask files in the
prompt. This reduces the frequency of dimension errors, especially in loading
the data. Finally, ChatGPT 4 is instructed to raise errors if the dimensions are
inappropriate, for e.g. in the train_model function, before passing the masks
and the outputs to the loss functions.
• Moving variables to CPU before typecasting to NumPy arrays: Since the model
is allowed to work on the GPU for training, the PyTorch tensors and the model are
all mapped to the GPU. However, when certain functions or visualizations require
the variables as NumPy arrays, the PyTorch tensors are not always moved to the
CPU before converting them to NumPy arrays.
By providing specific instructions in the prompts, e.g. in prompts Figs. 2, 8 and 9,
this error could be eliminated.
• Reshaping PyTorch tensor dimensions: Two methods are commonly used to
reshape PyTorch tensors: torch.reshape and torch.Tensor.view. The
latter is not applicable to non-contiguous tensors, i.e. it is not generally applicable.
This posed errors in the loss function for the combined Dice and BCE losses.
ChatGPT 4 often overlooked the instruction to use torch.reshape.

5 Conclusion

Segmentation of microstructure images into the classes ‘matrix’, ‘secondary’ and


‘tertiary’ using a ResNet50 encoder pre-trained on the MicroNet dataset and a
UNet++ decoder was conducted using ChatGPT 4. The key steps in accomplish-
ing this were discussed and their corresponding codes were generated by providing
ChatGPT 4 with carefully drafted prompts. ChatGPT 4 provided promising results
in generating the code for such a complex task. Certain major and minor errors were
observed in the process, and their fixes were discussed likewise. An iterative process
was required to obtain the desired output since in machine learning tasks, there is no
definite solution, and thus the chatbot has a large room for making its own assump-
tions, if details are not provided. Here, the errors encountered in these iterations
could be tackled by adding specific instructions to the prompt. It was also observed
that GPT-4 overlooked some specific instructions of the prompts. The missed-out
instructions were taken into consideration by ChatGPT 4 after providing it with the
same prompt. Similar prompts as mentioned for this task can be applied to binary
or multi-class segmentation depending on the specific microstructures. Furthermore,
using transfer learning makes this approach robust for few-shot learning, i.e. for
training the model with a very small number of labeled microstructure images. As a
result, the predicted masks can be used to estimate at high accuracy the amount of
phases present in a given microstructure image and further analyses.
178 R. Narkhede and B. Eidel

References

1. Akers, S., Kautz, E., Trevino-Gavito, A., Olszta, M., Matthews, B., Wang, L., Du, Y. & Spurgeon,
S. Rapid and flexible segmentation of electron microscopy data using few-shot machine learning.
Npj Computational Materials. 7, 187 (2021,11,17), https://doi.org/10.1038/s41524-021-00652-
z
2. Stan, T., Thompson, Z. & Voorhees, P. Building towards a universal neural network to segment
large materials science imaging datasets. Developments In X-Ray Tomography XII. 11113 pp.
111131G (2019), https://doi.org/10.1117/12.2525290
3. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H. & He, Q. A Comprehensive
Survey on Transfer Learning. (2020)
4. He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. (2015)
5. Zhou, Z., Siddiquee, M., Tajbakhsh, N. & Liang, J. UNet++: A Nested U-Net Architecture for
Medical Image Segmentation. (2018)
6. Stuckner, J., Harder, B. & Smith, T. Microstructure segmentation with deep learning encoders
pre-trained on a large microscopy dataset. Npj Computational Materials. 8, 200 (2022,9,19),
https://doi.org/10.1038/s41524-022-00878-5
Elastic Wave Propagation

Muhammad Saad Qureshi and Bernhard Eidel

Abstract This chapter investigates the ability of ChatGPT to generate code in the
field of Computational Materials Science, specifically using GPT-4. The focus of this
study is on solving the wave equation in 2D, considering elastic wave propagation,
and visualizing the results using Python with a finite difference time domain (FDTD)
scheme. The wave equation is fundamental in understanding how waves, such as
sound or seismic waves, travel through different media. The FDTD method is a
numerical technique used to model wave propagation by discretizing both time and
space. This involves setting up a computational grid, applying initial and boundary
conditions, and iteratively solving the wave equation to simulate the wave’s behavior
over time. Visualization of the results helps in interpreting the physical phenomena
and verifying the accuracy of the computational model.

1 Introduction

Artificial intelligence, particularly in the form of Large Language Models (LLMs)


like ChatGPT has shown significant potential in various fields including code gen-
eration; for a pioneering work focusing on numerical methods see [1]. This chapter
explores ChatGPT’s capabilities for generating a Python code that solves the prob-
lem of 2D elastic wave propagation based on the Finite Difference Time Domain
(FDTD) method. In particular, this work aims at verifying the code, assessing its
accuracy and efficiency in simulating and visualizing the behavior of wave propaga-
tion. Overall, we highlight both the strengths and limitations of GPT-4 in handling
complex scientific problems.

M. S. Qureshi · B. Eidel (B)


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
M. S. Qureshi
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 181
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_8
182 M. S. Qureshi and B. Eidel

1.1 Wave Equation in 2D

The wave equation is a second-order linear partial differential equation (PDE)


describing the behavior of waves. In two dimensional (.x, . y) coordinate system,
it is defined as [4]  2 
∂2u 2 ∂ u ∂2u
. =c + 2 . (1)
∂t 2 ∂x2 ∂y

Considering elastic wave propagation in a square domain, .x and . y represent the


axes having equal length . N , .u represents the displacement
√ vector from rest position,
and .c represents the wave speed, calculated by .c = E/ρ, where . E is the Young’s
modulus, and .ρ is the mass density.

1.2 Finite Difference Time Domain (FDTD)

Finite difference methods are numerical techniques used to approximate the deriva-
tives using finite differences. FDTD method was introduced by Kane S. Yee for
numerically solving time-dependent Maxwell’s equations [2]. The method involves
discretizing the space and time partial derivatives based on central-difference approx-
imations [5]. Applying FDTD approximations to (1) we obtain exemplarily for the
derivatives of displacement component .u x

∂ 2u x u x (x + h, y; t) − 2u x (x, y; t) + u x (x − h, y; t)
. ≈ , (2)
∂x2 h2
∂ 2u x u x (x, y + h; t) − 2u x (x, y; t) + u x (x, y − h; t)
. ≈ , (3)
∂y 2 h2
∂ 2u x u x (x, y; t + t) − 2u x (x, y; t) + u x (x, y; t − t)
. ≈ . (4)
∂t 2 t 2
Without loss of generality, we consider a square domain with a uniform spa-
tial grid of size .h both in .x and . y directions and for uniform temporal dis-
cretization .t as the constant time step size. Using (2), (3), (4) in (1) along with
the notations .u(x, y; t) =: u nx,y , .u(x, y; t ± t) =: u n±1
x,y , .u(x ± h, y; t) =: u x±1,y ,
n

.u(x, y ± h; t) =: u x,y±1 , where .n refers to the current time .tn , and .n ± 1 to time
n

.tn ± t, one obtains

1  n+1  c2  n 
. u − 2u n
+ u n−1
= u x+1,y + u nx−1,y + u nx,y+1 + u nx,y−1 − 4u nx,y .
t 2 x,y x,y x,y
h 2
(5)
Elastic Wave Propagation 183

Solving for .ut+1


x,y based on .tn -data yields the final form of the 2D wave equation
discretized by FDTD

c2 t 2  n 
u n+1 =
. x,y
2
u x+1,y + u nx−1,y + u nx,y+1 + u nx,y−1 − 4u nx,y + 2u nx,y − u n−1
x,y . (6)
h
Furthermore, initial conditions (ICs), boundary conditions (BCs) have to be defined
for a full-fledged Initial Boundary Values Problem (IBVP); the numerical solution
scheme requires the choice of the discretization parameters .x, .y and .t.

1.3 Initial Conditions

Two types of excitation functions are considered to be used for defining ICs in the
domain.

• Sinusoidal function
.u(t) = A sin(2π f t + φ), (7)

where . A is the amplitude, . f is the frequency and .φ is the phase angle, see Fig. 1
(bottom).
• Gaussian function  
(t − b)2
.u(t) = a exp − , (8)
2c2

where .a is the height of the curve’s peak, .b is the center of the peak, and .c is the
standard deviation, see Fig. 1 (top).
184 M. S. Qureshi and B. Eidel

Fig. 1 Gaussian function with .a = 1, .b = 0, .c = 1, and sinusoidal function with . A = 1, . f = 1


2π ,
.φ = 0

1.4 Boundary Conditions

Two types of BCs are considered:


• Dirichlet or reflecting BC : no displacement at the boundaries, total reflection of
the waves
.u 0,y = u x,0 = u N ,y = u x,N = 0. (9)

• Mur or absorbing BC, first introduced for time-dependent Maxwell equations [3]:
total absorption at the boundaries, no reflection of the waves. In the discretized
setting, the Mur BC are [7]

h − ct  n+1 
u n+1 = u n1,y −
. 0,y u 1,y − u n0,y , (10)
h + ct
h − ct  n 
u n+1
. N ,y = u nN −1,y + N −1,y ,
u N ,y − u n+1 (11)
h + ct
h − ct  n+1 
u n+1
. x,0 = u nx,1 − u x,1 − u nx,0 , (12)
h + ct
Elastic Wave Propagation 185

h − ct  n 
u n+1 = u nx,N −1 +
. x,N x,N −1 .
u x,N − u n+1 (13)
h + ct

Here, (10) refers to waves travelling in .−x direction or the left side of the square
domain, (11) refers to waves travelling in .+x direction or the right side of the
square domain, (12) refers to waves travelling in .−y direction or the bottom side
of the square domain, (13) refers to waves travelling in .+y direction or the top
side of the square domain.

1.5 Courant-Friedrichs-Lewy Condition

For numerical solutions of PDEs involving explicit time integration, the convergence
condition of Courant-Friedrichs-Lewy (CFL) is a necessary condition for conver-
gence. As a consequence of the explicit time integrator FDTD, the time step must
be less than a certain upper bound, given a fixed spatial increment; here, with .cx and
.c y the wave velocities in . x and . y directions, the CFL condition for a uniform spatial
grid size .h can be given according to [6]

cx t c y t
. + ≤ Cmax , (14)
h h
which is met by a value for .Cmax of 1. The equality condition in (14) is exploited to
calculate the critical time step .tmax . To satisfy the CFL condition, the time step .t
must be either equal or smaller than .tmax .

1.6 Strains and Stresses

Once the updated displacement field is computed for the entire domain, elastic strains
are calculated using the gradient of the displacement field, and, based on that, stresses
using Hooke’s law. Isotropic, linear elastic material behavior is assumed to hold.
∂u x ∂u y ∂u y ∂u x
ε =
. x , εy = , γx y = + . (15)
∂x ∂y ∂x ∂y

Here, .εx and .ε y are the normal strain components, and .γx y is the shear strain.

E
σ
. xx = (εx + νε y ), (16)
1 − ν2
E
σ
. yy = (ε y + νεx ), (17)
1 − ν2
E
σ
. xy = (γx y ) . (18)
1+ν
186 M. S. Qureshi and B. Eidel

Here, .σx x and .σ yy are the normal stresses, and .σx y is the shear stress. . E is the Young’s
modulus and .ν is the Poisson’s ratio of the material. From the normal stresses and
the shear stress, principal stresses are calculated according to
 2
σx x + σ yy σx x − σ yy
.σ1 , σ2 = ± + σx2y . (19)
2 2

The algorithm for FDTD simulations of elastic wave propagation in 2D is given


in the Algorithm Box 1.

Algorithm 1: Elastic Wave Propagation and Stress Calculation using FDTD


Input : Grid size . N , grid spacing .h, material properties .(E 1 , ν1 , ρ1 ) for first half,
.(E 2 , ν2 , ρ2 ) for second half
Output: Stress distribution over time
1 Initialize grid size and time step based on CFL condition;
2 Initialize displacement fields .u nx,y , .u n−1 n+1
x,y , .u x,y to zero;
3 Initialize stress fields .σx x , .σ yy , .σx y to zero;
4 Define sources with positions, frequencies, and amplitudes;
5 for each time step .t do
6 for each grid point .(x, y) except boundaries do
7 Get material properties .(E, ν, c) based on position ;
8 Update displacement .u n+1
x,y using FDTD equation;

9 Apply source excitation at the source location;


10 if Boundary condition is ‘Mur’ then
11 Apply Mur boundary conditions considering wave speed;
12 else if Boundary condition is ‘Dirichlet’ then
13 Apply Dirichlet boundary conditions (zero displacement);

14 Update previous and current displacement fields;


15 Compute strains .εx , .ε y , .γx y as gradients of the displacement field;
16 for each grid point .(x, y) except boundaries do
17 Calculate stresses .σx x , .σ yy , .σx y using material properties and strains;

18 if Calculate principal stresses then


19 Calculate principal stresses .σ1 , .σ2 ;

20 for each time step .t do


21 if Calculate principal stresses then
22 Update the plots with .σ1 , .σ2 ;
23 else
24 Update the plots with .σx x , .σ yy , .σx y ;
25 Render the animation frame;
Elastic Wave Propagation 187

1.7 Problems

The proper functionality of the generated code for the wave equation model and
the FDTD solution scheme shall be tested in two problems. The problem defined in
Table 1 has a corresponding reference code available on the internet [7], hence all
the parameters and values are selected to match the reference code. Here, the time
step size is significantly lower than the critical time step. Since this problem merely
aims at a quantitative, numerical analysis of wave propagation, we can safely restrict
for the parameters on their dimensions instead of explicit units. The output from the
GPT-4’s code and the reference code is compared and discussed in Sect. 4.

Table 1 Problem 1: Settings for the displacement study


Parameters Values/Types Dimensions
Domain size (. N ) 200 .[L]

Domain type Homogeneous


Wave speed (.c) 0.2 .[L T
−1 ]

Spatial step size (.h) 1 .[L]

Critical time step size (.tmax ) 2.5 .[T ]

Time step size (.t) 1 .[T ]

BC Dirichlet and Mur


Type of excitation function Sinusoidal
Amplitude (. A) 80 .[L]

Frequency 1/50 .[T


−1 ]

Source point (100, 100) .[L]

Output Displacement animation

The problem in Table 2 adds further complexity to the previous problem in terms
of a heterogeneous domain, two types of excitation functions, strain and stress cal-
culations and two types of stress plots. Here, the time step size is equal to the critical
time step.
188 M. S. Qureshi and B. Eidel

Table 2 Problem 2: Settings for the stress study


Parameters Values/Types Units
Domain size (. N ) 200 mm
Domain type Heterogeneous
Upper half material Aluminum
Lower half material Chromium
Young’s modulus of chromium 280 GPa
Poisson’s ratio of chromium 0.2
Density of chromium 7100 kg/m.3
Young’s modulus of aluminium 70 GPa
Poisson’s ratio of aluminum 0.33
Density of aluminum 2700 kg/m.3
Spatial step size (.h) 1 mm
Critical time step size (.tmax ) 7.96e-5 s
Time step size (.t) Equal to .tmax s
BC Dirichlet and Mur
Type of excitation function Sinusoidal or Gaussian
Amplitude (. A) 80 mm
Frequency 1/50 Hz
Source point (100, 100) mm
Output Stress animation

2 Prompt

The final prompt is given in Fig. 2. It is used for generating the Code Listing 1. The
proper labelling of the plots had to be done manually.
For Problem 2: Stress Study, the level of flexibility aimed could not be captured
in a single prompt. The Code Listing 2 is generated iteratively. In addition to that,
manual intervention is carried out to label the plots.
Elastic Wave Propagation 189

Fig. 2 Prompt for Problem 1: displacement study


190 M. S. Qureshi and B. Eidel

Fig. 2 (continued)

3 Code Listing

1 import numpy as np
2 import matplotlib.pyplot as plt
3 from matplotlib.animation import FuncAnimation
4

5 # Simulation parameters
6 size = 200 # domain size
7 dx = dy = 1.0 # spatial discretization
8 dt = 1.0 # time step
9 c = 0.2 # wave speed
10

11 # Create a spatial grid


12 x = np.linspace(0, size-1, size)
13 y = np.linspace(0, size-1, size)
14 X, Y = np.meshgrid(x, y)
15

16 # Initialize fields
17 u = np.zeros((size, size)) # current wave field
18 u_prev = np.zeros((size, size)) # previous wave field
19 u_next = np.zeros((size, size)) # next wave field
20

21 # Source parameters
22 source_amplitude = 80
23 source_frequency = 0.04
24 source_position = (100, 100)
25

26 def source(t):
27 """ Returns the source value at time t """
28 return source_amplitude * np.sin(2 * np.pi * source_frequency
→ * t)
29

30 def update(u, u_prev, t, boundary_condition=’Mur’):


31 """ Update wave field for the next time step using the FDTD
→ method """
32 # Core update (finite difference)
33 for i in range(1, size-1):
34 for j in range(1, size-1):
35 u_next[i, j] = (c**2 * dt**2 / dx**2) * (u[i+1, j] +
→ u[i-1, j] + u[i, j+1] + u[i, j-1] - 4*u[i, j]) + 2*u[i, j]
→ - u_prev[i, j]
36
Elastic Wave Propagation 191

37 # Apply source
38 u_next[source_position] += source(t)
39

40 # Apply boundary conditions


41 if boundary_condition == ’Mur’: # Absorbing boundary
42 for j in range(size):
43 # Left and right boundaries
44 u_next[0, j] = u[1, j] - (dx-c*dt)/(dx+c*dt) * (
→ u_next[1, j] - u[0, j])
45 u_next[-1, j] = u[-2, j] - (dx-c*dt)/(dx+c*dt) * (
→ u_next[-2, j] - u[-1, j])
46 # Top and bottom boundaries
47 u_next[j, 0] = u[j, 1] - (dx-c*dt)/(dx+c*dt) * (
→ u_next[j, 1] - u[j, 0])
48 u_next[j, -1] = u[j, -2] - (dx-c*dt)/(dx+c*dt) * (
→ u_next[j, -2] - u[j, -1])
49 elif boundary_condition == ’Dirichlet’: # Reflecting
→ boundary
50 u_next[0, :] = u_next[-1, :] = u_next[:, 0] = u_next[:,
→ -1] = 0
51

52 # Update arrays for next iteration


53 u_prev[:, :] = u[:, :]
54 u[:, :] = u_next[:, :]
55

56 def animate(i):
57 """ Animation function """
58 update(u, u_prev, i, boundary_condition=’Mur’) # Toggle ’Mur
→ ’ or ’Dirichlet’
59 img.set_data(u)
60 img.set_clim(-source_amplitude, source_amplitude)
61 return img,
62

63 # Setup plot for animation


64 fig, ax = plt.subplots()
65 img = ax.imshow(u, origin=’lower’, cmap=’viridis’,
66 interpolation=’nearest’, animated=True, vmin=-
→ source_amplitude, vmax=source_amplitude)
67 ax.set_title("Wave Propagation in homogeneous medium", fontsize =
→ 12)
68 ax.set_xlabel("x-axis", fontsize = 12)
69 ax.set_ylabel("y-axis", fontsize = 12)
70 cbar = fig.colorbar(img, ax=ax)
71 cbar.set_label(’Wave Amplitude’)
72

73 # Create animation
74 ani = FuncAnimation(fig, animate, frames=1000, interval=50, blit=
→ True)
75

76 # Optionally save the animation


77 #ani.save(’wave_propagation2.mp4’, writer=’ffmpeg’)
78 plt.show()
Code Listing 1 Output by ChatGPT for solving Problem 1: Displacement Study
192 M. S. Qureshi and B. Eidel

1 # -*- coding: utf-8 -*-


2 """
3 Created on Fri Jul 12 19:15:36 2024
4

5 @author: Saad Qureshi


6 """
7

8 import numpy as np
9 import matplotlib.pyplot as plt
10 from matplotlib import animation, rc
11

12 # Constants and Material Properties


13 E1 = 280e9 # Young’s modulus in Pascals for first half (Chromium
→ )
14 nu1 = 0.2 # Poisson’s ratio for first half
15 rho1 = 7100 # Density in kg/m^3 for first half
16 E2 = 70e9 # Young’s modulus in Pascals for second half (
→ Aluminum)
17 nu2 = 0.33 # Poisson’s ratio for second half
18 rho2 = 2700 # Density in kg/m^3 for second half
19 c1 = np.sqrt(E1 / rho1) # Wave speed for first half
20 c2 = np.sqrt(E2 / rho2) # Wave speed for second half
21

22 domain_size = 200
23 dx = 1.0
24 dy = 1.0
25 dt = dx / (max(c1, c2)*(2)) # CFL condition for stability
26

27 calculate_principal_stresses = True
28 boundary_type = ’Mur’ # Can be ’Mur’ or ’Dirichlet’
29 excitation_type = ’gaussian’ # Can be ’sinusoidal’ or ’gaussian’
30

31 # Initialize fields
32 x = np.linspace(0, domain_size-1, domain_size)
33 y = np.linspace(0, domain_size-1, domain_size)
34 u = np.zeros((domain_size, domain_size))
35 u_prev = np.zeros((domain_size, domain_size))
36 u_next = np.zeros((domain_size, domain_size))
37 stress_xx = np.zeros((domain_size, domain_size))
38 stress_yy = np.zeros((domain_size, domain_size))
39 stress_xy = np.zeros((domain_size, domain_size))
40

41 # Source definitions
42 sources = [{"position": (100, 100), "frequency": 0.04, "amplitude
→ ": 80}]
43

44 def source(t, frequency, amplitude, excitation_type):


45

46 if excitation_type == ’sinusoidal’:
47 omega = 2 * np.pi * frequency
48 return amplitude * np.sin(omega * t)
Elastic Wave Propagation 193

49 elif excitation_type == ’gaussian’:


50 return amplitude * np.exp(-((t - 5) ** 2) / (2 * (0.5
→ ** 2)))
51

52 # Function to get properties based on location


53 def get_properties(i):
54 if i < domain_size // 2:
55 return E1, nu1, c1
56 else:
57 return E2, nu2, c2
58

59 # Update function with Mur boundary conditions adjusted per wave


→ speed
60 def update(t):
61 global u, u_prev, u_next, stress_xx, stress_yy, stress_xy
62 # Apply dynamic updates within the domain
63 for i in range(1, domain_size-1):
64 for j in range(1, domain_size-1):
65 E, nu, c = get_properties(i)
66 u_next[i, j] = (2 * u[i, j] - u_prev[i, j] +
67 c**2 * dt**2 / dx**2 * (u[i+1, j] + u
→ [i-1, j] + u[i, j+1] + u[i, j-1] - 4 * u[i, j]))
68

69 # Apply sources
70 for source_info in sources:
71 src_x, src_y = source_info["position"]
72 u_next[src_x, src_y] += source(t, source_info["frequency"
→ ], source_info["amplitude"], excitation_type)
73

74 # Apply Mur boundary conditions correctly for wave speed


75 if boundary_type == ’Mur’:
76 for j in range(1, domain_size-1):
77 c_left = get_properties(1)[2]
78 c_right = get_properties(domain_size-2)[2]
79 u_next[0, j] = u_prev[1, j] + (c_left * dt - dx) / (
→ c_left * dt + dx) * (u_next[1, j] - u_prev[0, j])
80 u_next[-1, j] = u_prev[-2, j] + (c_right * dt - dx) /
→ (c_right * dt + dx) * (u_next[-2, j] - u_prev[-1, j])
81

82 for i in range(1, domain_size-1):


83 c_top = get_properties(i)[2]
84 c_bottom = get_properties(i)[2]
85 u_next[i, 0] = u_prev[i, 1] + (c_bottom * dt - dy) /
→ (c_bottom * dt + dy) * (u_next[i, 1] - u_prev[i, 0])
86 u_next[i, -1] = u_prev[i, -2] + (c_top * dt - dy) / (
→ c_top * dt + dy) * (u_next[i, -2] - u_prev[i, -1])
87

88 elif boundary_type == ’Dirichlet’:


89 u_next[0, :] = 0
90 u_next[-1, :] = 0
91 u_next[:, 0] = 0
92 u_next[:, -1] = 0
93
194 M. S. Qureshi and B. Eidel

94 u_prev, u = u, u_next.copy()
95

96 # Recalculate stresses
97 grad_u_x = np.gradient(u, axis=0) # Gradient along x-axis
98 grad_u_y = np.gradient(u, axis=1) # Gradient along y-axis
99 for i in range(1, domain_size-1):
100 E, nu, _ = get_properties(i)
101 stress_xx[i, 1:-1] = E / (1 - nu**2) * (grad_u_x[i, 1:-1]
→ + nu * grad_u_y[i, 1:-1])
102 stress_yy[i, 1:-1] = E / (1 - nu**2) * (grad_u_y[i, 1:-1]
→ + nu * grad_u_x[i, 1:-1])
103 stress_xy[i, 1:-1] = E / (2 * (1 + nu)) * (grad_u_x[i,
→ 1:-1] + grad_u_y[i, 1:-1])
104

105 if calculate_principal_stresses:
106 # Calculate principal stresses
107 sigma_avg = (stress_xx + stress_yy) / 2
108 sigma_diff = (stress_xx - stress_yy) / 2
109 R = np.sqrt(sigma_diff**2 + stress_xy**2)
110 principal_stress_1 = sigma_avg + R
111 principal_stress_2 = sigma_avg - R
112 return principal_stress_1, principal_stress_2
113 else:
114 return stress_xx, stress_yy, stress_xy
115

116 # Define the animation function and plot setup


117 # Define the animation function and plot setup
118 fig, ax = plt.subplots(1, 3, figsize=(18, 6))
119

120 # Initial settings for stress_xx plot


121 im1 = ax[0].imshow(stress_xx, origin=’lower’, extent=[0,
→ domain_size * dx, 0, domain_size * dy], cmap=’viridis’,
→ vmin = np.min(stress_xx), vmax = np.max(stress_xx))
122 cbar1 = fig.colorbar(im1, ax=ax[0])
123 cbar1.set_label(’Stress’)
124 ax[0].set_title(’Stress 1’)
125 ax[0].set_xlabel(’x axis (mm)’)
126 ax[0].set_ylabel(’y axis (mm)’)
127

128 # Initial settings for stress_yy plot


129 im2 = ax[1].imshow(stress_yy, origin=’lower’, extent=[0,
→ domain_size * dx, 0, domain_size * dy], cmap=’viridis’,
→ vmin = np.min(stress_yy), vmax = np.max(stress_yy))
130 cbar2 = fig.colorbar(im2, ax=ax[1])
131 cbar2.set_label(’Stress’)
132 ax[1].set_title(’Stress 2’)
133 ax[1].set_xlabel(’x axis (mm)’)
134 ax[1].set_ylabel(’y axis (mm)’)
135

136 # Initial settings for stress_xy plot


137 im3 = ax[2].imshow(stress_xy, origin=’lower’, extent=[0,
→ domain_size * dx, 0, domain_size * dy], cmap=’viridis’,
→ vmin = np.min(stress_xy), vmax = np.max(stress_xy))
Elastic Wave Propagation 195

138 cbar3 = fig.colorbar(im3, ax=ax[2])


139 cbar3.set_label(’Stress’)
140 ax[2].set_title(’Stress 3’)
141 ax[2].set_xlabel(’x axis (mm)’)
142 ax[2].set_ylabel(’y axis(mm)’)
143

144 # Main title for the figure


145 fig.suptitle(’Stress Distribution Over Time’, fontsize=16)
146

147 # Function to update plots


148 def animate(t):
149 if calculate_principal_stresses:
150 stress_1, stress_2 = update(t)
151 im1.set_data(stress_1)
152 im2.set_data(stress_2)
153 im3.set_data(np.zeros_like(stress_1)) # Placeholder if
→ only two plots needed
154 else:
155 stress_xx, stress_yy, stress_xy = update(t)
156 im1.set_data(stress_xx)
157 im2.set_data(stress_yy)
158 im3.set_data(stress_xy)
159 return im1, im2, im3
160

161 ani = animation.FuncAnimation(fig, animate, frames=200, interval


→ =50, blit=True)
162 plt.tight_layout()
163 ani.save(’stress_ani2.mp4’, writer=’ffmpeg’)
164 plt.show()
Code Listing 2 Output by ChatGPT for solving Problem 2: Stress Study

4 Tests for Verification

For the displacement study in Problem 1, a reference code [7] –equally based on the
FDTD method– is used to test the output of GPT-4’s code. The problem parameters
as listed in Table 1 are chosen exactly the same for a proper comparison.

4.1 Test 1

The output is compared for both types of BCs.


Figure 3 indicates total reflection for reflection BCs. Vice versa, Fig. 4 indicates
total absorption for absorbing BCs. Furthermore, the same pattern emerge for both
196 M. S. Qureshi and B. Eidel

the cases. Hence, it is concluded that the ICs, BCs and the FDTD method are working
as intended.

Fig. 3 Snapshot of the


amplitude animation in case
of reflecting BCs

(a) Reference code

(b) GPT-4 code


Elastic Wave Propagation 197

Fig. 4 Snapshot of the


amplitude animation in case
of absorbing BCs

(a) Reference code

(b) GPT-4 code

4.2 Test 2

For the stress study in Problem 2, the solution shall be tested for plausibility for a
layered domain in terms of stresses and strains. Note that there are no units associated
with the colormap. In the code, the minimum and maximum of colormap are defined
as the minimum and maximum of the respective stress, but due to the dynamic
nature of the plot, the program is unable to provide determine a useful range of the
colormap. From the Code Listing 8.2, the correct calculation of strain and stress
are easily verified. Furthermore, Fig. 5 indicates that the stress field travels faster in
198 M. S. Qureshi and B. Eidel

(a) Normal stresses and

(b) Principal stresses 1 and 2

Fig. 5 Stress contour plots for Gaussian excitation along with Mur BCs and a higher speed of
sound in the lower half of the simulation domain

the lower half (chromium) of the domain than in the upper half (aluminum), which
is consistent with chromium having a higher wave speed constant, 6280 m/s, than
aluminum, 5090 m/s.
These conclusions can further be applied to the other excitation function and BC.

5 Discussion

The methodology to get the desired output is:


• Start out with a detailed prompt, covering all the features required in the code.
• Test the code. In case of errors, convey the errors to GPT-4 and what should be
the correct output. Keep doing these iterations until the code with correct output
is generated.
Elastic Wave Propagation 199

• Finally modify the prompt with the knowledge gained from the iterations. Ask
GPT-4 to specially focus in areas where most frequent errors occurred. Briefly
explain the structure of the required code in terms of different functions etc., using
the knowledge of the code with correct output.

The methodology mentioned above works for the displacement study of Problem
1, but does not work properly for the stress study in a heterogeneous domain referred
to as Problem 2; the reason is the multitude of features that are requested seemingly
can not be successfully captured in one prompt in a way that it is processed by the
Chatbot.
For that reason, the set of Problem 2 calls for a solution strategy of prompt engi-
neering, where all the required features in the code are communicated in several
prompts. All attempts to generalize these successive prompts have failed, since they
resulted in a large variety of different code outputs generated by GPT-4 in each try.
It is this kind of instability to generate for the same prompt a unique code of granted
quality, which forbids to present a prompt for Problem 2 in this chapter. Instead we
restrict to the generated code that, finally, solved the problem.
Some of the important aspects that are noticed while working on the prompts,
generating code and testing are:

• GPT-4 frequently uses variables that were not defined in the code. Hence, a
NameError shows up while running the code.
• When a detailed prompt is provided to GPT-4, it sometimes ignores certain fea-
tures completely. Sometimes it defines a certain function only as a comment line
describing its usage, and the implementation part is left empty.
• Another common error of GPT-4 is not to define the origin of the domain correctly.
The .x-axis of the plot would start from the bottom left corner of the square and
the . y-axis from the top left corner of the square.
• GPT-4 makes a lot of errors regarding the animation part of the code. A lot of
debugging and iterations are required to make that part of the code function as
intended.
• Even if an expression is explicitly defined, GPT-4 would sometimes use a different
one. For example, CFL formula to calculate .t would be explained in the prompt,
but GPT-4 would use a different formula.
• A recurring issue is GPT-4 coming to a halt while generating code. The speed of
the text generation is very slow.
• Reproducibility remains a severe issue for GPT-4. While testing the prompt men-
tioned in Code Listing 8.1, the correct output is generated most of the times, but
not always. In case there are some errors in the output code, regenerating the code
usually results in removal of those errors.
• GPT-4 makes a lot of errors while implementing absorbing BC, which would result
in reflection of waves at the boundaries. To fix this, the discretized formula (10)
is included in the prompt as a reference, which resolved the issue.
200 M. S. Qureshi and B. Eidel

6 Conclusion

GPT-4 was used to numerically solve the 2D wave equation using finite difference
time domain scheme. As it turned out, GPT-4 exhibits enough domain expertise both
for the wave problem and the numerical method to generate a correct code solution
for a wave propagation displacement study along with absorbing as well as reflecting
boundary conditions. In a problem of wave propagation in a heterogeneous medium
along with stress computation, hence a task of increased complexity, GPT-4 revealed
deficits which called for manual/human intervention. Despite its demonstrated per-
formance in code generation, GPT-4 showed its current limitations in this project;
it suffers (i) from unstable code generation, where exactly the same prompt results
in different outputs, and (ii) from the inability to account for all the different tasks
in a prompt for complex problems. These disadvantages make formulating single
prompts for complex problems difficult, and make successive prompts and manual
intervention unavoidable.

References

1. Kashefi, A., Mukerji, T. ChatGPT for programming numerical methods. Journal of Machine
Learning for Modeling and Computing. 4 pp. 1–74 (2023), http://dx.doi.org/10.1615/
JMachLearnModelComput.2023048492
2. Yee, K. Numerical solution of initial boundary value problems involving maxwell’s equations
in isotropic media. Institute of Electrical and Electronics Engineers (IEEE). 14 pp. 302–307
(1966), http://dx.doi.org/10.1109/TAP.1966.1138693
3. Mur, G. Absorbing Boundary Conditions for the Finite-Difference Approximation of the Time-
Domain Electromagnetic-Field Equations. IEEE Transactions on Electromagnetic Compatibil-
ity. EMC-23 pp. 377–382 (1981), http://dx.doi.org/10.1109/TEMC.1981.303970
4. Agarwal, R.P., O’Regan, D. Two-Dimensional Wave Equation. In: Ordinary and Partial Differ-
ential Equations. Springer, New York, NY. pp. 292–299 (2009), https://doi.org/10.1007/978-0-
387-79146-3_37
5. Linge, S., Langtangen, H.P. Wave Equations. In: Finite Difference Computing with PDEs. Texts
in Computational Science and Engineering, vol 16. Springer, Cham. pp. 93–205 (2017), https://
doi.org/10.1007/978-3-319-55456-3_2
6. Wikipedia contributors. (https://en.wikipedia.org/wiki/Courant-Friedrichs-Lewy_condition),
[Online; accessed July-2024]
7. Hill, C. (https://scipython.com/blog/the-two-dimensional-wave-equation/), [Online; accessed
July-2024]
Electromagnetic Wave Propagation
in Dielectric Media

Tejas Viresh Anvekar and Bernhard Eidel

Abstract This study leverages the Conditional Generative Pre-trained Transformer,


specifically ChatGPT-4, to develop a Python-based application for analyzing the
propagation of electromagnetic waves in dielectric media using the two-dimensional
Finite-Difference Time-Domain (FDTD) approach. The primary aim is to examine
how electromagnetic waves interact with various dielectric environments, focusing
on their reflection, transmission, and absorption properties. This work aims to exploit
the capabilities of ChatGPT-4 to craft an accurate simulation tool. The performance
of the generated FDTD simulations by ChatGPT-4 is evaluated. The findings suggest
that ChatGPT-4 successfully creates FDTD program codes that conform to expected
physical outcomes, albeit with slight variances. This research underscores the poten-
tial and accuracy of AI in handling sophisticated electromagnetic simulation tasks.

1 Introduction

The study of electromagnetic wave propagation through various media is essential for
developments in telecommunications, radar, and electronic systems engineering. The
Finite-Difference Time-Domain (FDTD) method, pioneered by Yee in 1966, offers a
dynamic numerical technique for solving Maxwell’s equations in the time-domain.
This method is highly regarded for its direct algorithmic form, which is capable of
managing complex boundary conditions (BC) and varying media properties.
With the advancement of computational methods, FDTD has been widely applied
in conjunction with fluid dynamics concepts to enhance the study of wave propa-
gation in various media. Fluid dynamics principles, such as those encapsulated by
the Boltzmann equation, allow for the modeling of wave interactions in complex

T. V. Anvekar · B. Eidel (B)


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
T. V. Anvekar
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 201
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_9
202 T. V. Anvekar and B. Eidel

media. By coupling fluid dynamic behavior (through the distribution of particles or


fields) with electromagnetic wave propagation, we gain a deeper understanding of
the interaction of waves with different materials. The Boltzmann-FDTD method,
which bridges the dynamics of particles with wave behavior, captures essential char-
acteristics such as reflection, refraction, and absorption. This approach, grounded in
fluid dynamics, enhances the accuracy of simulations by incorporating the underly-
ing physical properties of the medium, offering insights into energy transfer, wave
dynamics, and material interactions.
With the advent of artificial intelligence (AI), particularly advancements in
machine learning and algorithm generation, new possibilities have emerged to
enhance and automate numerical simulations. The Conditional Generative Pre-
trained Transformer, ChatGPT-4, by OpenAI, is a testament to these advancements,
providing sophisticated language comprehension and code generation capabilities.
This work employs ChatGPT-4 to devise a Python program that simulates electro-
magnetic wave interactions in dielectric media using the FDTD method.
The primary goal of this research is to create a comprehensive simulation platform
to study how electromagnetic waves, specifically Gaussian pulses and plane waves,
interact within different dielectric environments. These wave types are chosen for
their relevance in practical applications and their distinct propagation characteris-
tics, which are crucial for a detailed study of wave phenomena including reflection,
transmission, absorption, and scattering.
One of the key challenges in simulating electromagnetic wave propagation is
choosing the appropriate dimensionality of the model. While 3D simulations offer
the most detailed analysis, they are computationally expensive and time-consuming.
To strike a balance between accuracy and computational efficiency, this work tran-
sitions from a 3D model to a 2D cross-sectional simulation. The 2D model captures
essential phenomena such as reflection, refraction, and absorption while significantly
reducing the computational burden. This approach is illustrated in Fig. 1, where the
3D simulation box with a spherical dielectric medium is simplified to a 2D cross-
sectional view.
The dielectric medium used in this study is central to the investigation of electro-
magnetic wave behavior. The medium is characterized by a specific relative permittiv-
ity and conductivity, which are chosen to allow detailed observation of critical wave
phenomena. The dielectric cylinder placed at the center of the simulation domain,
as shown in Fig. 1, serves as a test object to visualize reflection, refraction, and
absorption of electromagnetic waves. These phenomena are not only fundamental to
understanding wave-material interactions but are also crucial for the development of
practical devices in communications and radar systems.
This work also explores various BC, particularly the Perfectly Matched Layer
(PML), to minimize reflections at the boundaries of the computational domain, which
is vital for accurate simulation of open-region electromagnetic problems. By integrat-
ing ChatGPT-4 in the development process, this study seeks to merge computational
electromagnetics with AI, enabling rapid prototyping and validation of electromag-
netic models. The simulations produced by ChatGPT-4 will be rigorously compared
with analytical models and established FDTD software outputs to assess their accu-
Electromagnetic Wave Propagation in Dielectric Media 203

(a) 3D Simulation Box (b) 2D Cross-Sectional View

Fig. 1 Transition from 3D to 2D simulation for efficient analysis. The left image a shows a 3D
simulation setup, while the right image b illustrates the 2D cross-sectional view used for the simu-
lation with a cylindrical dielectric medium with a diameter (.d) of 20 cm. The dielectric cylinder is
positioned at the center of the domain. The Gaussian source emits in all directions

racy and effectiveness. This investigation not only evaluates the reliability of AI in
generating viable scientific computing algorithms but also examines its potential to
simplify complex simulation tasks [1, 6, 7].

1.1 Electro-Magnetic Theory

Maxwell’s equations form the foundation of classical electromagnetism, describing


how electric and magnetic fields propagate and interact with matter. In a dielectric
medium, these equations can be expressed in a differential form as follows [1]:

∇ · D = ρ,
. (1)
. ∇ · B = 0, (2)
∂B
.∇ × E = − , (3)
∂t
∂D
.∇ × H = J + , (4)
∂t
where. E is the electric field,. H is the magnetic field,. D is the electric flux density,. B is
the magnetic flux density, .ρ is the charge density, and . J is the current density. These
equations can be discretized for computational purposes using the FDTD method.
204 T. V. Anvekar and B. Eidel

Fig. 2 Illustration of a Transverse Magnetic (TM) wave. The magnetic field is confined to the .x-. y
plane, while the electric field has a component in the .z direction

In a 2D FDTD simulation, we consider the TM (Transverse Magnetic) waves,


where the electric field has only a .z-component (. E z ), and the magnetic fields have
. x and . y components (. Hx , . H y ). In this mode, the electric field . E z is oriented along
the .z-axis, while the magnetic field components . Hx and . Hy are oriented in the .x and
. y directions, respectively, and are transverse to the direction of wave propagation as

visualized in Fig. 2. The Maxwell’s curl equations for this mode are

∂ Hx 1 ∂ Ez
. =− , (5)
∂t μ ∂y
∂ Hy 1 ∂ Ez
. = , (6)
∂t μ ∂x
 
∂ Ez 1 ∂ Hy ∂ Hx
. = − , (7)
∂t  ∂x ∂y

where . is the permittivity and .μ is the permeability of the medium. These equations
are discretized in both time and space, allowing for the fields to be updated iteratively
at each time step.
Electromagnetic Wave Propagation in Dielectric Media 205

1.2 Dimensionality Reduction

To address computational complexity and memory requirements in FDTD simu-


lations, it is standard practice to implement these simulations on a 2-D lattice. In
this work, we assume that the field components are independent of the .z coordinate
.(∂/∂z = 0). Under this assumption, the plane electromagnetic field can be decom-
posed into transverse electric (TE) and transverse magnetic (TM) waves. These wave
modes are characterized by the following relations [3]:

TE : E z = 0,
. Hx = Hy = 0 (8)
TM : Hz = 0,
. Ex = E y = 0 (9)

Considering the TM wave, the update equation for the electric field component
. E z is
 
n+1 (i, j) = E n (i, j) + t
H yn (i, j) − H yn (i − 1, j) Hxn (i, j) − Hxn (i, j − 1)
. Ez z − .
 x y
(10)
For the magnetic field components . Hx and . Hy the update equations are

t E zn (i, j + 1) − E zn (i, j)
. Hxn+1/2 (i, j) = Hxn−1/2 (i, j) − , (11)
μ y

t E zn (i + 1, j) − E zn (i, j)
. Hyn+1/2 (i, j) = Hyn−1/2 (i, j) + . (12)
μ x

Here, .x and .y represent the spatial steps in the .x and . y directions, respectively,
while .t denotes the time step size. The indices .i and . j refer to the grid points
in space, and .n indicates the time step. The magnetic field components . Hx and . Hy
are staggered with respect to the electric field component . E z . This staggering is
crucial for accurately calculating the fields at each time step, following the Yee cell
structure, which is a standard approach in FDTD simulations. The interleaving of
these fields, as depicted in Fig. 3, ensures that the electromagnetic fields propagate
correctly throughout the computational domain. These update equations are applied
iteratively to achieve this propagation.
206 T. V. Anvekar and B. Eidel

Fig. 3 Interleaving of the. E and. H fields for the two-dimensional TM formulation, sketch adopted
from [4]

1.3 Absorbing Boundary Conditions (BC)

In FDTD simulations, accurate boundary conditions (BC) are essential to minimize


reflections from the edges of the computational domain, which can interfere with
the results. One of the most effective methods for absorbing outgoing waves is the
perfectly matched layer (PML) BC.
The PML is designed to absorb electromagnetic waves at the boundary of the
simulation domain, preventing them from reflecting back into the interior. This is
achieved by gradually increasing the material’s conductivity in the PML region,
which attenuates the wave as it propagates through the layer. The key to the effec-
tiveness of PML is that it matches the impedance of the interior domain, ensuring
minimal reflection.

1.3.1 Updating with PML

The implementation of PML in a 2D FDTD simulation involves modifying the update


equations for the electric and magnetic fields in the PML region. The electric and
magnetic field components are split into two auxiliary variables, which are updated
separately. Maxwell’s equations can be updated as follows [5]
Electromagnetic Wave Propagation in Dielectric Media 207

Dzn+1/2 (i, j) = gi3 (i)g j3 ( j)Dzn−1/2 (i, j) + 0.5gi2 (i)g j2 ( j)


    
1 1
× Hyn i + , j − Hyn i − , j
. 2 2 (13)
   
1 1
− Hx i, j +
n
+ Hx i, j −
n
.
2 2

As illustrated in Fig. 4, the parameter distribution within the PML ensures that outgo-
ing waves are effectively absorbed. The overlapping regions at the corners indicate
where both sets of parameters intersect, which is crucial for minimizing reflections.
The parameters .gi2 , gi3 , g j2 , g j3 are given by

1
g = g j2 =
. i2 (14)
σ t
1+
20
σ t
20
g = g j3
. i3 = (15)
σ t
1+
20

Fig. 4 This figure illustrates the parameter distribution within the PML (Perfectly Matched Layer)
region of a 2D FDTD simulation. The overlapping regions at the corners indicate where both sets
of parameters intersect, ensuring effective absorption of outgoing waves
208 T. V. Anvekar and B. Eidel

The curl of the electric field can be given by

∇ × E = E xn+1/2 (i + 1, j) − E xn+1/2 (i, j).


. (16)

The incident magnetic field in the . y-direction can be given by


   
1 1
n+1/2
. Hy i + , j = Hy
n−1/2
i + , j + ∇ × E. (17)
2 2

The total magnetic field in the . y-direction can be evaluated recursively as follows
     
1 1 1
Hyn+1/2 i + , j = gi3 i + Hyn i + , j
2 2 2
.     (18)
1 1
− 0.5gi2 i + ∇ × E − gi1 ( j)Hyn i + , j ,
2 2

where .gi1 is given by:


σ t
g =
. i1 . (19)
20

The incident magnetic field in the .x-direction is given by


   
n+1/2 1 n−1/2 1
I
. H i, j + = I Hx i, j + + ∇ × E. (20)
x
2 2

The total magnetic field in the .x-direction can be obtained by


     
1 1 1
Hxn+1 i, j + = g j3 j + Hx i, j +
n
2 2 2
.    
1 n+1/2 1
+ 0.5 g j2 j + ∇ × E + g j1 ( j)I Hy i, j + .
2 2
(21)

1.3.2 Parameters for PML

The effectiveness of the PML depends on the choice of parameters such as the con-
ductivity profile. A common approach is to use a polynomial grading of conductivity,
which gradually increases from zero at the interface to a maximum value at the outer
edge of the PML. The conductivity .σ in the PML can be expressed as [4]
 m
x
σ (x) = σmax
. , (22)
dPML
Electromagnetic Wave Propagation in Dielectric Media 209

where .dPML is the thickness of the PML, .σmax is the maximum conductivity, and .m
is the grading order. By carefully selecting these parameters, the PML can effec-
tively absorb outgoing waves over a broad range of angles and frequencies, ensuring
minimal reflection and accurate simulation results.

1.4 Gaussian Sinusoidal Pulse

In FDTD simulations, a Gaussian sinusoidal pulse is often used as a source due


to its localized and broadband nature. The pulse is defined by a Gaussian enve-
lope modulating a sinusoidal carrier wave, which can be mathematically represented
as [4]:  
(t − t0 )2
. E(t) = E 0 exp − sin(2π f c t), (23)
2σ 2

where . E 0 is the peak amplitude, .t0 is the time delay, .σ is the pulse width, and . f c is
the central frequency of the sinusoidal wave. The Gaussian envelope ensures that the
pulse is localized in time, while the sinusoidal component determines its frequency
content.
In the context of FDTD, the source can be introduced into the simulation domain
by updating the electric field component at a specific grid point. This allows the study
of the pulse’s interaction with various media and boundaries.
This implies that changes in energy density within a volume are due to the
net flux of energy across the volume’s boundaries. The stabilization of energy in
the graph indicates conservation of internal energy after accounting for boundary
absorption [2].
In this work, a circular dielectric medium is introduced into the 2D FDTD com-
putational domain to investigate the interaction of electromagnetic waves with a
dielectric object as illustrated in Fig. 1. The dielectric cylinder is characterized by a
relative permittivity (.r ) of 30 and a conductivity (.σ ) of 0.3 S m.−1 . The diameter of the
cylinder is 20 cm, representing both lossless and lossy linear isotropic homogeneous
media (LIHMs).
A lossless linear isotropic homogeneous medium (LIHM) is one where the mate-
rial has zero conductivity (.σ = 0 S m.−1 ). In such a medium, electromagnetic waves
can propagate without any energy loss. The material’s properties, like relative per-
mittivity (.r ), affect the speed and direction of the wave, but not its amplitude.
In contrast, a lossy LIHM is a material with non-zero conductivity (.σ > 0 S m.−1 ).
In this case, the medium absorbs some of the electromagnetic wave’s energy, leading
to attenuation of the wave as it propagates. The dielectric cylinder in this work,
characterized by .σ = 0.3 S m.−1 , is an example of a lossy medium, which allows the
study of both idealized (lossless) and realistic (lossy) conditions.
210 T. V. Anvekar and B. Eidel

The presence of the dielectric cylinder affects the propagation of the electromag-
netic waves, causing phenomena such as reflection, refraction, and scattering. These
effects are crucial for understanding the behavior of waves in real-world applications,
such as radar and wireless communications.
The update equations for the FDTD simulation in the presence of a dielectric
medium are modified to account for the material properties. For the electric field
component . E z , the update equation reads

σ t 
1− t Hyn (i, j) − Hyn (i − 1, j)
n+1
.Ez (i, j) = 2
σ t
E zn (i, j) + 
1+  1 + σ2t x
2

Hxn (i, j) − Hxn (i, j − 1)
− , (24)
y

where . is the permittivity of the dielectric medium, .t is the time step, and .x and
y are the spatial steps in the .x and . y directions, respectively.
.
For the magnetic field components . Hx and . Hy , the update equations remain the
same as in the free space case, as the magnetic permeability is assumed to be that of
free space (.μ = μ0 )

t E zn (i, j + 1) − E zn (i, j)
. Hxn+1/2 (i, j) = Hxn−1/2 (i, j) − , (25)
μ0 y

t E zn (i + 1, j) − E zn (i, j)
. Hyn+1/2 (i, j) = Hyn−1/2 (i, j) + . (26)
μ0 x

By integrating the circular dielectric medium into the FDTD simulation, the study
aims to provide insights into the complex interactions of electromagnetic waves with
dielectric materials. Understanding these interactions is essential for the design and
analysis of various electromagnetic systems, such as radar and wireless commu-
nication systems. The simulation helps to visualize and quantify the effects of the
dielectric medium on wave propagation, enabling better prediction and optimization
of system performance [4].
The algorithm for simulations of electromagnetic wave propagation in dielectric
media is given in the Algorithm Box 1.
Electromagnetic Wave Propagation in Dielectric Media 211

Algorithm 1: 2D FDTD Electromagnetic Wave Simulation with PML and


Dielectric Medium
Data: Simulation parameters: .d x, .dy, .dt, . L x, . L y, .T
Result: Animation and plots of electromagnetic wave propagation
1 Initialization:
2 – Define constants: .c0 , .0 , .μ0
3 – Compute grid size . N x, . N y, time steps . N t
4 – Initialize permittivity matrix .r and conductivity matrix .σ
5 – Set up circular dielectric medium in .r and .σ
6 – Initialize field arrays . E z , . Hx , . Hy
7 – Define source parameters: position, pulse width, frequency . f 0
8 – Apply PML boundary conditions to .σ
9 Main Simulation Steps:
10 for each time step .n from .0 to . N t − 1 do
11 – Calculate current time .t = n × dt for each time step .n
12 – Compute source pulse for time .t
13 end
14 – Update the electric field . E z at the source position using:
σ t
1− 20 t
E zn+1 (i, j) = σ t
E zn (i, j) +
1+ 20  1+ σ t
. 20
 
Hyn (i, j) − Hyn (i − 1, j) Hxn (i, j) − Hxn (i, j − 1)
× −
x y

15 – Update magnetic fields . Hx , . Hy using:


 
n+1/2 t E zn (i, j + 1) − E zn (i, j)
n−1/2
Hx (i, j) = Hx (i, j) −
μ0 y
.  
n+1/2 n−1/2 t E zn (i + 1, j) − E zn (i, j)
Hy (i, j) = Hy (i, j) +
μ0 x

16 – Apply PML conditions and update fields using:


n+1/2 n−1/2
Dz (i, j) = gi3 (i)g j3 ( j)Dz (i, j) + 0.5gi2 (i)g j2 ( j)
    
1 1
× Hyn i + , j − Hyn i − , j
. 2 2
   
1 1
− Hxn i, j + + Hxn i, j − .
2 2

17 – Store results every 10 steps and calculate total energy: if n mod 10 == 0 then
18 – Store current . E z field for animation
19 – Compute and store total energy in the system
20 end
21 Visualization:
22 – Create 2D animation of . E z over time
23 – Generate 3D surface plots of . E z at selected time steps
24 – Plot total electromagnetic energy as a function of time
212 T. V. Anvekar and B. Eidel

1.5 Problem Description

The work is based on simulating 2D FDTD electromagnetic wave propagation in a


dielectric medium. The simulation will incorporate advanced features and visualiza-
tions of reflection, refraction, and absorption. The simulation setup includes various
physical constants, simulation parameters, material properties, field initialization,
source configuration, and PML boundary conditions (BC). The parameters chosen
are given in Table 1.
In this simulation, vacuum space is considered alongside a dielectric medium to
study the behavior of electromagnetic waves under ideal conditions. The inclusion of
vacuum allows us to observe the wave’s natural behavior, minimizing external influ-
ences such as dispersion and loss. This provides a clear baseline for understanding
wave phenomena such as reflection and refraction.

Table 1 Summary of constants, material properties, source configuration, and PML BCs
Parameters Values/Types Units
Constants
Speed of light (.c0 ) .3 × 108 m s.−1
Vacuum permittivity (.0 ) .8.54 × 10
−12 F m.−1
Vacuum permeability (.μ0 ) .4π × 10
−7 H m.−1
Material properties
Relative permittivity (.r ) 30 –
Conductivity (.σ ) 0.30 S m.−1
Diameter of dielectric cylinder 0.20 m
Simulation parameters
Spatial step in .x (.x) 0.01 m
Spatial step in . y (.y) 0.01 m
x
Time step (.t) .0.9 × √ s
c0 2
Domain length in .x (. L x ) 2.0 m
Domain length in . y (. L y ) 2.0 m
Position of dielectric center 1.0, 1.0 m
Total simulation time (.T ) .1 × 10
−7 s
Source configuration
 
Nx N y
Position . , = (0.5, 1.0) m
4 2
Pulse width .10
−10 s
.5 × 10
Frequency (. f 0 ) 6 Hz
PML BC
Thickness 20 cells
Max conductivity (.σmax ) 0.1 S m.−1
Electromagnetic Wave Propagation in Dielectric Media 213

The dielectric medium, characterized by its relative permittivity and conductivity,


is included within the same simulation environment to observe how these material
properties affect wave propagation. By studying the interaction of waves with the
dielectric medium, we can analyze how it alters wave behavior in terms of reflection,
refraction, and absorption. This comprehensive approach ensures that the effects of
the dielectric medium on wave behavior are clearly understood within a controlled
simulation environment.

2 Prompt

To effectively utilize ChatGPT for formulating and solving a 2D FDTD electromag-


netic wave propagation simulation in a dielectric medium, the prompt must be struc-
tured with specific instructions and provide comprehensive and precise inputs. The
goal is to ensure that the problem is clearly articulated and all necessary parameters
are included to achieve a precise and correct solution. The task involves defining the
physical constants, simulation parameters, material properties, field initialization,
source configuration, PML BC, and analysis requirements perfectly. The prompt
must specify the following details
• Physical Constants: Specify the speed of light, vacuum permittivity, and vacuum
permeability.
• Simulation Parameters: Describe the spatial steps, time step, domain size, and
simulation time.
• Material Properties: Define the relative permittivity and conductivity of the
dielectric medium, as well as its geometric placement in the domain.
• Field Initialization: Specify the initialization of the electric and magnetic fields.
• Source Configuration: Describe the source type ,position, pulse width, and fre-
quency.
• PML BC: Mention the thickness and maximum conductivity of the PML bound-
aries.
• Units: Ensure all inputs are given in consistent units to avoid errors in the simu-
lation.
The prompt must also include a set of instructions or guidelines to ensure that
ChatGPT understands the task clearly and displays the results as required. The fol-
lowing instructions are provided for this particular field of study:

• FDTD Setup: Instructions to define the spatial and time steps, initialize the field
arrays, and apply the FDTD update equations for both electric and magnetic fields.
Instructions to incorporate the material properties and PML BC into the simulation.
• Source Implementation: Instructions to define the source configuration, including
its position, pulse width, and frequency.
• Post-Processing: Instructions to compute the outputs like electric and magnetic
field distributions, visualize the fields over time, and analyze the wave interactions
214 T. V. Anvekar and B. Eidel

with the dielectric medium. Wherever applicable, formulas or pseudo-code can be


provided to minimize the assumption of ChatGPT.
• Visualization: The format of the output is mentioned in the prompt. Instructions
provided to generate plots or animations of the field distributions with proper
legends, labels, and color maps for better understanding.
• Instruction to ask ChatGPT for comments in the code: Request comments
within the code to explain the steps and calculations performed.

As a computational physicist, generate a complete Python program for a 2D


Finite-Difference Time-Domain (FDTD) electromagnetic wave propagation
in dielectric medium simulation. The program must include the following
components and features, with the same function names as mentioned in this
script.
1. Import necessary libraries:
• numpy for numerical computations,
• matplotlib for plotting and animation,
• cm from matplotlib.
2. Define physical constants:
• Speed of light in vacuum (.c0 = 3 × 108 m s.−1 ),
• Permittivity of free space (.0 = 8.54 × 10−12 F m.−1 ),
• Permeability of free space (.μ0 = 4π × 10−7 H m.−1 ).
3. Set up simulation parameters:
• Spatial steps (.x, .y) = (0.01, 0.01 m),
• Time step (.t) calculated for stability = .0.9 × cx √ ,
0 2
• Domain dimensions (. L x , . L y ) = (2, 2 m),
• Total simulation time (.T ) = .1 × 10−7 s,
• Calculate grid sizes (. N x , . N y ) and number of time steps (. Nt ).
4. Create material properties:
• Define a circular dielectric medium in the center of the domain: .r , .σ ,
• Center: .center _x = 0.5 × N x , .center _y = 0.5 × N y ,
• Radius: .radius = 0.20
x
,
• Set relative permittivity (.r ) = 30 and conductivity (.σ ) = 0.30 S m.−1
for the medium.
5. Initialize field arrays:
• Electric field (. E z ),
• Magnetic fields (. Hx , . Hy ).
Electromagnetic Wave Propagation in Dielectric Media 215

6. Set up source parameters:


• Source position: (. N x //4, . N y //2),
• Pulse width: .1 × 10−10 s,
• Frequency: .5 × 106 Hz.
7. Set up PML Boundary Conditions (BC):
• PML thickness: 20 cells,
• Maximum conductivity (.σmax ): 0.1 S m.−1 .
Create a function apply_pml() using a for loop in the range of
pml_thickness. This function applies PML BC to the conductivity
array sigma in a 2D grid, which is used in FDTD simulations to absorb
outgoing waves at the boundaries and prevent reflections.
8. Create update functions for the electromagnetic fields:
• Includes updating the magnetic fields. Hx and. Hy , updating the electric
field. E z in a single function, then applying conductivity for absorption.
9. Simulation functions:
• Create a function run_simulation(): This function runs a sim-
ulation of electromagnetic wave propagation and captures snapshots
of the electric field at regular intervals. The function should propagate
a Gaussian-modulated sinusoidal pulse wave in a medium character-
ized by its relative permittivity (.r ) and electrical conductivity (.σ ). It
captures the electric field (. E z ) at every 10th time step and stores these
snapshots as ‘frames’ in a list for subsequent analysis or visualization.
• Create a function animate(frames, title): Create an anima-
tion of the electric field from a series of frames and save it as a GIF
or MP4 file.
• Create a function run_simulation_3D_plot(): Cap-
ture snapshots of the electric field (. E z ) at specific time
frames(100,500,1000,1200) for 3D plotting.
• Create a function plot_3d_surface(Ez, title): Plot a 3D
surface of the electric field component . E z .
• Create a function calculate_energy(Ez, Hx, Hy,
eps_r, dx, dy): Calculate the total electromagnetic energy
in the system.
• Create a function run_simulation_with_energy(): Run a
simulation and track the energy of the system at regular intervals.
10. Run the simulation and plot the results:
• Call the function energies = run_simulation_with_
energy() and plot the results with appropriate labels and titles.
216 T. V. Anvekar and B. Eidel

• Call the function frames = run_simulation().


• Call the function frames_1 = run_simulation_3D_plot().
• Call the function animate(frames, ’2D FDTD
Simulation of Gaussian Pulse’).
• Plot the 3D surface plots for specific time intervals.

3 Code Listing

2 #instruction to run the code.


3 #* line 104 and 106 , allows user to save GIF and animated video
→ of this visualization.
4 #* it can be best visualized if you run line 104 as GIF ,as Video
→ processing time may take a while.
5

8 import numpy as np
9 import matplotlib.pyplot as plt
10 import matplotlib.animation as animation
11 from matplotlib import cm
12 from mpl_toolkits.mplot3d import Axes3D
13

14 # Constants
15 c0 = 3e8 # Speed of light in vacuum
16 eps0 = 8.854e-12 # Permittivity of free space
17 mu0 = 4 * np.pi * 1e-7 # Permeability of free space
18

19 # Simulation parameters
20 dx = 0.01 # Spatial step (m)
21 dy = 0.01 # Spatial step (m)
22 dt = 0.9 * dx / (c0 * np.sqrt(2)) # Time step (s)
23 Lx = 2.0 # Domain length in x (m)
24 Ly = 2.0 # Domain length in y (m)
25 T = 1e-7 # Total simulation time (s)
26

27 Nx = int(Lx / dx) # Number of spatial steps in x


28 Ny = int(Ly / dy) # Number of spatial steps in y
29 Nt = int(T / dt) # Number of time steps
30

31 # Material properties for circular dielectric medium


32 eps_r = np.ones((Nx, Ny))
33 sigma = np.zeros((Nx, Ny))
34 center_x = int(0.5* Nx)
35 center_y = int(0.5 * Ny)
Electromagnetic Wave Propagation in Dielectric Media 217

36 radius = int(0.20 / dx)


37

38 for i in range(Nx):
39 for j in range(Ny):
40 if (i - center_x)**2 + (j - center_y)**2 <= radius**2:
41 eps_r[i, j] = 30
42 sigma[i, j] = 0.3
43

44 # Field arrays
45 Ez = np.zeros((Nx, Ny))
46 Hx = np.zeros((Nx, Ny))
47 Hy = np.zeros((Nx, Ny))
48

49 # Source parameters
50 source_position = (Nx // 4, Ny // 2)
51 pulse_width = 1e-10
52
53 f0 = 5e6 # Frequency (Hz)
54

55 # PML boundary conditions


56 pml_thickness = 20 # Number of cells in PML
57 sigma_max = 0.1 # Maximum conductivity in PML
58

59 def apply_pml(sigma, pml_thickness, sigma_max):


60 for i in range(pml_thickness):
61 sigma[i, :] = sigma_max * (pml_thickness - i) /
→ pml_thickness
62 sigma[-i-1, :] = sigma_max * (pml_thickness - i) /
→ pml_thickness
63 sigma[:, i] = sigma_max * (pml_thickness - i) /
→ pml_thickness
64 sigma[:, -i-1] = sigma_max * (pml_thickness - i) /
→ pml_thickness
65

66 apply_pml(sigma, pml_thickness, sigma_max)


67

68 def update_fields(Ez, Hx, Hy, eps_r, sigma):


69 # Update magnetic fields Hx and Hy
70 Hx[:, :-1] -= (dt / mu0 / dy) * (Ez[:, 1:] - Ez[:, :-1])
71 Hy[:-1, :] += (dt / mu0 / dx) * (Ez[1:, :] - Ez[:-1, :])
72

73 # Update electric field Ez


74 Ez[1:-1, 1:-1] += (dt / eps0 / eps_r[1:-1, 1:-1]) * (
75 (Hy[1:-1, 1:-1] - Hy[:-2, 1:-1]) / dy -
76 (Hx[1:-1, 1:-1] - Hx[1:-1, :-2]) / dx
77 )
78

79 # Apply conductivity for absorption


80 Ez *= np.exp(-sigma * dt / eps0)
81

82 def run_simulation():
83 frames = []
84 for n in range(Nt):
218 T. V. Anvekar and B. Eidel

85 t = n * dt
86 pulse = np.exp(-((t - 4 * pulse_width) ** 2) / (
→ pulse_width ** 2)) * np.cos(2 * np.pi * f0 * t)
87 Ez[source_position] += pulse
88

89 update_fields(Ez, Hx, Hy, eps_r, sigma)


90

91 if n % 10 == 0:
92 frames.append(np.copy(Ez))
93

94 return frames
95 def animate(frames, title):
96 fig, ax = plt.subplots()
97

98 def update(frame):
99 ax.clear()
100 im = ax.imshow(frame.T, cmap=cm.viridis, vmin=-0.01, vmax
→ =0.01, animated=True)
101 ax.set_title(title)
102 return [im]
103

104 ani = animation.FuncAnimation(fig, update, frames=frames,


→ interval=50, blit=True)
105 #Save as GIF using Pillow
106 #ani.save("2D_FDTD_Simulation.gif", writer=’pillow’)
107 #Save as MP4 video using ffmpeg
108 #ani.save("2D_FDTD_Simulation.mp4", writer=’ffmpeg’)
109 plt.show()
110 return ani
111

112 def run_simulation_3D_plot():


113 frames_1 = []
114 time_frames = [ 100, 500,1000,1200] # Time frames for
→ capturing the results
115 for n in range(Nt):
116 t = n * dt
117 pulse = np.exp(-((t - 4 * pulse_width) ** 2) / (
→ pulse_width ** 2)) * np.cos(2 * np.pi * f0 * t)
118 Ez[source_position] += pulse
119

120 update_fields(Ez, Hx, Hy, eps_r, sigma)


121
122 if n in time_frames:
123 frames_1.append(np.copy(Ez))
124

125 return frames_1


126 def plot_3d_surface(Ez, title):
127 fig = plt.figure()
128 ax = fig.add_subplot(111, projection=’3d’)
129 X, Y = np.meshgrid(np.arange(Ez.shape[0]), np.arange(Ez.shape
→ [1]))
130 surf = ax.plot_surface(X, Y, Ez.T, cmap=cm.viridis,edgecolor=
→ ’black’)
Electromagnetic Wave Propagation in Dielectric Media 219

131 ax.set_title(title)
132 ax.set_xlabel(’X’)
133 ax.set_ylabel(’Y’)
134 ax.set_zlabel(’Electric Field (Vm$^{-1}$)’)
135

136 plt.show()
137

138 def calculate_energy(Ez, Hx, Hy, eps_r, dx, dy):


139 electric_energy = 0.5 * eps0 * np.sum(eps_r * Ez**2) * dx *
→ dy
140 magnetic_energy = 0.5 * np.sum((Hx**2 + Hy**2) / mu0) * dx *
→ dy
141 total_energy = electric_energy + magnetic_energy
142 return total_energy
143

144 def run_simulation_with_energy():


145 energies = []
146 for n in range(Nt):
147 t = n * dt
148 pulse = np.exp(-((t - 4 * pulse_width) ** 2) / (
→ pulse_width ** 2)) * np.cos(2 * np.pi * f0 * t)
149 Ez[source_position] += pulse
150

151 update_fields(Ez, Hx, Hy, eps_r, sigma)


152

153 if n % 10 == 0:
154 energy = calculate_energy(Ez, Hx, Hy, eps_r, dx, dy)
155 energies.append(energy)
156 return energies
157 # Run the simulations and plotting
158 energies = run_simulation_with_energy()
159 plt.figure()
160 plt.plot(np.arange(0, Nt, 10) * dt, energies)
161 plt.xlabel(’Time (s)’)
162 plt.ylabel(’Total Electromagnetic Energy (J)’)
163 plt.title(’Energy Conservation in 2D FDTD Simulation’)
164 plt.grid(True)
165 plt.show()
166 # Run simulation and create animation , create 3D surface plots
167 frames = run_simulation()
168 frames_1 = run_simulation_3D_plot()
169
170 time_labels = [100 , 500 , 1000 , 1200]
171 animate(frames, ’2D FDTD Simulation of Gaussian Pulse’)
172 for frame_1, label in zip(frames_1, time_labels):
173 plot_3d_surface(frame_1, f’2D FDTD Simulation at T={label:} s
→ ’)
Listing 1 Output by ChatGPT for solving Electromagnetic wave propagation in dielectric medium
220 T. V. Anvekar and B. Eidel

4 Tests for Verification

The most important part of the work is to verify the program generated by ChatGPT-4
and ensure that the results conform with the solution of the problem stated in the
problem description.
This work’s verification is done by comparing the FDTD simulation results with
those presented in other technical papers and through self-verification. The FDTD
program generated by ChatGPT is run on an Integrated Development Environment
(IDE). It generates visualizations of the electromagnetic wave propagation, including
reflection, refraction, and absorption, which are compared to the expected results.
The comparison involves examining the accuracy of the field distributions and energy
conservation over time.
Next, I will discuss the verification done by matching similar graphs from research
papers. The results from the FDTD simulation are compared to those from established
research to ensure the accuracy and validity of the ChatGPT-generated code.

4.1 Verification: Graphically

The result from the FDTD simulation at time frame .t = 1200 s aligns well with
the outcomes observed in similar studies, as discussed in [5]. The symmetry, wave
interaction pattern, and boundary conditions all reflect the accuracy of the simulation
generated by the FDTD program.
The Image from [5] shows a simulation of a plane wave impinging on a dielec-
tric cylinder at time frame .t = 75 s, while Fig. 5 from the FDTD simulation of a
Gaussian wave at time frame .t = 1200 s. Despite different waveforms and simula-
tion parameters, both results demonstrate consistent interactions with the dielectric
media, enabling a comparative analysis of wave reflection and refraction patterns.
• Wave Interaction Pattern: Both visualizations show two distinct peaks with a
valley in between, indicating similar wave interaction patterns with the medium.
• Symmetry: The wave propagation appears symmetric in both images, suggesting
correct modeling of the medium’s properties and boundaries.
• Boundary Conditions: Both simulations use absorbing boundary conditions
(PML), as evident from the absence of reflected waves at the edges.
These similarities validate the FDTD simulation approach employed and confirm its
reliability in modeling electromagnetic wave interactions in dielectric media.

4.1.1 Verification: Energy Conservation Analysis

The prompt has its own way of validating the program it has generated with the
method of Energy Conservation as discussed below.
Electromagnetic Wave Propagation in Dielectric Media 221

The graph in Fig. 6 shows the total electromagnetic energy as a function of time.
The corresponding snapshots in Fig. 7 visually depict the wave’s interaction with the
dielectric medium at different time steps. Key observations include:

Fig. 5 3D visualization of the electric field component . E z generated using a 2D Finite-Difference


Time-Domain (FDTD) simulation at time frame .t = 1200 s. The plot illustrates the propagation of
the electromagnetic wave within a dielectric medium, showcasing the complex interactions of the
wave as it reflects and refracts within the medium

Fig. 6 Total electromagnetic energy as a function of time in a 2D FDTD simulation. The plot
demonstrates energy conservation within the computational domain and validates the accuracy of
the FDTD simulation over time
222 T. V. Anvekar and B. Eidel

(a) = 1.06e-09 s (b) = 2.12e-09 s

(c) = 3.18e-09 s (d) = 3.71e-09 s

(e) = 4.24e-09 s (f) = 5.30e-09 s

Fig. 7 These time-lapse visualizations illustrate the reflection, refraction, and absorption of the
electromagnetic wave as it interacts with the dielectric medium and exits the simulation domain.
The progression from the initial pulse propagation to wave dissipation provides insights into energy
conservation and wave dynamics within the medium
Electromagnetic Wave Propagation in Dielectric Media 223

• Initial Energy Peak: The energy starts at a peak value due to the initial source
input, as seen in Fig. 7a, where the wave is centered within the dielectric medium.
• Energy Decay: Rapid decay in the initial phase, likely due to absorption by the
PML boundaries, can be correlated with Fig. 7c and d, where the wave begins to
exit the medium and is partially absorbed by the boundaries.
• Energy Conservation: The energy drops down to zero after the initial decay,
indicating the wave has left the simulation box, such that a steady state is restored
where internal energy is conserved. This is visually supported by Fig. 7f, where the
wave has exited the simulation domain, leaving behind minimal residual energy.
According to the Poynting theorem, the energy density .u and the Poynting vector
. S satisfy
∂u
. + ∇ · S = − J · E. (27)
∂t
For a lossless medium with no free currents (. J = 0), this simplifies to

∂u
. + ∇ · S = 0. (28)
∂t

5 Discussion

After the ChatGPT-4 generating the program, there are several aspects to verify
before running it in any Integrated Development Environment (IDE). These aspects
are summarized as follows:
• Completeness of the program: You must confirm that ChatGPT 4 has supplied
the entire program for this kind of work, where you need to demonstrate outcomes
rather than generate numbers. The main problem is that each time you enter the
prompt, a slightly different version of the prompt is generated, making it difficult
to anticipate the outcome.
• In Steps: Occasionally, ChatGPT-4 produces code in a step-by-step fashion instead
than as a single, integrated application. If this happens, request the whole program.
• Short term memory capacity: There is a limit to ChatGPT-4’s output where it
outputs precisely what the prompt instructs; after that, you must prompt again to
concentrate on the details provided in the first prompt. Claiming that it is accurate,
the most recent version Chatgpt-4o can read the prompt in its entirety and produce
a program that is as exact as possible; based on the prompt, it can provide the
desired output based on an accurate user prompt.
• Updating and frames: As a visualization objective, it its mandatory to check that
the updated electric field equation captures the frames at every step for a smooth
simulation.
224 T. V. Anvekar and B. Eidel

• Learned Lessons: Chatgpt-4 is advanced for solving complex tasks compared


to any other model that exists. The more references and resources you provide,
the better your accuracy. Comparing the two models Chatgpt-4 and 4o, the newer
model delivers precisely and more than what you expect; it can handle a large
amount of data compared to Chatgpt-4. If you expect complex mathematical rea-
soning, version 4 is worth considering.

6 Conclusion

The work aimed to generate a Python program using ChatGPT-4 for simulating elec-
tromagnetic wave propagation in a dielectric medium using the 2D Finite-Difference
Time-Domain (FDTD) method. The objective was to examine wave interactions such
as reflection, transmission, and absorption, within different dielectric environments.
It was observed that ChatGPT-4 requires comprehensive and precise inputs to
perform accurately, as discussed in the prompt section. Detailed task descriptions
and explicit instructions are necessary to ensure that ChatGPT-4 does not make any
assumptions about key factors in the FDTD simulation setup.
Comparing the FDTD program generated by ChatGPT-4 with theoretical models
and established methods that the solutions were correctly formulated. The results of
the simulations, including the electric and magnetic field distributions, were consis-
tent with theoretical predictions and comparable to those produced by other estab-
lished ways.
Verification with graph solutions indicated that the FDTD results converged with
increased grid resolution. The simulation of the Gaussian sinusoidal pulse and its
interaction with a cylindrical dielectric medium provided accurate visualizations of
wave propagation phenomena. The implementation of PML BC effectively mini-
mized reflections at the boundaries, further validating the simulation’s accuracy.
The energy conservation analysis confirmed the physical accuracy of the generated
code, with the total energy stabilizing over time after initial absorption by the PML
boundaries. The overall performance of the ChatGPT-4 generated FDTD simulation
demonstrated high accuracy and reliability.
In summary, this study validates ChatGPT-4’s ability to produce FDTD simula-
tion programs that are accurate and dependable, greatly simplifying and improving
the process of resolving challenging electromagnetic wave propagation issues. This
effective application of AI in computational electromagnetics shows how broadly
applicable it may be in scientific computing and provides a viable method for
automating the creation of intricate simulation tools.
Electromagnetic Wave Propagation in Dielectric Media 225

References

1. D. Sarkar, “FDTD Analysis of Guided Electromagnetic Wave Interaction with Time-Modulated


Dielectric Medium,” Springer Nature Singapore, 2022. ISBN: 978-981-19-1630-4, https://doi.
org/10.1007/978-981-19-1630-4.
2. J. Chen and Y. Kougong, “Energy and momentum of electromagnetic waves in media,” Journal
of Optics, 2023, https://doi.org/10.1007/s12596-023-01183-0.
3. N. Faruk and U. M. Gana, “FDTD Modelling of Electromagnetic waves in Stratified Medium,”
Global Journal of Engineering Research, 2013, https://doi.org/10.4314/gjer.v12i1.1.
4. D. M. Sullivan, “Electromagnetic Simulation Using the FDTD Method,” Wiley-IEEE Press,
2013. ISBN: 9781118646700, https://doi.org/10.1002/9781118646700.
5. E. Balti, “Finite-Difference Time-Domain Simulations,” engrxiv, 2021, https://doi.org/10.
31224/osf.io/yjcs8.
6. K. Yee, “Numerical solution of initial boundary value problems involving Maxwell’s equations in
isotropic media,” IEEE Transactions on Antennas and Propagation, vol. 14, no. 3, pp. 302–307,
1966, https://doi.org/10.1109/TAP.1966.1138693.
7. J. X. Liu, Z. K. Yang, L. Ju, et al. “Boltzmann Finite-Difference Time-Domain Method Research
Electromagnetic Wave Oblique Incidence into Plasma.” Plasmonics 13, 1699–1704, 2018,
https://doi.org/10.1007/s11468-017-0681-3.
Flow Around an Obstacle Using
the Lattice Boltzmann Method

Fenil Lathiya and Bernhard Eidel

Abstract This chapter delves into the generation of Python code by ChatGPT-4o
for 2D simulations of fluid flow around obstacles of circular or square shape. The
numerical solution is achieved using the Lattice Boltzmann Method (LBM) with
the D2Q9 model. A significant portion of the chapter is dedicated to the intricacies
of prompt engineering for this specific task, as well as the thorough verification
of the generated code. The performance of the code is rigorously tested against
results obtained from a commercial solver, focusing on velocity fields, pressure
fields, and pressure distribution. The remarkable quantitative agreement across all
criteria demonstrates the effectiveness of GPT-4o in producing accurate and reliable
code for complex fluid dynamics simulations.

1 Introduction

The study of fluid dynamics involves understanding the behavior of fluid flow around
obstacles, a task that requires advanced computational methods. The Lattice Boltz-
mann Method (LBM) is such a method in Computational Fluid Dynamics (CFD)
through a lattice grid-based approach. This method models fluid behavior at a micro-
scopic level using particle distribution functions and kinetic theory, which can then
be translated into macroscopic fluid properties. For a comprehensive overview of
the Lattice Boltzmann Method and its applications, we refer to [1] and for a sound
introduction to the LBM in fluid flows [2].
This paper is organized into several sections to comprehensively discuss the AI-
Assisted Coding Project. After this introduction, the next section details the For-
mulation of the Coding Task, explaining the problem setup and the approach taken.

F. Lathiya · B. Eidel (B)


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
F. Lathiya
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 227
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_10
228 F. Lathiya and B. Eidel

Following this, the Code Listing section presents the AI-generated Python code for
the LBM simulation. The Code Testing/Performance section evaluates the perfor-
mance and accuracy of the code through tests and comparisons with ANSYS Fluent.
The Discussion section provides an analysis of the results and insights gained from
the project. Finally, the Conclusions section summarizes the findings and implica-
tions of the study. Additional details and data are included in the Appendix.

1.1 Lattice Boltzmann Method: Theory and Modeling

The main idea behind the LBM is to represent the fluid using particles that move
and collide each other on a discrete lattice grid. The key part of this method lies in
the Boltzmann equation, which describes the evolution of the particle distribution
function. A critical component of this equation is the collision step [7], which is
given by  
. f out = f in − ω f in − f eq , (1)

where . f in is the incoming particle distribution function, . f out is the outgoing parti-
cle distribution function after the collision, .ω the relaxation parameter, and . f eq the
equilibrium distribution function.
LBM is advantageous due to its simplicity in handling complex boundary con-
ditions (BC) and its flexibility in simulating various types of flows, which makes it
ideal for studying fluid flow around obstacles. The method translates the microscopic
particle dynamics into macroscopic flow properties, such as velocity and pressure
fields, by averaging them.
In LBM, the fluid is modeled as a collection of particles that reside on the nodes of
a discrete lattice. These particles propagate to neighboring nodes and collide, redis-
tributing their velocities according to predefined rules. The macroscopic properties
of the fluid, such as density and velocity, are obtained by taking moments of the
particle distribution functions.
One of the significant advantages of LBM is its ability to handle complex geome-
tries and BC with ease. Traditional CFD methods often require sophisticated meshing
techniques and BC formulations. In contrast, LBM uses simple bounce-back rules to
impose no-slip BC on solid surfaces, making it more straightforward to implement
for problems involving obstacles.
The standard lattice model used in LBM is the D2Q9 model (see Sect. 1.2) for two-
dimensional flows. In this model, each node in the lattice has nine possible velocity
vectors, including one stationary and eight moving in different directions. The particle
distribution function at each node evolves according to the LBM equation, [3]

f (x + ei , t + 1) = f i (x, t) + i ,
. i (2)

where . f i is the particle distribution function in the direction .ei , and .i represents the
collision operator that models the redistribution of particles due to collisions. The
Flow Around an Obstacle Using the Lattice Boltzmann Method 229

collision operator .i is often represented by the Bhatnagar-Gross-Krook (BGK)


approximation, which simplifies to [7]:
1 eq 
.i = − f i (x, t) − f i (x, t) . (3)
τ
eq
Here, .τ is the relaxation time parameter, and . f i is the local equilibrium distri-
bution function. The BGK approximation ensures that the system relaxes towards
equilibrium over time, with the rate of relaxation governed by .τ . The collision oper-
ator .i simplifies the collision term to a relaxation towards a local equilibrium
eq
distribution . f i . This relaxation process occurs over a characteristic time scale .τ ,
ensuring that the system gradually approaches equilibrium.
The equilibrium distribution function . f eq for the D2Q9 model is given by [4]
 
ei · u (ei · u)2 u2
f
. eq,i = wi ρ 1 + 2 + − 2 , (4)
cs 2cs4 2cs

where .wi are the weights associated with each direction, .ρ is the fluid density, .u is
the macroscopic velocity, and .cs is the speed of sound in the lattice.
The macroscopic fluid properties, such as density .ρ and velocity .u are obtained
by taking moments of the particle distribution function [4]


ρ=
. fi , (5)
i

ρu =
. f i ei . (6)
i

1.2 The D2Q9 Model

The D2Q9 model, an abbreviation for “Two-Dimensional, Nine-Directional”, is a


prevalent lattice model employed within the LBM for simulating fluid flows in two
dimensions. This model is crucial in CFD for its efficiency in simulating complex
fluid behaviors at the macroscopic level.
Structure and velocity vectors: The D2Q9 model is constructed around a square
lattice, where each node is linked to its nearest and next-nearest neighbors, enabling a
comprehensive representation of fluid flow directions. The connectivity and possible
movement directions at each node are depicted in Fig. 1:

• .e0 = (0, 0): Represents the rest particle.


• .e1 = (1, 0), .e2 = (0, 1), .e3 = (−1, 0), .e4 = (0, −1): Movement to the nearest
neighbors.
230 F. Lathiya and B. Eidel

Fig. 1 Velocity vectors in


the D2Q9 model

• .e5 = (1, 1), .e6 = (−1, 1), .e7 = (−1, −1), .e8 = (1, −1): Movement to the next-
nearest neighbors.

Weights and distribution functions: The weights.wi associated with these veloc-
ity vectors through (4) play a fundamental role in the particle distribution calculations
within the model [6]:

.w0 = 4
9
for the rest particle,
w1,2,3,4 = 1
9
for particles moving toward nearest neighbors, (7)
w5,6,7,8 = 1
36
for particles moving toward next-nearest neighbors.

These weights help define the equilibrium distribution function . f eq , which inte-
grates these weights with macroscopic variables like density and velocity. This func-
tion ensures compliance with the macroscopic equations of mass and momentum
conservation, thereby providing accurate fluid dynamics simulations under various
conditions.

1.3 Collision and Streaming Steps

In LBM, the simulation proceeds through two main steps: collision and streaming.

1.3.1 Collision Step

During this step, particles at each lattice node collide and redistribute their velocities
according to the collision operator. This operator is typically based on the Bhatnagar-
Gross-Krook (BGK) approximation, which simplifies the collision process by assum-
Flow Around an Obstacle Using the Lattice Boltzmann Method 231

ing a single relaxation parameter .ω. The post-collision distribution function is given
by [3]
 
f ∗ (x, t) = f i (x, t) + ω f eq,i (x, t) − f i (x, t) .
. i (8)

1.3.2 Streaming Step

In this step, the particles move to neighboring lattice nodes based on their velocities.
The distribution function is updated as follows [3]

f (x + ei t, t + t) = f i∗ (x, t).


. i (9)

1.4 Boundary Conditions

Applying appropriate BC is crucial for accurately simulating fluid flow around obsta-
cles. In LBM, several BC can be implemented easily [8]:
• No-slip BC: This condition is applied at solid boundaries (such as the surface of an
obstacle) using the bounce-back rule. Particles that hit the boundary are reflected
back along their incoming direction, ensuring that the velocity at the boundary is
zero.
• Inlet and outlet BC: These conditions are used to specify the fluid flow at the
boundaries of the simulation domain. Common approaches include specifying a
constant velocity profile at the inlet and a zero-gradient condition at the outlet.

1.5 Simulation Setup

The simulation of fluid flow around a square/circular obstacle involves defining a


discrete lattice grid, initializing the particle distribution functions, and iteratively
applying the collision and streaming steps. The BC, including the no-slip condition
on the obstacle and the periodic inlet and outlet conditions, are implemented using
simple bounce-back rules. The simulation runs for a specified number of iterations,
and the macroscopic properties, such as velocity and pressure fields, are computed
and visualized.
The Python code generated for this project uses the D2Q9 model to simulate the
fluid flow around a square/circular obstacle. The code initializes the lattice, sets up
the BC, and iteratively updates the particle distribution functions. The results are
visualized using velocity magnitude and pressure fields, providing insights into the
flow behavior around the obstacle.
232 F. Lathiya and B. Eidel

The algorithm for LBM simulations of 2D fluid flow is given in the Algorithm
Box 1.
Algorithm 1: Lattice Boltzmann Method for 2D Fluid Flow
Input: Lattice size .(N x, N y), time steps .T , Reynolds number . Re, initial
distribution function . f i (x, t = 0), boundary conditions, obstacle type
(square/circle), obstacle parameters (center .(cx, cy) and size)
Output: Velocity field .u(x, T ), density field .ρ(x, T )
1 Initialization:
2 Initialize the lattice grid with size .(N x, N y) and Set the obstacle shape and
position (square or circle);
3 Calculate relaxation parameter .ω based on . Re;
4 Set initial macroscopic variables: density .ρ(x, 0) and velocity .u(x, 0);
5 Initialize the distribution function . f i (x, 0) according to equilibrium
eq
distribution function . f i (ρ, u);
6 for t = 0 to T do
7 Collision Step:
8 foreach lattice node . x ∈ (N x, N y) do
9 Calculate macroscopic variables .ρ(x, t) and .u(x, t) from . f i (x, t);
eq
10 Compute equilibrium distribution . f i (ρ, u);
11 Update distribution function using BGK approximation:
 eq 
f (x, t + t) = f i (x, t) − ω f i (x, t) − f i (ρ, u)
. i

12 end
13 Streaming Step:
14 foreach lattice direction .i do
15 Move the distribution function to neighboring nodes:

f (x + ci t, t + t) = f i (x, t + t)


. i

16 end
17 Apply Boundary Conditions:
18 Apply appropriate boundary conditions (e.g., bounce-back for walls,
periodic, outflow conditions etc.) on . f i (x, t + t);
19 Update Macroscopic Quantities:
20 Calculate macroscopic quantities .ρ(x, t + t) and .u(x, t + t);
21 end
22 Output:
23 Return the final velocity field .u(x, T ) and pressure field .ρ(x, T );
Flow Around an Obstacle Using the Lattice Boltzmann Method 233

1.6 Overview of the Python Code

Ideally, the program generated by AI for this project should use the D2Q9 model to
simulate fluid flow around a square or circular obstacle and consist of the following
key components: [5]
1. Initialization: The lattice grid should be defined, and the particle distribution
functions should be initialized.
2. Collision and streaming: The code should iteratively apply the collision and
streaming steps to update the particle distribution functions.
3. Boundary conditions: No-slip BC should be applied using the bounce-back rule.
Inlet and outlet conditions should be specified to maintain the flow.
4. Visualization: The macroscopic properties, such as velocity magnitude and pres-
sure fields, should be computed and visualized to analyze the flow behavior.
The AI-generated code demonstrates the potential of artificial intelligence in
assisting with complex computational tasks. By providing a detailed prompt, the
AI was able to generate a functional code for simulating fluid flow using the LBM.
This project uses AI, specifically GPT-4o, to assist in coding the LBM for sim-
ulating fluid flow around an obstacle. The goal is to assess the capability of AI in
generating Python code for this complex task. The subsequent sections will present
the specific problem to be solved, the AI-generated prompt, the resulting code, out-
comes, its verification and some remarks about the GPT-4o.
By looking at the intersection of AI and CFD, this report may aim to highlight
the strengths and limits of AI-assisted coding in scientific research.

2 Prompt

2.1 Problem Statement Definition

The simulation involves modeling fluid flow around an obstacle within a 2D rectan-
gular domain using the LBM. The domain is defined with the following parameters:
• Domain size: The computational domain has a width .nx and a height .ny. Here 1
lattice spacing equals 1mm i.e., there will be .nx lattice nodes in .x direction for .nx
mm width and .ny lattice nodes in . y direction for .ny mm height.
• Coordinate system: The origin (0,0) is located at the bottom-left corner of the
domain. The .x-axis extends horizontally to the right, and the . y-axis extends ver-
tically upwards.
• Obstacle: The obstacle can be either a square or a circle, positioned within the
domain based on user input.
– For a square obstacle, the user specifies the .x and . y coordinates of the center
(.cx, .cy) and the side length.
234 F. Lathiya and B. Eidel

Fig. 2 Computational domain with (top) a circular and (bottom) a square obstacle

– For a circular obstacle, the user specifies the .x and . y coordinates of the center
(.cx, .cy) and the radius.

• Discretization: The lattice nodes are evenly spaced, forming a grid over the
domain. The flow simulation uses these discrete points to model fluid dynamics.

Figure 2 sketches the simulation domain with exemplary obstacles.


For definiteness in the coding assignment, the parameters are chosen as mentioned
in Table 1.

Table 1 Parameters chosen for the fluid simulation study with obstacles
Parameters Values/Types Units
Domain width (.nx) 400 mm
Domain height (.ny) 100 mm
Obstacle type Circular/Square
Obstacle center (.cx, cy) (100, 50) (mm, mm)
Radius of circle 25 mm
Length of square 40 mm
(continued)

The final, working prompt is given in Fig. 3. It is used for generating the Code
Listing 1.
Flow Around an Obstacle Using the Lattice Boltzmann Method 235

Table 1 (continued)
Parameters Values/Types Units
LBM cell size 1 mm
time step size .t 1 s
Total time steps 4000
Reynolds number (. Re) 50
Upstream velocity .u 0.04 mm/s
Model type Bounce-back
Obstacle BC No-slip
Output Velocity, Pressure and streamline plots

Fig. 3 Prompt for simulating fluid flow around an obstacle


236 F. Lathiya and B. Eidel

Fig. 3 (continued)
Flow Around an Obstacle Using the Lattice Boltzmann Method 237

3 Code Listing

The final code version generated by GPT-4o is given in the Code Listing 1.
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from matplotlib import cm
4 import pandas as pd
5

6 # Function to get obstacle parameters from the user


7 def get_obstacle_parameters():
8 obstacle_type = input("Enter the type of obstacle (square/
→ circle): ").strip().lower()
9 if obstacle_type == "square":
10 return get_square_parameters()
11 elif obstacle_type == "circle":
12 return get_circle_parameters()
13 else:
14 raise ValueError("Invalid obstacle type. Please enter ’
→ square’ or ’circle’.")
15

16 # Function to get parameters for a square obstacle from the user


17 def get_square_parameters():
18 while True:
19 try:
20 cx = int(input("Enter the x-coordinate of the square
→ center (cx): "))
21 cy = int(input("Enter the y-coordinate of the square
→ center (cy): "))
22 side = int(input("Enter the side length of the square
→ : "))
23 # Ensure the square is within the simulation domain
24 if cx - side // 2 < 0 or cx + side // 2 >= nx or cy -
→ side // 2 < 0 or cy + side // 2 >= ny:
25 raise ValueError("Square parameters are outside
→ the region or not feasible.")
26 return "square", cx, cy, side
27 except ValueError as e:
28 print(e)
29

30 # Function to get parameters for a circular obstacle from the


→ user
31 def get_circle_parameters():
32 while True:
33 try:
34 cx = int(input("Enter the x-coordinate of the circle
→ center (cx): "))
35 cy = int(input("Enter the y-coordinate of the circle
→ center (cy): "))
36 r = int(input("Enter the radius of the circle (r): ")
→ )
37 # Ensure the circle is within the simulation domain
238 F. Lathiya and B. Eidel

38 if cx - r < 0 or cx + r >= nx or cy - r < 0 or cy + r


→ >= ny:
39 raise ValueError("Circle parameters are outside
→ the region or not feasible.")
40 return "circle", cx, cy, r
41 except ValueError as e:
42 print(e)
43

44 ###### Flow definition


→ #########################################################
45 maxIter = 4000 # Total number of time iterations.
46 nx, ny = 400, 100 # Number of lattice nodes in the x and y
→ directions.
47 ly = ny - 1 # Height of the domain in lattice units.
48 Re = 50.0 # Reynolds number.
49 uLB = 0.04 # Velocity in lattice units.
50
51 # Get obstacle parameters from the user
52 obstacle_type, cx, cy, size = get_obstacle_parameters()
53 # Calculate viscosity in lattice units based on Reynolds number
→ and characteristic length (size)
54 nulb = uLB * size / Re
55 # Calculate relaxation parameter for the Lattice Boltzmann method
56 omega = 1. / (3. * nulb + 0.5)
57

58 ###### Lattice Constants


→ #######################################################
59 # Define the discrete velocity set for D2Q9 model (9 velocities)
60 v = np.array([[1, 1], [1, 0], [1, -1], [0, 1], [0, 0], [0, -1],
→ [-1, 1], [-1, 0], [-1, -1]])
61 # Define the weights for each velocity direction
62 t = np.array([1./36., 1./9., 1./36., 1./9., 4./9., 1./9., 1./36.,
→ 1./9., 1./36.])
63

64 # Define columns for streaming step (used for shifting the


→ distribution functions)
65 col1 = np.array([0, 1, 2])
66 col2 = np.array([3, 4, 5])
67 col3 = np.array([6, 7, 8])
68

69 ###### Function Definitions


→ ####################################################
70 # Function to compute macroscopic variables (density and velocity
→ ) from distribution functions
71 def macroscopic(fin):
72 """
73 Compute macroscopic variables (density rho and velocity u)
→ from the distribution functions (fin).
74 Args:
75 fin (ndarray): Distribution functions.
76 Returns: rho (ndarray): Density field.
77 u (ndarray): Velocity field."""
78 # Compute density as the sum of distribution functions
Flow Around an Obstacle Using the Lattice Boltzmann Method 239

79 rho = np.sum(fin, axis=0)


80 # Initialize velocity array with zeros
81 u = np.zeros((2, nx, ny))
82 # Compute velocity as the weighted sum of distribution
→ functions
83 for i in range(9):
84 u[0, :, :] += v[i, 0] * fin[i, :, :]
85 u[1, :, :] += v[i, 1] * fin[i, :, :]
86 # Normalize velocity by density
87 u = u / rho
88 return rho, u
89

90 # Function to compute the equilibrium distribution function


91 def equilibrium(rho, u):
92 """
93 Compute the equilibrium distribution function based on
→ density and velocity.
94 Args:
95 rho (ndarray): Density field.
96 u (ndarray): Velocity field.
97 Returns: feq (ndarray): Equilibrium distribution functions.
→ """
98 # Compute the square of the velocity magnitude
99 usqr = 3. / 2. * (u[0]**2 + u[1]**2)
100 # Initialize equilibrium distribution function array
101 feq = np.zeros((9, nx, ny))
102 # Compute equilibrium distribution function for each velocity
→ direction
103 for i in range(9):
104 cu = 3. * (v[i, 0] * u[0, :, :] + v[i, 1] * u[1, :, :])
105 feq[i, :, :] = rho * t[i] * (1. + cu + 0.5 * cu**2 - usqr
→ )
106 return feq
107

108 ###### Setup: obstacle and velocity inlet with perturbation


→ ########
109 # Function to create a mask for a square obstacle
110 def obstacle_fun_square(x, y):
111 return np.logical_and(np.abs(x - cx) <= size // 2, np.abs(y -
→ cy) <= size // 2)
112

113 # Function to create a mask for a circular obstacle


114 def obstacle_fun_circle(x, y):
115 return (x - cx)**2 + (y - cy)**2 <= size**2
116

117 # Create the obstacle mask based on user input


118 if obstacle_type == "square":
119 obstacle = np.fromfunction(obstacle_fun_square, (nx, ny))
120 else:
121 obstacle = np.fromfunction(obstacle_fun_circle, (nx, ny))
122

123 # Function to initialize the velocity field with a slight


→ perturbation
240 F. Lathiya and B. Eidel

124 def inivel(d, x, y):


125 """
126 Initialize the velocity field with a slight perturbation to
→ trigger instabilities.
127 Args:
128 d (int): Dimension index (0 for x-direction, 1 for y-
→ direction).
129 x (ndarray): x-coordinates of the grid points.
130 y (ndarray): y-coordinates of the grid points.
131 Returns:
132 velocity (ndarray): Initial velocity field.
133 """
134 return (1. - d) * uLB * (1. + 1.e-4 * np.sin(y / ly * 2. * np
→ .pi))
135

136 # Initialize the velocity field with the perturbation


137 vel = np.fromfunction(inivel, (2, nx, ny))
138

139 # Initialize the distribution functions at equilibrium with the


→ initial velocity
140 fin = equilibrium(1., vel)
141

142 ###### Main time loop


→ ##########################################################
143 for time in range(maxIter):
144 # Right wall: outflow condition (copy the values from the
→ second last column to the last column)
145 fin[col3, -1, :] = fin[col3, -2, :]
146

147 # Compute macroscopic variables (density and velocity)


148 rho, u = macroscopic(fin)
149

150 # Left wall: inflow condition (impose velocity and compute


→ density)
151 u[:, 0, :] = vel[:, 0, :]
152 rho[0, :] = 1. / (1. - u[0, 0, :]) * (np.sum(fin[col2, 0, :],
→ axis=0) + 2. * np.sum(fin[col3, 0, :], axis=0))
153

154 # Compute equilibrium distribution function based on updated


→ macroscopic variables
155 feq = equilibrium(rho, u)
156 fin[[0, 1, 2], 0, :] = feq[[0, 1, 2], 0, :] + fin[[8, 7, 6],
→ 0, :] - feq[[8, 7, 6], 0, :]
157

158 # Collision step: relaxation towards equilibrium


159 fout = fin - omega * (fin - feq)
160

161 # Bounce-back condition for obstacle: reflect distribution


→ functions
162 for i in range(9):
163 fout[i, obstacle] = fin[8 - i, obstacle]
164
Flow Around an Obstacle Using the Lattice Boltzmann Method 241

165 # Streaming step: propagate the distribution functions to


→ neighboring nodes
166 for i in range(9):
167 fin[i, :, :] = np.roll(np.roll(fout[i, :, :], v[i, 0],
→ axis=0), v[i, 1], axis=1)
168

169 # Visualization every 100 iterations


170 if time % 100 == 0:
171 plt.clf()
172 # Plot velocity magnitude
173 plt.imshow(np.sqrt(u[0]**2 + u[1]**2).transpose(), cmap=
→ cm.Reds)
174 plt.colorbar()
175 plt.title(f’Velocity magnitude at time step {time}’)
176 plt.pause(0.1)
177

178 ###### Post-Processing


→ #########################################################
179 # Compute final macroscopic variables (density and velocity) from
→ distribution functions
180 rho, u = macroscopic(fin)
181

182 # Set velocity to zero inside the obstacle (no flow inside the
→ obstacle)
183 u[0, obstacle] = 0
184 u[1, obstacle] = 0
185

186 # Pressure field calculation (assuming ideal gas law: p = rho *


→ RT, with RT = 1)
187 pressure = rho
188 pressure[obstacle] = np.nan # Mask the pressure inside the
→ obstacle
189

190 ###### Plotting


→ ##########################################################
191 # Compute velocity magnitude field
192 velocity_magnitude = np.sqrt(u[0]**2 + u[1]**2)
193 velocity_magnitude[obstacle] = np.nan # Mask the velocity
→ magnitude inside the obstacle
194

195 plt.figure(figsize=(12, 6))


196 plt.subplot(121)
197 # Plot velocity magnitude
198 plt.imshow(velocity_magnitude.transpose(), cmap=cm.viridis)
199 cbar=plt.colorbar(location=’bottom’)
200 cbar.ax.tick_params(labelsize=13)
201 plt.title(’Velocity Magnitude(m/s)’,size=’16’)
202 plt.xticks(fontsize=13)
203 plt.yticks(fontsize=13)
204

205 plt.subplot(122)
206 # Plot pressure field
207 plt.imshow(pressure.transpose(), cmap=cm.viridis)
242 F. Lathiya and B. Eidel

208 cbar=plt.colorbar(location=’bottom’)
209 cbar.ax.tick_params(labelsize=13)
210 plt.title(’Pressure(Pa)’,size=’16’)
211 plt.xticks(fontsize=13)
212 plt.yticks(fontsize=13)
213

214 plt.tight_layout()
215 plt.show()
216

217 ###### Streamline Plot


→ #########################################################
218 # Plot streamlines to visualize the flow direction
219 plt.figure(figsize=(12, 6))
220 plt.streamplot(np.arange(nx), np.arange(ny), u[0].transpose(), u
→ [1].transpose(), color=velocity_magnitude.transpose(),
→ density=2, cmap=cm.viridis)
221 cbar=plt.colorbar(location=’bottom’)
222 cbar.ax.tick_params(labelsize=18)
223 plt.title(’Streamlines plot’,size=’23’)
224 plt.xticks(fontsize=18)
225 plt.yticks(fontsize=18)
226 plt.gca().set_aspect(’equal’) # Set aspect ratio to be equal
227 plt.tight_layout()
228 plt.show()
Listing 1 Output by GPT-4o for solving fluid flow around an obstacle in 2D

The output of the GPT-4o does not guarantee that the code generated will work
without any flaws. More discussion on the understanding of the GPT-4o is mentioned
in Sect. 5.

4 Test for Verification

Verification is crucial to ensure the accuracy and reliability of computational simu-


lations. It involves comparing the results of a simulation with analytical solutions or
experimental data to confirm that the model behaves as expected.
To verify the results obtained from the Python code, a similar simulation is per-
formed using ANSYS Fluent. The ANSYS simulation setup involves defining the
same computational domain and obstacle dimensions, applying appropriate BC, and
running the simulation as in the Python LBM code.
The fluid flow is modeled using the Navier-Stokes equations, solved with the
Finite Volume Method (FVM). The domain is discretized using a structured grid,
where the grid cells are organized in a regular pattern. A higher mesh density is used
to capture accurate boundary layer effects around the obstacle while keeping the
grid resolution in the interior of the channel coarser thus balancing efficiency with
accuracy.
Flow Around an Obstacle Using the Lattice Boltzmann Method 243

The element size implemented in the simulation is .h = 2.0616 × 10−5 mm. Simi-
lar to the LBM counterpart in this work, a constant time-stepping method is employed
with a fixed time step size of .t = 0.04 s.
The BCs include a uniform inflow with a slight perturbation an outflow condition
at the domain’s exit, and no-slip conditions on the obstacle surface to simulate the
interaction between the fluid and the solid boundary.
The results from ANSYS Fluent provide a benchmark to compare and validate
the results obtained from the AI-generated Python code. The simulation results,
including velocity and pressure fields, are compared against the Python-based Lattice
Boltzmann Method (LBM) simulation in Sect. 4.1.

4.1 Comparison

To verify the results, we compare the velocity, the streamline, and the pressure profiles
(Figs. 4 and 5) as well as the midline of the geometry (.ny = 50 mm) for both the
Python code and ANSYS simulations (Fig. 6). Velocity data is extracted from both
the Python code and the ANSYS simulations for this purpose. The comparison in
Fig. 6a and b for the .x-velocity and . y-velocity profiles indicate excellent agreement
of the present code with the results using the commercial solver.
The profiles obtained from the Python code and ANSYS Fluent simulations show
a high degree of agreement. The streamline patterns and velocity magnitude plots
exhibit similar flow characteristics around the obstacle. Specifically:

• Streamline plots: The results of both simulation frameworks (Fig. 4a and c) exhibit
the expected vortex shedding behind the obstacle, indicating that the flow separa-
tion and recirculation regions are captured accurately.
• Velocity contour plots: They agree between the present simulation results with
the commercial solver with respect to the velocity distributions and the maxima
(Fig. 4a and c).
• Velocity component profiles: The .x-velocity and . y-velocity profiles at the mid-
line of the geometry (Fig. 6) are in excellent agreement for the present Python code
and the commercial solver, which further strongly suggests the correctness of the
GPT4o-generated code.

Despite minor discrepancies, the overall agreement between the two sets of results
demonstrates that the GPT-4o-generated code accurately simulates the fluid flow
around the obstacle. The close match between the streamline patterns, velocity mag-
nitude distributions, and velocity profiles indicates that the Python code of this work
performs on par with the sophisticated commercial software ANSYS Fluent.
244 F. Lathiya and B. Eidel

(a) Velocity (top) and Streamline (bottom) plot for the circular obstacle

(b) Pressure distribution plot for the circular obstacle

(c) Velocity (top) and streamline (bottom) plots from the ANSYS Fluent
simulation

(d) Pressure distribution plot for the circular obstacle from the ANSYS
Fluent simulation

Fig. 4 Simulation results from python code a, b with a comparison to a reference solution c, d
from a commercial solver for the circular obstacle
Flow Around an Obstacle Using the Lattice Boltzmann Method 245

(a) Velocity (top) and Streamline (bottom) plot for the square obstacle

(b) Pressure distribution plot for the square obstacle

Fig. 5 Simulation results from python code a, b for the square obstacle

4.2 Discussion of Flow Phenomena at Reynolds Number 50


Versus 300

Beyond the comparison carried out above, the characteristics of the flow around
the circular obstacle at a Reynolds number 50 are briefly analyzed and set into
comparison with additional results for Re .= 300, which reveals interesting fluid
dynamics phenomena. These can be categorized into three distinct regions:

4.2.1 Flow in Front of the Obstacle (Upstream)

• Flow Deceleration and Stagnation Point: As the fluid approaches the circular
obstacle, it begins to decelerate due to the obstruction. Directly in front of the
obstacle, a stagnation point forms where the flow velocity reduces to zero. At this
point, the fluid is diverted around the obstacle.
• Pressure Increase: The deceleration of fluid near the stagnation point causes a
rise in pressure in front of the obstacle, following Bernoulli’s principle. The flow
246 F. Lathiya and B. Eidel

(a) -velocity profiles

(b) -velocity profiles

Fig. 6 Comparison of velocity profiles at the mid-line (.ny = 50 mm) for the circular obstacle
between the present python simulation result and a reference solution from the commercial solver
ANSYS
Flow Around an Obstacle Using the Lattice Boltzmann Method 247

(a) Velocity contour plot (top) and streamline plot (bottom)

(b) Pressure distribution contour plot

Fig. 7 Simulation results for the circular obstacle at Re .= 300

lines are closer together, indicating a pressure build-up. Note that for Re .= 50 the
pressure maximum is in some distance to the obstacle, Fig. 4b, for Re .= 300 right
at the obstacle, Fig. 7b.

4.2.2 Flow Passing the Obstacle (Along the Sides)

• Boundary Layer Development: As the fluid moves around the obstacle, a bound-
ary layer forms along the surface of the obstacle. This boundary layer is initially
laminar due to the low Reynolds number but starts thickening as it progresses
along the sides.
• Flow Separation: Given that the Reynolds number is around 50, flow separation
occurs on the sides of the obstacle. The flow cannot remain attached to the surface
of the obstacle as it curves around the sides. This results in the boundary layer
separating from the surface at some point, usually on the rear half of the obstacle.
• Symmetry in Separation: At Re .= 50, the separation is symmetric in that the
separation points on both sides of the obstacle occur at roughly the same position.
248 F. Lathiya and B. Eidel

For Re .= 300, the separation shows minor asymmetry leading to the formation of
alternating vortices in the wake.

4.2.3 Flow Behind the Obstacle (Downstream)

• Recirculation Zone: Behind the obstacle, a recirculation zone forms due to the
flow separation. In this region, the fluid moves in the opposite direction to the
main flow, creating vortices. For Re .= 50, these vortices –known as recirculating
eddies or vortex pairs– show high symmetry. The length of the re-circulation
zone is relatively short at this Reynolds number. For Re .= 300, in contrast, the
recirculation zone exhibits alternating vortices. The flow enters a periodic vortex
shedding regime, leading to the formation of a von Kármán vortex street. This
phenomenon is clearly visible in the vorticity and streamline plots, where the
alternating vortices on either side of the wake are prominent.
• Vortex Shedding Onset: At a Reynolds number of around 50, the flow is in a
transitional regime where vortex shedding might start to occur; in the present case
however, there is no clear vortex shedding. For Re .= 300, the wake behind the
obstacle is characterized by alternating vortices forming on either side of the wake.
• Wake Symmetry: At Re .= 50, the wake is symmetric, meaning that the vortices
are of similar size and strength on both sides of the obstacle. More downstream
in some distance to the obstacle the flow turns back to laminar. For Re .= 300 the
wake is inherently asymmetric due to the unsteady vortex shedding. The vortices
alternate in strength and size, which is characteristic of the vortex shedding process
in this Reynolds number range. Transition to Turbulence: The flow exhibits
characteristics of transition towards turbulence in the wake region. The alternating
vortices begin to interact with each other, and their dynamics contribute to the
development of complex, unsteady flow patterns further downstream

These observations highlight the significant impact of increasing the Reynolds


number on the flow behavior around the obstacle. The flow becomes more unsteady,
with prominent alternating vortex shedding and complex wake dynamics, indicating
the growing dominance of inertial forces over viscous forces compared to Re .= 50.

5 Discussion

This section discusses various aspects of the project, including errors and omissions,
completeness, reproducibility, and the lessons learned during the process.
Flow Around an Obstacle Using the Lattice Boltzmann Method 249

5.1 Errors and Omissions

During the code generation process, various errors and omissions are encountered,
which can be categorized into major and minor issues.

• Major Issues:

– Index mismatching errors: These errors were frequent and challenging to rec-
ognize, often leading to the code failing to execute properly.
– Non-reproducibility of code: The generated code varies significantly with each
attempt, even with the same prompt. This inconsistency makes it difficult to
achieve reliable results.
– Incomplete outputs: On several occasions, the code runs without errors but fails
to produce any output, resulting in blank profiles.

• Minor Issues:

– Missing details: Despite providing detailed prompts, GPT-4o sometimes misses


implementing simple yet crucial details.

5.2 Completeness and Short Term Memory Capacity

GPT-4o demonstrates a good capacity to implement details as specified in the prompt.


However, it is noted that while detailed prompts are necessary, extremely detailed
prompts can cause GPT-4o to forget some of the details, leading to incomplete imple-
mentation. It is crucial to strike a balance in the level of detail provided in the prompts
to ensure the best results.

5.3 Reproducibility as a Signature of Reliability

Unfortunately, the reproducibility of the code using the same prompt is not assured.
Several observations are noted during the generation of the code and the formulation
of the prompt:

• GPT-4o generates different codes for each trial, with many instances resulting in
compilation errors such as array dimension mismatches.
• Allocating a persona to GPT-4o which is an expert in the respective domain is
beneficial to a large extent. Additionally, when dealing with specialized subjects,
providing GPT-4o with useful principles and equations enhances the accuracy of
the results.
250 F. Lathiya and B. Eidel

• The previous history of the session fed to GPT-4o plays a critical role in gener-
ating further results more accurately according to the need. Therefore, the same
assignment, even with the same formulation, may not produce identical outputs
due to the influence of the session history.

5.4 Learned Lessons

Several valuable lessons were learned during the project:


• The importance of providing clear, detailed, and structured prompts to GPT-4o to
improve the quality and accuracy of the generated code.
• Understanding the limitations of GPT-4o in terms of reproducibility and the influ-
ence of session history on the generated outputs.
• Recognizing the necessity of human intervention to refine and validate the results,
ensuring the accuracy and reliability of the simulations.
• Appreciating the potential of AI-assisted coding in generating complex algorithms,
while also acknowledging the need for human expertise to refine and verify the
results.

Overall, this project highlights both the strengths and limitations of using AI
tools like GPT-4o for CFD simulations. The experience underscores the importance
of combining AI capabilities with human expertise to achieve reliable and accurate
results.

6 Conclusion

This chapter demonstrated the capability of GPT-4o to generate functional Python


code for simulating fluid flow around an obstacle using the LBM. The generated code
was verified in a benchmark simulation against the results of the commercial solver
ANSYS Fluent; the comparison indicates a high degree of agreement and confirms
the accuracy of the AI-assisted coding approach.
While GPT-4o proved to be a powerful tool in generating complex CFD algo-
rithms, human intervention was crucial for refining, debugging, and verifying the
results in the present work. In particular we pointed at challenges related to repro-
ducibility and error handling, emphasizing the need for clear, balanced prompts and
human oversight. Despite these challenges, the verification against ANSYS Fluent
demonstrated that GPT-4o-generated code can perform on par with advanced com-
mercial software, underscoring the viability of GPT-4o-assisted coding for scientific
research and engineering applications.
Overall, while GPT-4o cannot completely replace human expertise, it can signifi-
cantly speed up the coding process, making it a valuable tool for enhancing efficiency
and innovation in computational simulations.
Flow Around an Obstacle Using the Lattice Boltzmann Method 251

References

1. T. Krüger, H. Kusumaatmaja, A. Kuzmin, O. Shardt, G. Silva, and E. M. Viggen, “The Lattice


Boltzmann Method: Principles and Practice,” Springer, 2017. https://doi.org/10.1007/978-3-
319-44649-3.
2. S. Chen and G. D. Doolen, “Lattice Boltzmann Method for Fluid Flows,” Annual Review of
Fluid Mechanics, vol. 30, pp. 329–364, 1998. https://doi.org/10.1146/annurev.fluid.30.1.329.
3. J. Yojina, W. Ngamsaad, N. Nuttavut, D. Triampo, Y. Lenbury, P. Kanthang, S. Sriyab, and
W. Triampo, “Investigating flow patterns in a channel with complex obstacles using the lattice
Boltzmann method,” Journal of Mechanical Science and Technology, vol. 24, no. 10, pp. 2025–
2034, 2010.
4. W. Abassi, F. Aloui, S. Ben Nasrallah, and J. Legrand, “Lattice Boltzmann Method Used to
Simulate an Unsteady Flow Around an Obstacle in Laminar Regime,” in ASME/JSME/KSME
2011 Joint Fluids Engineering Conference, Hamamatsu, Japan, July 2011.
5. P. Mocz, “latticeboltzmann-python: Lattice Boltzmann simulation in Python,” GitHub repos-
itory, 2023. [Online]. Available: https://github.com/pmocz/latticeboltzmann-python. [Online
accessed: June-2024].
6. A. Majumder, V. Kariwala, S. Ansumali, and A. Rajendran, “Lattice Boltzmann method for
population balance equations with simultaneous growth, nucleation, aggregation and breakage,”
Chemical Engineering Science, vol. 69, no. 1, pp. 316–328, 2012. https://doi.org/10.1016/j.ces.
2011.10.051.
7. D. Heubes, A. Bartel, and M. Ehrhardt, An Introduction to the Lattice Boltzmann Method for
Coupled Problems, Preprint BUW-IMACM 11/28, Institute of Mathematical Modelling, Anal-
ysis and Computational Mathematics (IMACM), Bergische Universität Wuppertal, December
2011.
8. Z. Guo and C. Shu, Lattice Boltzmann Method and its Applications in Engineering. World
Scientific Publishing Co. Pte. Ltd., 2013.
Learned Lessons—Recommendations

Bernhard Eidel, Rahul Narkhede, and Aagashram Neelakandan

Abstract This chapter condenses the lessons learned in different chapters of this
book into recommendations for prompt engineering of Python programming tasks
for the chatbot GPT-4/4o in the areas of computational materials and mechanics.
Beyond, detailed recommendations for code verification are provided.

1 Responsible Usage of GPT-4 in Coding Tasks

Expert knowledge in the areas in which the chatbot is used as a programming assis-
tant is absolutely necessary for responsible use. This refers to the domain of model-
ing problems in nature and the sciences by differential equations and to numerical
methods employed for their solution. It no less refers to the field of professional
programming.
Hence, this expert knowledge must be readily available before the prompt is
shaped—and will certainly be of great help to craft the prompt. This expert knowledge
equally is required in rigorous tests of the code implying verification tests as well as
validation tests. In this book we had restricted to verification, which is, briefly, the
assessment of the accuracy of the code and its solution to a computational model by
comparison with known solutions. Validation is the assessment of the accuracy of a
computational simulation by comparison with experimental data.
While GPT-4 proved to be a powerful tool in generating complex algorithms,
human intervention was throughout necessary in all chapters of this book.

B. Eidel (B) · R. Narkhede · A. Neelakandan


M5, IMFD, TUBAF, Freiberg, Germany
e-mail: [email protected]
R. Narkhede
e-mail: [email protected]
A. Neelakandan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025 255
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5_11
256 B. Eidel et al.

2 Tangible Advice for Prompt Engineering

In working with GPT-4 to generate Python code for computational problems, par-
ticularly in computational materials science, solid mechanics, wave propagation,
and fluid mechanics, we have distilled several key lessons and strategies that can
enhance the effectiveness of AI interactions. The following are detailed recommen-
dations for prompt engineering that have proven valuable in eliciting high-quality
responses from GPT-4. We include links to the general characteristics of LLMs and
their performance in Chatbots.
In a book context, where the interaction with GPT-4 is limited to one or a few well-
constructed prompts, it becomes essential to design each prompt carefully to ensure
it leads to high-quality code generation. Multiple iterative prompts or feedback loops
are often impractical, so the focus should be on refining each prompt to be as effective
as possible from the outset.
1. Persona Assignment: Generating Context
One effective technique is the strategic use of persona assignment, where GPT-4
is attributed a specific role or context. By assigning a persona as an expert with
knowledge in, e.g., computational materials science with particular expertise in
grain growth, the generated solutions are more aligned with user expectations.
This helps GPT-4 frame its answers appropriately, adding a layer of sophistication.
This approach provides context, ensuring that the output is relevant both in terms
of solution and explanation. The persona assignment can be also used to tailor
the responses generated by the chatbot to follow a certain format and take certain
general instructions into consideration.
2. Be Explicit in What You Need: Specify Language, Output Format, or Tech-
niques
Clarity in prompts leads to clarity in responses. It is essential to specify the coding
language, the methods or libraries to be used, and the desired output format. Some
measures to ensure clarity in the prompts are listed here:

• Provide equations or relations in a LATEXformat to ensure accurate interpretation


by GPT.
• Mention the datatype of the inputs and expected outputs, with examples. If the
inputs are images or any other form of data, mention the file format, e.g. ‘.jpg’
or ‘.tif’.
• Clearly mention the expected dimensions of matrices and vectors.
• If a term has the possibility of being interpreted differently by GPT, then briefly
specify the desired meaning in the prompt.

This ensures the solution is tailored precisely to the problem’s requirements,


avoiding irrelevant approaches.
3. Iterative Refinement Process
From our experience in almost all topical chapters, the first iteration of a prompt
was very rarely the final one, yielding the best code output. In cases of complex
Learned Lessons—Recommendations 257

problems requiring decomposition of the problem into multiple steps, it is sug-


gested to start a new iteration in a new chat.
One can also use GPT-4 to create a prompt using the solution from the previous
iteration or briefly explain the desired output. This may act as a good starting
point for subsequent prompt refinement.
By refining the prompt before the final submission, the output will be more
complete, reducing the need for further iterations.
4. Problem Decomposition
When addressing complex tasks, it is beneficial to break the problem into smaller,
manageable components. Instead of requesting a complete solution at once,
decomposing the task improves both clarity and output quality. Ideally, prob-
lems should be broken down into reasonable and logical parts, e.g., in machine
learning, the problem can be split into the steps of the workflow, i.e., model inputs,
model creation, training, and testing. GPT-4 can also be prompted to tackle a com-
plex problem in a step-wise manner by adding the line “Let’s think step-by-step”
to invoke the chain of thought technique of prompt engineering. If the prompt-
response chat becomes too long, it is recommended to provide direct references
to previous responses or statements by simply adding them to the current prompt.
Thus, the issues rendered by a limited memory can be overcome.
This step-by-step approach structures the interaction, guiding GPT-4 to handle
complex problems methodically.
5. Embedded Explanation and Commentary: Chain of Thought in a Single
Prompt To enhance the clarity of the generated code, instruct GPT-4 to include
explanations and commentary within the code. This mimics the chain of thought
technique, ensuring the AI provides reasoning alongside its solution.
By embedding explanations within the code, the model generates both a solution
and a tutorial-like guide for the reader.
6. Self Reflection Invocation: Questioning the Correctness When dealing with a
complex topic or new areas, a prompt with very little information about the theory
might lead GPT-4 to misunderstand which domain of knowledge the prompt
refers to. One way of addressing this is to ask follow-up questions on the topic of
interest to fully understand GPT-4’s position. Another method is to include self-
reflection comments or critique-style questions, where GPT-4 questions whether
the information provided is correct or not as a form of self-reflection. This typically
improves the quality of data retrieved from a file, but can also be applied to general
prompts [3].
By incorporating critiquing with self-reflection, the accuracy of text retrievals
from external files can be improved, and hallucinations in responses to general
prompts can be reduced.
7. Addressing Hallucinations in Prompt Engineering Hallucinations, a factually
incorrect or fabricated content generated by language models, pose a major chal-
lenge for reliable use of AI, especially in scientific domains where accuracy is
paramount [2]. Hallucinations arise from the model’s training data, data sources,
and inference methods, leading to errors when the model produces outputs based
on memorized or low-confidence data. While end-users have limited control
258 B. Eidel et al.

over data and training-related hallucinations, errors at the inference level can
be minimized by employing strategies that prioritize faithfulness to the user’s
prompt.
For instance, Faithfulness Enhanced Decoding techniques adjust the model’s
probability weighting to prioritize user-provided context and instruction align-
ment, effectively reducing hallucinatory outputs by focusing on the most relevant
tokens [1]. Moreover, detailed prompts with clear and precise context reduce
ambiguity, helping the model generate responses that are more likely to adhere
closely to the input and reduce error. When the input prompt is sufficiently
detailed, specific and clear, the likelihood of hallucinations decreases, enhancing
response accuracy.
Therefore, for high-stakes scientific applications, detailed, well-structured
prompts are critical. This includes iterative refinement, prompt adjustments,
and, if needed, employing structured query patterns like chaining commands to
guide GPT-4 in generating fully operational response.

3 Tests for Code Verification, Inbuilt and Separate

Tests for code verification are indispensable. We can distinguish testing action we
already ask the Chatbot to carry out being inbuilt in the prompt or tests we figure out
and design.
1. Incorporate Testing and Verification: Request Edge Case Handling
Given the importance of verification, prompt GPT-4 to include testing cases,
especially for edge conditions. This promotes robust code generation, essential
in computational mechanics.
2. Check Initialization of Problems
Several problems require an initialization of variables over a domain or an initial-
ization of the parameters of a model. If the initialization is incorrect, the solution,
despite being correct, can result in wrong results. Thus, problem initialization
should be checked before moving to the verification of the rest of the solution
provided by GPT-4.
3. Verify Boundary Conditions
Despite clear instructions, GPT-4 may omit some boundary conditions of the
problem. A simple check of the values of the variables at these boundaries can be
made to ensure if they are included correctly in the problem.
4. Ensure Data Inputs are Correct
For problems requiring external sources of data to be imported into the model, the
correct processing of the data should be ensured. For e.g. image segmentation, if
the masks are not assigned correctly, then the models will not result in the correct
segmentation.
5. Mathematical Accuracy of Numerical Implementation
Numerical methods applied for solving differential equations can be implemented
Learned Lessons—Recommendations 259

with minor inaccuracies which can cause major deviations from the expected
outputs. For e.g., in the summation . iN= j (ηin )2 , the code generated can miss out
on ensuring.i = j, resulting in an erroneous summation. Such details of numerical
implementation should be checked.
6. Comparison with Analytical Solutions in Particular Settings
Some complex problems can be reduced to a form where an analytical solution
exists. Providing this information within the prompts helps GPT-4 generate the
corresponding analytical equations, which can then be coded to verify the results.
However, if the generated results are unsatisfactory or incorrect, providing the
explicit analytical form in the form of equations will also work.
7. Comparison with (Commercial) Software Systems When verifying codes
against commercial software, there are several steps and considerations to ensure
a meaningful and robust comparison; (i) define the scope and objectives clearly
(choice of a problem that can be modeled equivalently in the generated code
and the commercial software; choice of comparison metrics; preference of rather
simple benchmark problems), (ii) ensure model consistency (geometry, material
properties, boundary conditions, loading, mesh, etc.), (iii) match solver settings
(for, e.g. finite elements: element formulation, numerical integration, solver algo-
rithm, tolerances, (iv) handle nonlinearities carefully.
Observed differences should be interpreted cautiously. Differences can arise due to
hidden reasons. As an example, commercial software often uses highly optimized
and proprietary algorithms that may differ in subtle ways. Moreover, slight dif-
ferences in floating-point arithmetic can accumulate over iterative calculations.
Moreover, we recommend to treat commercial software as a “gold standard”
cautiously; their results are not inherently infallible.

4 Current Limitations–Where, Why and How GPT-4 Fails


and How to Overcome It

1. Long, Complex Coding Tasks


Even though GPT-4 has large context windows of 128000 context length [4], for
a long task they are not sufficient. There are multiple ways it can be mitigated.
Some of them are,
• Explicitly informing GPT-4 that prompt is long and to wait for complete code.
• Break down the codes in sections (instead of splitting as per max tokens), will
help GPT-4 to better understand the context, than stopping at a random point.
Using self-reflections and chains of thought improves the response quality.
2. Task Skipping
When structuring highly complex prompts with multiple tasks that reference each
other, such as “Task 1. Task 2. Task 3 refers to Task 2. Task 4 now refers to Task
3...”, GPT-4 might forget some tasks, leading to their complete omission.
260 B. Eidel et al.

In cases where task skipping is evident, it is always recommended to restructure


the prompts in a clearer and more straightforward fashion.
3. Sticking to Older Library Versions
For Python libraries that have been updated to newer versions recently, GPT-4
may provide code with the syntax and usage according to older versions. This may
cause incompatibility with other libraries at times. To remedy this issue, a prompt
containing the current usage of library methods from the original documentation
can be added to update GPT-4 generated code.
4. ‘r’s in Strawberry
One of the well-known examples of GPT-4’s tokenization issue is asking, “How
many ‘r’s are in the word ‘Strawberry’?”. Despite the obvious answer being 3,
GPT-4 often provides different answers, a result of the tokenization process. GPT-
4 typically reads tokens based on attention, meaning that when it processes the
question, information about the word might appear before or after the prediction,
and the weight of the last token might not influence the result.
Self-reflection, by prompting the same question again or questioning the correct-
ness of the answer, might resolve the issue. Another possible approach is to add
the instruction “Compute with code.” to the prompt. This allows GPT-4 to com-
pute the answer by generating a program that verifies and provides the correct
response. However, in a recent version before book publication, this issue has
been patched.

References

1. Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X.,
Qin, B. & Liu, T. A Survey on Hallucination in Large Language Models: Principles, Taxonomy,
Challenges, and Open Questions. ACM Trans. Inf. Syst.. (2024, 11)
2. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A. & Fung,
P. Survey of hallucination in natural language generation. ACM Computing Surveys. 55, 1–38
(2023)
3. Asai, A., Wu, Z., Wang, Y., Sil, A. & Hajishirzi, H. Self-rag: Learning to retrieve, generate, and
critique through self-reflection. ArXiv Preprint ArXiv:2310.11511. (2023)
4. OpenAI, “OpenAI Platform - Models Documentation,” 2024. [Online]. Available: https://
platform.openai.com/docs/models/. [Accessed: 26- Sep- 2024].
Index

A Deep learning, 127


Absorbing BC, 184 DenseNet-121, 128
Acronyms, xi Dielectric, 210
Allen-Cahn equation, 56 Dirichlet BC, 184
ANSYS Fluent, 243 Dislocation, 4, 17
Dislocation line, 4

B
Basis, 3 E
Bernoulli’s principle, 245 Edge dislocation, 4
Bhatnagar-Gross-Krook (BGK) approxima- Edge displacement, 17
tion, 229, 230 Electric field, 203
Body Centered Cubic (BCC), 5, 24 Electric flux density, 203
Bounce-back rule, 231 Electromagnetic energy, 221
Boundary Conditions (BCs), 183 Energy conservation, 50, 220
Boundary layer, 247 Ensembling, 130
Bravais lattice, 3 Equilibrium distribution, 229
Burgers vector, 4, 6, 22 Error function (erf), 117
Excitation functions, 183

C
Cellular Automata (CA), 84 F
Charge density, 203 Face Centered Cubic (FCC), 5, 24
Contributors, ix Fick’s Second Law (FSL), 85
Convolutional neural network, 128 Finite difference method, 56
Courant-Friedrichs-Lewy (CFL) condition, Finite Difference Time Domain (FDTD),
185 182, 210
Crystal, 3, 5 Finite Volume Method, 242
Crystal lattice, 5 Flow separation, 243
Current density, 203 Fluent, 242
FTCS scheme, 109

D
D2Q9 model, 228, 229 G
Data augmentation, 130 Gaussian function, 183
Data splitting, 129 Gaussian sinusoidal pulse, 209
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2025 261
B. Eidel (ed.), GPT for Python-Coding in Computational Materials Science
and Mechanics, Studies in Computational Intelligence 1198,
https://doi.org/10.1007/978-3-031-85470-5
262 Index

Generalizability, 127 No-slip BC, 231


Graphics Processing Unit (GPU), 130

O
H Optimization, 130
Hallucinations, 77, 119, 120 Overfitting, 130
Heat conduction, 109 OVITO, 16, 23
Hexagonal Close Packed (HCP), 5, 24
High-temperature corrosion, 85
Human expertise, 250 P
Hyperparameters, 127 Parameters, 127
Partial Differential Equation (PDE), 182
Perfectly Matched Layer (PML), 206, 208
I Periodicity, 9
ImageNet, 128 Permeability, 204
Initial Boundary Values Problem (IBVP), Permittivity, 204, 210
183 Pitting corrosion, 85
Initial Conditions (ICs), 183 Poisson’s ratio, 7
Inlet and outlet BC, 231 Pristine crystal, 17
Integrated Development Environment PyTorch, 130
(IDE), 223

R
L Reaction Kinetics (RK), 86
Laminar, 247 Recirculation, 248
Large Language Models (LLMs), 23, 26, Reflecting BC, 184
119, 181 Reflection, 210
Lattice Boltzmann Method (LBM), 228 Refraction, 210
Lattice constant, 4, 8 ReLU, 142
Lattice grid, 232 Reproducibility, 26, 249
Lattice parameter, 4 ResNet50, 128, 146
LBM collision, 230 Rotation matrix, 27
LBM-grid, 229
LBM streaming, 230
Learning rate, 127, 142 S
Lennard-Jones potential, 34 Scanning Electron Microscope (SEM), 125,
142
Scattering, 210
M Screw dislocation, 4
Magnetic field, 203 Screw displacement, 18
Magnetic flux density, 203 Segmentation, 145
Maxwell-Boltzmann distribution, 36 Short-term memory capacity, 30
Maxwell equations, 203 Short term memory, 25
Mean Square Error (MSE), 118 Single crystal, 16
Mesh density, 242 Sinusoidal function, 183
MicroNet, 146 Stagnation point, 245
Microstructure, 142 Statistical ensembles, 36
Mur BC, 184 Structure generation, 15
Superalloy, 146

N
Native orientation, 16 T
Navier-Stokes equations, 242 Test, 129
Non-Bravais lattice, 3 Training, 129
Index 263

Training data, 127 V


Transfer learning, 128, 142, 146 Validation, 127, 129
Transverse electric, 205 Van der Waals forces, 34
Transverse magnetic, 205 Velocity Verlet integration, 36
Voronoi tessellation, 57, 58, 63, 71
Vortex, 248
U Vorticity, 248
UNet, 146
UNet++, 146
Unit cell, 8 W
Wave equation, 182
Weights, 127

You might also like