0% found this document useful (0 votes)
7 views37 pages

A Machine Learning Framework for Accelerating the Design Process

This paper introduces a machine learning framework designed to accelerate the design process in computer-aided engineering (CAE) simulations, specifically applied to finite element analysis for structural crashworthiness. The framework employs 3D convolutional neural networks (CNN) autoencoders and long-short term memory neural networks (LSTM-NN) to predict force-displacement responses and mesh deformations from finite element simulations, achieving significant speed improvements over traditional methods. The study demonstrates that the proposed ML system can effectively utilize a minimal amount of training data to provide accurate predictions, thereby enhancing the design exploration process for engineers.

Uploaded by

7gbhfqpgq8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views37 pages

A Machine Learning Framework for Accelerating the Design Process

This paper introduces a machine learning framework designed to accelerate the design process in computer-aided engineering (CAE) simulations, specifically applied to finite element analysis for structural crashworthiness. The framework employs 3D convolutional neural networks (CNN) autoencoders and long-short term memory neural networks (LSTM-NN) to predict force-displacement responses and mesh deformations from finite element simulations, achieving significant speed improvements over traditional methods. The study demonstrates that the proposed ML system can effectively utilize a minimal amount of training data to provide accurate predictions, thereby enhancing the design exploration process for engineers.

Uploaded by

7gbhfqpgq8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Available online at www.sciencedirect.

com
ScienceDirect

Comput. Methods Appl. Mech. Engrg. 385 (2021) 114008


www.elsevier.com/locate/cma

A machine learning framework for accelerating the design process


using CAE simulations: An application to finite element analysis in
structural crashworthiness
Christopher P. Kohara , Lars Greveb , Tom K. Ellerb , Daniel S. Connollya , Kaan Inala ,∗
a Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1
b Volkswagen AG, Group Innovation, 38436 Wolfsburg, Germany

Received 18 December 2020; received in revised form 17 May 2021; accepted 15 June 2021
Available online 13 July 2021

Abstract
This paper presents a novel framework for predicting computer-aided engineering (CAE) simulation results using machine
learning (ML). The framework is applied to finite element (FE) simulations of dynamic axial crushing of rectangular crush
tubes that are typically used in vehicle crashworthiness applications. A virtual design of experiments that varies the size and
wall thickness of the FE model is performed to generate the necessary training data. This process generates designs with
varying numbers of nodes and elements that are handled by the ML system. However, the explicit design parameters and
meshing techniques that were used to generate the training data remain unknown to the ML system. Instead, 3D convolutional
neural networks (CNN) autoencoders are used to process the initial FE model data (i.e., nodes, elements, thickness, etc.)
to automatically determine these features in an unsupervised manner. A voxelization strategy that operates on the mass of
individual nodes is proposed to handle the unstructured nature of the nodes and elements while capturing variations in the
wall thickness of the FE models. The flattened latent space generated by the 3D-CNN-autoencoder is then used as input into
long-short term memory neural networks (LSTM-NN) to predict the force–displacement response as well as the deformation
of the mesh. The training process of both the 3D-CNN-autoencoders and LSTM-NN is systematically studied to highlight the
robustness of the framework. The proposed ML system utilizes only 16% of the simulations generated in the virtual design
of experiments to achieve good predictive capability. Once trained, the proposed framework can predict the deformation of
the mesh and resulting force–displacement response of a new design up to ∼330 and ∼2,960,000 times faster, respectively,
than the conventional FE approach with good accuracy. This computational speed up offers design engineers and scientists a
potential tool for accelerating the design exploration process with CAE simulation tools, such as FE analysis.
⃝c 2021 Elsevier B.V. All rights reserved.

Keywords: Machine learning; CNN model; Autoencoder; LSTM model; Finite element simulation; Crashworthiness

1. Introduction
Automakers rely on computer-aided engineering (CAE) software and simulation tools, such as commercial
finite element (FE) analysis packages, to predict the performance of lightweight structures before manufacturing
∗ Corresponding author.
E-mail address: [email protected] (K. Inal).

https://doi.org/10.1016/j.cma.2021.114008
0045-7825/⃝ c 2021 Elsevier B.V. All rights reserved.
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

and production. These FE tools allow design engineers an opportunity to explore various features within a
component or assembly without physical prototyping. In addition, design engineers can perform parametric studies
by defining explicit sets of parameters that explore the design space to predict component performance. Engineers
can manipulate the meshes and geometry of the FE model to investigate the geometric impact on performance.
Several new challenges have emerged as a result of designing vehicles in a virtual environment. The continuous
enhancement of vehicle safety [1,2] and emission requirements [3–5] (through lightweighting methods) requires
an increasingly complex CAE development process that leads to a large number of simulations, which generates
a substantial amount of data. The complexity of these models also leads to substantial computational and time
requirements despite recent improvements in computing technologies for FE calculations [6].
At the present moment, the global economy is amidst the latest industrial revolution that has been sparked by
data science and big data across all economic sectors, including the automotive and transportation sectors. Machine
learning (ML) within the field of artificial intelligence (AI) has emerged as a solution to cope with the challenges
that have arisen during this data science revolution. ML has made substantial developments in the area of human–
computer interactions through natural language processing [7,8], financial forecasting [9], and diagnostic tools for
medical analysis [10,11]. ML and AI applications in the automotive sector have primarily focused on advanced
driver assistance systems, vehicle controls, and autonomous driving systems [12,13]. This provides an opportunity
for ML to exploit these pre-existing big data sets from CAE simulations that remain largely ignored on data systems
for future vehicle development.
ML has been applied to aid various aspects of the development process of CAE simulations in vehicle lightweight-
ing, such as enhancing material modeling calculations [14–18] and speeding up CAE model calculations [19–23].
In many of these applications, artificial neural networks (NN) are used as a metamodel to generate a relationship
between a set of desired input to output functions through regression modeling in a supervised manner [24]. Multiple
layers of neurons can add complexity to the model to extract higher-level knowledge from the data set during
training; this is known as deep learning (DL). In structural lightweighting with FE simulations, virtual design of
experiment studies are performed to generate the necessary data for training these ML metamodels. Once these
ML metamodels are identified, optimization algorithms can be leveraged to navigate the design domain to enhance
various performance metrics [25–31]. Traditionally, ML metamodels used in structural lightweighting relate explicit
design parameters to multiple scalar quantities (i.e., mass, absorbed energy, peak force, etc.). Instead, the time-series
response of an output response (i.e., force–time) can be captured in a metamodel to provide additional insight for
a design engineer to explain the observed response and performance.
ML techniques have been used to predict time-series responses for desired output functions in structural
lightweighting applications with FE simulations. Lanzi et al. [32] used an artificial NN to predict force–time re-
sponses obtained from PAMCRASH simulations of dynamic crushing of rivetted assemblies, honeycomb structures,
and a helicopter subfloor assembly. The input to their artificial NN was dependent on a set of explicit parameters
that were used to create their virtual design of experiments for training. The current instance of time was also
used as an input to their NN. However, no history-dependent state or sequence, such as the previous values in
the force–time response, was utilized in their predictions. It is important to note that these ML metamodels are
generating approximate solutions to a system of non-linear partial differential equations that are governed by the
mechanics and physics of solids and are time-dependent.
Temporal effects can be incorporated into ML metamodels through deep learning with recurrent neural networks
(RNN). RNNs are a network architecture that predicts the next sequence of output and internal states based on the
current input and previous history of the input, output, and internal states. Various types of RNN have been used
as ML metamodels for time-series predictions in structural lightweighting applications with FE simulations. Omar
et al. [33] used an RNN to predict the deceleration response of a simple box crush tube subjected to different initial
impact speeds. They also used an RNN to predict the average deceleration response of a full-vehicle FE model
subjected to different initial speeds [33]. Recently, Kohar et al. [34] used a long-short term memory neural network
(LSTM-NN), a specific RNN architecture, to predict the deceleration response of a full-vehicle FE model of a
pickup truck. Their ML metamodel mapped the material properties (i.e., yield strength, ultimate tensile strength,
hardening) and the gauge thickness throughout the vehicle to the time-series response of the occupant and passenger
deceleration. Van de Weg et al. [35] utilized an LSTM-NN to predict the bifurcation and fracture behavior of FE
simulated tensile coupons, which are typically used in the characterization of automotive steels in lightweighting
applications. Their model mapped a single known geometric parameter through the surrogate model to predict when
fracturing would occur.
2
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

However, there are substantial barriers to applying these ML methods to exploit the already existing data sets
that exist within these automaker companies. In the studies mentioned above, the ML metamodel relies on a
common set of parameters that have been explicitly defined at the beginning of the data generation stage. It is
challenging to explicitly define a common set of parameters across various vehicle platforms that have fundamental
differences in design and shapes. This is compounded by the fact that FE models can have significantly different
mesh topologies and discretization schemes (i.e., number of nodes, elements) across varying design configurations
to maintain standard practices for FE analysis. These variations introduce an unstructured nature into the dataset that
needs to be handled by the ML metamodel. Next, the studies mentioned above are often limited to predicting only
a few time-dependent scalar quantities (i.e., accelerations, force–displacement, energy absorption). To the authors’
knowledge, little to no studies exist that attempt to predict and visualize a deformed structure or mesh from FE
simulations, let alone without using the set of explicit design parameters [36].
Recent advances in the technique of convolutional neural networks have shown promise in feature extraction from
a 3-dimensional image and data sets, such as FE mesh data. A convolutional neural network (CNN) is a specific
machine learning architecture that mimics the recognition process of animals for identifying features and objects in
images. CNNs use pixel or voxel maps of an image set or array as the input vector for feature identification. CNNs
were initially developed for 2D image processing and handwriting recognition [37,38]. However, CNNs have been
extended to perform analysis on 3D image sets, such as processing magnetic resonance images (MRI) of medical
images for illness and disease detection [39,40]. Recently, deep 3D-CNNs have been applied to analyze features
in computer-aided design (CAD) components that are used in CAE simulations [41]. Zhang et al. [42] used deep
3D-CNNs to identify features in a series of standard CAD features, including size and shape, that are commonly
used in design. Yet, this again requires classification and labeling of metadata within these datasets to facilitate a
training algorithm. This is a challenge when presented with large data sets where the classification process of the data
is performed manually or does not exist. Autoencoders [43,44] are a technique that utilizes unsupervised learning
to transform high-dimensionality data (i.e., images, voxel maps) into the low-dimensionality latent space without
explicit feature definition. This low-dimensionality latent space consists of features identified by the autoencoder to
differentiate between training samples but is abstracted from human comprehension. When combined with CNNs,
autoencoders have the potential to self-learn a latent space of features from a data set consisting of 3-dimensional
bodies (i.e., FE mesh data) that can then be used to predict some time-series behavior using RNNs that has physical
meaning (i.e., force–displacements, deformation response).
This work presents a novel framework for using machine learning to predict results from CAE simulations. The
framework is demonstrated using an academic example that is based on FE simulations presented in Kohar et al. [45]
for dynamic axial crushing of various rectangular crush tubes. A virtual design of experiments is performed to
generate the necessary training data for this study. Each FE simulation has variations in structure (i.e., size, shape,
thickness) and mesh (i.e., number of nodes, number of elements, etc.) that are unknown to the ML system. The ML
system consists of a deep network consisting of a 3D-CNN-autoencoder and an LSTM-NN to capture the response
of the FE models. 3D-CNN-autoencoders are used to process the initial FE model data (i.e., nodes, elements,
elemental thicknesses, etc.) to automatically determine these features in an unsupervised manner. A voxelization
strategy is proposed to handle the unstructured nature of the FE model data for use in the 3D-CNN-autoencoder. The
architecture of the 3D-CNN-autoencoder is systematically studied to understand the robustness of the approach and
the feature space that is created. These features generated by the 3D-CNN-autoencoder are then used as the inputs
into LSTM-NN to predict the dynamic axial crushing response of these rectangular crush tubes. These predictions
include the force–displacement and resulting deformation response of the structures. Similarly, architecture studies
for the LSTM-NN are presented to evaluate the robustness of this framework. More importantly, the amount of
data required to train the LSTM-NN is also presented to identify the minimum number of simulations required to
calibrate the framework.

2. Machine learning-based models


2.1. Feed-forward neural networks

The feed-forward neural network is a standard artificial NN architecture used in machine learning that is based
on the connection of neurons to mimic the structuring of the human brain. A single neuron can receive multiple
3
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 1. Schematic of a single neuron.

input signals, and the neuron weighs these input signals to determine whether or not to activate. Fig. 1 presents a
schematic of a single neuron.
A single neuron, i, takes the values of an input vector, x j , and scales them by corresponding weights, wi j .
Additionally, the product of this operation is summed and shifted by a bias value, bi . Each weight and bias is
unique to each input and neuron, respectively. As such, the output of the neuron, ỹi , is governed by an activation
function, f , such that
ỹi = f wi j x j + bi
( )
(1)
Activation functions were originally designed to mimic the action potential of a neuron, but were then extended
to other mathematical functions to improve model performance for certain problems. These activation functions
include a simple linear function, rectified linear unit (ReLU), hyperbolic tangent, and sigmoidal function.
Various numbers of neurons can be connected to the same input vector with their own unique set of weights.
Furthermore, the outputs of neurons can be passed as the input into another neuron to form dense layers of a
network. This produces a “feed-forward” architecture where the result of each scaled input propagates forward
towards the output dense layer without feedback. Fig. 2 presents a schematic of a feed-forward NN. Each dense
layer, l, contains its own neurons with corresponding connections to the next layer. Each connection is associated
with a unique weight, wi(l)j , and each neuron with a bias, bi(l) . These weights and biases are determined through a
learning algorithm that attempts to minimize a loss function between the predicted output of the neural network, ỹo ,
and the actual output, yo , from a set of training data. Common learning algorithms calculate the gradients of the loss
function with respect to the weights and biases in each layer through backpropagation [46,47] and minimized using
optimizers, such as scaled conjugate gradient [24], Levenberg–Marquardt [48,49], and the Adam [50] methods. The
structuring of these input–outputs forms a massively parallel architecture that can be vectorized for computational
efficiency. Multiple output relationships, ỹo , can be established through a single network, which can further
accelerate computation.

2.2. Recurrent neural networks

A recurrent neural network (RNN) is a network architecture that can be used to capture time-dependent data
or sequences. RNNs utilize the previous evaluation of the output, along with the temporal evolution of the input
vector of the network, as part of the input for the next evaluation in time. However, the evolution of error during
backpropagation over many network layers (as commonly observed in RNNs) tends to lead to exploding or vanishing
gradients [51]. The long-short term memory neural network (LSTM-NN) [51] is a particular RNN architecture that
is effective in overcoming the exploding or vanishing gradient problem by strategically introducing internal memory
cells that control the error propagation [52].

2.2.1. Long-short term memory neural networks


Fig. 3 presents a schematic of a single LSTM cell. Each LSTM cell maps an input state of size, n, at an instance
of time, xt ∈ Rn , into the cell and outputs the memory cell state, ct ∈ Rh , and an intermediate hidden state, h t ∈ Rh ,
that is decoded to the desired output. Additionally, the cell receives the previous state of the cell memory, ct−1 ,
4
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 2. Schematic of a feed-forward multi-layer NN.

Fig. 3. Schematic of an LSTM cell [34].

and intermediate hidden state, h t−1 . The input gate, i, forget gate, f , cell candidate gate, g, and an output gate, o,
controls the temporal evolution of the backpropagated error response. The formulation of the associated gates is
f = σ R f ht−1 + W f x t + b f , f ∈ Rh
( )

g = σ R g ht−1 + W g x t + bg , g ∈ Rh
( )
(2)
i = θ (Ri ht−1 + W i x t + bi ) , i ∈ Rh
o = σ (Ro ht−1 + W o x t + bo ), o ∈ Rh

where σ and θ denote the sigmoidal and hyperbolic tangent activation functions, and R = R f , R g , Ri , Ro , W =
[ ]

W f , W g , W i , W o and b = b f , bg , bi , bo are the weights and biases associated to each gate that needs to be
[ ] [ ]

identified. Accordingly, the size of the weight and bias matrices are R j ∈ Rh×h , W j ∈ Rh×n , b j ∈ Rh . The cell
memory and intermediate hidden states are updated according to

ct = f ⊙ ct−1 + g ⊙ i (3)
ht = o ⊙ θ (ct ) (4)
5
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 4. Schematic of a 3D-CNN-autoencoder architecture.

where ⊙ is the Hadamard product (element-wise multiplication) operator. Afterward, the intermediate hidden state is
decoded using a time-distributed feed-forward NN to form an LSTM-NN that obtains the desired output time-series
response.

2.3. 3D-Convolutional Neural Networks (CNN) autoencoders

Fig. 4 presents a schematic of a 3D-CNN-autoencoder architecture. 3D-CNNs use voxel maps of an image set
or array of size, Nx(0) × Ny(0) × Nz(0) , as the input vector for feature identification. Subsets of the image are passed
into a layer of neurons, known as convolutional filters, which map these values to a reduced volumetric space. The
convolution operation is achieved by overlapping multiple filters across the entire volume of the image or voxel
maps. The stored weights of the filter act as an “encoder” for the input voxel map to transform the data into a useful
format for the neural network. Multiple layers of CNN filters operate as hierarchical decomposition filters to extract
higher-order feature information. This operation continues to a vector space of lower dimensionality, known as the
latent space, that has size n f . In traditional 3D-CNN applications, the latent space is then mapped through a classifier
module into a small number of categories (i.e., edges, corners, etc.). However, this process typically requires the
classification of these small categories within these image datasets to facilitate the learning algorithm. In other words,
the data set requires labeling of the explicit features in each image for training the CNN in a supervised manner.
This is a challenge when presented with large data sets where the classification process of the data is performed
manually or does not exist. Autoencoders [43,44] are a technique that utilizes unsupervised learning to transform
high-dimensionality data (i.e., images, voxel maps) into the low-dimensionality latent space without explicit feature
definition. The architecture utilizes CNN layers as an “encoder” to generate the latent space from the input voxel
map, xo . After generating the latent space, a de-convolution (or inverse or transpose convolution) structure acts as a
“decoder” to expand and transform the latent space to generate a recovered voxel map, x̃o . The “decoder” structure
follows a symmetric structuring to the “encoder” filters during the expansion into the volumetric space. As such,
the weights within the CNN-autoencoder are identified using common learning algorithms where the loss function
is defined as some difference between the original and reproduced voxel map. Although the values within the latent
space contain no physical meaning, the reduced dimensionality allows for a unique classification of features within
the voxel map. This, in turn, can be efficiently translated for use in additional networks.

3. Proposed framework
The machine learning framework utilizes results obtained from finite element (FE) simulations as training
information. A virtual design of experiments (DoE) study is performed with the FE model to generate the necessary
data for calibrating the recurrent neural network model. Although an explicit set of design parameters will be
6
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 5. Schematic of deep neural network architecture for predicting the force–displacement and deformation response of a structure.

required to create the virtual DoE, this ML framework fundamentally assumes that these explicit design parameters
are not known at any stage of the training process. As such, the framework must self-learn a latent space to
differentiate different structural configurations. A flattened latent space is generated by a 3D-CNN-autoencoder
of the FE mesh using a voxelization strategy. This flattened latent space is then used as input into an LSTM-NN
to predict force–displacement response as well as the deformation of the mesh. Following a similar methodology
outlined in Kohar et al. [34], the procedure of the framework is summarized as follows:
1. Problem definition: Generate a FE model of the problem of interest. Define the desired output response that
will be measured for use in training. This framework will be demonstrated using an academic example that
is based on FE simulations presented in Kohar et al. [45] for dynamic axial crushing of various rectangular
crush tubes. The objective is to predict the dynamic axial crushing response of these rectangular crush tubes
using only the FE mesh data (i.e., nodes, elements, thickness, etc.) as input. The predicted outputs will include
the force–displacement and resulting deformation response of the structure.
2. Sampling of problem domain: Vary the FE geometry such that there is a statistical representation of the
entire domain of interest and simulate their response. The variation of the FE mesh will be parameterized for
generating the input training information for the ML system. The generation of these FE models can produce
structures that may have a different number of nodes and elements. It should be reiterated that the proposed
ML system will have no knowledge of the parameters used to generate these meshes.
3. Define ML system: Create an ML architecture that maps the input geometry to the desired output
response. Fig. 5 presents schematics of the proposed ML frameworks for predicting the deformation and
force–displacement response of the structure. The following are the detailed operations in the proposed
framework.

a. Voxelization of FE Mesh. The proposed methodology utilizes a voxelization approach that converts
a Cartesian-based FE geometry into a discrete three-dimensional space of dimensionality, Nx(0) ×
Ny(0) × Nz(0) [53,54]. Before voxelization, the limits of the Cartesian space that is occupied by all
FE models is determined as ([xmin , xmax ] , [ymin , ymax ] , [z min , z max ]). Each voxel is represented by an
7
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

index v(I, J, K ), and (the indices are ) determined by


( (0) ) x0(n) − xmin
I = Nx − 1 , I ∈N
x − xmin
( max
) y (n) − y
)
(
J = Ny(0) − 1
min
0
, J ∈N (5)
ymax − ymin
) z 0(n) − z min
( )
K = Nz(0) − 1 , K ∈N
(
z max − z min

where x0(n) , y0(n) , z 0(n) represents the location of a FE node, n, in Cartesian space, and I, J, K are
elements of the natural numbers. This voxelization approach uses a binary approach where a voxel
that is occupied by a node receives a value of 1; otherwise, an unoccupied voxel has a value of 0.
This approach allows for flexible FE mesh topologies, where the number of nodes and elements can
vary in each FE simulation. In essence, the three-dimensional voxel space provides data structuring to
unstructured FE meshes. However, the binary approach of nodal occupation results in a loss of density
and mass distribution. As such, the value of a voxel is computed as the normalized mass to enrich the
voxelization map. The mass of a node is computed as the lumped mass by apportioning the total mass
of an element equally among the nodes. For a four-node quadrilateral element, the mass of a node,
M, is defined as
1
M = ρ A0 T0 (6)
4
where ρ is the density of the material, A0 and T0 are the initial area and thickness of an element,
respectively. The method of superposition is used to account for the connectivity of different elements
to various nodes. Therefore, the mass of a voxel for a particular FE model can be defined, m(I, J, K ),
and the voxel value takes the form of the normalized mass defined as
v (I, J, K ) = m(I, J, K )/ max (m) (7)
where max (m) is the largest mass of a voxel for all FE models. This allows for the framework
to capture differences in FE models that have the same structure but variations in thickness and
distribution. This means that a voxel takes on a single scalar value (or a single channel) for the voxel
map. Hence, the input voxel map has a dimension of Nx(0) × Ny(0) × Nz(0) × 1. This assumes that a
constant material is used throughout the structure. However, increasing the channel dimension and
performing a similar operation can accommodate variations in material properties.
b. Latent Space Generation using CNN-autoencoder. The initial mesh and mass voxelization are reduced
to a flattened latent space of lower dimensionality, 1 × 1 × 1 × n f , through a series of 3D-
CNN-autoencoders. The values of the lower dimensionality are determined through the autoencoder
methodology. The training operation of the 3D-CNN-autoencoder is independent of the training
procedure for determining the time-response of the system with the LSTM-NN. The weights within
the CNN-autoencoder are identified by minimizing the mean squared error of the mass voxelization
map, MSECNN
1 ∑∑∑
MSECNN = (0) (ṽ (I, J, K ) − v (I, J, K ))2 (8)
Nx × Ny(0) × Nz(0) I J K
where ṽ (I, J, K ) is the output voxel value from the autoencoder. The weights of the autoencoder are
determined using the Adam [50] method for parameter identification.
c. Structuring of Time-Dependent Input Data for LSTM-NN. All input parameters are normalized between
−1 and 1 to avoid any bias associated with the different scales of the parameters. For each design, d,
(d)
the normalized input parameter, x i,t , at an instance of time, t, is determined as
(d)
( )
(d) xi,t − min(xi )
x i,t =2 −1 (9)
max (xi ) − min(xi )
8
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

where the parameter is normalized by the maximum and minimum value for all designs. Although
a similar network architecture can be used, predicting the deformation and the force–displacement
response of the structure requires unique input architectures.
i. Predicting the force–displacement response of the structure requires only the instance in time,
t, and the flattened latent space, n f .
ii. Predicting the deformation response of the structure begins with predicting the response of a
single node. Predicting the displacement of a single node is the fundamental case that can be
repeated for all nodes to accomplishes the task of predicting the deformation of the structure.
This allows the flexibility of the network to predict the deformation of meshes with varying
numbers of nodes. The input vector for predicting the displacement of a single ( node utilizes)the
instance in time, t, the flattened latent space, n f , and the initial coordinate, x0(n) , y0(n) , z 0(n) of
a node, n.
It is important to note that the normalized time signal is the only direct input that is varying. This
normalized time signal, in conjunction with the recurrent nature of the LSTM-NN, provides the unique
mapping for the current state of prediction at time, t, while the other inputs act a state selection for the
specific feature to be predicted by the LSTM-NN. This can also allow the flexibility to capture varying
time steps or end time conditions across( various
) FE simulations. However, it should be noted that each
LSTM-NN uses a constant time step ∆tLSTM to predict their respective time-series response in this
current study.
d. LSTM-NN Prediction of Crush Force/Deformation Response. The time-dependent input data is then
passed into the LSTM-NN model. The intermediate hidden state from the LSTM cell is then decoded
using a two-layer feed-forward NN. The first layer has the same number of neurons as the intermediate
hidden states, h. The second layer has the same number of neurons as the number of outputs of the
network. Similarly, the weights of each network are determined using an optimizer for parameter
identification. Similar to the inputs for each LSTM-NN, the output of each network requires a unique
strategy for determining the weights of the network and a unique loss function for training.
i. Output normalization is used for predicting the force–displacement to account for the difference
in magnitudes and units of force and displacement.
(d)
u (d)
( ) ( )
(d) Fz,t − min(Fz ) (d) z,t − min(u z )
F z,t = 2 −1 u z,t = 2 −1 (10)
max (Fz ) − min(Fz ) max (u z ) − min(u z )
It should be noted that the outputs of the reaction force and the resulting displacement of the
mass are coupled together through Newton’s second law. Through the simultaneous connection
of the outputs to the same hidden layers, known as multitask learning, these relationships can
be learned and shared by using the training from one output to be used by another output.
This can allow better training performance than identifying the individual outputs as a single
network [55]. As such, the loss function pertaining to the network prediction of the force–
displacement response of the structure will be known as MSELSTM FvD . The loss for an individual
design is
T ( )2 (
1 ∑ ˜(d) (d) (d)
)2
LSTM (d)
MSEFvD = F z,t − F z,t + ũ z,t − u (d)
z,t (11)
2T t

ii. Predicting the deformation response of


( an individual design ) requires the output of the network
to be the displacement of the node, u (d,n)
x,t , u (d,n)
y,t , u (d,n)
z,t , in time, t. No normalization of the
output is used during training on the displacement vector because the scale and magnitude are
common for all points. As such, the position of the node at a time, t, is computed as
xt(d,n) = x0(d,n) + u (d,n)
x,t yt(d,n) = y0(d,n) + u (d,n)
y,t z t(d,n) = z 0(d,n) + u (d,n)
z,t (12)
9
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

A mean squared error of the distance for an individual node in the structure, known as
MSELSTM
Mesh
(n)
, is used as the loss function and is defined as
T
1 ∑ ( (d,n) )2 ( )2 ( )2
LSTM (d,n)
MSEMesh = ũ x,t − u (d,n)
x,t + ũ (d,n) (d,n)
y,t − u y,t + ũ (d,n)
z,t − u (d,n)
z,t (13)
3T t
and is the target loss function to be minimized. The total MSE loss pertaining to the prediction
of an individual mesh will be known as MSELSTMMesh and is defined by
N
1 ∑
MSELSTM
Mesh
(d)
= LSTM (d,n)
MSEMesh (14)
N n
for all the nodes in the mesh of an individual design.

4. Training of ML system: Select data from the sample set to calibrate the ML system.
5. Prediction of ML system: Use remaining data for prediction with the ML system. This serves as a blind
test and evaluates the ability of the ML system to generalize the problem.

4. Problem formulation
4.1. Construction of finite element model

The finite element (FE) model follows a similar construction for the dynamic axial crush of a rectangular tube
that was outlined in Kohar et al. [45], which follows the experimental setup in Williams et al. [56]. The goal was
to develop an FE simulation that has a balance of accuracy in capturing the deformation behavior of the crush rail
while minimizing the computational time. All crush simulations were analyzed using a non-linear explicit dynamic
formulation of the commercial FE software LS-DYNA – R9.3.0 [57] with double precision utilizing 16 × 2.7 GHz
Intel Xeon E5-2680 processors and parallel processing techniques.
Fig. 6 presents a schematic of the FE construction that is used in this study. The crush tube has a constant
rectangular cross-section with a base (B), width (W), and height (H) that was varied in each simulation. The
dimensions presented in Fig. 6a define the mid-plane surface of the crush tube. Unlike the FE models presented in
Kohar et al. [45], no corner fillets were used in the crush tube to simplify the geometry for this study. All simulations
utilized two symmetric crush initiators (one on opposite sides) that is 5 mm deep × 40 mm wide and offset 50 mm
from the bottom of the crush tube.
The crush tube was constrained between two 120 mm × 120 mm rigid steel plates using a rigid material model
(*MAT 020) with an elastic modulus and Poisson ratio of 205 GPa and 0.3, respectively. Nodes within 25 mm of
the top and the bottom of the crush tube are translationally constrained to approximate the boss fixture according
to
u x (t) = u y (t) = 0 along z ≤ 25 mm
(15)
u x (t) = u y (t) = 0 along z ≥ H − 25 mm
where t is time. The nodes of the rigid bottom plate are fully constrained in all translational directions, such that
u x (t) = u y (t) = u z (t) = 0. In a similar manner, the nodes of the rigid top plate are planarly constrained according
to u x (t) = u y (t) = 0. However, the top plate was given an initial velocity of u̇ z (0) = −15.56 m/s. The rigid top
plate was assigned a mass of 560 kg and allowed to decelerate naturally. The total simulation time was 15.0 ms. A
constant time step of 0.1 µs, which is lower than the Courant–Friedrichs–Lewy stability criterion [58], was used in
these FE calculations.
Eight-node constant stress solid elements (ELFORM = 1 for *ELEMENT SOLID) with a mesh size of 5.0 mm
× 5.0 mm × 5.0 mm were used to model the rigid steel plates. Although Kohar et al. [45] showed that hexahedral
elements had higher fidelity, the crush tube was meshed using fully-integrated shell elements (ELFORM = 16 for
*ELEMENT SHELL) to provide a balance between accuracy and computational speed. Various mesh sizes and the
number of through-thickness integration points were studied to determine a computationally efficient model for data
generation. The material thickness (T) of the crush tube was also varied in each simulation. As such, the material
thickness, cross-section base, width, and height will be known as the explicit parameters that can be used to identify
10
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 6. (a) Isometric view of FE crush tube, (b) Isometric view of FE crush tube with loading plates and (c) Front view of FE model with
boundary conditions (present study).

each geometry. Contact treatment was utilized between all the plates and the crush tube with static and dynamic
coefficients of friction of 1.04 and 1.00, respectively. Self-contact treatment was used for the crush tube with static
and dynamic coefficients of friction of 0.45 and 0.40, respectively [59,60]. The force–displacement response was
measured at the contact interface between the rigid top plate and the crush tube. The force–displacement signal was
sampled at a time interval of 0.1 ms for a total of 150 samples. The deformation response was sampled as a time
of interval of 0.5 ms for a total of 30 samples.

4.1.1. Definition of crush tube material properties


The material properties of the crush tube used in this study were assumed to be commercially available aluminum
alloy AA5754-O, which were collected from Williams et al. [56,61] and Smerd et al. [62] and used in Kohar
et al. [45]. The material properties of AA5754-O are summarized in Table 1. The deformation behavior of the
material is assumed to follow an elasto-plastic constitutive model that is governed by an isotropic von Mises
hardening law. The total deformation rate tensor, ε̇, is assumed to be composed of an elastic strain rate tensor,
ε̇∗ , and plastic strain rate, ε̇P , such that
ε̇ = ε̇∗ + ε̇P (16)
The rate of Cauchy stress, Σ̇, is related to the elastic strain rate through the 4th order isotropic elasticity tensor,
L el , defined by
Σ̇ = Lel : ε̇ ∗ (17)
which can be described by the elastic modulus, E, and the Poisson ratio, ν. The deformation is considered to be
completely elastic if the Cauchy stress tensor, Σ, satisfies the von Mises yield criterion
Φ = 3J2 − Σy 2 ≤ 0 H⇒ ε̇P = 0 (18)
where Σy is the plastic flow stress required to generate yielding. J2 is defined as the second tensor invariant of the
deviatoric Cauchy stress tensor, s, which is given as
I1 (2)
s =Σ− I (19)
3
11
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Table 1
Summary of material properties for AA5754-O [56,61,62].
ρ [g mm−3 ] E [GPa] ν Σ0 [MPa] Σuts [MPa] D C [s−1 ] s
2.7 × 10−3 70.0 0.34 210 315 9.2 9.39 × 1010 10.55

where I1 = Σ11 + Σ22 + Σ33 is the first tensor invariant of the stress tensor, and I(2) is the 2nd order identity tensor.
If yielding occurs, the plastic strain rate tensor is governed by an associative flow rule defined as
P ∂Φ P
ε̇ P = ε̇ , ε̇ ≥ 0 (20)
∂Σ
P ∫t P
where ε̇ is the effective plastic strain rate and the accumulation of the total plastic strain follows ε P = 0 ε̇ dt.
The consistency criterion during plastic flow governs that
Φ̇ = 0 (21)
P
This forms a series of simultaneous non-linear equations that can be numerically solved to determine ε̇ . The plastic
flow stress model is assumed to follow a Voce-type plasticity law [63] defined by
Σy,q = Σuts − (Σuts − Σ0 ) exp −Dε P
( )
(22)
where Σ0 is the material quasi-static yield stress, Σuts is the saturation stress, and D is the Voce hardening
saturation coefficient. Strain-rate dependence is captured by scaling the plastic flow stress using a Cowper–Symonds
model [64], such that
⎛ ( P )1/s ⎞
ε̇
Σy = Σy,q ⎝1 + ⎠ (23)
C

This material model was utilized as a rate-dependent formulation with piece-wise linear plasticity (*MAT 024) with
the parameterization of the Voce plasticity law as the hardening curve for the crush tube. A simple element deletion
criterion of ε P ≥ 1.40 is used to simulate failure within the material.

4.1.2. Comparison of FE model construction with experiments


A baseline analysis is conducted to verify the fidelity of the FE model construction and to select mesh
configurations for use in data generation. An FE model with dimensions of B = W = 73.6 mm, H = 400 mm, and
T = 3.0 mm is simulated and compared to the experimental response reported in Williams et al. [61]. Two different
mesh sizes are explored in this study: (i) 1.5 mm × 1.5 mm, (ii) 5.0 mm × 5.0 mm. For each of these mesh sizes,
three different through-thickness integration schemes, n p = [3, 5, 7], are studied. The fully integrated (FI) model
from Kohar et al. [45] is also re-simulated using the present FE model parameters for reference. This re-simulated
model used 1.5 mm × 1.5 mm elements with seven through-thickness integration points and incorporated the radii.
Fig. 7 presents the predicted deformation for all simulations after 15.0 ms. The experimental deformation pattern
that was taken from Williams et al. [61] is presented for comparison. The degree of consolidation within the plastic
hinges varies between the simulations. This difference is caused by the resolution of the FE mesh to resolve the local
bending behavior, which will have an implication on the predicted crush force and energy absorption. Nevertheless,
all simulations can capture the progressive symmetric crush modes of the structure. Fig. 8 presents a comparison of
the FE models with the experimental force–displacement and energy absorption response obtained from Williams
et al. [56]. For comparison, the mean crush force, peak crush force, and energy absorption are calculated for each
simulation and are presented in Table 2. Energy absorption, Eabs , is calculated as
∫ z
Eabs (z) = F (z) dz (24)
0
where z is the displacement in time. The mean crush force, Fmean , is calculated as the energy absorption and
normalized at 200 mm, such that
∫ 200
1
Fmean = F (z) dz (25)
200 0
12
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 7. (a) Experimental [56] and predicted plastic strain contours of deformed structure for the (b) Re-simulated response of Kohar et al. [45],
(c) 1.5 mm × 1.5 mm w/7 Int. Points, and (d) 5.0 mm × 5.0 mm w/3 Int. Points FE models.

Table 2
Comparisons between simulated and experimental [56] results for mean crush force, peak crush force, energy absorption, and computational
time for baseline FE models.
Model Mesh size # Int. Fmean Fpeak Eabs ∆ϵ, Fmean ∆ϵ, Fpeak ∆ϵ, Eabs CPU
[mm × mm] Points [kN] [kN] [kJ] [%] [%] [%] [s]
Kohar et al. [45]: 1.5 × 1.5 7 62.2 192.0 12.4 −4.5 −14.4 −4.6 2055
Re-simulated
Present model 1.5 × 1.5 3 67.7 200.8 13.6 4.0 10.5 4.3 2124
Present model 1.5 × 1.5 5 69.5 201.0 13.9 6.7 10.4 6.8 3161
Present model 1.5 × 1.5 7 70.2 201.2 14.0 7.7 10.3 7.8 4230
Present model 5.0 × 5.0 3 73.9 195.7 14.7 13.5 12.7 13.2 187
Present model 5.0 × 5.0 5 75.4 195.6 15.0 15.7 12.8 15.4 240
Present model 5.0 × 5.0 7 75.8 195.5 15.1 16.4 12.8 16.1 299
Experiment 65.1 224.3 13.0

13
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 8. Comparison of re-simulated FE model of Kohar et al. [45], present models, and experimental [56] force–displacement and energy-
displacement response for a mesh size of (a) 1.5 mm × 1.5 mm and (b) 5.0 mm × 5.0 mm with various number of integration
points.

Finally, the peak crush force, Fpeak , is calculated as


Fpeak = max (F (z)) (26)
These energy absorption metrics for each simulation are compared to the measurements from the experiments and are
also presented in Table 2. The present FE model construction can capture the general force–displacement response
that was observed in the experiments. The present FE models can adequately predict the peak crush force within
12%. The predicted mean crush force and energy absorption of the 5.0 mm × 5.0 mm FE models tend to be higher
by 13%–16% than the experiment. However, the predicted mean crush force and energy absorption of the 1.5 mm
× 1.5 mm FE models were only 4%–8% higher than the experiment. This variation is dependent on the mesh size
and the number of through-thickness integration points has an implication on the resulting plastic hinges that are
generated during folding. The coarser mesh produces larger folds that stiffen the local bending resistance and results
in higher energy absorption. For comparison, the re-simulated FE model from Kohar et al. [45] was closer to the
experimental response (4%) and had a 10% lower mean force and energy absorption. The difference between the
present and re-simulated FE model is attributed to the simplification of the radius in the geometry, which increases
the cross-sectional area and the local bending resistance.
Table 2 also presents the average computational time required for each simulation. As expected, the computational
time generally scales with the number of elements and integration points. The re-simulated Kohar et al. [45] model
14
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

had lower time than the 1.50 mm × 1.50 mm mesh in the present study; this is a result of the high plastic deformation
that is generated in the sharp corners that requires more calculations to perform. The FE model with 5.0 mm ×
5.0 mm with three integration points had the lowest computational time of 187 s (2992 core-seconds), which is
9.5% of the computational time required by the re-simulated FE model from Kohar et al. [45]. As such, a mesh
size with 5.0 mm × 5.0 mm with three integration points will be used for the data generation because the FE model
is sufficient to capture the physics of the crush response while reducing the computational resource requirements
to perform this study.

4.2. Sampling of geometric properties for data generation

Various combinations of the structural dimensions and thickness are used to create a virtual DoE for the axial
crush simulation based on the 5.0 mm × 5.0 mm with three integration points meshing scheme. This data will be
used for training the ML system. Intelligent sampling techniques for high dimensionality data sets have been a topic
of interest within the scientific and engineering community [65–68]. Common techniques include pseudo-random
Monte-Carlo sampling, Latin hypercube sampling (LHS) [69], and orthogonal array sampling (OAS) [70]. These
sampling techniques can significantly reduce the amount of analysis time required to generate the training data [71].
However, a factorial DoE was used to ensure that the design space was thoroughly sampled for this problem to
evaluate the true potential of the proposed framework. Once this data set has been created, further and future analysis
can be performed to evaluate different sampling algorithms and network architectures. The sampled factorial DoE
used in this study was
375 mm ≤ H ≤ 500 mm, ∆H = 25 mm
60 mm ≤ B ≤ 90 mm, ∆B = 5 mm
(27)
60 mm ≤ W ≤ 90 mm, ∆W = 5 mm
1.50 mm ≤ T ≤ 3.00 mm, ∆T = 0.25 mm
This produced a total of 2058 samples for simulation. These FE models were generated by a custom program that
was written in MATLAB using a workstation with 4 × 2.8 GHz Intel Core i7 processors. The total time to generate
these FE models was 372 s (or 0.72 core seconds/model). Using 16 × 2.7 GHz Intel Xeon E5-2680 processors, the
average computational time to perform each FE simulation was 207 s/model. It is important to note that a uniform
average mesh size was applied throughout all the simulations with these varying dimensions. This means that each
simulation potentially had a different number of nodes and elements (along with ordering) that had must be handled
by the proposed framework.

5. Results and discussion


This section presents an analysis of the feature extraction method using the CNN-autoencoder for the FE
simulations of the dynamic crush response of the rectangular tubes. This section also presents the analysis of the
LSTM-NN prediction of the deformation and the crush-force response of the rectangular crush tubes. All models
were trained using an Nvidia DGX Workstation with a 20 × 2.2 GHz Intel Xeon E5-2698 processor, 256 GB of
RAM, and 4 × 1.2 GHz Nvidia Tesla V100 (4 × 32 GB HBM2) GPUs with an NVLink for multi-GPU interfacing.
The Keras/Tensorflow [72] libraries with Nvidia CUDA 10.1 optimized implements in Python were used for training
the various ML architectures.

5.1. Analysis of the feature extraction using CNN-autoencoder

( Different combinations ) of CNN filtering architectures, number of latent features, n f , and voxelization maps
Nx(0) , Ny(0) , Nz(0) , 1 of the input geometry are presented in this study to identify a suitable autoencoder configu-
ration for use in the LSTM-NN architecture. Table 3 presents the various 3D-CNN-autoencoder architectures that
were utilized in this study. These architectures were designed to conveniently produce a flattened latent space vector
of size, n f . The size of the convolutional filter was selected
( to evenly reduce
) the number of successive inputs by
the number of layers. The voxel map configurations for Nx(0) , Ny(0) , Nz(0) , 1 were (5, 5, 25, 1) , (10, 10, 50, 1), and
15
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Table 3
CNN-autoencoder architectures.

(20, 20, 100, 1), which generated six different architectures that produced even filtering layers. Seven different latent
features according to n f = 2i with 0 ≤ i ≤ 6 were studied for each architecture. Appendix A presents the total
number of weights in each CNN-autoencoder architecture.
All models used a rectified linear (ReLU) unit as the activation function in the CNN-autoencoder, except for
the last layer in the de-convolutional filter. The final layer in the deconvolutional filter layer used a sigmoidal
function to capture the bounded nature of the voxel pixel between [0,1]. A training split of 0.8 (80%), where 1646
designs were randomly selected to be part of the training set, was used for calibrating the CNN-autoencoder. This
means that 0.2 (20%) was used as a validation data set. Although 1646 designs appear to be a large data set for
a relatively simple study, the computational cost associated with generating FE meshes is orders of magnitude
lower than the cost associated with performing the FE calculations for predicting the response of the structure.
The training operation was performed 12 times (allowing for 3 repetitions on each GPU) for each configuration
with different designs being selected in the training set to analyze the variance in sample selection. The Adam [50]
method was used for parameter identification of the weights in the 3D-CNN-autoencoder to minimize the MSECNN
loss function presented in Eq. (8). The initial weights, known as the seeding of the weights, were randomly varied
in each repetition. Each configuration was trained to 1000 epochs using default hyperparameters for the Adam
optimizer (αt = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 10−8 ), and a batch size of 27 . The average computational time
required
( for training each
) configuration can be found in Appendix B. All configurations, except for Architecture 6
(0) (0) (0)
with Nx , Ny , Nz = (20, 20, 100, 1) and 64 features, used only one GPU; the remaining configuration utilized
two GPUs to allow for sufficient GPU memory.
Fig. 9 presents the training and validation convergence for the median response of each CNN-autoencoder
configuration. The significance of the validation set is that it serves as an unbiased evaluation of the system. Both
the training and validation convergence plots for each CNN-autoencoder configuration show similar trends and
variations. This is an indicator that there is sufficient representation of the features within both data sets. All learning
curves exhibit some degree of a converging solution at Epoch = 1000; however, further training improvement could
be achieved. As such, Fig. 10 presents a box and whisker plot of the various CNN-autoencoders configurations
evaluated at Epoch = 1000 of the MSECNN validation loss function for all repeats. Table 4 lists the median validation
loss of each CNN-autoencoder configuration at Epoch = 1000. Each subfigure highlights the convergence of each
architecture and voxel map configuration with respect to the number of features.
The MSECNN validation loss functions have similar orders of magnitude for increasing the refinement of the
voxel map resolution. In particular, the average MSECNN validation loss for configurations with only one feature
was 15.06×10−3 , 21.76×10−3 , and 28.81×10−3 , for voxel map configurations of (5, 5, 25, 1) , (10, 10, 50, 1), and
(20, 20, 100, 1) for Architectures #1–3. It should be noted that the low-resolution voxel maps are still able to
generate unique voxel maps (corroborated by a low MSECNN validation loss) through the mass voxelization method.
This result is possible because of the relatively large increments of the design variables in the data generation phase
16
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 9. Median training and validation convergence plots for the various architectures and input voxel maps.

17
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 10. Box and whisker plot of the variation in validation loss for 12 repetitions at Epoch = 1000 for the various architectures and input
voxel maps.

18
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Table 4
Median validation loss (10−3 ) for each CNN-autoencoder architecture.
Input voxel map (5, 5, 25, 1) (10, 10, 50, 1) (20, 20, 100, 1) (10, 10, 50, 1) (20, 20, 100, 1) (20, 20, 100, 1)
Architectures
# of Features 1 2 3 4 5 6
1 15.0625 21.7648 24.8172 21.9664 30.3110 29.7417
2 11.7959 14.8797 19.6840 10.8991 27.6253 18.0072
4 3.3247 9.9518 13.3732 3.9174 2.2615 3.6221
8 0.5955 0.8190 1.1632 3.0210 1.1729 0.8167
16 0.1458 0.2538 0.1625 0.1193 34.8291 34.9011
32 0.0611 0.0524 0.0312 0.0871 34.6875 34.6342
64 0.0510 0.0272 0.0398 44.6733 34.9205 35.0216

of the training data. However, this may not necessarily be the case for studies with a lower range in design parameters
where the coarse voxelization approach does not generate unique voxel maps for different designs.
Architectures #1–3 showed minimal variance in the validation error at Epoch = 1000 for each configuration.
However, Architectures #4–6, which had multiple filter operations, had more variation in the final MSECNN
validation loss at Epoch = 1000. Also, the training algorithm had difficulties overcoming the initial local minimum
that was generated at the beginning of training for Architectures #4–6 for 16 features or more. However, in some
cases, Architectures #4–6 produced better MSECNN validation loss for the same number of features. For example,
the lowest MSECNN validation loss for Architecture #3 with a voxel map of (20, 20, 100, 1) and eight features
was 5.89 × 10−4 , while the lowest loss for the same voxel map and features was 7.21 × 10−5 with Architecture
#5. This highlights that some level of filtering can provide some benefit in obtaining better feature identification.
This improvement was achieved with only a minor increase in the number of weights (640,009 vs. 738,761), but
approximately 10 times more computational time (276 s vs. 3240 s) and higher variation in training convergence.
This can be attributed to the added depth of the network and the difficulty of dying ReLU during the learning
process [73].
Due to the minimum variance and good computational training time given the pixel resolution, Architecture
#3 with a voxel map of (20, 20, 100, 1) was selected for continued analysis. A one-way analysis of variance
(ANOVA) test was performed using the Statistics and Machine Learning Toolbox in MATLAB [74] for the number
of features in Architecture #3 to identify statistically similar architectures with respect to the MSECNN loss. Now,
the convergence of the training and ( validation loss followed an exponential decay. As such, the ANOVA study
was also performed on the log10 MSECNN to identify the statistical differences in magnitude. The results of the
)
ANOVA analysis are presented in Appendix C. First, CNN-autoencoder configurations with four or fewer features
presented a significant statistical difference with respect to MSECNN to all other configures. Eight features( or more)
were not statistically different from each other. However, the ANOVA study that was performed on log10 MSECNN
showed that only 32 and 64 features were not statistically different from each other. Furthermore, 32 features had
the lowest median MSE for Architecture #3 with a voxel map of (20, 20, 100, 1). It should be reiterated that explicit
knowledge of the features may not be available to a design engineer for larger and more complex data sets of FE
models. Yet, the 3D-CNN-autoencoder can be grasping higher-level features within the geometry (i.e., meshing,
trigger) requiring higher dimensionality.
As such, Architecture #3 with a voxel map of (20, 20, 100, 1) and 32 features was selected as a candidate for
the remainder of this analysis. The training process that was outlined above was repeated using Epoch = 100,000.
All twelve repetitions showed similar training and validation convergence responses for this architecture. Fig. 11
presents a representative training and validation convergence plot for this analysis. The MSECNN validation loss
for this configuration improved to 4.47×10−6 at Epoch = 100,000, which is approximately an order magnitude
improvement compared to Epoch = 1000 and required approximately 10 h of training time on the GPU to achieve
this performance.
Now, to ensure that a single sample did not contribute substantially to the majority of the MSECNN training or
validation loss, a histogram of the MSECNN for each design is presented in Fig. 12 to identify significant outliers.
Fig. 13 presents the original and reconstructed voxel maps with the lowest, median, and highest MSECNN loss from
the validation set to demonstrate the significance of this MSECNN value. It should be noted that these plots are
19
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 11. (a) Training and (b) Validation for Architecture #3 with a 20 × 20 × 100 pixel map and 32 features.

Fig. 12. Histogram of MSECNN distribution for (a) Training and (b) Validation sets for Architecture #3 with a 20 × 20 × 100 pixel map
and 32 features.

normalized by the largest mass of a voxel, max (m), for all FE models, which was described in Eq. (7). Fig. 13 also
presents the distribution of the individual voxel error defined by (ṽ (I, J, K ) − v (I, J, K ))2 . The highest individual
MSECNN in the validation set was 2.93×10−5 , which is still within an order of magnitude of the total MSECNN .
Nevertheless, the highest MSECNN still well-captured details regarding the geometry’s structure (including the crush
initiator detail) and mass distribution.
Again, it should be reiterated that the objective of the autoencoder is to self-learn the feature space without
the knowledge of the explicit features that were used to construct the FE mesh. Nevertheless, this demonstration
allows an opportunity to evaluate the quality of the autoencoder with respect to the explicit features. Fig. 14 presents
MSECNN response surfaces with respect to the explicit parameters for the CNN-autoencoder. These response surfaces
were generated by fitting a cubic spline surface to the explicit parameters to the MSECNN loss. The error surfaces
highlight a peak in MSECNN occurring at B ∈ [60, 65] and W = 75. Analyzing the data set showed that 75%–85%
of all designs that were bounded by this domain were present in the training set. The intensity of this error tends
to increase with thickness, as well as with respect to the remaining design domain. However, this result shows
that the mass voxelization approach is sufficient in capturing the mass effect. As such, potential further training
20
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 13. (a) Original, (b) Reconstructed, and (c) Individual MSE voxel map with the (i) Highest, (ii) Median, and (iii) Lowest MSECNN
loss from the validation set for Architecture #3 with a 20 × 20 × 100 pixel map and 32 features.

21
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 14. MSECNN loss response surfaces with respect to the explicit parameters for Architecture #3 with a 20 × 20 × 100 pixel map and
32 features.

22
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 15. MSELSTM


FvD loss response surfaces with respect to the number of hidden states and training split for (a) 1 LSTM Cell, (b) 2 LSTM
Cells, and (c) 3 LSTM Cells for predicting the force–displacement response of the structure.

time, enhanced sampling, and learning algorithms could provide additional improvement and possibly reduce the
magnitude of these peaks. Nevertheless, the MSECNN within the peak is still low. Overall, these response surfaces
show no direct correlation between MSECNN and an individual explicit parameter. This is an indicator that the
CNN-autoencoder can self-learn the feature space and can be used in predicting the time responses.

5.2. Analysis of LSTM-NN in predicting force–displacement response

This study presents different configurations of the LSTM-NN that utilizes the flattened latent space from the
CNN-autoencoder with an input voxel map of (20, 20, 100, 1), Architecture #3 and 32 features for predicting the
150 increments (∆tLSTM
FvD = 0.1 ms) of the force–displacement response. The latent space and the output force and
displacement were normalized between [−1,1] for the training process. Up to three LSTM cells were stacked in
series with the hidden state vector of the last cell being used as input into the next LSTM cell. The vector size of the
hidden state of the LSTM cells was varied according to h = 2i with 0 ≤ i ≤ 10. The dense layer after the last LSTM
cell (as shown in Fig. 5) utilized a tanh activation function, while the output layer utilized a simple linear function.
The influence of the training split was also studied to identify a sufficient number of FE simulations that would
be necessary to achieve reasonable accuracy in predicting the force–displacement response. This will accomplish
the objective of reducing the cost and accelerating the development process for design engineers by reducing the
number of required FE simulations. The training split was varied according to 0.01 × 2 j with 0 ≤ j ≤ 6, and the
training sets were randomly sampled. Similar to the methodology outlined in Section 5.1, the Adam [50] method
with default hyperparameters were used to minimize MSELSTM FvD for 1000 epochs with an initial learning rate of
0.001, and a batch size of 25 . Each training operation was performed 12 times for each configuration.
Fig. 15 presents the error response surfaces with respect to the number of hidden states, training split, and the
number of LSTM cells for predicting the force–displacement response. First, a three-way ANOVA study (with
respect to # of hidden states, training split, and # of cells) showed little to no statistical difference in reducing the
MSELSTM
FvD with respect to the number of LSTM cells for this study ( p-value = 0.1173) with no interactional effects.
23
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 16. (a) Training and (b) Validation convergence using one LSTM cell, a training split of 0.16 and 256 hidden states for predicting the
force–displacement response of the structure.

Therefore, attention was paid to analyzing the single-cell LSTM-NN for the remainder of this study. The MSELSTM FvD
loss response converged to a value of 17–19 × 10−3 after a training split of 0.04 (i.e. 4% of the total dataset)
using only one hidden state. The MSELSTM FvD response had a convergence response that is approximately linear in
log-space for higher training splits. However, the MSELSTM FvD loss response further improved when increasing the
size of the hidden state and saturated for hidden states ≥ 128 units with higher training splits. Now, it is desired
to use a minimum amount of data while still achieving good predictive capabilities. As such, a training split of
0.16 (corresponding to 329 out of 2058 simulations that are used for training) was selected for further analysis in
predicting the force–displacement response. 256 hidden states had the lowest MSELSTM −3
FvD loss (2.32 × 10 ) at Epoch
= 1000 within the training split of 0.16 and, therefore, was selected for further analysis.
The training process that was outlined above was also repeated using Epoch = 100,000 for twelve repetitions.
Again, the training data was randomly sampled with a different seed in each trial. Similar convergence histories
were obtained in each trial. Fig. 16 presents representative training and validation convergence plots for predicting
the force–displacement response. Although the MSELSTM −3
FvD validation loss improved to 2.04 × 10 , this occurred at
LSTM −4 LSTM
Epoch = 1525 with a MSEFvD training loss of 4.29 × 10 . After Epoch = 1525, the MSEFvD validation loss
slightly increases before saturating. Meanwhile, the MSELSTM FvD training loss generally continued to improve with
each iteration to a MSELSTMFvD training loss of 3.66 × 10−7
at Epoch = 100,000; this is an indicator of potential
overfitting with time. Nevertheless, the network at Epoch = 100,000 was utilized for evaluating the suitability in
predicting the force–displacement response. Once trained, the LSTM-NN requires an average computational time of
2.7 ×10−4 s to predict the time-dependent response of the force–displacement response using a workstation with 4
× 2.8 GHz Intel Core i7 processors. This corresponds to 2.96 ×106 times computational speedup compared to using
the conventional FE method, which is beneficial for design engineers and scientists. Fig. 17 presents comparisons
between the force–displacement responses from the FE simulation and ML predictions. The results with the lowest,
median, and highest MSELSTM FvD loss from the validation set is shown in the figure. Table 5 presents a comparison of
the mean crush force, peak crush force, and energy absorption from each simulation, along with the corresponding
MSELSTM
FvD loss. Overall, the trends are well-captured by the ML prediction. However, there appears to be a phase
shift in the case with the highest MSELSTM
FvD loss. Nevertheless, the predictions of the energy absorption characteristics
are within 10%, with an average error between 3% and 5%. Now, the ML prediction appears to be attempting to
capture the small oscillations in the force–displacement response due to noise in the signal. Filtering the original
force–displacement response from the FE simulation using standard automotive criteria, such as the SAE CFC
standard [75], could improve training performance and overall generalizability by smoothing the signal.
Fig. 18 presents MSELSTM
FvD loss response with respect to the explicit parameters used in the data generation. It
should be reiterated that the prediction of the force–displacement response is solely based on the latent space
obtained by the autoencoder and that this an opportunity to evaluate the quality of the complete framework.
24
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 17. FE simulation and ML predictions of the force–displacement responses the (a) Lowest, (b) Median, and (c) Highest MSELSTM
FvD loss
from the validation set.

Table 5
Comparison of energy absorption characteristics between the FE simulation and ML predictions for the lowest, median, and highest MSELSTM
FvD
validation loss.
Model FFE
mean FFE
peak EFE
abs FAI
mean FAI
peak EAI
abs ∆ϵ, Fmean ∆ϵ, Fpeak ∆ϵ, Eabs MSELSTM
FvD
[kN] [kN] [kJ] [kN] [kN] [kJ] [%] [%] [%]
Highest 75.7 217.5 16.4 83.0 234.3 17.7 9.7 7.7 8.1 0.0195
Median 54.9 173.2 12.1 53.1 175.8 11.8 −3.2 1.5 2.9 0.0016
Lowest 23.9 86.7 5.5 22.9 83.3 5.3 −4.3 −4.0 −4.0 0.0002

Similarly, these response surfaces were generated by fitting a cubic spline surface to the explicit parameters to
the MSELSTM LSTM
FvD loss. The MSEFvD loss shows some increase with respect to the thickness of the FE mesh. There is
also a noticeable band of higher loss at H = 400 mm, W = 80 mm, T = 3.0 mm, B ∈ [75, 90], which is a result of
low sampling within this region. Overall, this framework has good performance in predicting the force–displacement
response.

5.3. Analysis of LSTM-NN in predicting mesh deformation response

This study presents different configurations of the LSTM-NN for predicting the deformation response of the
rectangular crush tubes over time. A total of 30 time increments (∆t = ∆tLSTM Mesh = 0.5 ms) are used to train
the LSTM-NN and capture the evolution of the mesh. Similarly, this LSTM-NN utilizes the CNN-autoencoder
from Section 5.1 using [−1,1] normalization of the values. Similar to the methodology outlined in Section 5.2, the
dense layer on the output of the LSTM cell utilized a tanh activation function, while the output layer utilized a
simple linear function. It should be reiterated that no normalization of the displacements was used during training
because the scale and magnitude are common for all nodes. As such, the loss function, MSELSTM Mesh , for this network
represents the average distance error in predicting a node. The Adam [50] method with default hyperparameters
was used to minimize the MSELSTM Mesh loss. Only up to two LSTM cells were stacked in series with the hidden state
vector of the last cell being used as input into the next LSTM cell due to computational requirements for this
study. The vector size of the hidden state of the LSTM cells was varied according to h = 4i with 1 ≤ i ≤ 5.
The influence of the training split was also studied to identify a sufficient number of FE simulations that would be
necessary to achieve reasonable accuracy in predicting the deformation of the mesh. Unlike the previous study, the
prediction of the displacement of the node requires the initial coordinate as an input into the LSTM-NN for the
given FE structure of interest. It should be emphasized that the partitioning in the training and validation sets was
performed on a simulation by simulation level. As such, only the nodes within the training set of simulations are
used for identifying the weights in the LSTM-NN. This ensures that no nodes are used from the validation set of
25
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 18. MSELSTM


FvD loss surfaces with respect to the explicit parameters for one LSTM cell, a training split of 0.16 and 256 hidden states
for predicting the force–displacement response of the structure.

26
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 19. MSELSTM


Mesh validation loss response surfaces with respect to the number of hidden states and training split for (a) 1 LSTM Cell and
(b) 2 LSTM Cells for predicting the deformation of the structure.

simulations. The training split was varied according to 0.01 × 2 j with 0 ≤ j ≤ 4; An upper limit of four is used
to follow the reduction in simulations from Section 5.2. Again, each training operation was performed 12 times
for each configuration. The simulations were selected using a random sampling algorithm with random seeding. It
should be noted that there was a small difference in the total number of nodes in each trial (<3%) because each FE
simulation had a different number of nodes and the random selection algorithm in generating the training/validation
sets. However, this fact had a negligible impact on the training convergence time and performance. It should also
be noted that the displacements of 11,656,512 nodes are required to predict the 2058 simulations. As such, a batch
size of 212 and 100 epochs were used during training.
Fig. 19 presents the error response surfaces with respect to the number of hidden states, training split, and
the number of LSTM cells for predicting the deformation of the mesh. Both the single and double cell LSTM-NN
models showed a general trend of simultaneous improvement for increasing the number of hidden states and training
split. Yet, the MSELSTM
Mesh validation loss approached a linear behavior with respect to training split for an increase
in the size of the hidden state. Although the surface responses appear similar, a three-way ANOVA study (# hidden
states, training split, # of LSTM cells) showed enough statistical difference in reducing the MSELSTM Mesh validation
loss with respect to the number of LSTM cells for this study ( p-value = 8.08 × 10−11 ). Following the 0.16 training
split that was used in the prediction of the force–displacement, a double cell LSTM-NN with 1024 hidden states
was selected for use in further analysis because it had the lowest MSELSTM Mesh , which will demonstrate the level of
fidelity of this framework.
Fig. 20 presents representative training and validation convergence plots for predicting the deformation response
of the mesh. The MSELSTM Mesh training loss continues to improve, while the validation loss saturates at approximately
6.58 × 10−1 . Once trained, the LSTM-NN requires an average computational time of 2.4 s to predict the time-
dependent response of the deformed mesh using a workstation with 4 × 2.8 GHz Intel Core i7 processors. This
corresponds to 332.7 times computational speedup compared to using the conventional FE method. Again, this
27
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 20. Training and validation convergence using two LSTM cells, a training split of 0.16 and 1024 hidden states for predicting the
deformation response of the structure.

computational acceleration can offer design engineers and scientists a substantial speedup in the design exploration
process with CAE simulation tools, such as FE analysis. Fig. 21 presents a comparison between the time-dependent
deformation response of the result from the FE simulation and the predicted response from the ML system. The
deformation response was taken from a sample with the median model error in the validation set. The elemental
error is used to compare the error between the FE simulation, and the ML predicted response. Since the FE models
utilized four-node quadrilateral elements, the average elemental error, Err(el)
t , is computed as the average distance
error of the four connected nodes defined as
E 4
1 ∑ (el) 1 ∑ (el(n))
Errt = Errt , Err(el)
t = Errt
E el=1 4 n=1

)2 ( )2 ( )2 (28)
(n) (n) (n) (n) (n) (n) (n)
(
Errt = ũ x,t − u x,t + ũ y,t − u y,t + ũ z,t − u z,t

where (el (n)) governs the connectivity of the element, and E is the total number of elements in the FE model. Fig. 21
also presents the elemental error distribution at various instances of time. The ML system can identify the nodes
that are translationally constrained at the top and the bottom of the crush tube. As time progresses, the elemental
error increases with each sequential fold in time; the error was the highest at the end of time. The average elemental
error at the end of time for predicting an element was 1.87 mm, which corresponded to a MSELSTM Mesh loss of 0.66 mm
2
. However, the majority of the error results from a translational error of approximately 5.0 mm in predicting the
last fold instead of localized errors throughout the mesh. As such, the ML system can capture the sequencing and
progressive folding pattern. For additional comparison, Fig. 22 presents results with the lowest, median, and highest
MSELSTM
Mesh loss from the validation set at the end of time. The average elemental error corresponding to the deformed
meshes with the lowest, median, and highest MSELSTM Mesh loss was 0.37 mm, 1.87 mm, and 6.96 mm, respectively.
Similar observations in identifying the constrained nodes and general trends in the distribution of the error are
observed in these cases. In a similar manner, Fig. 23 presents MSELSTM Mesh loss surfaces with respect to the explicit
parameters for the FE model that were generated using cubic spline surfaces. This result shows that there appears to
be no correlation between an explicit parameter and the MSELSTM Mesh loss. Overall, the ML system has good predictive
capabilities in capturing the deformation response predicted by the FE simulation (see Fig. 23).

5.4. Discussion and remarks on the proposed framework

The proposed ML framework demonstrated the ability to self-learn a latent space for use in an LSTM-NN to
predict the response of FE simulations through the resulting force–displacement and deformed mesh. The latent
28
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 21. (a) FE Simulation, (b) ML Predicted, and (c) Elemental error distribution in predicting the mesh deformation at various instances
in time from the model with median model error in the validation set.

29
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 22. Comparison of the FE simulation, ML predicted, and elemental error distribution for predicting the mesh deformation at the end
of the simulation (t = 15 ms) from models with the (a) highest, (b) median, and (c) lowest model error from the validation set.

space was derived from a mass voxelization strategy that was based solely on the initial FE model data with
3D-CNN-autoencoders. The proposed mass voxelization strategy showed no particular difficulty in capturing the
variation in the shape or thickness of the FE meshes. Although the framework demonstrated good predictive
capabilities, it is important to note that the fidelity of these results, including the predictions generated by the
30
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Fig. 23. MSELSTM


Mesh loss surfaces with respect to the explicit parameters for two LSTM cells, a training split of 0.16 and 1024 hidden states.

31
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

LSTM-NNs, is contingent on the ability of the 3D-CNN-autoencoder method to grasp the latent space of the FE
model through the voxelization method. First, it should be pointed out that the current voxelization strategy contains
a sparse matrix with relatively few non-zero elements. This can lead to poor memory management for relatively
larger structures that contain large volumes of empty space. Additional strategies for managing voxelization should
be explored. Next, components within automotive structures can have complex and varying profiles [76], wall-
thicknesses [77,78], material properties [79] to achieve the desired performance. Yet, the FE model of the crush
tube used in this study was a constant rectangular cross-section profile with a uniform wall thickness and a single
material. The proposed framework of capturing the mass of a node as a voxel should be capable of managing these
variations in the geometry. However, the voxelization map would require enrichment by incorporating additional
channels to capture variations in material. This would allow the framework to capture tailored material properties
throughout an automotive structure and component [80].
The present study explored different configurations of 3D-CNN-autoencoder architectures with the size of the
latent space. Architectures #1–3 involved a single CNN filter to the flattened latent space, while Architectures
#4–6 utilized multiple CNN filtering layers. Effectively, Architectures #1–3 are acting like a feedforward NN layer
that maps the input voxel map to the latent space. Utilizing multiple CNN filters did show some improvement in
reducing the training and validation loss when compared to a single CNN filter for the same flattened latent space
size. This result means that the multi-CNN filters are extracting higher-order information from the voxelization map.
However, architectures with multiple CNN filters generally had higher variation in the training and validation loss
with an increase in the size of the latent space compared to a single CNN filter. Also, the training algorithm had
difficulties overcoming the initial local minimum that was generated at the beginning of training for Architectures
#4–6 for 16 features or more; this contributed substantially to the variability in the training and validation loss.
Better training and intelligent early termination criteria can be used to prevent such an event. Ultimately, the single
CNN filter was selected for the remainder of the analysis. However, this result could be drastically different with
a more complex geometry that cannot be decomposed down to a simple 2D cross-section and an extrusion height.
More complex geometries can be used in future studies to highlight the necessity and efficiency of multi-layered
CNN-autoencoders for use in encoding FE model data.
Still, the CNN-autoencoder method compressed the high-dimensionality nature of the FE model down to a
flattened latent space of approximately 32 parameters. Although four explicit features were used to define the
domain for constructing the FE models, the autoencoder can be grasping higher-level features within the geometry
(i.e., meshing, trigger) requiring higher dimensionality. This compression of the FE model into a flattened latent
space is a critical component to facilitate the training of the LSTM-NN for predicting the deformation of the mesh
and the force–displacement response. This flattened latent space through the FE voxelization strategy offers design
engineers an opportunity to couple the framework with optimization strategies for future design synthesis. However,
the present study utilized a general autoencoder strategy that produces a latent space that can have no physical
meaning. The classification of the different designs into the latent space by the autoencoder method can also produce
large discontinuous regions where sampling the latent space generates no useful designs. As a result, it is difficult to
sample within the current latent space and decoder for use as a generative model for future designs and will require
further enhancements. Recent developments into variation autoencoders [81,82] can allow for the latent space to
be constrained to a specific region that is continuous. This can allow for a smooth traversal through the latent
space that can enable conventional optimization algorithms to navigate the domain with ease. In addition, new and
advanced methods into beta-variational autoencoders [83,84] can allow for the latent space to be decomposed into
independent factors without supervision. This can be used with the proposed framework to evaluate the size of the
latent space and minimize the dimensionality of the domain to unique parameters.
Finally, the ability of the framework relies on the ability of the LSTM-NN to predict the resulting force–
displacement and deformed mesh. The usage of LSTM-NNs to predict the deformed response of FE models is a
natural progression of the recent developments by Kohar et al. [34] and Van de Weg et al. [35]. These LSTM-NNs
are generating approximate solutions to a system of time-dependent non-linear partial differential equations that are
governed by the mechanics and physics of solids. Although the load case is constant (i.e., axial crashworthiness
of a rectangular crush member), the complex crushing behavior of a structure is a strain-path and time-dependent
problem. As such, the movement of a node and the change in force–displacement is time and history-dependent,
which is suitable for recurrent-type neural networks, such as LSTM-NNs. This study creates the foundation for
future exploration of various fundamental deep neural network architectures (i.e., with or without recurrence), other
time-dependent cells, such as gated recurrent unit [85], Fourier recurrent units [86], or Legendre memory units [87],
or multiple load cases and boundary conditions.
32
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

6. Conclusions

This work presented a new and novel framework for using machine learning to predict simulation results obtained
from CAE simulations. In particular, the framework was demonstrated using FE simulations of dynamic axial
crushing of various rectangular tubes. A virtual design of experiments was used to generate the necessary training
data by varying the base, width, height, and wall thickness of the FE model; a total of four explicit parameters were
used to generate 2058 FE models and simulations. A different number of nodes and elements was produced in each
FE model by maintaining a constant mesh size; this introduced some unstructured nature to the data set. Both the
explicit feature definitions and meshing techniques that were used to create the FE models were unknown to the
ML system.
The ML system was developed using the Keras/Tensorflow [72] libraries in Python and trained using a multi-GPU
accelerated workstation. First, 3D-CNN-autoencoders were used to process the initial FE model data (i.e., nodes,
elements, thickness, etc.) to self-learn the feature space, known as the latent space, in an unsupervised manner. A
voxelization strategy that operated on the mass of individual nodes was proposed to handle the unstructured nature of
the nodes and elements while capturing variations in the wall thickness of the FE models. The 3D-CNN-autoencoder
was trained using the Adam [50] method for minimizing the mean squared error, MSECNN , of the mass voxelization
map. Different voxel resolutions, latent features, and CNN filtering schemes were systematically studied to identify
a suitable latent space for characterizing the FE mesh. ANOVA studies were used to identify configurations of the
3D-CNN-autoencoder that were statistically different while providing good predictive capability in reconstructing
the geometry. The ANOVA study revealed that models with 32 features or more were not statistically different
from each other for all configurations. Although there are four explicit features in the geometry, the autoencoder
can be identifying higher-level features within the geometry (i.e., meshing, trigger) requiring higher dimensionality.
A single CNN filter produced results that had the least amount of variance when the training operation was
repeated. Although adding multiple CNN-filtering layers could achieve higher accuracy in some instances through
better feature recognition, it came at a substantial computational cost (∼10 times) and higher training variability.
In addition, the Adam [50] training algorithm with the prescribed default configurations used had difficulties
overcoming the local minimum that was generated at initial seeding for the higher filtering architectures. A single
layer 3D-CNN-autoencoder with a voxel map of (20, 20, 100, 1) and 32 features was selected, retrained for a
more extended period, and used for the remainder of the analysis. The original and reconstructed voxel maps
produced by the 3D-CNN-autoencoder were presented, showing that even the reconstructed voxel map with the
highest error still well captured the FE geometries structure, features, and mass distribution. An error response
surface of the 3D-CNN-autoencoder validation loss was created with respect to the explicit parameters to evaluate
the quality of the latent space. Overall, the 3D-CNN-autoencoder grasped the feature space with no stand-out
region of poor performance. However, enhancements to the sampling and training could still further improve
performance.
The latent space that was generated by the 3D-CNN-autoencoder of the FE mesh was then used as the input into
LSTM-NNs to predict the deformation and force–displacement response during dynamic axial crushing. Studies
were performed varying the number of LSTM cells and the number of hidden states in each network. More
importantly, the amount of data used in training was also varied under each condition. A single cell LSTM-NN
with 256 hidden states and a training split of 0.16 showed good capabilities in predicting the force–displacement
response. This means that the proposed framework could predict the remaining 84% of the available data with good
accuracy that was unseen during training. The LSTM-NN could predict the energy absorption characteristics with
an average error between 3%–5%. Similarly, a double cell LSTM-NN with 1024 hidden states and a training split
of 0.16 could predict the location of a node within 1.87 mm on average for an entire mesh. The LSTM-NN could
learn the translation of nodes that were constrained and well-capture the deformation response of the structure with
respect to time. Once trained, the LSTM-NN for predicting the force–displacement and the deformed mesh was
approximately 2,960,000 and 330 times faster, respectively, than the FE approach with good accuracy. It should
be mentioned that this study is one of, if not the first, to predict and visualize a deformed structural response
or FE mesh of crashworthiness without knowing the explicit design parameters of the geometry. This framework
offers design engineers and scientists a potential tool to significantly accelerate the development process of CAE
simulation tools, such as FE analysis. Although this framework was demonstrated for a simple example of dynamic
33
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

axial crushing of various rectangular tubes, this framework can be deployed to exploit already existing big data sets
from FE simulations of previous vehicle development programs that remain largely ignored on data systems.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential
competing interests: This work was supported by Volkswagen AG under Contract No. AZ-79483123. The authors
have no additional conflicts of interest to declare.

Acknowledgments

The authors would like to thank Dr. Michael Andres and Mr. Bram van de Weg from Volkswagen AG for
the valuable discussions and feedback. Sincere thanks are due to Dr. Pit Schwanitz, and Mr. David Kracker from
Porsche AG for their valuable contributions and ideas as well as for providing a realistic data set that was used for
the development and testing of the framework.

Appendix A. Number of parameters in each CNN-autoencoder architecture

Input voxel (5, 5, 25, 1) (10, 10, 50, 1) (20, 20, 100, 1) (10, 10, 50, 1) (20, 20, 100, 1) (20, 20, 100, 1)
map
Architectures
# of 1 2 3 4 5 6
Features
1 1252 10,002 80,002 3126 22,346 6874
2 2503 20,003 160,003 8751 64,691 23,735
4 5005 40,005 320,005 27,501 209,381 87,421
8 10,009 80,009 640,009 95,001 738,761 334,649
16 20,017 160,017 1,280,017 350,001 2,757,521 1,308,529
32 40,033 320,033 2,560,033 1,340,001 10,635,041 5,173,985
64 80,065 640,065 5,120,065 5,240,001 41,750,081 20,575,681

Appendix B. Average computational time (in seconds) for training each CNN-autoencoder architecture

Input voxel (5, 5, 25, 1) (10, 10, 50, 1) (20, 20, 100, 1) (10, 10, 50, 1) (20, 20, 100, 1) (20, 20, 100, 1)
map
Architectures
# of 1 2 3 4 5 6
Features
1 6.40 × 101 8.60 × 101 2.74 × 102 1.43 × 102 3.21 × 103 2.56× 103
2 6.10 × 101 8.40 × 101 2.76 × 102 1.43 × 102 3.22 × 103 3.29 × 103
4 6.10 × 101 8.40 × 101 2.76 × 102 1.43 × 102 3.23 × 103 4.64 × 103
8 6.00 × 101 8.50 × 101 2.76 × 102 1.43 × 102 3.24 × 103 7.10 × 103
16 6.10 × 101 8.50 × 101 2.77 × 102 1.45 × 102 3.27 × 103 1.24 × 104
32 6.00 × 101 8.60 × 101 2.74 × 102 1.43 × 102 3.21 × 103 2.56 × 103
64 6.10 × 101 8.40 × 101 2.76 × 102 1.43 × 102 3.22 × 103 3.29 × 103

34
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

Appendix C. P-values from anova analysis of Architecture #3 for various number of features

Comparing P-value P-value


( )
# of Features # of Features MSECNN
Epoch=1000 log10 MSECNN
Epoch=1000
−7
1 2 1.42 × 10 7.15 × 10−1
1 4 3.71 × 10−8 1.38 × 10−4
1 8 3.71 × 10−8 3.71 × 10−8
1 16 3.71 × 10−8 3.71 × 10−8
1 32 3.71 × 10−8 3.71 × 10−8
1 64 3.71 × 10−8 3.71 × 10−8
2 4 3.71 × 10−8 2.54 × 10−1
2 8 3.71 × 10−8 3.71 × 10−8
2 16 3.71 × 10−8 3.71 × 10−8
2 32 3.71 × 10−8 3.71 × 10−8
2 64 3.71 × 10−8 3.71 × 10−8
4 8 3.71 × 10−8 3.71 × 10−8
4 16 3.71 × 10−8 3.71 × 10−8
4 32 3.71 × 10−8 3.71 × 10−8
4 64 3.71 × 10−8 3.71 × 10−8
8 16 7.23 × 10−1 3.71 × 10−8
8 32 6.56 × 10−1 3.71 × 10−8
8 64 6.57 × 10−1 3.71 × 10−8
16 32 1.00 × 100 4.18 × 10−8
16 64 1.00 × 100 6.49 × 10−8
32 64 1.00 × 100 1.00 × 100

References
[1] N.H.T.S.A., National Highway Traffic Safety Administration Laboratory Test Procedure for FMVSS 208, Occupant Crash Protection,
U.S. Department of Transportation, Washington, 2008.
[2] Euro-NCAP, Rating Review 2018: Report from the Ratings Group, European New Car Assessment Programme, Leuven, Belgium,
2020.
[3] U.S.E.P.A., Draft Technical Assessment Report: Midterm Evaluation of Light-Duty Vehicle Greenhouse Gas Emission Standards and
Corporate Average Fuel Economy Standards for Model Years 2022-2025, U.S. Environmental Protection Agency, Washington, 2016.
[4] A. Elgowainy, J. Han, J. Ward, F. Joseck, D. Gohlke, A. Lindauer, T. Ramsden, M. Biddy, M. Alexander, S. Barnhart, I. Sutherland,
Current and future United States light-duty vehicle pathways: Cradle-to-grave lifecycle greenhouse gas emissions and economic
assessment, Environ. Sci. Tech. 52 (4) (2018) 2392–2399.
[5] H. Kim, G. Keoleian, S. Skerlos, Economic assessment of greenhouse gas emissions reduction by vehicle lightweighting using aluminum
and high-strength steel, J. Ind. Ecol. 15 (1) (2011) 64080.
[6] N. Meng, M. Strassmaier, J. Erwin, LS-DYNA® Performance on Intel® Scalable Solutions. in 15th International LS-DYNA Users
Conference, Dearborn, MI, 2018.
[7] C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard, D. McClosky, The Stanford CoreNLP natural language processing toolkit,
in: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, 2014.
[8] K. Chowdhary, Natural language processing, in: Undamentals of Artificial Intelligence, Springer, New Delhi, 2020, pp. 603–649.
[9] A. Chaboud, B. Chiquoine, E. Hjalmarsson, C. Vega, Rise of the machines: Algorithmic trading in the foreign exchange market, J.
Financ. 69 (5) (2014) 2045–2084.
[10] H. Vu, H. Kim, J. Lee, 3D Convolutional neural network for feature extraction and classification of fMRI volumes, in: International
Workshop on Pattern Recognition in Neuroimaging, Vol. 104, 2018.
[11] A. Birenbaum, H. Greenspan, Multi-view longitudinal CNN for multiple sclerosis lesion segmentation, Eng. Appl. Artif. Intell. 65
(2017) 111–118.
[12] M. Hengstler, E. Enkel, S. Duelli, Applied artificial intelligence and trust—The case of autonomous vehicles and medical assistance
devices, Technol. Forecast. Soc. 105 (2016) 105–120.
[13] G. Meiring, H. Myburgh, A review of intelligent driving style analysis systems and related artificial intelligence algorithms, Sensors
15 (12) (2015) 30653–30682.
35
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

[14] X. Li, C. Roth, D. Mohr, Machine-learning based temperature-and rate-dependent plasticity model: application to analysis of fracture
experiments on DP steel, Int. J. Plast. 118 (2019) 320–344.
[15] K. Pandya, C. Roth, D. Mohr, Strain rate and temperature dependent fracture of aluminum alloy 7075: Experiments and neural network
modeling, Int. J. Plast. 135 (2020) 102788.
[16] B. Jordan, M. Gorji, D. Mohr, Neural network model describing the temperature-and rate-dependent stress–strain response of
polypropylene, Int. J. Plast. 135 (2020) 102811.
[17] D. Mohr, On the potential of recurrent neural networks for modeling path dependent plasticity, J. Mech. Phys. Solid (2020) 103972.
[18] C. Wang, L. Xu, J. Fan, A general deep learning framework for history-dependent response prediction based on UA-seq2seq model,
Comput. Method Appl. Mech. 372 (2020) 113357.
[19] P. Theocaris, P. Panagiotopoulos, Neural networks for computing in fracture mechanics, methods and prospects of applications, Comput.
Method Appl. Mech. 106 (1–2) (1993) 213–228.
[20] L. Greve, B. Schneider, T. Eller, M. Andres, J. Martinez, B. van de Weg, Necking-induced fracture prediction using an artificial neural
network trained on virtual test data, Eng. Fract. Mech. 219 (2019) 106642.
[21] J. Jung, K. Yoon, P. Lee, Deep learned finite elements., Comput. Methods Appl. Mech. 372 (2020) 113401.
[22] F. Meister, T. Passerini, V. Mihalef, A. Tuysuzoglu, A. Maier, T. Mansi, Deep learning acceleration of total Lagrangian explicit
dynamics for soft tissue mechanics, Comput. Method Appl. Mech. 358 (2020) 112628.
[23] S. Saha, Z. Gan, L. Cheng, J. Gao, O. Kafka, X. Xie, H. Li, M. Tajdari, H. Kim, W. Liu, Hierarchical deep learning neural network
(hidenn): An artificial intelligence (AI) framework for computational science and engineering, Comput. Method Appl. Mech. 373 (2021)
113452.
[24] M. Møller, A scaled conjugate gradient algorithm for fast supervised learning, Neural Netw. 6 (4) (1993) 525–533.
[25] W. Roux, N. Stander, R. Haftka, Response surface approximations for structural optimization, Internat. J. Numer. Methods Engrg. 42
(1998) 517–534.
[26] N. Stander, W. Roux, M. Giger, M. Redhe, N. Fedorova, J. Haarhoff, Crashworthiness optimization in LS-OPT: Case studies in
metamodeling and random search techniques, in: 4th European LS-DYNA Users Conferece, 2003, pp. J–I–11–26.
[27] H. Park, X. Dang, Structural optimization based on CAD–CAE integration and metamodeling techniques, Comput. Aided Des. 42 (10)
(2010) 889–902.
[28] J. Marzbanrad, M. Ebrahimi, Multi-objective optimization of aluminum hollow tubes for vehicle crash energy absorption using a genetic
algorithm and neural networks, Thin-Walled Struct. 49 (2011) 1605–1615.
[29] C. Kohar, A. Zhumagulov, A. Brahme, M. Worswick, R. Mishra, K. Inal, Development of high crush efficient, extrudable aluminium
front rails for vehicle lightweighting, Int. J. Impact Eng. 95 (2016) 17–34.
[30] C. Kohar, A. Brahme, J. Imbert, R. Mishra, K. Inal, Int. J. Solid Struct. 128 (2017) 174–198.
[31] F. Xiong, D. Wang, S. Chen, Q. Gao, S. Tian, Multi-objective lightweight and crashworthiness optimization for the side structure of
an automobile body, Struct. Multidiscip. Optim. 58 (4) (2018) 1823–1843.
[32] L. Lanzi, C. Bisagni, S. Ricci, Neural network systems to reproduce crash behavior of structural components, Comput. Struct. 82 (1)
(2004) 93–108.
[33] T. Omar, A. Eskandarian, N. Bedewi, Vehicle crash modelling using recurrent neural networks, Math. Comput. Modelling 28 (9) (1998)
31–42.
[34] C. Kohar, D. Connolly, T. Luisko, K. Inal, Using Artificial Intelligence to Aid Vehicle Lightweighting in Crashworthiness with
Aluminum, in: 17th International Conference on Aluminium Alloys, Grenoble, France, 2020.
[35] B. van de Weg, L. Greve, M. Andres, T. Eller, B. Rosic, Neural network-based surrogate model for a bifurcating structural fracture
response, Eng. Fract. Mech. 241 (2020) 107424.
[36] P. Baque, E. Remelli, F. Fleuret, P. Fua, Geodesic convolutional shape optimization, 2018, arXiv preprint arXiv:1802.04016.
[37] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel, Backpropagation applied to handwritten zip code
recognition, Neural Comput. 1 (4) (1989) 541–551.
[38] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86 (11) (1998)
2278–2324.
[39] K. Kamnitsas, C. Ledig, V. Newcombe, J. Simpson, A. Kane, D. Menon, D. Rueckert, B. Glocker, Efficient multi-scale 3D CNN with
fully connected CRF for accurate brain lesion segmentation, Med. Image Anal. 36 (2017) 61–78.
[40] F.N.N. Milletari, S. Ahmadi, V-net: Fully convolutional neural networks for volumetric medical image segmentation, in: Fourth
International Conference on 3D Vision, 2016, pp. 565–571.
[41] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, J. Xiao, 3d shapenets: A deep representation for volumetric shapes, in:
Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
[42] Z. Zhang, P. Jaiswal, R. Rai, Featurenet: machining feature recognition based on 3D convolution neural network, Comput.-Aided Des.
101 (2018) 12–22.
[43] G. Hinton, R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313 (5786) (2006) 504–507.
[44] P. Vincent, H. Larochelle, Y. Bengio, P. Manzagol, Extracting and composing robust features with denoising autoencoders, in:
Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 1096-1103.
[45] C. Kohar, M. Mohammadi, R. Mishra, K. Inal, Effects of elastic–plastic behaviour on the axial crush response of square tubes, Thin
Wall Struct. 93 (2015) 64–87.
[46] D. Rumelhart, G. Hinton, R. Williams, Learning representations of back-propagation errors, Nature 323 (1986) 533–536.
[47] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, (Vol. 1, No. 2), MIT press, Cambridge, 2016.
[48] K. Levenberg, A method for the solution of certain non-linear problems in least squares, Q. Appl. Math. 2 (2) (1944) 164–168.
36
C.P. Kohar, L. Greve, T.K. Eller et al. Computer Methods in Applied Mechanics and Engineering 385 (2021) 114008

[49] D. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, J. Soc. Ind. Appl. Math. 11 (2) (1963) 431–441.
[50] D. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv preprint arXiv:1412.6980.
[51] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8) (1997) 1735–1780.
[52] K. Greff, R. Srivastava, J. Koutník, B. Steunebrink, J. Schmidhuber, LSTM: A search space odyssey, IEEE Trans. Neural Netw. Learn.
28 (10) (2016) 2222–2232.
[53] E. Karabassi, G. Papaioannou, T. Theoharis, A fast depth-buffer-based voxelization algorithm, J. Graph. Tools 4 (4) (1999) 5–10.
[54] T. Hinks, H. Carr, L. Truong-Hong, D. Laefer, Point cloud data conversion into solid models via point-based voxelization, J. Surv.
Eng. 139 (2) (2013) 72–83.
[55] R. Caruana, Multitask learning, Mach. Learn. 28 (1) (1997) 41–75.
[56] B. Williams, C. Simha, N. Abedrabbo, R. Mayer, M. Worswick, Effect of anisotropy, kinematic hardening, and strain-rate sensitivity
on the predicted axial crush of hydroformed aluminum alloy tubes, Int. J. Impact Eng. 37 (2010) 652–661.
[57] J. Hallquist, LS-DYNA Theory Manual, Livermore Software Technology Corporation, Livermore, California, 2006.
[58] T. Belytschko, W. Liu, B. Moran, K. Elkhodary, Nonlinear Finite Elements for Continua and Structures, John Wiley & Sons, 2013.
[59] N. Norton, Design of Machinery: An Introduction to the Synthesis and Analysis of Mechanisms and Machines, McGraw-Hill, New
York, 2004.
[60] E. Avallone, T. Baumeister, A. Sadegh, Mark’s Standard Handbook for Mechanical Engineers, McGraw-Hill, New York, 2007.
[61] B. Williams, M. Worswick, G. D’Amours, A. Rahem, R. Mayer, Influence of forming effects on the axial crush response of hydroformed
aluminum alloy tubes, Int. J. Impact Eng. 37 (2010) 1008–1020.
[62] R. Smerd, S. Winkler, C. Salisbury, M. Worswick, D. Lloyd, M. Finn, High strain rate tensile testing of automotive alloy sheet, Int.
J. Impact Eng. 32 (2005) 541–560.
[63] E. Voce, A practical strain-hardening function, Metallurgia 51 (307) (1955) 219–226.
[64] G. Cowper, P. Symonds, Strain Hardening and Strain-Rate Effects in the Impact Loading of Cantilevered Beams, Providence, Rhode
Island, 1957.
[65] M. Pedergnana, S. García, Smart sampling and incremental function learning for very large high dimensional data, Neural Netw. 78
(2016) 75–87.
[66] G. Albuquerque, T. Lowe, M. Magnor, Synthetic generation of high-dimensional datasets, IEEE Trans. Vis. Comput. Graphics 17 (12)
(2011) 2317–2324.
[67] T. White, Sampling generative networks, 2016, arXiv preprint arXiv:1609.04468.
[68] S. Garud, I. Karimi, M. Kraft, Design of computer experiments: A review, Comput. Chem. Eng. 106 (2017) 71–95.
[69] M. McKay, R. Beckman, W. Conover, Comparison of three methods for selecting values of input variables in the analysis of output
from a computer code, Technometrics 21 (2) (1979) 239–245.
[70] A. Giunta, S. Wojtkiewicz, M. Eldred, Overview of modern design of experiments methods for computational simulations, in: 41st
Aerospace Sciences Meeting and Exhibit, 2003.
[71] H. Linshu, Nearly-orthogonal sampling and neural network metamodel driven conceptual design of multistage space launch vehicle,
Comput.-Aided Des. 38 (6) (2006) 595–607.
[72] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, Tensorflow:
Large-scale machine learning on heterogeneous distributed systems, 2016, arXiv preprint arXiv:1603.04467.
[73] L. Lu, Y. Shin, Y. Su, G. Karniadakis, Dying relu and initialization: Theory and numerical examples, 2019, arXiv preprint arXiv:190
3.06733.
[74] MathWorks, Statistics and machine learning toolbox in MATLAB, MathWorks (2018).
[75] S.o.A. Engineers, Instrumentation for Impact Test—Part 1—Electronic Instrumentation, SAE Technical Report, 2007, pp. J211–1.
[76] X. Zhang, H. Zhang, Z. Wen, Axial crushing of tapered circular tubes with graded thickness, Int. J. Mech. Sci. 92 (2015) 12–23.
[77] G. Sun, T. Pang, C. Xu, G. Zheng, J. Song, Energy absorption mechanics for variable thickness thin-walled structures, Thin Wall
Struct. 118 (2017) 214–228.
[78] X. Zhang, Z. Wen, H. Zhang, Axial crushing and optimal design of square tubes with graded thickness, Thin Wall Struct. 84 (2014)
263–274.
[79] Y. Lin, J. Min, Y. Li, J. Lin, A thin-walled structure with tailored properties for axial crushing, Int. J. Mech. Sci. 157 (2019) 119–135.
[80] T. Eller, L. Greve, M. Andres, M. Medricky, A. Hatscher, V. Meinders, A. van den Boogaard, Plasticity and fracture modeling of
quench-hardenable boron steel with tailored properties, J. Mater. Process. Tech. 214 (6) (2014) 1211–1227.
[81] D. Kingma, M. Welling, Auto-encoding variational bayes, 2013, arXiv preprint arXiv:1312.6114.
[82] D. Rezende, S. Mohamed, D. Wierstra, Stochastic backpropagation and approximate inference in deep generative models, 2014, arXiv
preprint arXiv:1401.4082.
[83] I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, A. Lerchner, Beta-vae: Learning basic visual
concepts with a constrained variational framework, 2016.
[84] C. Burgess, I. Higgins, A. Pal, L. Matthey, N. Watters, G. Desjardins, A. Lerchner, Understanding disentangling in beta-VAE, 2018,
arXiv preprint arXiv:1804.03599.
[85] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using
RNN encoderdecoder for statistical machine translation, 2014, arXiv preprint arXiv:1406.1078.
[86] J. Zhang, Y. Lin, Z. Song, I. Dhillon, Learning long term dependencies via fourier recurrent units, in: International Conference on
Machine Learning, 2018, pp. 5815–5823.
[87] A. Voelker, I. Kajić, C. Eliasmith, Legendre Memory units: Continuous-time representation in recurrent neural networks, in: NeurIPS
2019, 2019.

37

You might also like