The IMA Volumes in Mathematics and Its Applications: Avner Friedman Willard Miller, JR
The IMA Volumes in Mathematics and Its Applications: Avner Friedman Willard Miller, JR
in Mathematics
and its Applications
Volume 58
Series Editors
Avner Friedman Willard Miller, Jr.
Institute for Mathematics and
its Applications
IMA
The Institute for Mathematics and its Applications was established by a grant from the
National Science Foundation to the University of Minnesota in 1982. The IMA seeks to encourage
the development and study of fresh mathematical concepts and questions of concern to the other
sciences by bringing together mathematicians and scientists from diverse fields in an atmosphere
that will stimulate discussion and collaboration.
The IMA Volumes are intended to involve the broader scientific community in this process.
Avner Friedman, Director
Willard Miller, Jr., Associate Director
**********
UMA ANNUAL PROGRAMS
1982-1983 Statistical and Continuum Approaches to Phase Transition
1983-1984 Mathematical Models for the Economics of Decentralized
Resource Al1ocation
1984-1985 Continuum Physics and Partial Differential Equations
1985-1986 Stochastic Differential Equations and Their Applications
1986-1987 Scientific Computation
1987-1988 Applied Combinatorics
1988-1989 Nonlinear Waves
1989-1990 Dynamical Systems and Their Applications
1990-1991 Phase Transitions and Free Boundaries
1991-1992 Applied Linear Algebra
1992-1993 Control Theory and its Applications
1993-1994 Emerging Applications of Probability
1994-1995 Waves and Scattering
1995-1996 Mathematical Methods in Material Science
IMA SUMMER PROGRAMS
1987 Robotics
1988 Signal Processing
1989 Robustness, Diagnostics, Computing and Graphics in Statistics
1990 Radar and Sonar (.June 18 - .June 29)
New Directions in Time Series Analysis (.July 2 - .July 27)
1991 Semiconductors
1992 Environmental Studies: Mathematical, Computational, and Statistical Analysis
1993 Modeling, Mesh Generation, and Adaptive Numerical Methods
for Partial Differential Equations
1994 Molecular Biology
** •• ******
SPRINGER LECTURE NOTES FROM THE UMA:
The Mathematics and Physics of Disordered Media
Editors: Barry Hughes and Barry Ninham
(Lecture Notes in Math., Volume 1035, 1983)
Orienting Polymers
Editor: J .L. Ericksen
(Lecture Notes in Math., Volume 1063, 1984)
New Perspectives in Thermodynamics
Editor: James Serrin
(Springer-Verlag, 1986)
Models of Economic Dynamics
Editor: Hugo Sonnenschein
(Lecture Notes in Econ., Volume 264, 1986)
W.M. Coughran, Jr. Julian Cole
Peter Lloyd Jacob K. White
Editors
Semiconductors
Part I
With 55 Illustrations
Springer-Verlag
New York Berlin Heidelberg London Paris
Tokyo Hong Kong Barcelona Budapest
W.M. Coughran, Jr. Julian Cole
AT&T Bell Laboratories Department of Mathematical Sciences
600 Mountain Ave., Rm. 2T-502 Rensselaer Polytechnic Institute
Murray Hill, NJ 07974-0636 USA Troy, NY 12180 USA
Peter Lloyd Jacob K. White
AT&T Bell Laboratories Massachusetts Institute of Technology
Technology CAD Department of Electrical Engineering and
1247 S. Cedar Crest Blvd. Computer Science
Allentown, PA 18103-6265 USA 50 Vassar St., Rm. 36-880
Cambridge, MA 02139 USA
Series Editors:
Avner Friedman
Willard Miller, Jr.
Institute for Mathematics and its
Applications
University of Minnesota
Minneapolis, MN 55455 USA
Mathematics Subject Classifications (1991): 35-XX, 60-XX, 76-XX, 76P05, 81UXX, 82DXX,
35K57, 47N70, 00A71 , ooA72, 81T80, 93A30, 82B40, 82C40, 82C70, 65L60, 65M60, 94CXX,
34D15, 35B25
. Current Volumes:
Volume 1: Homogenization and Effective Moduli of Materials and Media
Editors: Jerry Ericksen, David Kinderlehrer, Robert Kohn, J.-L. Lions
Volume 10: Stochastic Differential Systems, Stochastic Control Theory and Applications
Editors: Wendell Fleming and Pierre-Louis Lions
Volume 20: Coding Theory and Design Theory Part I: Coding Theory
Editor: Dijen Ray-Chaudhuri
Volume 21: Coding Theory and Design Theory Part II: Design Theory
Editor: Dijen Ray-Chaudhuri
Volume 42: Partial Differential Equations with Minimal Smoothness and Applications
Editors: B. Dahlberg, E. Fabes, R. Fefferman, D. Jerison, C. Kenig and
J. Pipher
Forthcoming Volumes:
Phase 'lTansitions and 1iTee Boundaries
Free Boundaries in Viscous Flows
Applied Linear Algebra
Linear Algebra for Signal Processing
Linear Algebra for Control Theory
Summer Program Environmental Studies
Environmental Studies
Control Theory
Robust Control Theory
Control Design for Advanced Engineering Systems: Complexity, Uncertainty, In-
formation and Organization
Control and Optimal Design of Distributed Parameter Systems
Flow Control
Robotics
Nonsmooth Analysis & Geometric Methods in Deterministic Optimal Control
Systems & Control Theory for Power Systems
Adaptive Control, Filtering and Signal Processing
Discrete Event Systems, Manufacturing, Systems, and Communication Networks
Mathematical Finance
FOREWORD
SEMICONDUCTORS, PART I
Avner Friedman
Foreword ................................................................. xi
Preface ................................................................... xiii
SEMICONDUCTORS, PART I
Process Modeling
Circuit Simulation
Device Modeling
Werner Liniger
Thomas J. Watson Research Center
LIST OF PARTICIPANTS
Abstract. This paper gives an overview of predictive Technology CAD tools for simulating
and modeling the fabrication and electrical behavior of integrated circuits. Recent trends in the
integration of process, device, and circuit simulation tools, and the current emergence of UNIX-
based computing environments of networked workstations make possible user-friendly, task-based
CAD systems for technology optimization, characterization, and cell design.
1. Introduction.
In the electronics industry, CAE and CAD tools/systems have played a critical
role in reducing non recurring engineering (NRE) costs, improving product quality
and shortening time-to-market intervals. The productivity and quality gain in semi-
conductor electronics industry stems from both shorter design intervals and more
robust design verification keyed to process capability and has depended extensively
on simulation. At the lowest level of the CAD tool hierarchy are the circuit, device
and process simulators which link circuit design to fabrication. Detailed process
and device simulation can playa key role in generating data for modeling of circuit
performance prior to fabrication. Predictive capabilities enable early delivery of
accurate compact models which are critical to high performance cell and detailed
sub-system design. These Technology CAD tools for modeling the fabrication and
electrical behavior of integrated circuits are rapidly gaining in maturity. Smooth
interfacing and integration of the various modeling tools has been a recent trend
particularly in industries where Technology CAD has become an integral part of IC
development and is given organizational focus [1].
Initially, predictive Technology CAD (TCAD) tools are generally a substitute
for physical experimentation to save time, effort and money, and to provide addi-
tional insight. Later, tools are integrated into a TCAD system and an optimization
capability is added to aid in the evaluation of competing technology alternatives.
Furthermore, it is recognized that the manufacturing process has inherent variabil-
ity as do the operating and physical environments in which the products have to
work. These variations cause product behavior to deviate from the nominal design
resulting in a reduction of yield. Traditionally, circuits have been designed by us-
ing a worst-case approach, often sacrificing either performance or yield. Technology
CAD tools can be used to make the production process and device and circuit design
less sensitive to the inherent variations. Figure 1.1 illustrates various components
in an TCAD system and their relationship in the context of IC manufacturing and
design.
The evolution of IC technologies is primarily driven by the need for small, light,
fast, low power, reliable electronic circuits in military and commercial systems and
*Presented 7/15/91 at the Summer Program on Semiconductors, Institute for Mathematics and
Its Applications, University of Minnesota.
** AT&T- Bell Laboratories, Allentown, Pennsylvania 18103.
2
,,
~
,, Simulate
:, De"'" &
electrical
behavior.
Device
Characterization
,, Interconnect
(Ament. a Reliability
,, Modeling
Hot electrons,
D
Stvdies
, Capacltanco_..
1--------
• • • v
Compact F~compact
Palllffleter
De..,., models 10
Fi~s
Modeling cnatacterisllC$
~ Sfmuialo
OC, AC&
transient
ell'evil
Perlonnance
[BE]
Cfrooll
Simulation behOviOtOI pflJdiction a
cells & OpNmlz._
circuits
TCAD is maturing and several TCAD vendors have entered the market place.
With support from universities, the CAD Framework Initiative (CFI) and a number
of semiconductor companies, framework standards for TCAD are emerging. The
advances in TCAD framework will help streamline TCAD applications. Neverthe-
less, there are many challenging issues yet to be addressed, for example, efficient
simulation on massively parallel compute environments.
This paper provides an overview of TCAD tools and integration of these tools
into TCAD systems for deployment and development . These tools include those
3
for process, device and circuit simulation, parameter extraction and optimization.
TCAD tools can be used independently or coupled together to form tasks. Appli-
cations of these tools and systems at AT&T to technology development and circuit
design for optimization, characterization, and verification will be illustrated.
2. Process simulation.
Accurate simulation of an IC device structure begins with an accurate repre-
sentation of the geometry and material properties of the structure. To obtain this
representation a simulation of the individual process steps involved in fabricating
the device is performed. A typical silicon IC process consists of several sequences of
patterning the Si wafer, localling implanting impurities (acceptors or donors) into
the exposed regions, followed by high temperature activation to get the impurities
into the regions desired at the proper concentrations. Key device characteristics
such as switching speed and threshold voltage depend on precise control of the
doping profiles in the device. To prevent numerous iterations of processing wafers
to achieve the desired device characteristics, simulation of the incorporation and
diffusion of the impurities is necessary. Once the profile of acceptors and donors in
the device is simulated, this profile can be used as input into a device simulator to
predict device characteristics before fabrication.
Ion implantation is the most common method of locally incorporating impurities
into a silicon wafer. High energy ions of the species of interest (typically boron,
arsenic, or phosphorous) are accelerated and bombarded into the silicon surface.
The resultant distribution of ions in the wafer is usually described by a Pearson
IV distribution in the vertical direction, and an error function distribution in the
lateral directions where vertical mask edges exist. The moments ofthe distributions
are determined by the kinetic energy and the mass of the ion. Ion implantation also
results in a distribution of interstitial point defects and longer range defects which
must also be accounted for.
After the impurities are incorporated into the wafer, they must be activated, or
moved onto lattice sites. This is done with a high temperature anneal process which
not only activates the impurities, but causes diffusion of the impurities as well. The
rate of diffusion is dependent on the silicon temperature, the gradient of the impurity
concentration, the local electric field, and the local point defect concentration by
which the impurities may move. Often the silicon surface is exposed to an oxidizing
ambient during diffusion either to grow a thin oxide for the gate of the device,
a thick oxide to isolate devices, or an oxide to be used as the mask layer for a
future process step. This oxidation results in a movement of the surface boundary,
segregation of the impurities at the interface of the growing oxide, and generation
of interstitial defects which diffuse into the silicon. All of these phenomena must be
taken into consideration to accurately calculate the resultant impurity profile. The
diffusion equations are solved numerically on an appropriate multi-dimensional grid
and with appropriate time steps so as not to add numerical errors to the solution in
addition to errors introduced by assumptions 011 which the formulations are based.
In a typical CMOS process, n- and p-channel devices are created on the same
wafer. The impurity profile of the initial p-type substrate can be used to control the
4
threshold voltage of the n-channel device, but a deep tub of n-type impurities must
be created for the p-channel device. The wafer is covered with photoresist, then the
n-type tub region is exposed, implanted and diffused. Next, a new mask is placed
on the surface which protects the regions where the devices will be formed, but
exposes the regions between. A thick oxide to electrically isolate devices is grown
at a high temperature. The mask over the active regions is stripped, and the thin
gate oxide is grown. This is protected immediately with a layer of polysilicon for
the gate material which serves as the mask in the next step for implanting a high
concentration of acceptors into regions on either side of the p-channel gate to create
the source and drain regions of the device. The same is then done for the n-channel
device with a high concentration of donor ions. The wafer is annealed to activate
these source and drain regions, after which the impurity profiles in the devices are
essentially formed. Once simulated in a process simulator [2), as shown in Figure
2.1 these profiles are stored in disk files and can be used by a device simulator to
calculate current-voltage and charge-voltage characteristics.
3. Device modeling.
As semiconductor devices continue to shrink in size, and as new technologies en-
ergy, device structures become more complicated and the need for physically-based
numerical device simulation grows. Simulation tools are needed for the design of
devices and in order to gain insight into new physical effects. At present, device
modeling has become a necessary and integral element in any new process or tech-
nology development effort.
Device modeling is accomplished by solving the basic equations governing the
behavior of semiconductor devices. Basically, Maxwell's equations of electromag-
netism and t,he Boltzmann Transport Equation (BTE).
A direct approach to solve BTE is the Monte Carlo method. This technique sim-
ulates, at a microscopic level, the transport process of mobile carriers. The Monte
Carlo approach has proven to be successful in simulating transport effects. How-
ever, its primary drawback is the enormous cost associated with the long cpu time
required, particularly when coupled with Poisson's equation. The Hydrodynamic
model or the Momentum and En~rgy Balance equations are alternative approaches
to solving the BTE. However, the simplest form of the transport equation is that of
the Drift and Diffusion model. This model,. which can be derived from the hydro-
dynamic model, comprises electron and hole current continuity equations coupled
with Poisson's equation.
The inputs to device simulation tools are typically a description of impurity dop-
ing profiles obtained from process simulation, as discussed in the previous section,
and device geometry as well as bias conditions. The output will be the electrical
responses e.g. steady-state, transient or small-signal waveforms of currents and
voltages at terminals and/or carrier density, electric field or potential distributions
inside the device.
General two-dimensional (2D) or three-dimensional (3D) device simulation pro-
grams, such as PADRE [3], as well as application-specific tools, such as MEDUSA
[4], can be used to solve a wide range or problems.
There are many examples of device simulations applied in both device optimiza-
tion and reliability improvement. In device optimization, various devices are simu-
lated to quantify the effects of short channel length and narrow channel widths using
2D and 3D device simulators. As far as reliability improvement is concerned, de-
vice simulations have been used to refine CMOS device structure to prevent latchup
problems [3] and gain insight into the effects of hot carriers and velocity overshoot.
As dimensions of electronic devices decrease, resulting in faster switching speed,
delay caused by parasitic capacitance and resistance of interconnections becomes
more significant. The RESCAL program [5] has been developed to solve Laplace's
equation in two dimensions to provide fast an.d accurate values of distributed ca-
pacitance and resistance. RESCAL also produces plots of equipotential and flux
lines to represent visually the distribution of the electric field. Figure 3.1 shows an
6
example of a contour plot for a structure in which there are two layers of different
dielectric constants, and two trapezoid-shaped conductors in the lower dielectric
layer over a conducting substrate.
4. Compact models.
Circuit simulations are used to verify IC designs based upon compact device
models, so the models must be able to accurately represent characteristics of de-
vices being manufactured. Also, the device characteristics generated from compact
models provide a reference to which the manufacturing process should be controlled
such that the device characteristics will resemble the reference. Compact models
are thus an important link between IC design and manufacturing.
Aggressive IC design places strong demands on compact device models. The
models must allow to accurately represent the DC, AC and transient behavior of
circuits, and often also the circuit noise performance, distortion level and sensitiv-
ity to variations in manufacturing and operating conditions. In addition, compact
models must be computationally efficient, must be simple enough for robust param-
eter extraction, and must model device behavior over a wide range of bias, geometry
and operating temperature. To meet all these challenges is often difficult, for exam-
ple, MOSFETs exhibit an exponential variation of current with applied bias in the
subthreshold region of operation and a polynomial variation of current with applied
bias above threshold. It is also desirable for compact models to have a good basis
7
in device physics, so that physical understanding can be used to guide model devel-
opment and ongoing model improvements, and so that process variations measured
as changes in test wafer measurements can be mapped, at least to a first order, into
changes in model parameters.
ASIM3, an enhanced version of the model described in [6], is the most advanced
MOSFET model available in the AT&T circuit simulator, ADVICE [7]. ASIM
includes subthreshold conduction, models short and narrow channel effects, and is
based on an advanced mobility model that accounts for mobility reduction due to
gate and backgate fields and due to velocity saturation. ASIM is charge-based, and
accurately models both the partitioning of charge between the source and drain and
the variation of overlap capacitance with bias. Both the geometry and temperature
dependence of MOSFET behavior are modeled by ASIM. ASIM includes models
for noise and substrate injection current. The current and charge models of ASIM
are continuous in function value and derivatives with respect to the applied biases
across all operating regions.
Figure 4.1 shows the output and subthreshold characteristics of ASIM, com-
pared with data from the MEDUSA device simulator. ASIM accurately models the
DC current, output conductance and transconductance of MOSFETs.
'0 OJI
, , ""-'
,~: , ""-'
-,*-, 0.7
-,*-, OJI
ur'
~
,,,.. ...
,0-'0
,V 0.,
0 0
0 0
v.. M v.. M V.. M
, ""-'
D.'
'0
-,*-,
'0-'
,0-'
~
L""
-' ,,,.. •
"
' 0"'
'0-'
,,", ,...
'0"
0
'0"' 0 , U
0
0 2 3
V.. M V.. M V.. M
Major deficiencies have existed in previous MOSFET models. First, most MOS-
FET models are formulated as regional models, that is, different modeling equations
are used in the subthreshold, triode and saturation regions of operation. Regional
models have limited continuity, and display kinks and glitches, at the region bound-
aries. This causes problems for parameter extraction and DC convergence, limits
the accuracy of distortion analyses, makes some advanced techniques such as ho-
motopy [8] inapplicable to MOS circuits, limits the order of integration that can be
used for transient analyses, and leads to inefficient transient analyses as it causes
small time-steps to be used. Second, most MOSFET models are formulated with
the source node as the reference. This easily causes the model to display asym-
metries with respect to the source and drain, even though MOSFETs usually are
symmetric devices.
5. Circuit simulation.
Small feature size and mixed analog and digital components in today's VLSI IC
technologies demand more accurate technology modeling in circuit simulation than
that in the past. The emergence of networked workstation environments demands,
flexibility in task-oriented procedural simulation and robustness in design centering.
A circuit simulation system normally consists of six components: front-end for
user interactions, macro interface for procedural threads, design centering and opti-
mization, analysis engine, model interface for device models, and graphics display.
- The Front-End
The front-end provides a graphic user interface between the user and the
circuit simulator. In addition, it interacts with a schematic capture program
and also provides remote execution capabilities across the network. User-
friendliness is determined by not only the set of commands but also its look
and feel.
exchange between TCAD simulation tools. Data exchange between TCAD tools is
the first hurdle that must be overcome before tools can be integrated. Integration,
however, is only the first stage of deployment. Other issues such as capability,
accuracy, robustness, ease of use, and user-friendliness all play an important role in
gaining the user's acceptance.
TCAD frameworks can be viewed from two distinct points of view:
• As a framework for the integration of TCAD tools. An example of such
a framework is the AT&T Mecca system [15] and Intel's Ease system [16]
which integrate process/device and circuit simulation with analysis tools
such as optimization and parameter extraction as shown in Figure 6.1.
• As an environment, and associated set of support tools, for the development
of TCAD tools as shown in Figure 6.1. We are not aware of any significant
previous work in this area.
The integration framework is of immediate benefit to all TCAD practitioners, as
well as TCAD customer organizations: technology development, technology charac-
terization, and manufacturing. The development environment, on the other hand,
is of importance primarily to universities and industry R&D groups engaged in
TCAD tool development.
The next two subsections expand on the above definitions.
Sensitivity Analysis
worst-Case
Analysis
Optimization
User Interface
6.1 Summary. We have discussed two views of TCAD frameworks. These two
views are not exclusive. In fact, achieving a standard for TCAD tool development
would result in increased uniformity across the tools, which would greatly ease the
integration of such tools into a common TCAD system.
In addition to the traditional uses of TCAD in technology development and
characterization, opportunities are being persued in:
Problem
Definition
User
CENTER Interface
••C°U'IuIM: "
;~
(Approximate)
~1JiFUW1h~
( Worst-Case
--
C/)
'2
:::s
.e
-
"FAST" Prediction
ttl
"SLOW" Prediction
1 10 50 90 100
Normal Probability (%)
Figure 7.3. Predicted and Measured Ring Oscillator Frequen-
cies
9. Acknowledgements.
I thank past and present members of the Technology CAD Department at AT&T
Bell Laboratories for their contributions to the work described here. In addition, I
thank Heinz Dirks, Sally Liu, Jim Prendergast and Kishore Singhal for their help
in preparing this paper.
REFERENCES
[1) P. LLOYD, H.K. DIRKS, E.J. PRENDERGAST, AND K. SINGHAL, Technology CAD for Com-
petitive Products, IEEE Trans. Computer-Aided Design, vol. CAD-9, Nov. 1990.
[2) B.R. PENUMALLI, A Comprehensive Two-Dimensional VLSI Process Simulation Program,
BICEPS, IEEE Trans. Electron Devices, vol. ED-30, Sept. 1983.
[3) M.R. PINTO, W.M. COUGHRAN, JR., C.S. RAFFERTY, AND E. SANGIORGI, Device Simulation
for Silicon ULSI, Computational Electronics, Ed. K. Hess, J.P. Leburton, and U. Ravaioli,
Kluwer Academic Publishers, 1991.
[4) W.L. ENGL, R. LAUR, AND H.K. DIRKS, MEDUSA-A Simulator for Modular Circuits, IEEE
Trans. Computer-Aided Design, vol. CAD-I, April 1982.
(5) B.R. CHAWLA AND H.K. GUMMEL, A Boundary Technique for Calculation of Distributed
Resistance, IEEE Trans. Electron Devices, vol. ED-17, Oct. 1970.
(6) S.W. LEE AND R.C. RENNICK, A Compact IGFET Model-ASIM, IEEE Trans. Computer-Aided
Design, vol. CAD-7, Sept. 1988.
(7) L.W. NAGEL, ADVICE for Circuit Simulation, Proc. ISCAS, Houston, 1980.
(8) L. TRAJKOVIC, R.C. MELVILLE, S.-C. FANG, Improving DC Convergence in a Circuit Simu-
lator Using a Homotopy Method, IEEE Custom Integrated Circuits conference - CICC-91,
San Diego, CA, May 1991.
(9) M.S. TOTH, MakCal: An Application Generator for ADVICE, AT&T Technical Journal,
vol. 70, Jan./Feb. 1991.
(10) S. LIU AND K. SINGHAL, A Statistical Model for MOSFETS, IEEE International Conference
on Computer-Aided Design - ICCAD-85, Santa Clara, CA, Nov. 1985.
(11) K. SINGHAL, C.C. McANDREW, S.R. NASSIF, AND V. VISVANATHAN, The CENTER Design
Optimization System, AT&T Technical Journal, vol. 68, May/June 1989.
(12) G.M. KULL, L.W. NAGEL, S.-W. LEE, P. LLOYD, E.J. PRENDERGAST, AND H.K. DIRKS,
A Unified Circuit Model for Bipolar Transistors including Quasi-Saturation Effects, IEEE
Trans. Electron Devices, vol. ED-32, June 1985.
(13) S. LIU, K.C. Hsu, AND P. SUBRAMANIAM, ADMIT-ADVICE Modeling Interface Tool, Proc.
1988 Custom Integrated Circuits Conference, Rochester, 1988.
(14) S.G. DUVALL, An Interchange Format for Process and Device Simulation, IEEE Trans.
Computer-Aided Design, vol. CAD-7, July 1988.
(15) E.J. PRENDERGAST, An Integrated Approach to Modeling, Proc. NASCODE IV, Dublin,
June 1985.
(16) J. MAR, K. BHARGAVAN, S.G. DUVALL, R. FIRESTONE, D.J. LUCEY, S.N. NANGAONKAR, S.
Wu, K.-S. Yu, AND F. ZARBAKHSH, EASE-An Application-Based CAD System for Process
Design, IEEE Trans. Computer-Aided Design, vol. CAD-6, Nov. 1987.
THE BOLTZMANN-POISSON SYSTEM
IN WEAKLY COLLISIONAL SHEATHS
with the electric field in the z direction. For the sake of brevity, the sheath is assumed
to be composed purely of ions; we define the presheath/sheath boundary as the point
beyond which electrons are significantly depleted. Since, in most planar discharges,
the neutral density ng is much higher than the ion density n, we take into account
only two-body ion-neutral collisions. We also assume that the ions and the neutrals
have equal mass, and that the neutrals are uniformly distributed and cold, i.e., the
velocity distribution of the neutrals is given by F(v) = ngo(v).
The ion distribution function fey, z) is then governed by the following non-
dimensionalized Boltzmann-Poisson system:
(1) Uz
af -
a..r
d,p af- d
- dr -a = -,- u
.. Uz /lmfp
{)4
J(- feu ,
u'
U
-, U
()-iff! -
U!o!al
- }
feu, () ,
(2)
where U = lui and u' = lu'l. Here we have used the normalizations
f
(3)
v uWpid, z (d,
d2n U!o!al. Xd d. l•
(4) U .. = g;-SlD2" X 0/,
where X is the polar scattering angle measured in the laboratory system, and t/J is the
azimuthal scattering angle about the u' direction. Note that u = u' cos x.
We now solve Eqs. (1) and (2) in the limit of weak collisionality, e = d/Amfp < 1.
For the sake of simplicity, the ions are assumed to enter the sheath with a fixed
velocity VB as a beam, so the boundary conditions for f are given by
for u z ;::: 0,
(5)
/(u,( = 1) = 0 for Uz < O.
Here UB = vB/wpid > 0, and U.L is the magnitude of the component of u perpendicular
to the z direction. It is known that the initial ion stream velocity VB is typically given
by the ion sound speed VB = (kBT./m)!, where kB is the Boltzmann constant, T.
is the electron temperature of the bulk plasma and m is the ion mass (this is the
19
Bohm sheath criterion; see [8]). The boundary values for the potential are given by
I/> = 0 and dl/>/d( = EJ at ( = 0, where EJ denotes the (normalized) magnitude of
the electric field at the presheath/sheath boundary.
Assuming that the dependence of 1 and I/> on c: is analytic, we expand the ion
distribution function 1 and the potential ,p in terms of the small parameter c: in the
form 1 = 10 + c:ll + ... and I/> = <Po + c:l/>l + .... To the lowest order, we obtain from
Eqs. (1) and (2) the following equations for a collisionless sheath:
(6)
810 _ d,po 8j~ _0
Uz 8( d( 8u z - ,
(7)
(8)
Eq. (6) becomes uz 810(uJ.,£,O/8( = 0 and its solution with the boundary condition
(5) is given by
(9)
_=
fa
{~~,,~)8( /U - UB) if U z > 0 and & 2: 0,
o otherwise.
The lowest-order potential ,po may be calculated through substitution of Eq. (9)
into Eq. (7), i.e.,
d2 ,po -UB
(10)
de = Ju1- 2,po·
The exact, closed-form solution of Eq. (10) is derived in [10], where it is shown that
1/>0 is a non-positive ( ,po :::; 0 ) monotonically decreasing function for all ( 2: o. It is
known that in the limit UB ~ -2,po, the solution of Eq. (10) gives the collisionless
Child-Langmuir law [9] [10]:
(11)
(12)
20
where
B+ = uJ (~)41o(u/)~<fn
--1
u Utotal
or
...!,o( Vii - UB)
al~ = { "U
z
if U z > 0 and h ~ 0,
(15) a(
o otherwise,
(19) y = -2IPo«().
Evidently y ~ 0 and y ~ -2&. Then Eq. (15) becomes
aft _ O(-./h-UB)
(20)
8y - 2IPo7r(2& + y)
21
ifU z > 0 and h ~ o. Otherwise 8ft /Oy = O. Here tP~ = dtPo/d( is evaluated at
( = ((Y) = tPo 1 ( -y/2). The function h may be written in terms of £, U.L, and y as
2 ul
h = 2( U.L + £) + 2£ + y ,
which is a monotonically decreasing function of y (> -2£). In integrating Eq. (15),
we find
(21)
if
(22)
(23) and
fe(O, z)
(25)
where
(26)
In this section, we are concerned only with angular distributions for 0 > 0 and do not
count ballistic ions (0 = 0) whose distribution function is given by the 6 function.
In order to carry out the integration of Eq. (25), we need to determine the range
of the integration variable U for which Ii
is given by Eq. (21). From the inequality
(22) and Eq. (26), we obtain the condition
(27)
Substituting Eq. (26) into the inequality (23) yields the condition
(28)
The discriminant of this quadratic equation for u 2 is given by
(29)
(30)
1£1 sin 2 0
(31) - > -- (¢:::::> D 2 0)
4y - cos 4 0
(32) and
It is easy to show that (1£~ + y) cos 2 02 y + (1£~/2) cos 2 0 + JD/2 for all y 20 and O.
It should be noted that the term 1£1/y = mv1/2ql<l>ol denotes the ratio of the initial
ion kinetic energy to the zeroth-order potential energy <1>0 at z = Cd and typically
takes a small value.
In the case 1£1/4y = mv1/8ql<l>ol < sin 2 0/ cos 4 0 where the inequality (30) holds,
therefore, the angular distribution of the ion flux is given by
ro(O,z) =
(33)
where the relation 1£1- 2(1t.L2 + E) = (u~ + y) - u2 (1 + sin 2 0) is used. For small
angles 0 satisfying u1 /4y 2 sin 2 0/ cos 4 0, the range of integration of Eq. (33) needs
to be changed according to the inequality (32).
If the electric field is constant, then the term -</l~ = EJ may be taken outside of
the integration and we can in fact carry out the integration:
Equation (35) gives the profile of the ion flux angular distribution.
If the initial velocity VB is sufficiently srnall, so that the condition u1/4y « 1 is
satisfied, then the inequality u1/4y = mv~/8ql<l>ol < sin 2 0/ cos 4 0 holds for most of
23
0> 0, i.e., uB/2y ;S 0:<:; 7r/2. In this case, Eq. (34) may be further simplified with
the use of u1 ~ -</>0 = Fh( and given in dimensional form by
(36) ro(O,z)
roz-90(0).
= -,
/lmfp
where the boundary condition d</>o/ d( = EI is used. Carrying out this integration
and substituting cPo = -Ye/2 yields
(37)
where f{ = EJ/2u1- 1 (> -1). It is shown in [10] that the potential </>0 is a weak
function of f{ for realistic values of I{ (-1 < f{ < ~). The function Ye given in
Eq. (24) satisfies
(38)
Equations (37) and (38) give an expression for the dependence of cP~( (e) on u and O.
Introducing
(39) ~=--,
U
U max
a=--
UB
U max
with U max = Ju1 + y,
(40)
From Eq. (33), the angular distribution of the ion flux is then given by
(41)
Here the range of integration lea) is given as follows: from the inequality (30),
if (¢=} u1<sin
-
4y
8)
2
- 4-
cos 8 '
Although u1j4y is generally small and the angular distribution for most values of 8
is given by the integration over lea) above, an accurate account of the small-angle
distribution must be given by a different integration range lea). From the inequality
(32),
if
Q> 2sin8
- (1 + sin2 8)
(44) then
Here iJ = a 2 (a 2 cos 4 8 - 4(1 - ( 2 ) sin 2 8). We note that the function gee depends
on I< through the term aI< in the function ga(~) of Eq. (40). Since the value Q is
typically small and the dependence of the potential 4>0 on the parameter I< is known
to be weak (10), the parameter dependence of gee on I< is also weak.
Figure 1 shows a comparison of the theoretically-predicted angular distributions
and Monte Carlo simulation results in the case dj >'mfp = 0.14. The electric field used
in the Monte Carlo simulation and the self-consistent-field distribution (Eq. (42), the
solid line) is the solution to Eq. (10), subject to the boundary conditions 4>0(0) = 0
and EJ = d4>o(O)jd( = 2.8 x 10- 4 . The constant-field approximation (Eq. (35» is
given by the dashed line. The theoretical distribution for the ballistic ion component,
which is a delta function at 8 = 0, is not shown here and the Monte Carlo ballistic
ion component, represented by the first bin at 8 = 0, is truncated by the frame of
the figure; all curves are normalized so as to enclose unit area. A good agreement
between the analytic distributions and the simulation results is evident in Fig. 1.
where I] denotes the ratio of the ion kinetic energy to the kinetic energy of the ballistic
ions at z, i.e.,
!rnv 2 u2
(46) I] = 1 2 = -2- _ .
"i rnv B 2 - q<Po UB +y
For scattered ions (I] < 1), we have from Eq. (45)
(47)
25
CD
~
c::: 2
0
+::
"- ,, -
€ \
\
U5 \
i:5 \
..... \
J!! \ -
~
\
,,
0>
c::: \
«
OULUL~~~~~~~'U'~"-~U~~~~--L-I-L~
0.0 0.2 0.4 0.6 0.8 1.0
e / (1tI2)
FIG. 1. The angular distributions of the ion flu x in the case of a self-consistent electric field obtained
from the Monte Carlo simulations (histogram) and Eq. (42) (the solid curve). For comparison, the
formula for the constant-field approximation (Eq. (35)) is also presented as a dashed line . The
dimensionless parameters used here are d/A mjp = 0.14, un = 1.0 x 10- 2 , EJ = 5.1 X 10- 3 , and
(= 1.
26
Here we have used the relation u~ - 2( ul + &) = (u~ + y)(1 - (1 + sin2 0)1]).
The range of integration J(O') for 0 is obtained from the conditions (27) and (28):
If 0 ~ 1] ~ 1 - 0'2, then
(48)
(49)
(50)
where
-log(1 - 1])
(51) Qm("l; 0') ={
-log 0'2 (1 - 0'2 < "l < 1),
which holds for any 0' (0 < 0' < 1). We note that the distribution is constant for
1-0'2<"l<1.
In the case of a self-consistent electric field, the potential <Po is obtained by solving
Eq. (10). In this case, the electric field <P~ also becomes a function of "l and O. As
shown in Eqs. (37) and (38), we may write -<p~('c) = ..j2;;u max gE(O), where
(53)
where
4
E
z
w
~
c::
a 3
~
.0
.....
.~
II)
0 2
>. A
E>
Q)
/
/.
/
c:: ,/
W "..
"..
"..
0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Ion Energy 11
FIG . 2. The energy distributions of the ion flux in the case of a self-consistent electric field obtained
from the Monte Carlo simulations (histogram) and Eq. (54)(the solid curve). For comparison, the
formula for the constant-field approximation (Eq. (51)) is also presented as a dashed line. The
dimensionless parameters used here are the sante as those for Fig. 1.
(55)
(56) cf2¢
d(2 =- Jf( -T, (, U z )du.,
(57)
namely, the ions are assumed to be injected into the sheath as a beam in the z direction
with velocity UB. Since the sheath is collisionless, the perpendicular component of the
velocity vanishes, i.e., ttl. = 0, and we may consider that the distiribution function
J is already integrated over Ul.' The normalizations used here are thus somewhat
28
T wt, v./wd,
(58)
f /nJ/(wd), e Wpi/W ,
where ( = z/d and q, = ¢>qn[li?/eo are the same as those in Eq. (3), and Wpi
q.jnJ/meo denotes the ion plasma frequency. The boundary conditions for the po-
tential ¢>(t, z) are given by
(59)
where V(T) = eoV(wt)/(qn[li?) denotes the normalized cathode voltage. Here V(wt)
is given as a time-periodic function in t with period 27r /w.
The time average of a function h( T) is defined as
If the function h(T) is 27r-periodic in T and satisfies (h) = 0, then the function Ch
is also 27r-periodic in T and satisfies (Cf) = O. Defining ¢>o(O = (¢» and ¢>l(T,O =
¢> - (¢», we obtain the characteristic equations for Eq. (55):
(61)
(62)
We now seek the time-periodic solution of the system (55) and (56) satisfying
in the regime of high frequency, i.e., E <t: 1. When e <t: 1, the solution to the
characteristic equations at z is given [11], up to order e 2 , by
(63)
where ¢>~(T,O = 8¢>d8(. The derivation of Eq. (63) is based on a two-time scale
asymptotic expansion in small e. Since df/ dT = 0 or / = constant along the char-
acteristics, the standard method of characteristics yields the velocity distribution
function at (:
(65) (fe) = -2
w
1f
1 0
21</'"
fedt =- nj
2
1fmw
d
1 f(r,(,u z) dr.
0
21<-
(67)
and the discrete phases epi = epi(t:) denote all the distinct solutions ep of the equation
t: = t:",(ep). In the case of sinusoidal time dependence i.e. E(t, z) = -8iP/8z =
Eo(z) + E1(z) coswt, the general expression (66) reduces to
(68)
where
qEI(d)
v± =vo/±--.
mw
Numerical calculations of the energy distribution (f e), based on Eq. (68) and Monte
Carlo simulations, are found in [11].
Equation (64) gives the ion velocity distribution for any given electric field poten-
tial 1/>. However, in order to obtain a self-consistent electric field profile, one must
solve the Poisson equation (56), using Eq. (64). Carrying out the integration of
Eq. (56), we obtain
(69)
It is easy to see that Eq. (69) may be split into the following two equations:
d21/>o UB
(70)
d(2 Jul- 2e 2 I/>o(O'
d21/>1
(71)
d(2
o.
Suppose that the (normalized) time-dependent cathode voltage is given by VCr) =
Vo + iii cosr. Then the boundary conditions for Eqs. (70) and (71) become
(72)
30
and
(73)
Equation (70) subject to the boundary conditions (72) has a form similar to Eq. (10),
giving a collisionless DC-sheath potential. Equation (71) with the boundary con-
ditions (73) gives a uniform (i.e., z-independent) oscillation field, i.e., [)!fJd[)( =
VI COST.
(74)
1
where Ad = (eokBTe/n[q2)"2 denotes the Debye length. The self-consistent solution
from Eqs. (64) and (74) is beyond our present scope and will be discussed elsewhere.
Discussion of the effects of a high-frequency motion of the sharp presheath/sheath
boundary on the ion distribution may be found in [12].
REFERENCES
[1) S. M. Sze, VSLJ Technology, McGraw-Hill, New York (1988).
[2) B. Chapman, Glow Discharge Processes, John Wiley & Sons, New York (1980).
[3) J. W. Coburn and E. Kay, J. App!. Phys. 43, 4965 (1972).
[4) W. M. Holber and J. Forster, J. Vac. Sci. Techno!. AS, 3720 (1990).
[5) R. T. Farouki, S. Hamaguchi, and M. Dalvie, Phys. Rev. A 44, 2664 (1991).
[6) S. Hamaguchi, R. T. Farouki, and M. Dalvie, Phys. Rev. A 44, 3804 (1991).
[7) See, for example, L. D. Landau and E. M. Lifshitz, Mechanics, Pergamon Press, Oxford, 1960.
[8) W. P. Allis, "Motion ofIons and Electrons," in Handbuch der Physik, Vol. 21, Springer-Verlag,
Berlin, 1956, p. 383.
[9) C. D. Child, Phys. ~ev. 32, 492 (1911): 1. Langmuir, Phys. Rev. (Ser. II) 2, 450 (1913).
[10) R. T. Farouki, M. Dalvie, and L. F. Pavarino, J. Appl. Phys. 68, 6106 (1990).
[11) S. Hamaguchi, R. T. Farouki, and M. Dalvie, Phys. Rev. Lett. 68, 44 (1992).
[12] R. T. Farouki, S. Hamaguchi, and M. Dalvie, Phys. Rev. A45, 5913 (1992)
AN INTERFACE METHOD FOR SEMICONDUCTOR
PROCESS SIMULATION
MICHAEL J. JOHNSON* AND CARL L. GARDNER**
Abstract. The diffusion of dopants in silicon at high temperatures is modeled by a nonlin-
ear parabolic system of partial differential equations on a two-dimensional region with a moving
boundary. A numerical solution using the L-stable TRBDF2 time integration method and a "box
method" spatial discretization is described.
Details are given of the methods used to specify and manipulate curves, and to define arbitrary
simply connected regions by their boundary curves. Numerical experiments are presented com-
paring the divided difference and TR/TR methods for dynamically adjusting the timestep, and
comparing Newton and Newton-Richardson iteration.
1. Introduction.
Semiconductor process simulation} models the nonlinear diffusion of dopant
atoms during the thermal annealing of silicon or other semiconductor wafers which
have been doped by ion implantation. When the temperature is raised, the dopant
atoms diffuse in the sample. The diffusivities of the dopant atoms depend on local
dopant concentration and change with time due to a number of transient effects.
During the anneal, portions of the surface of the wafer are allowed to oxidize.
The rate of oxide growth depends in part on local dopant concentration along the
oxide/silicon boundary, so that the mathematical model of this process becomes a
free boundary problem.
The nonlinear diffusion process is described by a set of conservation laws for
impurities
(1) ~. ~v (p.,vc,) d.
in a simply connected region net), where C",,(x, t) is the concentration of the ath
species of dopant, D is a matrix of phenomenological diffusion coefficients, and a,
{3 = 1, ... , N label the types of impurities. Note that D = D(C}, ... , CN)
includes the effects of the coupling of the impurity ions to the electric field.
The boundary on(t) of net) represents the union of a silicon/mask interface
and a silicon/oxide interface. Boundary conditions of homogeneous Neumann type
are imposed by the physical constraints that no dopant ions may leave the silicon
region net) unless consumed by the growing oxide regions and that no migration
occurs across the oxide/silicon interface:
(2) (ft· \7C"")afl(t) = O.
The code which positions the moving boundary is distinct from the code that cal-
culates dopant diffusion. This paper addresses the efficient numerical solution of
the diffusion problem only, using a prescribed ?oundary net).
*IBM Corporation, Endicott, NY, 13760.
**Department of Computer Science, Duke University, Durham, NC 27706. Research supported
in part by the National Science Foundation under grant DMS-8905872.
1 See Ref. [1] for a review.
34
2. Numerical methods.
We use the composite TRBDF2 method [2] to integrate the solution in time_
To integrate Eq. (1) from t = tn to tn+l = tn + At n , we first apply the trapezoidal
rule (TR) to advance the solution from tn to tn+'r = tn + 'YAtn:
(3)
and then use the second-order backward differentiation formula (BDF2) to advance
the solution from tn+"Y to t n+l:
(4)
This composite one-step method is second-order accurate and L-stable [2]. The
importance of L-stability for diffusion is illustrated for a 1D computation in Figure
1. After a single timestep with At = 50AtEuler = 50A y 2/2D max , the TR method,
which is A-stable but not L-stable, exhibits severe unphysical oscillations near the
maximum of C.
c
2
-----------TR
-----TRBDF
after one time step
1.5
L -______~~~----~--------~--------~------~y
2 4 6 8
We linearize Fn+t in Eq. (4) (and similarly Fn+"Y in Eq. (3)) by approximating
(5)
where k = 0, 1, ... labels the Newton iterations, and the Frechet derivative
(6)
35
(7) n+1
C (k+l) = Cn+l
(k)
+ \'C(k)
AU
n+1 C n+1 Cn+'Y
, (0) =
where A is a damping factor [3] between 0 and 1, chosen to insure that the norm of
the residual for Eq.(3) or (4) decreases monotonically. At each TR or BDF2 partial
step, we iterate until the Newton method converges.
The Newton equation for the TR partial step is
(8)
(9)
_ (cn+1 _
( k)
1 Cn+'Y + (1- 'Y? cn) + 1- 'Y A
2 _ 'Y ut n
pn+l - G
(k) = - BDF2
'Y(2-'Y) 'Y(2-'Y)
(10)
v .f _ f . nds
fboundary of box
( )average, interior of box - area of box .
u(b) - u(a)
(11) (Vu) . n::::: x(b) _ x(a)'
and
All spatial operators employed in the Newton method described above are of this
type.
To define the box at the boundary aO(t), consider, for a moment, point a in
Figure 2 88 the origin of a coordinate system, and tbe dasbed box as a unit square.
In our implementation of the box method, the smallest incremental area is one
octant of this unit square. To enable correct identification of interior octants, tbe
list of boundary points adheres to an orientation convention such that the previoa.!
boundary point defines a starling octant, and the 1t<!:zt boundary point defines a
stopping octant, with the interior oct ants identified by counterclockwise rotation
from premolU to next, as shown in Figure 3.
• • •
I I
• • •
I
I I
a b
L ~
• • •
Figure 2: Box method in the interior of a rectangular grid.
• • ne%t
oxide
prevIous
L
a
~
I •
s ilicon
•
Figure 3: Box method 00
•
the boundary of a region.
•
The box method with central differences couples the BOlution at one point to
the llOiution at nearby neighbors; there is no coupling between distant points. As a
result, the matrix representing the spatially discretized operator (and consequently,
the matrix to be solved at each timestep) is sparse. The discretized linear systems
are solved wing the sparse matrix package of Bank [2).
37
The timestep size l:l.t is adjusted dynamically within a window [l:l.tmin' l:l.tmax ]
by monitoring a divided-difference estimate of the local truncation error T [2]:
(15)
k= -3(2 + 4( - 2
12(2 - ()
The three values of F employed in Eq. (14) have already been calculated in the
most recent TRBDF2 timestep.
An alternative approximation for Tn+! involves re-taking the most recent partial
timestep (from tn+'Y to tn+d using TR instead of BDF2 [6]. (We will refer to the
resulting value of C as C:;'"kiTR') The TR/TR step yields the approximation
for ( = 2 - J2. Very few Newton iterations are necessary in taking the second TR
step, since cn+l is, in fact, already known. The performances of the TR/TR and
divided difference error estimators are compared in Section 4.
Computational information (such as the solution) is kept only at grid points, the
points of intersection of grid lines. The division of grid blocks (the area bounded by
neighboring grid lines) into many smaller sub blocks is not computationally useful
unless information can be associated with the subblocks, which would effectively
yield a refined grid. In other words, if resolution finer than a grid block is required,
then the grid should be refined. (At present, this must be done manually.) For
this reason, multiple crossings of any grid block by a curve are eliminated in an
operation called pruning, which we now describe.
In the following discussion, it will be useful to label as block (i, j) that grid block
which encompasses the area between Xi and Xi+! and between Yi and Yi+l. To prune
a meshed curve, we move along the meshed curve one chord (two consecutive curve
points) at a time. If the chord crosses grid block (i, j), then we increment a counter
box(i,j). If box(i,j) exceeds one, we move backwards along the curve, removing a
40
point at a time from the curve structure, until box(i,j) = 1. Figure 7 shows the
curve of Figure 6 after pruning.
modeled here does not exhibit instabilities experimentally, however, and the pro-
jection method is quite appropriate.
To avoid accumulation of the positional error associated with projection, the
projected curve is never propagated. Instead, at each timestep, the exact boundary
curve is calculated and projected. The process of moving a boundary curve itself is
thus straightforward, but after the movement, algorithms of greater complexity are
required to deal with the consequences of the move, as we now describe.
/ /
~ ,,' ~
,I,-
''''
~t'-
V V ,l.-
I .... I....
--- ---
Figure 8: Projected curve.
curves are reassigned after moving the boundary. These areas are called the tran-
sition regions. Figure 10 shows the transition regions (labeled I-IV) formed when
the boundary curve of Figure 8 is moved to the position marked "new."
Boundary curves are always set up to encircle the regions they define in a
clockwise manner. This convention is used to ensure a consistent definition of the
concepts of moving "forward" or "backward" along a curve, and it is also employed
when discretizing via the box method, to decide locally which oct ants are in the
region.
The algorithm used to define each transition region is as follows:
• Move forward along the old boundary until a point is reached which is not
in the new boundary.
• Move backward along the old boundary one point. Call this point A. Copy
point A into the transition boundary as its first point.
• Move forward along the old boundary, copying each point into the transition
boundary, until another point is reached which is in the new boundary. Call
this point B. Copy point B into the transition boundary.
• Move backward along the new boundary, copying each point into the tran-
sition boundary, until point A is reached. Copy point A into the transition
boundary as its last point.
As can be seen in Figure 10, a single boundary movement can result in many
transition regions. Therefore, the transition region algorithm must be applied again
and again, starting from point B, etc., until the last point in the old boundary is
reached. Surprisingly, this still does not yield all the transition regions. The remain-
der must be found by reversing the roles of new and old above; that is, searching for
points in the new boundary which are not in the old boundary. Transition region
II in Figure 10 is of this type.
4. Numerical experiments.
The numerical experiments described here have been summarized in the form
of tables and graphs below. In the tables, in the column labeled "Case," M is a
medium (40 X 20) grid and F is a fine (80 X 40) grid; SB refers to a stationary
boundary and MB to a moving boundary; the final digit is the number of dopant
species. The column labeled "MN" is the total number of unknowns. (M is the
number of spatial points and N is the number of dopant species.)
The implant (initial data) is the following Gaussian, which yields a total dose
of approximately 10.5 X 1020 atoms/cm per species:
In cases of more than one species, the initial pr9files are the same except that the
concentration of the second species is 0.9 times that of the first. In all cases, the
simulated annealing time is T = 30 min.
44
For the single species cases, the diffusivity was modeled as D = aC + b, with
a = 5 x 1O-33cm5/sec and b = 0.1 x 1O-l3cm2/sec. For the dual species cases, the
diffusivity matrix was
TR/TR DD
Case MN CPU time- CPU time- AtTR/TR/ AtDD
sec. steps sec. steps mm max
M-SB-1 800 lOS 12 89 12 1.00 1.09
M-MB-1 800 129 13 115 15 1.04 1.86
M-SB-2 1600 515 11 454 12 1.04 1.08
M-MB-2 1600 615 13 571 15 1.04 1.24
F-SB-1 3200 721 11 644 12 1.05 1.08
F-MB-1 3200 889 14 948 19 1.05 2.26
F-SB-2 6400 5558 11 4846 12 1.05 1.08
F-MB-2 6400 5458 13 6635 19 1.05 2.26
For stationary boundary problems, the divided difference error estimate gives
superior performance. For large moving boundary cases, the TR/TR estimate is
preferable, for the following reason. At the beginning of a timestep the boundary
is moved to a place where 'VC was formerly nonzero. In calculating pn (at the
beginning of the timestep), we force n . 'V C = 0 on the boundary; but only a short
distance away from the boundary, the initial gradient of C (which was inherited
from the preceding timestep) may be fairly steep. Since P is calculated from second
spatial derivatives, we may expect the initial pn to contain some error associated
with this effect near the boundary. The other terms in Eq. (14), pn+"Y and pn+ 1 , are
45
calculated from C's that have undergone diffusion since the boundary was moved,
so they do not contain this error. Since Tn+! is calculated from difference$ in these
values, a given relative error in Fn will induce a relatively larger error in Tn+! and,
hence, in the calculated timestep size.
1.4 ,.-------,----,---,---,-----.--,------,
1.2
• •<> <>
•
TR/TR CPU 0.8 •
DDCPU
0.6
<> stationary boundary
0.4 • moving boundary
0.2
Figure 11: Relative CPU usage for TR/TR and divided differ-
ence timestep adjusters.
(c) If the damping factor Ak+l is less than 1.0, the Jacobian is re-factored. (The
damping factor is calculated by the formula of Bank and Rose [3].)
The performance of this Newton-Richardson method was compared experimen-
tally with that of ordinary Newton iteration. The results are summarized in Table
2. The Newton-Richardson method usually gives a modest performance improve-
ment over the Newton method, and occasionally only a slight degradation, so that
Newton-Richardson may be considered as the method of choice for most nonlinear
diffusion problems.
Newton- Newton
time- Richardson
Case MN steps CPU total CPU total
sec. iterations sec. iterations
M-SB-1 800 12 87 51 89 46
M-MB-1 800 15 103 58 115 57
M-SB-2 1600 12 466 49 454 45
M-MB-2 1600 15 521 58 571 57
F-SB-1 3200 12 670 51 644 46
F-MB-1 3200 19 926 74 948 69
F-SB-2 6400 12 4793 59 4846 58
F-MB-2 6400 19 6007 74 6635 74
5. Conclusion.
We have demonstrated an efficient set of algorithms appropriate for modeling
stable interfaces in two spatial dimensions, and we have applied these algorithms
to the solution of a set of nonlinear diffusion equations on a region with a mov-
ing boundary, from the field of semiconductor process modeling. We have shown
that for problems with a stationary boundary, a divided difference error estimator
gives optimal performance, while a TR/TR scheme is preferable with a moving
boundary. We have also demonstrated that in most nonlinear diffusion problems,
Newton-Richardson iteration yields a modest performance improvement over New-
ton iteration.
47
REFERENCES
[1] R.B. FAIR, C.L. GARDNER, M.J. JOHNSON, S.W. KENKEL, D.J. RoSE, J.E. RoSE, AND
R. SUBRAHMANYAN, Two dimensional process simulation using verified phenomenological
models, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 10 (1991), pp. 643-651.
[2] R.E. BANK, W.M. COUGHRAN, W. FICHTNER, E.H. GROSSE, D.J. RoSE, AND R.K. SMITH,
Transient simulation of silicon devices and circuits, IEEE Transactions on Computer-Aided
Design, vol. CAD-4 (1985), pp. 436-451.
[3] R.E. BANK AND D.J. ROSE, Global approximate Newton methods, Numerische Mathematik,
vol. 37 (1981), pp. 279-295.
[4] R.E. BANK, D.J. ROSE, AND W. FICHTNER, Numerical methods for semiconductor device
simulation, SIAM Journal on Scientific and Statistical Computing, vol. 4 (1983), pp. 416-435.
(5] R.S. VARGA, Matrix Iterative Analysis, Prentice-Hall, 1962.
[6] H.R. YEAGER AND R.W. DUTTON, An approach to solving multiparticle diffusion exhibiting
nonlinear stiff coupling, IEEE Transactions on Electron Devices, vol. ED-32 (1985), pp.
1964-1976.
[7] I.L. CHERN, J. GLIMM, O. McBRYAN, B. PLOHR, AND S. YANIV, Front tracking for gas
dynamics, Journal of Computational Physics, vol. 62 (1986), pp. 83-110.
[8] D.H. SELIM, Tensor product grid implementation for PREDICT2, Master's thesis, Duke
University, 1990.
[9] Microelectronics Center of North Carolina, Research Triangle Park, NC, PREDICT Users
Manual, 1986.
[10] M.J. JOHNSON, Numerical Methods for Semiconductor Process Simulation in Two Spatial
Dimensions: a Nonlinear Diffusion Problem with a Free Boundary, PhD thesis, Duke Uni-
versity, 1991.
[11] W.M. COUGHRAN, E.H. GROSSE, AND D.J. ROSE, Aspects of computational circuit analysis,
in VLSI CAD Tools and Applications (W. Fichtner and M. Morf, eds.), pp. 105-127, Kluwer
Publishers, Boston, 1986.
ASYMPTOTIC ANALYSIS OF A MODEL FOR THE DIFFUSION
OF DOPANT-DEFECT PAIRS
J.R. KING*
Abstract. Asymptotic methods are applied to a model describing the diffusion through silicon
of a dopant which pairs with both vacancies and self-interstitials. Several different asymptotic
limits are discussed for problems in both one and higher dimensions.
cvcj = CVCj ,
{JCdI {J2CdI
(1.3) -at = DdI {JX2 + F3 - R3 + R5 - F5 + R6 - F6 ,
{JCV {J2CV
(1.4) fJt = Dv {JX2 + RI - FI + R2 - F2 + R5 - F5 ,
{JCI {J2CI
(1.5) lit = DI {JX2 + RI - FI + Rs - F3 +~ - F4 ,
where the D's are constant diffusivities and it is assumed that the dopant is unable
to diffuse in its unpaired state. Before proceeding further we make the simplifying
assumption that K2 and K3 are sufficiently large that the relations
(1.6)
may be assumed valid. The governing system is then made up of the algebraic
equations (1.6) together with the following:
{J ~ .
(1.9) fJt (CI +CdI ) = {JX2 (DICI+DdICdI) +RI - FI +~-F4+R5 - F5+Rs- F6,
which are obtained by taking suitable combinations of (1.1) - (1.5). We note that
conditions such as (1.6) must hold in order for the governing equation to be the
linear diffusion equation
{JCd _ D. {J 2Cd
fJt - • {JX2
when Cd is small everywhere and Cv '" cv, CI '" cj everywhere, in this case the
intrinsic diffusivity Di being given by
boundary layers will occur in which (1.6) is not valid. Additional timescales are
also necessary to describe initial transients when (1.6) does not hold at t = o.
Using (1.6) we may rewrite (1.8) and (1.9) as
(LlO)
(1.11)
t=Tt ,
where cd is a representative dopant concentration and T is a representative timescale;
we note that V and I now denote dimensionless concentrations. We then obtain
(dropping overbars)
C2.1) Pv = uV, PI = uI ,
u =u s V = 1,1= 1,
rx~o
,
at t = 0 u =0, V = 1,1'= 1,
at t = 0 u = u(x) , V = 1, 1 = 1,
Q=jU(x)dX
o
finite. The initial conditions on V and 1 neglect implantation damage effects, but
these initial conditions will in any case play little role in the subsequent analysis.
It is convenient to write
at x = 0 u=u.(t)j
Wv, WI ~ 10- 3 ,
53
Morehead and Lever [9] do not include terms corresponding to 11:0 or 11:2 and they as-
sume that II:} is sufficiently large that the generation recombination term dominates
giving
The parameters used by Richardson and Mulvaney [10] also imply that rv, rI r, Wv
and WI are small and that gv and gI are large.
We shall always therefore consider the limits
In this case the solution has a two region asymptotic structure made up of an
inner region (x = 0(1» on the length scale of dopant diffusion and an outer region
(x = O(g!) where 9 = (9V9I)! ) on the much longer length scale of defect diffusion.
In this limit, equations (2.3) and (2.4) imply that for x = 0(1) we have at
leading order
(2.7)
which corresponds to the well-established assumption of flux balance (see, for ex-
ample, [11], [8]).
In each of cases (a) and (b) above, equation (2.7) implies that
and the leading order problem in x = 0(1) is governed by (2.8) together with
(2.9) au a2
at = ax2 (
(fv V + frI)u ) ,
In the outer region we write x = gty, and for y = 0(1), 9 ~ 1 the dopant
concentration u is exponentially small with respect to 9 and the dominant balance
is given by
oV 02V
at = 'Y oy2 + P (1 -
KO
(2.11) IV) ,
ol 1 o2l
(2.12) at =:y oy2 + KOP (1 - IV) ,
where 'Y = (9V/9I)t and P = (rv/rI)t. Matching the two regions together implies
that the conditions
oV
(2.13) as x -+ +00 u-+O, ox-+O
where V and I denote leading order solutions to the inner problem, then (2.11) and
(2.12) are governed by
at t = 0 V=l, l= 1.
3. Negligible generation-recombination.
so that for x = 0(1) the generation-recombination terms in (2.3) and (2.4) may be
neglected. This implies that at leading order
so that
(3.3)
where
We note that the contributions of vacancy and interstitial mechanisms to the effec-
tive diffusivity Dl are in this case additive. It follows from (3.1) and (3.2) that in
(2.14) we have
The corresponding profile for U will not, however, exhibit at high concentrations
a plateau of the kind that is observed for phosphorus diffusion in silicon (see [3]).
This follows because Dl decreases with increasing u.
We note from (3.6) that if gr ~ 1 then the defect supersaturation is very large.
If it is sufficiently large then the number of pairs becomes comparable to the number
of unpaired dopant atoms and the dominant balance of terms changes. This occurs
when gr = O(w).
56
! (Tl
and
+wIu)I) = ::2 (gI+/Ju)I) .
We restrict attention here to the surface source case (2.5), in which case the solution
is self-similar with u == u(x/d), V == V(x/d), I == I(x/d). Guided by (3.6), for
T = O(w/g) we introduce the rescalings
hV=gVTV/W,
! (fu/v)
02
aX2 (hI
~ ~
+ /Ju)I) ;
V'" u./u ,
57
4. Dominant generation-recombination.
hold, so that for u = 0(1) equations (2.3) and (2.4) imply that at leading order
(4.1) IV=l.
For r = O(l/g) equation (2.8) holds, and it then follows from (4.1) that
(4.4)
with
(4.5)
D 2 (u, us) = ((gVrvV+glrII)(fvV+III)+4Iv Jru) / ((gVrvV+glrII)+(fvF +JrI)u)
V= 1, 1= 1,
so that
{)u {)2u
&t = {)x 2 •
Of more interest is the case in which gr ~ 1 when three different scales for u must
be considered. We assume that II > Iv; if Iv > Jr the roles of vacancies and
interstitials are interchanged.
58
We note that in this case the terms IvVu and hIu on the right-hand side of (2.9)
make equal contributions, whatever the value of Iv / h (provided that lv, h =I- 0)
I . . . (h - Iv )u. / hu ,
and
(4.9)
(4.10)
59
The three region structure given by (i), (ii) and (iii) is reminiscent of that used
by Fair and Tsai [3] to describe phosphorus diffusion in silicon. The diffusivity is
largest in the tail region (iii), drops to a minimum in the intermediate region (ii)
and increases again in the high concentration region (i). It is evident from (4.6)
that in order to obtain a plateau effect due to a sufficiently large diffusivity at
high concentrations, we require that neither II nor Iv be too small. To obtain a
significant tail diffusivity (4.9) we require that fr and Iv not be too close in value.
We note from (4.8) that
as u-O
so that the interstitial supersaturation becomes very large as girl becomes small; if
it is sufficiently large then the number of interstitial-dopant pairs can be comparable
to the number of unpaired dopant atoms and a different balance of terms is again
needed. We now discuss this high supersaturation case.
girl = hlw ,
with hi = 0(1), W <: 1. The five regions describing the behaviour of the dopant are
as follows.
Writing
u = uo(z,t)+ 0(1) as W - 0
we have
&0 8 (
-=41r1v- Uo (0)
i -8z
(4.11)
8t 8z
((/1 - Iv )2u~ + 411 Ivu~ )
(4.13) -SOUO
. * = (4!I Iv Uo + hI(fIII-u(;2
Iv )u.) -
QUo
,
(!I - Iv)u. QZ
and it is matching with this that leads to the condition (4.12). From (4.13) we
obtain
(4.14)
. 4IIIv uo hI(!I - Iv)u.
-SoZ =
(II - Iv)u. 2II u,(/
where the arbitrary function of t which arises on integrating (4.13) has been set
to zero. This may be achieved by appropriate specification of the O(wi) term in
s( t; w); completing the determination of s to this order requires the matching of
further terms in the expansion for u, however.
It follows from (4.14) that
as Z -+ +00
We write
z
x = s(t;w) + In(l/w) ;
these scalings are necessary to match into region (4). At leading order we have
(4.16) as Z -+ +00
We now have
u = w! In-!(l/w)u t , x = s + 0(1) ,
61
with
o=~ ( U ot-28U~)
8x '
8x
so that matching with (4.16) yields
(4.17)
u=wu, x = x/w t
and obtain at leading order
(4.18) 8 ~
at (Iouo/v) = az ((hI + /ruo)Io~) ,
8:£2
Yo = l/fo .
The required solution to (4.18) satisfies
(4.19)
as x -+ 0+ Uo <'V (hI(JI - fV)Ust/fr)
l.
2 Ix IntCl/x), Yo <'V /ruo/(JI - fv)u s,
The discussion at the end of section 4.1 considered asymptotics on the diffusivity
D2 but not on the corresponding profile of u. If the latter were considered, the
regions (1) - (4) would be essentially as in this section, while region (5) would
simplify to give the diffusivity (4.9) (this arises if hI :> 1 in (4.18». We note that
(4.18) also corresponds to a limit in which flux-balance does not occur.
(5.1) \7 2 ((gVrV + fvu)V) = \7 2 ((gIrl + /ru)I) = - (lI:or + II:IU + 11:: u 2 )(1- IV) ,
(5.2) 8u
at = \7 2(( fvV + frI)u )
62
and we may immediately note the following. Schaake [11] claims that in any number
of dimensions there is a flux-balance condition, which for (5.1) would read
(5.4) IV=1.
u -+ 0 as x -+ +00 or as y -+ -00
and that the behaviour is one-dimensional in the limit y -+ +00, and can therefore
be described by the analysis given earlier. Such conditions are applicable to the
important problem of diffusion under a mask edge.
It then follows that as x -+ +00 with y / x -+ +00 we have
(5.5)
where for (5.3) Veo and 100 are given by (3.6), while for (5.4) we have
(5.6)
1.
leo = { (gIrl - gVrv + (II - IV)u s )2 + 4glrl gv rv ) 2
(5.7)
(5.8) as y -+-00 v -+ 1,
Writing x = r cose, y = r sine, the far-field defect behaviour as x -+ +00 with
y/x = 0(1) may now be obtained. Since u -+ 0 in this limit it follows from (5.1),
(5.7) and (5.8) that
(5.10) 1
V '" 1 + '2SY 2e)
( 1 + -;- Us,
I", I*( e, t)
where
(5.11)
R=5r.
We now define
(5.12) 8u
at '"
1 (18 (8 ) 1 8
Ii 8R 8R (,pu) + R2 8B2 (,pu)
52
-
2
)
.
64
Since 0 is an artificial small parameter, the require solution takes the form
with
(5.14) 8G
lit = -tP(8, t) ((8G)2
88 + 4G2) .
(5.15) as 8--+::
2
o
G(8, t) = H(8)/t
with
6. Discussion. The paper has outlined the results of applying singular per-
turbation methods to a simple model for dopant-defect pair diffusion. Much of the
analysis carries over to more realistic models which allow for the electric charges
carried by the various species.
65
Some of the reduced problems given here are not new, but have been obtained
before by physical reasoning; see [8] and [9] in particular. We are, however, able to
state precise conditions on the governing parameters in order for such reductions
to be valid. In particular, the reduced problems which follow from flux-balance,
namely (3.3) - (3.4) and (4.2) - (4.5) (see also [8] and [9]), require that, for example,
gIrl = 0(1) or in dimensional terms that
The higher concentration problems (see sections 3.2 and 4.2), which are new, are
appropriate when, for example, gIrl = O(WI) so that
REFERENCES
[1] P.M. FAHEY, P.B. GRIFFIN AND J.D. PLUMMER, Point defects and dopant diffusion in silicon,
Rev. Mod. Phys., 61 (1989), pp. 289-384.
[2] R.B. FAIR, C.L. GARDNER, M.J. JOHNSON, S.W. KENKEL, D.J. ROSE, J.E. ROSE AND
R. SUBRAHMANYAN, Two-dimensional process simulation using verified phenomenological
models, IEEE Trans. Comp.-Aided Des., 10 (1991), pp. 643-650.
[3] R.B. FAIR AND J.C.C. TSAI, A quantitative model for the diffusion of phosphorus in silicon
and the emitter dip effect, J. Electrochem. Soc., 1.24 (1977), pp. 1107-1118.
[4] J .R. KING, Asymptotic analysis of an impurity-defect pair diffusion model, Q.J. Mech. Appl.
Math., 44 (1991), pp. 369-412.
66
[5] J.R. KING, Surface-concentration dependent nonlinear diffusion, Euro. J. Appl. Math (to
appear).
[6] F. LAU AND U. GiiSELE, Two-dimensional phosphorus diffusion for soft drains in silicon
MOS transistors, Appl. Phys. A, 40 (1986), pp. 101-107.
[7] J.W. MOORE AND R.G. PEARSON, Kinematics and mechanism, John Wiley, New York,
(1981).
[8] F.F. MOREHEAD AND R.F. LEVER, Enhanced "tail" diffusion of phosphorus and boron in
silicon: self-interstitial phenomena, Appl. Phys. Lett., 48 (1986), pp. 151-153.
[9] F.F. MOREHEAD AND R.F. LEVER, The steady-state model for coupled defect-impurity dif-
fusion in silicon, J. Appl. Phys., 66 (1989), pp. 5349-5352.
[10] W.B. RICHARDSON AND B.J. MULVANEY, Plateau and kink in P profiles diffused into S;: a
result of strong bimolecular recombination?, Appl. Phys. Lett., 53 (1988), pp. 1917-1919.
[11] H.F. SCHAAKE, The diffusion of phosphorus in silicon from high surface concentrations, J.
Appl. Phys., 55 (1984), pp. 1208-1211.
A REACTION-DIFFUSION SYSTEM MODELING
PHOSPHORUS DIFFUSION
Abstract. At very high concentrations phosphorus diffusion in silicon exhibits marked non-
linearities. The hierarchy of physical models that attempt to explain this anomalous diffusion are
reviewed. An eight-species kinetic model is derived that yields a quasilinear, partly-dissipative sys-
tem of reaction-diffusion partial differential equations. The numerical method of lines is used to
solve the system for a simplified five-species model in three dimensions. The linear system in the
Newton iteration is solved using several matrix-free methods. In all cases the dimension of the
Krylov subspace must be quite large to insure convergence. This suggests that preconditioning will
be more important for efficiency than choice of an accelerator.
dopants such as phosphorus, arsenic, and boron into a silicon wafer in order to form
n-p junctions, as well as to anneal damage to the lattice caused by ion implantation.
When the impurity concentration C is less than the intrinsic electron concentration
ni, approximately 1018 cm-3 at lOOO°C, the heat equation represents dopant diffusion
well. For predeposition it is solved with a Dirichlet boundary condition at the top
surface of the wafer
and D~, D~, D~, and D~ are the intrinsic diffusivities due to neutral, donor, acceptor,
and double acceptor vacancies, respectively. If more than one impurity is present,
there would be additional equations for the other impurities, and in the drift term
the concentration C is replaced by the net electrically active concentration N. Using
(2) gives better results for extrinsic (high concentration) diffusion than (1), but still
fails to explain the pronounced nonlinearities that occur for a high-concentration
phosphorus source.
In addition to vacancies, interstitials - silicon atoms not residing on a lattice
site - are known to exist in numbers roughly equal to vacancies. Experiments such
as oxidation enhanced diffusion suggest that these species also aid in the diffusion
process. In 1974, Hu proposed that "P diffuses in Si via a dual mechanism, i.e., a
mixture of vacancy and interstitialcy mechanisms," by means of the reactions
v + pi;:"p. + S
S + Pi;:"P. + I
where S denotes a Si lattice atom, p. (Pi) substitutional (interstitial) phosphorus,
and V (1), a vacancy (interstitial), respectively. Over the next fifteen years a number
of models were developed that refined this idea, including those of Mathiot-Pfister,
Morehead-Lever, Law-Dutton, and Mulvaney-llichardson. They solved for concen-
trations of as many as three of the chemical species, using various assumptions about
equilibrium to simplify the model.
69
+ kneu {
+
[P ) +
n;
[e-]
with three analogous equations for interstitial species. Reaction R1 represents bi-
molecular generation-recombination with rate constant kbi • < 0 > denotes the result
of an interstitial silicon atom occupying a lattice site and annihilating the vacancy.
For reactions R2-R5, the forward and reverse rate constants are denoted by k! and
k~ respectively. In the equation for electron concentration (denoted both by [e-) and
n), the last term dynamically enforces charge neutrality.
The single species model with a nonlinear diffusivity has been exchanged for a
large system each equation of which is quasilinear with constant diffusivity. The
kinetic model attempts to more accurately embody the physics and gives data about
species whose concentrations cannot be measured directly. This is at the expense of
more equations and a stiffer system. Boundary conditions on the upper surface are
(5)
a[p+v-] = 0
a[
(6) (2:) . [V-j<q [P+] = C*
ni ~
with, of course, analogous conditions for the interstitial species. In the Dirichlet
condition for P+, C* represents the concentration of phosphorus in the ambient gas.
70
Reflecting Neumann conditions are enforced for all species deep in the bulk. Initial
conditions are zero for all species except the electrons and defects, which are set to
their equilibrium levels.
For vacancies it is possible to estimate the diffusivities from first principles using
thermodynamics. For interstitials and pairs the data is nonexistent or of doubtful
accuracy, and experimental data of Mathiot-Pfister was used to give the following
values
For the reaction A+B ;:: C , a simple kinetic argument due to Debye gives an estimate
for the forward rate constant for a diffusion-limited reaction of
(7)
where R is the encounter distance for A-B interaction. The rate constant for the
reverse reaction is approximated by
(8)
where n"
is the concentration of lattice sites and Eb is the binding energy of the
A - B pair. For reactions R4-R5 involving electrons this is precisely analogous to
Schocldey-Read-Hall theory of recombination-generation in which
(9) kr = Vt" Un ni exp( Ev- - Ei )
where V,,, ~ 101 cm/sec is the thermal velocity of an electron, Un ~ 1O-15cm2
is the capture cross-section, and Ev- is the energy level of the acceptor vacancy.
Reaction R2 R3 R4 R5
Forward (cm3s 1 ) 3.0 x 10 14 4.4 x 10 14 1.0 x 1O-1S 1.0 x 1O-1S
Reverse ( S-1 ) 1.4 x 10· 8.8 X 10 1 5.6 X lOw 1.7 X 1011
1011
10 20
10 19
..,"""' 10 II
·sU 10 17
'-'
10 16
~
.....
0
..... 1015
C":I
10 I'
h
c::01) 10 J)
U
c:: 1011
0 1011
U
10 10
10 9
Distance (!lm)
Figure 1: Simulation of a phosphorus predeposition at 900 D C for 10 minutes using the
eight-species nonequilibrium model in one dimension. This includes the additional
terms from reactions R6-R9. Crosses are experimental data of Yoshida et al., J. Appl.
Phys. p. 1498(1974).
Given a surface phosphorus concentration of 3 x 1020, [P+V-] ::::: 106 [V0] for the
region close to the surface and
For the reaction of P+V- and 1°, orientation effects would be important as compared
with the direct recombination mechanism. Still, supposing that kl ::::: kbi' k~f1 would
be 106 times the value predicted from (7).
Numerical modeling using kinetic estimates for all rate constants, (kbi = 10- 14
cm3 S-I) and including reactions R6-R9 gives Figure 1 [3]. Note the pronounced
plateau, kink, and tail in the profile. Comparison between this and the five-species
model presented below suggests that the effect of charge is secondary. The primary
reason for the observed anomalies is the interaction between the two types of defects
and pairs via reactions R1, R6-R9.
a]
at = DI V2] k~ p.] + k'FF - kbi (y.] - v·q·]"q)
(10) oE DE V 2 E + k~ p.y
at k'E E
of
at = D F V 2F + k~ p.] k'F F
op
- k~P.Y + - k~ p.] + k'F F
at k'E E
The SIOM algorithm [7] has very modest storage requirements and is "matrix free"
requiring only the matrix-vector product Av for a given v. To solve Ax = b it takes
a shifted Krylov subspace I<m of JRn and seeks an approximate solution x(m) which
belongs to I<m and such that the residual rim) at x(m) is orthogonal to I<m. An Arnoldi
method recursively builds an orthonormal basis {VI, ... , v m } for I< m; incomplete refers
to the fact that Vk is only orthogonalized against the previous {Vk-b ... , Vk_j}. Saad
73
z
Figure 2: Geometries for the three dimensional test problems. B possesses symmetry
along one axis and can be compared with output from a 2-D simulator. C represents
a problem that would be difficult to model using a sequence of 2-D simulations.
has shown that when A is symmetric, SIOM reduces to conjugate gradient and in
general is equivalent to ORTHORES and ORTHOMIN.
Good results are achieved with the standard model with grids of up to 27,000
nodes. The kinetic model results in a much stiffer system, and to get convergence on
even the simple Problem A, the maximum dimension of the Krylov subspace must
be taken much larger than its default value of m = 5. Figure 3 shows the effect on
vacancy and interstitial profiles of changing the maximum Krylov dimension. The
integrator halted due to nonconvergence with m less than eight. For m = 10, a
solution is obtained, but it is unphysical and radically different from a 1-D PEPPER
simulation where a direct solver is used. Refining the grid does not improve the
situation: in Figure 4 a grid of 80 points is used in the z direction and m must be 30
before relatively fiat profiles are obtained. IT the Krylov subspace is large enough for
convergence, it may still be insufficient to produce a physically meaningful solution.
It is important to solve the linear system accurately at each timestep and this cannot
be done simply by reducing the tolerance in the outer loop, one must enlarge the
subspace or cut back on the stepsize severely. For the SIOM algorithm in LSODP
the quantity Av's/Calls is the average Krylov dimension and represents a measure of
the efficiency of the method. Figure 5 shows Kavg for both models on a grid of 1000
points. As time increases the asymmetric Jacobian makes a larger contribution to
the linear system, effectively requiring a larger Krylov subspace to obtain an accurate
solution.
To compare SIOM with other accelerators, the calling sequence in LSODP is
modified so that routines from the NonSymmetric Preconditioned Conjugate Gradient
74
"
16
c::
.9
..,
..,...
C1l 15
.................................. .......... :--...,:
~~
d
<IJ
U
c:: u
0
U
....0 ':..-- ....................... - ........ . ........... ~ .7. ~P.... . __ -:
----_ .. - .. ", ...
~~
~~
Il
- ... _---------- ----------
~,
bO
m = 15
....
0
....:l
12
m=lO
11
1. 2.
Distance (p.)
3. . . s.
Figure 3: The effect on vacancy and interstitial profiles of increasing the Krylov
subspace dimension. This is Problem A for five species with 20 gridpoints in the z
direction. The lower curve of each pair is the vacancy, the upper the interstitial.
11 ~______________________________________________- ,
16 -------
........
IS
bO ,':' ........... 30
o
....:l ......yI~:.,.. ::............... .
....... :4~ ....
-------
II ~--~----~--~----~--~----._--_r----._--_r--~
1. 3. . S. ,
Distance (p.)
Figure 4: Same problem as in Figure 3, but with 80 gridpoints in the z direction.
Dim(Km ) must be as large as 30 to achieve the expected flat defect profiles.
75
20 ~-------------- __________________________- ,
AS
IS
10
Time (sec)
Figure 5: Average Krylov dimension [(aug as a function of time. The number [(aug
is a measure of the relative effectiveness of the SIOM algorithm in solving the linear
system.
package [8] can be called. It uses various accelerator techniques such as Chebyshev
and generalized conjugate gradient. Four accelerators were chosen for comparison,
ORTHOMIN, GMRES, BCGS, and Minimum Error. Table III gives the runtimes
in seconds on a Sparcstation for the standard model (2) on a grid of one thousand
points, where F(u)'s is the number of evaluations of the RHS in (10), Av's is the
number of times the matrix-vector product is formed, and Calls is the number of
calls made to the iterative solver. All four algorithms performed well, with roughly
equivalent runtimes, although GMRES and ORTHOMIN were slightly faster.
TABLE III
Prob Method Time F(u)'s Av's Calls
A BCGS 132.7 314 244 69
GMRES 109.3 253 183 69
ME 135.2 314 244 69
ORTHOMIN 107.7 253 183 69
B BCGS 1452.0 3489 2784 704
GMRES 1087.4 2589 1935 653
ME 1448.3 3489 2784 704
ORTHOMIN 1167.3 2793 2088 704
C BCGS 1898.4 4544 3628 915
GMRES 1555.7 3661 2739 921
ME 1899.9 4544 3628 915
ORTHOMIN 1537.6 3637 2721 915
REFERENCES
[1] W. B. Richardson and B. J. Mulvaney, Plateau and kink in P profiles diffused into Si
- A Result of Strong Bimolecular Recombination?, Appl. Phys. Lett. 53(1988), pp.
1917-1919.
77
[6] P. N. Brown and A. C. Hindmarsch, SIAM J. Numer. Anal., 24(1987), pp. 610.
[7] Y. Saad, Krylov Subspace Methods for Solving Large Unsymmetric Linear Systems,
Math. of Comp., vol. 37,105(1981).
[8] T. C. Oppe, W. D. Joubert, and D. R. Kincaid, Center for Numerical Analysis Report
CNA-216, University of Texas.
[9] J. R. King, On the diffusion of point defects in silicon, SIAM J. AppJ. Math. 49(14),
1989, pp. 1018-1101.
1. Introduction.
The deviation from stoichiometry was found to greatly change the material
properties. In the case of GaAs, the heat-treatment experiment under the applied
arsenic pressure showed the existence of the exact stoichiometric vapor pressure,
and it was found that interstitial arsenic atoms and arsenic vacancies were dominant
point defects governing the deviation from stoichiometry [1-3]. On the other hand,
the liquid phase and melt growth experiments were made with applied arsenic vapor
pressure upon the solution [3-8], and it was found that the deviation from the
stoichiometry was controlled by the applied vapor pressure and the stoichiometric
crystals were segregated at just the same arsenic vapor pressure as that of the heat-
treatment. The grown crystals were very perfect. This growth method was called
the temperature difference method under controlled vapor pressure (TDM· GVP).
Figure 1 illustrates the experimental methods of the heat-treatment and TDM .
GVP growth.
The stoichiometry-controlled crystal growth in T D M· GV P was found to be due
to the increased saturation solubility of the solution under applied vapor pressure
[6,7] and it was found that the equality of the chemical potentials of arsenic holds
between the three phases. The details of the experiments and the chemical potential
approach can be referred to Reference [7].
Atomic diffusions in GaAs are also thought to be greatly influenced by the
deviation from stoichiometry. However, most of the discussions so far have not
given strong attentions on this point. In this paper, we discuss the atomic diffusions
on the basis of the chemical potential approach developed for TDM· GVP. We
deal with diffusion of sulfur, self diffusion of arsenic and gallium, and also diffusion
of silicon. As we show in the following sections, these are typical of interstitial
diffusion, diffusion via arsenic vacancies, and diffusion with site transfer. In these
discussions we assume that corresponding point defects are in thermal equilibrium
under the applied vapor pressure. In the final section, however, we deal with atomic
diffusion at a hetero-interface, which is an example in which equality of the chemical
potentials does not hold, which causes characteristic interface mixing phenomena.
II
c:r "I
ampoule
~
As
T, - - -- '01 source
crystals
b;d
solution
T2 --- - - GaAs
9.Jbstrate
As
I~m perature
There are four kinds of point defects which possibly cause the deviation from
stoichiometry, except anti-lattices. They are an arsenic interstitial atom, lAs, ar-
senic vacancy. VAs, gallium interstitial, lGa, and gallium vacancy VGa. However,
we have experimentally shown that lAs and VAs are dominant, while lGa and VGa
are much less in concentration. In such a case, the AS 4 vapor pressure which gives
the exact stoichiometry (which we call the optimum vapor pressure in T D M . GV P)
is determined by the equality of the concentration of lAs and VA •. Figure 2 shows
the experimental stoichiometric vapor pressures in T D M . GV P and in the heat-
treatment, together with the calculation based on lAs and VAs. These three curves
are in a very good agreement. Table 1 gives the free energies of formation for lAs
and VA. adopted for the calculation, which were determined from the change in the
lattice parameter. Recently, photocapacitance studies have shown [9-11] that the
formation energy of the interstitial arsenic atoms is t:.Hr~s = 1.1eV, which is just
the same as adopted in the calculation. From these results, the formation energies
of lAs and VA. listed in Table 1 are thought to be reliable enough.
*Reference [10 I
T.OC
1200 1000 800 600
• heat-treatment
I crystal growth
....
....
~ 2.6Xl06exp(-~ )
kT
calculat ion
1 64Xl06 eXp(_1.04eV )
. kT
a.
o
We consider the following reactions for the formation of lAs and VAs'
~GfAs and ~G~ As are the free energy differences per a molecule, and described
as
(2.5)
~GfAs = ~HrAs - T~SrAs
~G~ As = ~He:As - T~Se: As
where ~HrAs and ~He: As are the entalpies, and ~SrAs and ~Se: As are the en-
tropies of vibration.
In order to refer to the applied AS4 vapor pressure, we need the following reac-
tion equation.
where ~G~!, is the free energy of sublimation of arsenic element. If we write the
reactions of formation of IAa and VAs as,
1 r
(2.7) 4As4 (gas) = lAs; tl.G 1As
1
(2.8) As (lattice site) = VAs + 4As4 F'
(gas); tl.G VAs .
(2.10)
(2.11)
In these expressions, [lAs] and [VAs] are defined as [lAs] = NIAs/NYAs and [VAs] =
NVAs/N~As' where NIAs and NVAs are concentrations of lAs and VAs, and NYAs
and N~As are the concentrations of the whole available interstitial and the substi-
tutional arsenic sites.
Assuming that NYAs = N~As' the AS4 vapor pressure corresponding to the exact
stoichiometry, which we call the optimum vapor pressure p~~t , can be obtained from
the condition that [lAs) = [VAs].
That is, from Equations (2.10) and (2.11) we have
Figure 2 has shown the calculation by this equation with the parameters listed
in Table 1. As for the gallium vacancies, VGa, and gallium interstitials, IGa, the
similar expressions are possible, though their concentrations are thought to be much
smaller.
That is, the formation energies are defined from the following reactions,
where D.CSG'!'AB is the free energy of the sublimation of GaAs solid phase. Then, we
have
. 1 ~
(2.16) GaAs (solid) = Iaa + 4As4 (gas); D.GIGa
(2.18)
D.Gf~a = D.Gfaa + D.GG'!'As
D.G~~a = D.G~aa - D.GG'!'As
Equations (2.16) and (2.17) gives the arsenic vapor pressure and temperature de-
pendences of [Iaa] and [Vaal as follows.
(2.19)
(2.20)
3. Diffusion of sulfur.
There is a detailed experiment on the diffusion of sulfur in GaAs under applied
arsenic vapor pressures by Young and Pearson [12]. The diffusion coefficient D
increases as the square root of the AS4 vapor pressure, but it saturates at a higher
vapor pressure as shown in Figure 3. They proposed that the complex of gallium
divacancy and sulfur donor is responsible for the diffusion. However, as pointed
out in Sec. 2, the equilibrium concentrating of the gallium divacancies should be
so small that they could not be dominant migrating species. On the other hand,
B. Tuck assumed the arsenic vacancies in his calculation of the diffusion profile [13].
But we think that arsenic vacancies cannot explain the vapor pressure dependence
of D, in spite of his assertion.
In place of them, we propose an interstitial sulfur diffusion model. First, it
should be pointed out that the cross over-points of the square root line and the
constant D line, is close to the optimum vapor pressure described in Section 2,
both at T = 1000°C and 1130°C, as denoted by arrows in Figure 3. The model
must explain this fact, as well as the vapor pressure dependence.
We assume that a substitutional sulfur donor st changes to an interstitial
sulfur, It, and an arsenic vacancy, V~s' or we assume that, in the presence of
an interstitial arsenic atom I~s' st chang~s to an interstitial molecular complex
(Is. IAs)+ and an arsenic vacancy.
85
10- 11 c------------------=::::;::::~q
• 1003·C
x 1003·C (preannealed)
o 1130·C
10
PAst., atm
The following reaction equations should hold in the above two cases, respec-
tively.
We have assumed that the charge states of sulfur do not change in this reaction
because there is no such an observation of strong concentration dependence as in
Zn diffusion [17). Also, charge states of arsenic vacancies and arsenic interstitial
atoms have been assumed to be neutral as was described in Section 2.
For the interstitial diffusion under thermal equilibrium of VAs and lAs, the for-
mer equation gives the (PAs.)* dependence of D, while the latter gives (PAs.)i
dependence. Because of the vapor pressure dependence and also because the dif-
fusion coefficient of sulfur is much smaller than that of zinc for which it is well
established that zinc interstitial atoms have a form of isolated atoms [14), it will
be reasonable to assume that an interstitial sulfur and an interstitial arsenic atom
make a complex like a molecule as described by Equation (3.2).
Following the diffusion equation which was established in the case of zinc diffu-
sion, the diffusion equation for sulfur is given by
where N.ub and Ninter are concentrations of substitutional and interstitial species
and D~ and Dl are their intrinsic diffusion coefficients, respectively. If we can
86
assume that N.ub > Ninter, and the reaction given by Equation (3.2) under the
applied arsenic vapor pressure is fast enough, we get
(3.4)
which gives the following effective diffusion coefficients, D.ub, for substitutional
sulfur
, , 8Ninter
(3.5) Dsub = Dsub + Dinter-8N.
sub
.
• I 8Ninter
(3.6) Dsub =; Dinter -8N. .
sub
Under the arsenic chemical potential given by the applied arsenic vapor pressure,
the equilibrium for the equation (3.2) gives the following equation.
(3.7)
[I•.It1 .[V1.1 _ (_ ~Gi2)
[st1' [11.1 - exp kT
where [st1 is the concentration of st, relative to the concentration of the whole
available sites for st, and all other notations [11.1, [v1.1 and [IA • . It1 are also
relative concentrations of each species.
Assuming that the densities of the whole available sites for st and (I• . I A .)+
are the same, we get
(3.8) 8Ninter
- [11.1
- - - --exp (~Gi2)
---
8Nsub - [v1.1 kT
(3.9)
If the thermal equilibrium is established under the applied AS4 vapor pressure, [11.1
and [v1.1 are given by Equations (2.10) and (2.11) i):! section 2, respectively.
Then, Dsub is expressed as follows using P A.4 and the free energies of formation
for IA., VA. and (I.' IA.)+.
(3.10)
87
On the other hand, the stoichiometric vapor pressure, ~~! is given by Equation
(2.12), at which [IAa)= [VAa ) holds. Therefore, the diffusion constant at the stoi-
chiometric vapor pressure, D~;~ is given by
(3.11) opt
Dsub = I
D inter exp
(
-
t::..G i2
kT ) .
D{nter is approximately given by the following form using the free energy of
migration of the interstitial molecule t::..G m2 = t::..Hm2 - Tt::..Sm2
(3.12) I
D inter •
=; '16 a-72-1I exp -t::..Skm2- exp (t::..Hm2)
---y;;r-
Comparing with the present sulfur case, it was assumed in the case of Zn
diffusion that zinc interstitials are isolated atoms. IT the attractive interaction of
zinc and arsenic interstitial atoms is too strong, a stable complex may not be formed
because they tend to find a place in a single interstitial cell so that they destroy
a lattice and one of them return to a substitutional site. We assume that the
molecular complex (l• . lA.)+ is stable because of a weaker interaction between I.
and lAo as illustrated in Figure 4.
T
--@
(3.14)
where k is the reaction rate, and {2As + S} denotes the smallest aggregated con-
figuration. Then, we have the following rate equation,
(3.15)
Also, the reaction described by Equation (3.2) can be expressed by using the rate
89
(3.16)
In a steady state, the whole generation rate of (l• . lA.)+ equals 0, i.e.,
(3.17)
Then, we have
(3.18)
( ~NN~Dlrr)
(J Inter 0
is just the same as was given by Equation (3.8).
The diffusion coefficient in the saturation region is given by D::t = DIDter ~ ,
and the critical vapor pressure is given by
( 3.20) l= (kl)2
peritic ..
A.. k exp
{2(b..Gf~. - b..Gf:~.
kT
+ b..Gi2)} = (k2)2 opt
k PA •• ·
The experimental fact that the p~r!!ieaJ is close to the optimum vapor pressure
means that the values of k and k2 are not greatly different each other.
Comparing the two reactions,
+ +
(3.16) (I•. lA.) + VA.0 +2
k.
SA. + lAs
0
The reaction rate constants k2 and k are determined by the random walk areas of
(l• . lAs)+, V1s and l1s' as far as we assume they immediately react when they come
90
to the nearest neighbour sites each other. Therefore, if interstitial arsenic ato~ are
isolated each other, k may be larger than k2 because the diffusion constant of lAs
is thought to be the largest. However, if interstitial arsenic atoms are interacting
each other at such a high concentration that aggregation takes place and effective
diffusion constant of ~8 is reduced, then the diffusion constant of (Is' lAs)+ deter-
mines both k2 and k so that k2 ~ k will hold. The fact that the cross over point is
close to the optimum vapor pressure means that the above mentioned mechanism is
essentially true. However, more precisely, the cross over point is about twice times
the optimum pressure and the curve bends more sharply than Equation (3.19) pre-
dicts. We think, therefore, that actual aggregation takes place more suddenly when
the concentration of arsenic interstitial atoms exceeds some level.
Finally in this section, Figure 5 shows the calculated temperature dependence
of the diffusion coefficient of sulfur at a stoichiometry and at a gallium rich liquidus
line as well as cross over point with assuming flHm2+flHi2 = 2.6eV, and (lit)2 = 2.
800
10-11 ,....-....:.:,,:..:..-...,.---:,.:...:~--r---....----,
10"12
III
o
The most striking result is that D As and DGa are nearly equal. Although the
above expressions are a little bit different from each other, they are equal within
the experimental uncertainty over wide ranges of vapor pressure and temperature.
It should be first pointed out that the diffusants are isotopic As and Ga (we
describe them As' and Ga') while the arsenic vapor pressure is that of usual natural
arsenic atoms. As a result, the chemical potential for isotopic arsenic, JL~s' is
without control, and [I~s] should be much smaller than [lAS]' On the other hand,
there is no discrimination between As' and As for VAs' Therefore, this kind of
experiment is not related to the diffusion of arsenic interstitial atoms.
We present a model in which both arsenic and gallium atoms diffuse assisted
by arsenic vacancies.
It has been usually assumed that VAs and VGa can migrate like shown in Figure
6 (a), that is, an arsenic atom jumps to VAs directly from one of the next nearest
lattice sites. As a comparison, in silicon and other elemental semiconductors, an
atom need only to jump to the nearest neighbour lattice site, as shown in Figure
6 (b). The latter need only stretching of the lattice bonds as illustrated by dotted
lines, but in the former case (a) the bonds should be broken and the atom must go
through an interstitial site. It will need a much higher migration energy than in the
case of (b). On the other hand, if an atom in compound semiconductors could jump
to a vacancy at the nearest neighbour lattice site like in elemental semiconductors,
then, a line of antilattices would be generated as illustrated in Figure 6 (c). From
this discussion, it is understood that a simple diffusion mechanism via vacancies as
in silicon is hard to be considered in covalent compound semiconductors.
The experimental fact that DGa = D As should not be accidental but it implies
that Ga' (or Ga) and As' (or As) jump as a pair in the presence of an arsenic
vacancy.
92
VAs , L\
a)
Figure 7 illustrates a possible migration process. First, a Ga' (or Ga) jumps to
the nearest neighbour VAs site and, at the same time, As' (or As) nearest to the
Ga' jumps to the former Ga' site, so that the VAs moves to the next nearest lattice
site and paired antilattices Ga' - As' are formed. This paired antilattices in Figure
7 (b) do not mean a stable energy state but a saddle point through which (Ga' -
As') move to the final stable state. The latter half of the step is as follows. The
three atoms of the paired antilattices (Ga' - As') and a neighbouring Ga denoted
Ga" in Figure 7 (c) cause interchanges between them as shown by the three arrows
in Figure 7 (c) and relax to a final state shown in Figure 7 (d) . It is assumed that
the saddle point energy of the paired antilattices (Ga' - As') is enough high to
cause the movement of the third atom Ga". As for the possibility of an interchange
within (Ga' - As'), it will need a higher energy than the three body interchange.
The migration of an arsenic vacancy should be much easier in this model than
that in the simpler process illustrated in Figure 6 (a). There may be a similar
process based on VGa' but we can assume that the concentration of VGa is much
smaller than that of VAs as explained in Section 2. Therefore, the diffusion constant
of As and Ga are the same, and given by the following form in terms of [VAs].
93
VAs ..l
I 'I
Ga'
As'
(4.1)
where ~G;'a.ir = ~H;:Ur - T ~S;air is the free energy of the saddle point correspond-
ing to paired antilattices with VAs at the nearest site.
Using Equation (2.10) the diffusion coefficient is expressed in terms of the ap-
plied AS4 vapor pressure and temperature as follows
. 1 2 _~ (~G;:"ir + ~Gf~s)
(4.2) DAs = DGa ~ tid V(PAs.) • exp kT .
In order to compare with the experiment, we must know the expression at the
gallium rich liquidus line at which [VAs] becomes maximum and PAs. becomes
minimum.
-~
(PAs:) min is obtained from the following equation
1
(4.3) G aAs (lattice ) = Ga (liquid ) + 4As4; A sub'
UGaAs'
94
(4.8) Dmax _
A8 -
Dmax ' - .!.cf
G/J ..... 6 1/ exp
(~GP'air + ~G~~8
kT
-
~G~:~8) .
T, ·c
1200 1000 800
N
tn
E
<J
-10"1'
~
III
(!)
010-15
10"16
5. Diffusion of silicon.
Diffusion coefficients of most of foreign elements in GaAs are much larger than
the self-diffusion coefficients of As and Ga which can be interpreted by the move-
ment of a pair of atoms via an arsenic vacancy.
Therefore, we must consider interstitial diffusion and other mechanisms for
them. In the case of silicon in GaAs, silicon atoms can locate both Ga and As lat-
tice sites, so that we must consider a diffusion mechanism based on the site transfer
of silicon atoms, other than the interstitial diffusion mechanism. The site transfer
diffusion was first introduced in Si-Si pair diffusion [3,19], but in the present model
Si atoms need not be strongly paired.
Figure 9 illustrates the site transfer diffusion mechanism. As a first step, Sioa
(silicon atom at the gallium site) transfers to VAs at the nearest neighbour site,
resulting in the formation of Si A • and V Oa , then VOa goes out of the lattice, or
recombines with an interstitial gallium, lOa, so that thermal equilibrium is reached.
As a second step, Si A • transfers to VOa when it comes to the nearest neighbour
96
site, which results in the formation of SiG" and VA., and the latter also returns
to the thermal equilibrium. Each step can be described by the following reaction
equations
VAsJ...
I ,
Both reactions are equilibrated, and the equilibration can be described by the
following equation.
(5.3)
where t = exp (-~) . We will later discuss the equilibrium concentration ratio
[Sibal/[Si'A.) using this equation.
Different from the case of the self-diffusion discussed in Section 3, the state
corresponding to Figure 9 (b) is not a saddle point, but a stable state, because
both SiGa and SiAs are stable. The first and the second steps occur in a series and
the probabilities of occurrence are proportional to [VA.), and [VGa ), respectively.
That is, the rate determining step should be the latter except at very high arsenic
vapor pressures because [VGa) is usually much smaller than [VAS)' The formation
energy of [VGa) is estimated to be very high (about 3 eV as will be later discussed) .
Therefore, the site transfer diffusion process should be dominant at higher tempera-
tures, while we must take into account the interstitial diffusion mechanism at lower
temperatures. Let us discuss first the site transfer diffusion. Diffusion coefficient
of SiGa and SiAs are the same for this mechanism and it can be described in the
following form
(5.4)
97
where r elf is the effective jump rate, and can be described by r A. and rVG.. which
are the jump rates via VA. and VG .. , respectively, (that is, they are proportional to
kl and k2 in equations (5.1) and (5.2)).
(5.5)
with
(5.6)
(5.7)
where ~GVA. and ~GVG .. are corresponding free energies of migration. At low
and middle vapor pressures relf =. rVG .. holds, and we have,
(5 .9) D(S')
ZGa = D (S')' 102
ZA. "'" 6a-IIGa A•• ' exp (~GfGa kT
(P).1 + ~GVG") .
As shown in Figure 10 our experiment has shown that the diffusion depth mono-
tonically increases with the AS4 pressure at 950", 1000°C, which suggests the site
transfer diffusion. However, at lower temperatures 900 '" 875°C, the diffusion c0-
efficient rather decreases with increasing vapor pressure in a lower pressure region,
which suggests the contribution of the interstitial diffusion mechanism as will be
discussed later.
In the above expression ~G~~a is expressed as
(2.18)
The entalpy of the sublimation of GaAs is known to be 1.14 eV. The entalpy of
the formation of VGa is estimated as follows. Vacancy formation energies for silicon
and diamond were estimated to be 2.3 eV and 4 eV, respectively. We assume VGa
in GaAs is roughly in the middle of the two values, that is, ~H:Ga ~ 3eV and
so ~H:~a ~ 2eV. On the other hand, the migration entalpy ~HVGa in Equation
(5.7) is estimated as follows. In the case of silicon crystals ~HVSi were estimated
as ~HVSi ~ 0.33 '" leV while, in Section 4 we have obtained for the two atom
migration as ~Hm ~ 1.6eV, so that we roughly estimated that ~HVGa ~ leV.
Therefore, our estimation of the diffusion coefficients for the site transfer mech-
anism is
(5.10)
98
(5.3)
as
(5.11)
where n and p are electron and hole concentrations, while Nc and N v are their effec-
tive densities of states, respectively, that is, we have the relation pn = NcNv exp ( -~ ).
In the experiment shown in Figure 10, the diffused region became n-type, that is,
[SiGal > [SiAs] holds. If we can assume [SiGal > [SiAs], then n ~ NGa[Siba] holds
and we have
50
•
0
875 ·c
900
40
E •
V
900 -- _. SP.Si
925
:1.
. X 950 - - - - SP. S i
:::c 30
l-
0..
•e 1000 - - 23h
1000 •
UJ
c + 1000
520
tn
::>
La..
La.. o
0 '0
o~~--~~~~~~~~--~~~~--~--~~
1 10 10 2 10~
PAs4; Torr
Figure 10. Experimental diffusion depth of silicon in GaAs in
Reference [3].
99
Next, we discuss the interstitial diffusion mechanism which may become domi-
nant at lower temperatures. IT we assume an isolated Si interstitial atom, Is;, but
not a molecular complex, the following two reaction equations will hold
(5.13)
Sit .. = It; + V~.. f:::.G;
j
(5.14)
We consider the case that [Sit .. ] > [Si As ] holds, then Equation (5.13) gives
.) , 1 (f:::.G;)
(5.16) D ( SIGa = Dinter [V~a] exp - kT
(5.18)
which gives
(5.19)
In this case, vapor pressure dependence is not expected because [lAB] and [VGa]
have the same (PAso)~ dependence.
Experimentally, we observe the increase of the diffusion coefficient with de-
creasing vapor pressure at lower temperatures._ Therefore we assume that silicon
interstitial atoms are isolated. The entalpy parts of the free energies in Equation
(5.17) are estimated as follows. As a very crude estimation, we assume that f:::.H;
100
is nearly equal or less than the formation entalpy of Frenkel pair in silicon crystal,
that is,
~Hi < ~H~Si + ~Hfsi'
It was estimated to be about 3.1 '" 3.3eV [15]. Therefore, we tentatively assume
that ~Hi ~ 2.5eV. ~H$; and ~H~~a is estimated to be about 0.3eV and 2eV,
respectively, similar to the earlier discussions.
Then, we have the following expression for the interstitial diffusion mechanism.
(5.20)
Figure 11 shows the calculation according to Equations (5.8) and (5.20). The con-
stants Do was fitted to the experimental point at the highest vapor pressure and
temperature, while D~ was at the lowest vapor pressure and temperature.
T ·C
1200 1000 • 800
1~8r--.----r----.-----'-------.-----'
109 lOOOTorr
I Torr
=
til
10- 11
I Torr
o 1000 Torr
interstitial
GaAs AlAs
surfaCR
J-IGa /--------....
r----jJAI
(6.1)
1 pi
(2.16) GaAs (solid) = faa + 4As4; !:!"GIGa
However, it is understood that the above equation implicitly assume the existence
of the free surface, which is not the case at the heterostructure interface. Actually,
102
large amount of V&a will be generated by the reaction in Equation (6.1) because
!:l.G Zn should be much smaller than formation energies of VGa via Equation (2.17).
They must recombine with IGa or go out to the free surface for the equilibration,
but the concentration of IGa is much lower than the equilibrium level corresponding
to the sharp decrease of the gallium chemical potential. Although a part of VGa's
go out to the free surface, but most of them contribute to the disordering at the
interface.
In such a case we should consider that Equation (6.1) is equilibrated in itself,
without assistance of Equations (2.16) and (2.17). That is, we have
but the condition [Zntl = (V&al must hold because Znt and V&a are generated as
a pair.
Therefore, the concentration of VGa equilibrated by the reaction equation (6.1)
IS
(6.3) 0 12
VGa
[ = -
[Zn.l(p/Nv)2exp (!:l.Gzn)
--,;;y- .
It is understood that (V&al at the interface is not determined by the arsenic va-
por pressure, but determined by zinc concentration and its activation energy is
.z.~?i. , which should be much smaller than !:l.H~~a' the formation entalpy in the
homogeneously controlled crystal.
This excess of V&a concentration is the origin of the interface disordering. In the
case of silicon diffusion, both the interstitial silicon formation and the site transfer
reaction can generate VGa. At lower temperatures where the interstitial diffusion is
dominant, the reaction
(5.13)
[l;;l(V&al _ (_ !:l.Gi)
(6.5) .+ 1 - exp
[SZGa kT·
0 1
(6.6) VGa
[ SZGa 1.12 exp (!:l.Gi)
= [.+ - 2kT .
103
On the other hand, at higher temperatures where the site transfer diffusion is dom-
inant, the reaction equation which should be equilibrated is Equation (5.3), so
that we have the equilibrium relation (5.11). But this time, the condition that
[SiAs) = [VJa] must hold. As for W1.] we assume it is controlled by the external
vapor pressure, that is,
(2.11)
Therefore, we have
Both of Equations (6.6) and (6.7) also shows that the concentration of gallium
vacancies which cause the interface disordering are much increased depending on
silicon concentration. We give the calculational result for Equation (6.6), which is
the simplest case. As was discussed in Section 5, we assume that 6..H; ~ 2.5eV.
Also, we simply assume 6..S; = o. As shown in Figure 13, the concentration of
gallium vacancy, NVGa, at the interface can be much higher than in homogeneous
crystals. In the case of thermal equilibrium, NVGa is expected to be of the order of
101l-12 cm -3 at 1000°C, if we assume 6..HVGa = 3eV and 6..SVGa = 0 in Equations
(2.18) and (2.20). Therefore N VGa shown in Figure 13 is more than 104 times
higher than in the homogeneous crystals. These excess VGa's in the vicinity of
the interface are rapidly occupied by aluminum interstitial atoms diffusing from
the AlAs region driven by the steep slope of the aluminum chemical potential.
The same phenomenon also occurs in the AlAs region. These processes cause the
gallium-aluminum mixing at the interface region.
104
T. ·C
1200 1000 800
lOll , - - - - . , , - - - r - - - - . - - - - , - - - - - r - - - - - ,
,;>1017
E
<.J
c
.2
~'015
1
8
l~~UL--~-~--~-~--~----~--~
1.0
REFERENCES
[7] J. NISHIZAWA, Y. OKUNO AND K. SUTO, Nearly perfect Crystal Growth in III-V and II-VI
compound semiconductors, JARECT Vol. 19, Semiconductor Technologies (1986), edited by
J. Nishizawa, OHM & North-Holland, 1986, pp. 17-80.
[8] J. NISHIZAWA, Stoichiometry Control for Growth of III-V Crystals, J. Crystal Growth, 99
(1990), pp. 1-8.
[9] J. NISHIZAWA, Y. OYAMA AND K. DEZAKI, Stoichiometry-Dependent Deep Levels in n-type
GaAs, J. Appl. Phys., 67 (1990), pp. 1884-1896.
[10] J. NISHIZAWA, Y. OYAMA AND K. DEZAKI, Formation Energy of Excess Arsenic atoms in
n-type GaAs, Phys. Rev. Letters, 65 (1990), pp. 2555-2558.
[11] ,Stoichiometry-Dependent Deep Levels in p-GaAs prepared by annealing under excess ar-
senic vapor pressure, J. AppJ. Phys., 69 (1991), pp. 1446-1453.
[12] A.B.Y. YOUNG AND G.L. PEARSON, Diffusion of Sulfur in Gallium Phosphide and Gallium
Arsenide, J. Phys. Chern. Solids, 31 (1970), pp. 517-527.
[13] B. TUCK, Atomic Diffusion in III- V Semiconductors, Adam Hilger, Bristol and Philadelphia,
1988.
[14] K.K. SHIH, J .W. ALLEN AND G.L. PEARSON, Diffusion of Zinc in Gallium Arsenide under
Excess Arsenic Pressure, J. Phys. Chern. Solids, 29 (1968), p. 379.
[15] K.H. BERNNEMANN, New Method for Treating Lattice Points Defects in Covalent Crystals,
Phys. Rev., 137 (1965), pp. A 1497-1514.
[16] R.R. HASIGUTI, Calculation of the Properties of Vacancies and Interstitials, p. 27 (U.S.
Government Printing Office, Washington, D.C., 1966).
[17] B. GOLDSTEIN, Diffusion in Compound Semiconductors, Phys. Rev., 121 (1961), pp. 1305-131]
[18] M. KASHIWAGI, (to appear).
[19] M.E. GREINER AND J.F. GIBBONS, Diffusion of silicon in gallium arsenide using rapid thermal
processing: Experiment and model, AppJ. Phys. Lett., 44 (1984), pp. 750--752.
[20] W.D. LAIDIG, N. HOLONYAK, JR., AND M.D. CAMRAS, Disorder of an AIAs-GaAs superlat-
tice by impurity diffusion, Appl. Phys. Lett., 38 (1981), pp. 776-778.
[21] K. MEEHAN, N. HOLONYAK, JR., J.M. BROWN, M.A. NIXON AND P. GAVRILOVIC, Disorder
of an Al",Gal_",As-GaAs super/attice by donor diffusion, Appl. Phys. Lett. 45 (1984), pp.
549-551.
THEORY OF A STOCHASTIC ALGORITHM FOR
CAPACITANCE EXTRACTION IN INTEGRATED CmCUITS·
Abstract. We present the theory of a novel stochastic algorithm for high-speed capacitance
extraction in complex integrated circuits. The algorithm is most closely related to a statistical
procedure for solving Laplace's equation known as the floating random-walk method. Our analysis
begins with surface Green's functions for Laplace's equation on a scalable square domain. From
them, we obtain integrals for electric potential and electric field at the domain center. An electrode-
capacitance integral is next derived. This integral is expanded as an infinite sum, and probability
rules that statistically evaluate the sum are deduced. These rules define the algorithm.
• Written material in this paper has been excerpted from a larger work in draft form which will
be submitted to Solid State Electronics, for future publication
t Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Insti-
tute, Troy, NY 12180-3590
108
~a,
Y+ ' _ _---,,----
L. ti,
x
where i = 1, ... , N. Above, the VI, ... , VN and qt, ... , qN denote electrode voltages
and their corresponding charges per unit length in z. Appropriately, Gij is a capaci-
tance per unit length in z as well.
Gauss's law permits us to write
(2)
1 Diagonal elements C ll , C22 , .•. represent, what we call, electrode "self-capacitances". They
serve no use in electrical modeling.
109
FIG. 2. Ezamples of initial mazimal squares. Boundaries Sa({J are of edge size a(e) and are centered
on Gaussian-boundary points (dark dots).
(4)
(5)
110
ti ,
FIG . 3. Ezamples of subsequent mazimal squares (initial square shaded). Each boundary Sa(e")
can be decomposed into an electrode part Sa({"") and a non-electrode part Sa({" " )' These squares
are centered on previous square-boundary points (dark dots).
qi - 10i d{ 8i(e) {
(6) + }.
Expansion (6) depends only on electrode potentials. Grounding all electrodes
except the jth, which we set to some arbitrary voltage vj, reduces (1) to
(7)
for i = 1, ... , N (i =f. j). Since the qi of (6) are linear functions of Vj, Gij can be
written as an integral expansion independent of Vj. This argument is valid for any
possible jj thus, off-diagonal elements of the capacitance matrix are independent of
electrode voltages, and we have
(8) + }.
The boundary4 S~(e"4) is simply the portion of Sa(e"") coincident with electrode j.
3. Extraction algorithm and proof. In practice, direct evaluation of (8) is
computationally prohibitive, especially for integrated-circuit geometries where N is
usually large. We resolve this difficulty with a stochastic algorithm that estimates
the Gij:
1. Partition each integration variable in (8) into small segments of, possibly different,
size LlC" (r = 1, 2, ... ). Define corresponding discrete variables at each
segment center C".
4 We mean here the superscript ' ... ,' to possibly incJu_de the unprimed case. That is, C" is one
of e,e',e", ....
112
FIG. 4. Examples of first-, second-, and third-order walks. For clarity, the walks have been drawn
to start at the same boundary-point e.
of (h.
2. Introduce new variables Ni and Wi, the meaning of which will be made clear
shortly. Initially, set Ni = Wij = 0 for all ij (i,j = 1, ... , N). Set i = 1.
3. Randomly pick a e., say e., on Qi, with discrete probability distribution Pe(e,)=
It-e, de Si(e)·
4. Randomly pick a e~, say e~, on Sor(e.), with discrete probability distribution, con-
ditioned by e., pe'(e.le~) = ft.e~ de' G[a(e.)Ie'J.
5. If the last variable picked in Step 4 is not on an electrode boundary, then change
e
Step 4 as follows: mark every occurrence of with an additional primeS' , ,
and repeat Step 4.
6. If the last variable picked in Step 4 is on the jth electrode boundary, then replace
Ni with Ni + 1 and W ij with Wij + w(e.le~).
7. If Ni is sufficiently large, go to Step 8. Else, change Step 4 to its original form
(written here above) and go to Step 3.
8. Cij = Wij/Ni for j = 1, ... , N. If i = N, then stop. Else, replace i with i + 1,
change Step 4 to its original form (written here above), and go to Step 3.
We will now explain why this capacitance-extraction algorithm works. For a given
starting electrode i, enumerate a possible set of Ni random trajectories, or "walks",
generated by the algorithm that start on Qi and end on any electrode. Figure 4 gives
examples of such first-order (e. -+ e~), second-order (e. -+ e~ -+ C), and third-order
(e. -+ e~ -+ C -+ C') walks.
.:1
W(l)
(9) cV) ~ = ~ Pe({T) ~ w({rl{~)Pe'({rl{~)·
• OJ sia(er)
The sums in (9) are to be taken over discrete-points {r and {~ on their respective
surfaces gi and S~(er). In addition, we have designated first-order-walk contributions
with the superscript' (1)'. Observe that (9) and the discussion preceding it are valid
for any starting Gaussian surface and ending electrode, that is, any ij-pair (i #- j).
Remember, as well, that (9) is a good approximation to cV)
when M is large-large
enough to ensure that the known probability-distributions Pe and Pe' adequately
represent the actual distributions in our enumeration of walks.
A connection with (8) follows upon replacing Pe and Pe' in (9) with their integral
equivalents from Steps 3 and 4 of the algorithm. We get
(10) cV) ~ ~
0i
[1aer d{ SjW] sj~ w({TI{~) {1aer,de' G[a({T)It]}.
a(er)
The ~{~ are assumed small enough so that w({rl{~) varies little over their extent.
This permits us to change {~ to {' in wand to move w within the rightmost integrand
of (10). Hence,
(11) CV) ~~
0i
U aeT
d{ Si({)] ~
sj
1aeT,de' w({rlf) G[a({T)Ie']·
a(er)
(12) cV) ~ ~
gj
[1aer d{ s;({)] 1sa(er) de' w({rIO G[a({r )Ie']·
j
Lastly, if we assume the ~{T are small enough, so that the rightmost integral in (12)
varies little over their extent, we can change {r to { in w, G, and a; and move the
integral within the leftmost integrand. Evaluating the remaining sum, as before, gives
our final result:
(13) CV) ~ f d{ Sj({) f. d{ w({lO G[aWIe'].
JOj JS~(e) .
(14)
114
where wLn) is the sum of weight-functions w for all nth-order random walks starting
on Gaussian boundary gi and ending on electrode j.
REFERENCES
[1) P.M. Morse and H. Feshbach, Methods of Theoretical Physics, Part I, McGraw-Hill, New York,
1953.
(2) A.H. Zemanian, "A Finite-Difference Procedure for the Exterior Problem Inherent in Capac-
itance Computations for VLSI Interconnections", IEEE 1rans. Electron Devices, vol. 35,
pp. 985-991, 1988.
(3) P.E. Cottrell and E.M. Buturla, "VLSI Wiring Capacitance", IBM J. Res. Develop., vol. 29,
pp. 277-288, 1985.
(4) F.s. Lai, "Coupling Capacitances in VLSI Circuits Calculated by Multi-Dimensional Discrete
Fourier Series", vol. 32, pp. 141-148,1989.
(5) A.E. Ruehli and P.A. Brennan, "Efficient Capacitance Calculations for Three-Dimensional
Multiconductor Systems", IEEE 1rans. Microwave Theory Tech., vol. MTT-21, pp. 76-82,
1973.
(6) G.M. Brown, "Monte Carlo Methods" in Modern Mathematics for Engineers, E.F. Beckenbach,
editor, McGraw-Hill, New York, 1956.
(7) A. Haji-Sheikh and E.M. Sparrow, "The Solution of Heat Conduction Problems by Probability
Methods", Trans. ASME, vol. C-89, pp. 121-131, 1967. (See, in particular, the section
"Authors Closure", and references therein.)
MOMENT-MATCHING APPROXIMATIONS FOR
LINEAR(IZED) CIRCUIT ANALYSIS·
• This work was supported by the National Science Foundation under the grant MIP #9007917.
t Computer Engineering Research Center, Department of Electrical & Computer Engineering,
The University of Texas at Austin, Austin, Texas 78712
116
(1) 10 00
tj[h(t) - h(t)]dt = 0, j = 0,1,2, ... , (m + n)
where m and n are respectively the degrees of the numerator and denominator of
H(s). Equation (1) is recognized to suggest that the first (m + n + 1) moments of h
and h are equal. Finally, it can be shown that Condition 3 is satisfied for sufficiently
large values of (m + n) [26].
The existence of the moments of the actual system function h( t) is ensured if h( t) is
piecewise continuous in [0,00) and of exponential order O[exp( ITt)], t -+ 00, IT < 0 [26].
For passive, linear RLC circuits, which are asymptotically stable, the responses are
smooth and piecewise continuous in [0,00), thus satisfying both the above require-
ments.
The preceding discussion developed a reduced-order model H(s), that is termed a
moment-approximant. The Pade approximants are a similar class of approximating
functious that are related to moment-approximants. A Pade approximant, denoted
[P / Q], is a rational function approximation of a transfer function H (s ), analytic about
s = 0, such that the first (P+Q+1) coefficientsofthe MacLaurin expansions of [P/Q]
and H(s) are equal [1,2]. In the above definition, P and Q refer to the degrees of
the numerator and denominator polynomials respectively, in the Pade approximant.
To establish the relation between the moment- and the Pade-approximants, con-
sider the Laplace transform definition of an analytic function h(t):
(4)
exists, the nth moment Mn of the function about the origin is defined as (9)
(3)
It is shown in [9, 17] that the normalized moments Mil Mo are analogous to the mean of a probability
distribution f u n c t i o n . '
117
(5) H(s) =
(6)
In other words, the time moments of a function h(t) are related to the coefficients
of the MacLaurin series expansion of h(t). The following theorem by Zakian [26],
explicitly defines the relation between the moment- and the Pade-approximants:
THEOREM 2.1. Let h be piecewise continuous on [0,00) and of exponential order
O[exp(O't)], t -+ 00, 0' < 0, and let the Laplace transform C{h} be an asymptotically
stable (min) rational function; then h is the (m+n) moment-approximant of h if and
only if C{h} is the Pad!. approximant [min] of C{h}. The reference to C{h} being
asymptotically stable is an important one, and will be addressed in more detail in a
later section on stability.
(7)
x
where is the n-dimensional state vector and gis an m-dimensional excitation vector
of impulses. Such a circuit description can be found for most LLTI circuits. Modeling
the state variables permits the modeling of any output variable as a linear combination
of these state variables. While the following development can be applied to other
excitation forms such as step or ramp voltages, we consider only impulse excitations
since from them, all other responses can be obtained by analytical convolution and
superposition.
The Laplace transform solution of Eq.(7) is
(8)
(9)
Focusing on a specific component of X(s), say the ith, the coefficients of the series
expansion can be denoted as:
mi0 = [-A-1b]i
mi1 [-A-2~i
where m~ denotes the j-th coefficient in the series expansion of the i-th state variable.
118
The efficiency of AWE lies in the recursive computation of these coefficients [nil;.
As explained in [20], the explicit construction/inversion of the state matrix A is
not required. Instead, finding A -1 from Eq.(7) is equivalent to solving for the port
voltages of the open-circuit capacitance ports and port currents of the short-circuit
inductance ports [6]. To illustrate, consider the circuit in Fig.1(a). The response
L3 R4
(a)
L3
L3 .mj R4
(b)
coefficients can be obtained from the circuit in Fig.l(b), where all the capacitors have
been replaced by current sources, and inductors by voltage sources.
The recursion in Eq.(10) is initiated by replacing the source V in Fig.l(b), by a
constant voltage of value 1, and setting all the capacitor and inductor-sources to zero.
Solving this dc circuit for the capacitor voltages and inductor currents yields the first
coefficient mo for each state variable. This is equivalent to substituting g= 1, i = 0
in Eq.(7), and solving for x. This yields x = -A -lb, which, from Eq.(lO), is the first
coefficient mo.
Higher-order coefficients mj are then recursively obtained by shorting the excita-
tion source V, setting the capacitor-current sources equal to -C;m}_l1 the inductor-
voltage sources equal to - L i m}_1, and solving for the port voltages and currents. This
is again equivalent to setting g= 0 in Eq.(7"), i = Xj-l and solving for Xj [13, 20].
119
In AWE, the reduced qth-order model of the ith state variable has the form:
• kt
E
9
(11) [X(S)]i = - (i)
1=1 S - PI
where the terms pi are the q, unique dominant-pole approximations and the terms kt
are the corresponding residues. The values of pi and kj are computed such that the
model in Eq.(l1) best approximates the actual response in Eq.(9) in the sense ofthe
Pade approximation:
(12)
Cross-multiplying and equating the coefficients of sq, sq+l, ... , yields the following
set of linear equations for the denominator coefficients of Eq.(12) [13, 20]:
m1
m; ...
i
•.• m q_ 1
m~
i 1
.
.. . .. ...
(13)
(14)
are the dominant pole approximations.
To solve for the corresponding residues ki, the first q coefficients of the s terms in
the expansion of Eq.(l1) are matched to those of Eq.(9) to obtain the system:
ki ki ki
(15) -( (pi1)q + (p~2)q + ... + (l)q) =
This may be rewritten in matrix form [20] as
(16) _
k~i - V-1 ~i
- mL
where mi is a vector of the low-order coefficients, (m~, mi, ... , m~_1V, and V is the
matrix
(pD-1 (p;)-1 ... (Pq
;• )-1
[ (pD-2 (p~)-2 ... - (Pq
)-2 1
. .
(pD-9 (p~t9 ... (p~)-q
120
(IS) H(O) =
m~
[ mi m'
m!_l 1
m~:q~2
q
m~_l
This is also recognized to be the matrix in Eq.(13) for the roots of the characteristic
polynomial.
From [12, 5], it is seen that the degree of a proper rational function is equal
to the rank of the Hankel matrix representing its power series expansion. In the
development of [X(s)]., which is indeed a proper rational- function approximation,
the degree would represent the order of approximation that is sought, i.e., the degree
of [X(s)] •. Further, since, by assumption, the series expansions of the actual response
and the reduced-order model are to agree at least as far as the first 2q coefficients,
Equation (18) would represent the series expansion of [X(s)]. as well. Hence, an upper
limit on the order of approximation would be q :5 pHJO) where p denotes the rank.
Attempting to obtain an order of approximation higher than this limit would result
in the truncation noise being personified by unstable poles or poles with relatively
insignificant residues. (A similar approach was also suggested to us by Pak Chan [4]).
Thus, it might not be possible to fit the signal bandwidth with a set of poles
spanning that bandwidth. Rather, the order of approximation is increased until
either the bandwidth requirement is satisfied, or there is amplification of numerical
noise in the form of poles with vanishing residues.
121
Rl R2
---~
IC4000
nant poles and residues of the response at the output node are shown in Table 1 for
increasing orders of approximation. To obtain an approximation for a signal band-
TABLE 1
Poles and residues at output of -IOOO-node RC tree for increasing orders of approximation.
width of, say, 5e+8 radians, a 3rd-order approximation would be sufficient, since
all the poles below that frequency have converged and do not shift appreciably at
122
higher orders. However, for signal frequencies much greater than le+ll, attempting
to obtain models of order greater then 5 would yield poles with relatively insignif-
icant residues that do not influence the response. These poles would represent the
magnification of numerical noise and occur at random locations, as illustrated by the
"noise" poles in the 6th- and 7th-order approximations in Table 1.
The next section introduces concerns of instability that are inherent to moment-
matching and Pade approximation techniques, in particular. These concerns play an
important part in the actual approximation process.
TABLE 2
Rlustration of sensitivity of moment-matching scheme to noise in the coefficient values of response
at node 2000 of ..tOOO-node RC tree (perturbed coefficient in bo:t).
jw
x
x x x
---cr
x x x
x
FIG. 3. Pole-pattern of an artificial system function.
124
TABLE 3
E/Ject of the location of system zeros on the stability of the reduced-order models of a 4-pole, 3-zero
function.
I
System poles I -1 -10 -15 I -100 I
Case 1 System zeros -2 -20 -40
O(1)-model poles -1.66
O(2)-model poles -1.04 -22.31
O(3)-model poles -1.00 -9.23 -158.87
Case 2 System zeros -2 -13 +6
O(I)-model poles -1.30
O(2)-model poles -1.16 +895.36
O(3)-model poles -1.00 -11.08 -104.44
Case 3 System zeros +19 +8 -6
O(I)-model poles -0.84
O(2)-model poles -1.00 +8../.12
O(3)-model poles -1.00 +3.61 +7.4-63
However, in some cases, using the scaled coefficient values may still result in
the near-singularity of the Hankel matrix in Eq.(18). This was demonstrated by
Huang [13] who showed that the higher-order coefficients are increasingly influenced
by the minimal poles. Further, the contribution of the high-frequency poles decreases
with the ratio of the magnitudes of the low- and high-frequency poles. The effect of
scaling is not evinced here, since the ratio of the magnitudes of the poles is unaffected.
To overcome this, a method of frequency shifting, termed uniform predistoriion,
was suggested in [13]. As compared to frequency scaling, where the energy-storage el-
ements are scaled, frequency shifting involves adding proportional resistors in parallel
or series to the energy storage elements. This has the effect of moving the jw-axis to
the right, as shown in Fig.4, and thus increasing the ratio of low- to high-frequency
poles. With respect to Fig.4,
Pl+,).>Pl
P2+,). Pl·
The degree of shift, .\, of the jw axis determines the change in the ratio of pole
magnitudes.
Another technique of overcoming numerical instability, for the special case of pas-
sive, linear RC-interconnect circuits, is described in [11, 10]. This approach attempts
to map a set of series coefficients of the homogeneous step response, to a stable domi-
nant pole-residue representation using constrained. optimization. This is facilitated by
126
jCil
!
x
P2
;1 .cr
jCil
I
I
I
I~A.
I
I cr
I
P2+A. Pl+ A.
FIG. 4. Frequency shifting to improve the ratio of pole magnitudes.
the a priori knowledge that the poles of passive, linear RC-circuits are real and nega-
tive. Hence, for a stable approximation, the model-poles should be real and negative
as well. This forms a nonlinear inequality constraint that can be easily incorporated in
the form of a variable transformation, (pj = -exp(xj}) on the system in Eq.(15}. The
resultant, transformed system is optimized in x-space using unconstrained techniques.
This constrained optimization technique is employed in RICE (Rapid Intercon-
nect Circuit Evaluator) [23), an implementation of AWE for the analysis of inter-
connect circuits. RICE uses an efficient path-tracing scheme [22), that minimizes
introduction of numerical errors, for the computation of the moments of the circuit
responses. In addition, problem conditioning is maintained throughout through the
use of frequency scaling and the use of numerical techniques such as the singular value
decomposition [11, 10).
Table 5 shows the results of using RICE to model the step response at node 1000
of the RC-tree in Fig.2. A 3rd-order unconstrained approximation yields an unstable
TABLE 5
Unconstrained and constrained models of the step response at node 1000 of 4000-node RC-tree.
model, while using the constrained optimization scheme yielded a stable model which
compares very favorably with the output of a circuit simulator [21), as shown in Fig.5.
1.0
0.8
0.6
CD
C'l
CG
.:::
0 0.4
> RICE
-------- PSPICE
0.2
0.0
O.OOe+O 1.00e-8 2.00e-8 3.00e-8 4.00e-8 5.00e-8
Time
FIG. 5. Comparison of the constrained 3rd-order AWE model of the response of a 4000-node RC-tree
versus the output of a circuit simulator.
TABLE 6
Execution time (in seconds) for various sizes of RC-interconnect circuit models. (An'" indicates the
circuit WIIS too large in terms of memory requirement.)
1.2
1.0
0.8
Q)
Cl 0.5
~ 0.4
0
> PSP ICE
0.2 RIC:: 6th order
0.0
- 0.2
Oe- O 2e - 8 4e-6 5e-8 8e-8
Time
FIG. 7. Comparison of the 6th-order AWE model of the step response of an RLC-interconnect circuit
versus the output of a circuit simulator.
obta.ined from PSPICE and a 6th-order AWE model. Note that even a 6th order
model does not capture all of the high frequency effects due to an ideal step input.
As the rise time of the input increases, the AWE model approximates the actual
waveform more closely. Fig.8 shows this observation. The greater the rise time of the
input signal, the lesser is it's high frequency content, hence, the lower the required
order of approximation.
Circuits with controlled-sources and active devices may be inherently asymptoti-
cally unstable and may possess transfer-function poles in the right half of the complex
plane. In such cases, AWE reduces to a "pure" Pade approximation, since the mo-
ments of the response of a function with a positive pole cannot be obta.ined due to
the divergent nature of the response [16]. However, the problem posed here is the
determination of whether a positive model-pole reflects the instability problem as-
sociated with the Pade approximation or is an approximation to the actual positive
system-pole.
1.2
1.0
0.8
II> 0.6
Cl
!2
"0 0.4
PSP ICE
> R I C~ 6th order
0.2
0.0
- 0.2
OeoO 2e-8 4e-8 5e - 8 8e - €
Time
FIG. 8. Comparison of the 6th-order AWE model and the output of a circuit simulator for a 10ns
input-signal rise time.
time. Alleviating these problems of instability and order-estimation will enable the
extension of AWE to even more complex and challenging tasks.
REFERENCES
[17] S. P. McCormick. Modeling and Simulation of VLSI Interconnections with Moments. PhD
thesis, Mass. Inst. Tech., June 1989.
[18] J. Pal. Stable reduced-order Pade' approximants using the Routh-Hurwitz array. Electron.
Lett., 15, 1979.
[19] L. T. Pillage. Asymptotic Waveform Evaluation for Timing Analysis. PhD thesis, Carnegie
Mellon Univ., Apr 1989.
[20] L. T. Pillage and R. A. Rohrer. Asymptotic waveform evaluation for timing analysis. IEEE
Trans. Compo Aided Design, 9, 1990.
[21] PSPICE USER'S MANUAL. Version ./.03. Micr08im Corp., Jan 1990.
[22] C. L. Ratzlaff. A fast algorithm for computing the time moments of RLC circuits. Master's
thesis, The Univ. of Texas at Austin, May 1991.
[23] C. L. Ratzlaff, N. Gopal, and L. T. Pillage. RICE: Rapid Interconnect Circuit Evaluator. In
Proc. 28th ACM/IEEE Design Auto. Conf, Jun 1991.
[24] R. H. Rosen and L. Lapidus. Minimum realization and systems modeling: Part I - Fundamental
theory and algorithms. A. 1. Ch. E. J ., 18, Jul 1972.
[25] Y. Shamash. Linear system reduction using Pade approximation to allow retention of dominant
modes. Int'l. J. Control, 21(2), 1975.
[26] V. Zakian. Simplification of linear time-invariant systems by moment approximants. Int'l. J.
Control, 18, 1973.
[27] J. Zinn-Justin. Strong interaction dynamics with Pade' approximants. Phy. Rep., 1970.
SPECTRAL ALGORITHM FOR SIMULATION
OF INTEGRATED CmCUITS
Abstract. Waveform relaxation improves the efficiency of integrated circuits transient simu-
lation at the expense of large memory needed for storage of coupling variables and complicated
intersubcircuit communication requiring interpolation. A new integration method based on the ex-
pansion of unknown variables in Chebyshev series is developed. Such a method assures very compact
representation of waveforms, minimizing storage requirements. Solutions are provided in continu-
ous form, therefore no extra interpolation is needed in the iterations. The resulting algorithm was
implemented and proved to be very efficient. A short description of spectral technique is presented
and an application of spectral analysis in computing the transient behavior of an MOS circnit is
discussed. The computing proved to be much more efficient in comparison with other methods.
00 ,
where T;(t) denotes a first kind Chebyshev polynomial of ith degree and c; are the
constant coefficients [2]. The prime at the summation symbol denotes that the first
term in the summation is halved. If a function· c(l) is defined in a general interval
[11' 12] then it has to be scaled to the interval [-1, 1] to obtain the scaled function c(t)
used in (2.1). The scaling is performed using the following operation
(2.2)
To simplify notation the equation (2.1) can be rewritten in the following vector form
(2.3) C{c(t)} = c,
• Department of Electrical and Computer Engineering, University of Arizona, Thcson, AZ 85721
132
where C denotes the Chebyshev transformation and the entries of the vector c are
composed of expansion coefficients of the function c(t) as
(2.4)
where
!eo Cl
!Cl
(2.6a) C=
!c;
and
~go gl
~gl
(2.6b) G=
~gi
(2.7) y=By*+2ey(-1)
where
(2.8) y* = C{y'(t)} ,
133
(2.9)
and the matrix B is invariant and its first row is defined as:
1
'2 i=O
{
(2.10) Bo; = (t2 -
- -
t1) . -.
(_1)'+1
1 i=l
(i-1)(i+1)
i = 2,3, ...
j = i-I
(2.11) j = i +1
elsewhere.
- dy _ --
(3.1) c(t) dt = g(t)y + h(t)
(3.1a)
where functions c(t), g(t) and k(t) are defined in the time interval
(3.2)
and y = y(t) is the unknown function on the same interval. To simplify mathematical
operations equation (3.1) is scaled to the interval [-1,1] as described in the previous
section. As a result, equation (3.1) is rewritten as
dy
(3.3) c(t) dt = g(t)y + h(t)
where y = y(t) is the unknown function, and c(t), g(t) and h(t) are known functions
of t on the interval [-1,1]. Using the notation introduced in the previous section
the above equation can be rewritten in the following transformed form as a relation
between the Chebyshev expansion coefficients
Using relation (2.7) the above equation is rearranged in the following form
dv(t)
(4.1) C(v(t), t)--;u = q(v(t), t) v(-l) = Yo,
where v(t) is an M-dimensional vector of circuit variables, Vo is a vector of initial con-
ditions, C is an M x M square matrix with variable entries and q is aM-dimensional
vector of nonlinear functions.
Special Case
Consider the scalar case (M = 1), linearize equation (4.1) around a given waveform
yp(t) to yield
= c(yp(t), t) ,
--av£9./ v=yp(t)
_.11£/
av V=lIp(t)
~
dt' (4.3)
= q[yp(t), t]- gP(t)Yp(t) .
135
Equation (4.2) is solved by using the same procedure as in the case of equation (3.3).
Based on this linearization a Newton-Kantorovich-type iteration procedure is used to
recover the solution. The process continues until the iterates satisfy a convergence
condition
(4.4)
where to is the convergence tolerance, k denotes the iteration count, and II . 1100 denotes
the maximum-norm.
General Case
(4.5)
where
(4.6b) qi = qi(V, t) .
(4.7)
(4.8)
M
L Cij(Yp, t)
;=1
d: = L
dv' M
;=1
gi;(t)V; + qi(YP, t) -
M
L gi;(t)V
;=1
p;,
where
136
After linearizing all subcircuits using (4.8) and (4.9), equation (4.7) is rewritten in
the form:
where C"(t), G"(t) are M x M matrices and ~~ , v and h"(t) are M dimensional
vectors. Each element of matrices cP(t), GP(t) is expanded into Chebyshev series
of degree 2N and formes an (N + 1) x (N + 1) submatrix defined by (2.6ab). The
elements of the vector h"(t) are expanded into Chebyshev series of degree N and
form subvectors of order (N + 1) of coefficients of expansion. Using property (2.7)
the constructed equation for expansion coefficients can be assembled and represented
in a matrix form as:
(4.11)
where A", gP, B are M(N + 1) x M(N + 1) square matrices and p., X are M(N + 1)
dimensional vectors. The vector {)* is composed of (N + 1) dimensional subvectors
vi. A subvector vi contains the coefficients of the expansion of the ph component of
~. The blocks of matrices A", g" are determined by the submatrices created from
expansion of CP(t) and G"(t). Matrix B is a block diagonal with blocks composed of
M identical (N + 1) x (N + 1) square matrices defined by relations (2.10) and (2.11).
Vector X is composed of (N + 1) dimensional subvectors each represents expansion
coefficients of a respective element of vector h"(t). Vector p. is composed of the (N +1)
dimensional subvectors
(4.12)
which depends on the initial conditions VOl< for the kth components of vector v. The
M(N + 1) dimensional vector {) is calculated by using the relation
5. Example.
5.1. Circuit model. Simulation of MOS circuits is based on the solution of or-
dinary differential equations obtained using MNA. The circuit equations are written
in the general form (4.1) In order to provide better description of the model and solu-
tion algorithm details a specific example is given below. Model of CMOS NAND Gate
The schematic of NAND gate built in CMOS technology is shown in Fig. 1. The
equivalent circuit is obtained by replacing the MOS transistors by appropriate model
[5] and the equations for the unknown nodal voltages (Va, V4 ) are written in the matrix
form
(5.1)
where
Ql('\I3, "114, Vi, \12) = cgdp . ft Vi + (cgdp + cgdN) . ft \12 - IN('\I3, \12, "114)
-Ip(Va, V2 ,5.0) - Ip('\I3, Vi, 5.0)
and
(5.3)
The remaining capacitors are constant. Functions IN and Ip represent the drain
current for the n-channel and p-channel MOS devices, respectively. The current
function IN depends on the regions of operation of the MOS transistors and it is
given by the following expressions
138
SV SV
VI
V3
VZ
b) reverse region V ds ~ 0
The p-channel device operates in the same manner as the n-channel device except
that all voltages and currents are reversed.
5.2. Results of simulation. The CMOS NAND gate described above was sim-
ulated by using the prototype software. The simulation was performed in the time
interval [0.0, 2.3]l's which was divided into five subintervals (windows) as shown in
Fig. 2a and Fig. 3a. The Chebyshev expansion degree was set to 32 for all windows.
The results of simulation, voltage \/3, are shown in Fig. 2a together with the driving
signals lit and V2. The output shows a logical representation of the NAND function,
\/3 is low when lit and V2 are in high level, and \/3 is in high level otherwise. The
iteration process is illustrated in Fig. 3b, where the results of some initial iteration
steps are shown. The accuracy of the solution was set to 1.0 m V in each window.
139
V3M
5.0~
I ~3M
1\ f 5.00
IV
r "
2.50
0.00 \. 1 !Vi
V3=*'¥?:.
2 V3
,
I~ L H
V1M L H H
5.00 H L H
H H ~
2.50 H-Iigh l.rI
0.00 I l-loillBlll
V2M
5.0v-
I
2.5,.
0.00 l
time [us
0.00
\ time [us
0.0 0.4 0.8 1.5 1.9 0.0 0.4 0.8 1.5 1.9
a b
FIG. 2. Simulation of a NAND gate obtained using SPEC simulator; a) driving signals, Vi, V2, and
output, V3; b) the details of the output voltage V3.
REFERENCES
........ ,
V3M
25. n I
i iter.tiCliI
V3M
22.0
...
~
,f.... -
.: ....
",t ••'
20.0
20 . lB.OO
:1 16.0 .'
j! / .. '0
:
.,' -
14 .0 ~~
15." ~'(,e, . . . . . . .
I, itHatiOD 1
.
. :,l,i
12 . 0
,.'
: fl
.•.. ",.'
"
.
10 . 0
10. . t,.e""'t.'l. ••
I ituatim] B.00
-Ie ,. ........ it:r,.tiOii
" ~ iT
.....
.:........ ........
",
5. iY~ itentiClit 6.00
\ If 4.0
/
last it.erjtton - SOlution
\. iI
.,......
2.0
o.n
if 0.0 /
time (us] I time lus ]1
0.0 0.4 0.8 1.5 1 9 1. 50 1. 60 1. 70 1.80 1. 90
a b
FIG . 3. Convergence process in computation of transient in the NAND, the example of output
variable, V3 ; a) V3 in the simulation range with marked window for which the iterations were
recorded; b) the details of iteration process in the selected window.
R2
C31
+ +
VI
v.:
~CI
+ V3
S -
(3)
where v(s) is the Laplace transform of v(t). The above equation is, in matrix
notation,
(5)
We utilize the following theorem [9] to show that the WR iteration has a potential
problem.
THEOREM 2.1. Let X = V(I4, cn) with 1 ::; p ::; 00 and assume that the
eigenvalues of M have positive real parts. Then the spectral radius of the symbol
K(s) is
(6)
143
Evaluating p( K) we find
and it is clear that the maximum occurs for w = 0 where p(K(O» = 1, which
indicates that convergence problems occur at w-+ O. Fortunately we show that this
does not imply that there is a problem over any finite time interval, but only indicates
that convergence is non-uniform and degrades over the infinite time interval. Also,
this difficulty does not exist if other parasitic elements are connected at either node.
We simplify the problem by choosing a = P = 1 without any loss of key insight.
By executing the WR interation in the s-domain, we arrive at the solution for iteration
(k):
(k)
(8) _ (k)( ) _ " 1
Vl S - ~l(s + 1)2m-l'
The solution in the time domain corresponding to Eq. 8 is found from the inverse
Laplace transform as
(k-l) t 2m
(9)
Vl
(k){t) _
- e
-t"
~o (2m)!'
Taking the limit as the iteration index k -+ 00 we find that the limit is
(10)
To verify that the limit is indeed the exact solution, we start by finding the s-
domain response of the circuit in Fig. 1,
_ (s + P)
(11) VI = s(s+{a+p» ,
for a unit initial voltage. Converting this into its corresponding time solution, we
find
(12)
CRCCIRCUIT
VOLTAOII
lUCliOl
1.00 -t- - + - - I I - -+ - - t - --t-- - + -0;,0"...------
0.95 1\ "'1"1".... -----
" "l'l'.... .fo['--
0.90 -t[\-
\-+---1I--+--t---t---+-'l1"oniioY -
0"" \ -,r.....sOr. -
0.10 \.
0.15 i\
0.70 \\
0.65 \ \.
0.60 -+-+-\-\".+---1--+--1---+--+
~ -+-~\-1~,,~~-I--+--t---+--+
uo -+--+-+="",.::,~~~-~~h_=-_-_4--=-=-+-_-_-+
0'"" -+---'\+--'~-~+----'r-....-;:-+---=:,-+
O~-+--~-~~-+~__t--~~-+
0.l5 -+--+.---1.....:....-+----'''''- -+- ...;....+
O~_+--~~-I~~~--t-~~--+
O~~--~~-I~--p..--r--~~-;_
0.20 -t--+--'>.MI--+---!~t---t--.::...,+-
0.1' ~--~--'".II--+--f"'T--+--;_
0.10 -+---j----1"""';:--+--I---"'-+.:--t-
O.CI! _+--+----1--""+<-:;::--I---+---=~t-
0.00 -+---j----1--+---=F="+~-t-
0.00 1.00 2.00 !.oo ' .00 ' .00 6.00
(3
(13) Vl(OO) = --(3-
a+
TABLE 1
Window times
Iteration Time Windows
1 0.74
2 1.47
3 2.21
4 2.94
5 3.68
LEMMA 3.1. Let T E ~, then the WR sequence converges rapidly after the k-th
iteration for k ~eio.
The proof is fairly straight forward using the approximate identity (2m)! e!
$(2m)2m+i e- 2m . After some algebraic manipulations this leads to
(14)
In fact, we can see easily from Eq. 14 that the coefficients decrease faster than
O( ~ ) for the conditions on the time T and the iteration index k given in the lemma.
Fig. 2 shows how the first five iterations converge. In fact, the time window T of
rapid convergence is clearly visible. We can conclude from the above condition that
a reasonable choice for a time window as a function of iteration count is k > ei,
as
pictured. For a comparison, we give the values for the equality in Table 1.
We do get an indication from Fig. 2 and Table 1 how the useful time window
grows with the number of iterations. Of course, with our choices of a = {J = 1, this
is normalized to unit time constants. This implies, in general, that time constants
associated with this subcircuit should be sufficiently large such that the time window
is large enough that it does not constrain the number of time steps in a window too
much. "Large enough" may mean that the numerical integration needs at least 10
time points in a particular window. This condition may be guaranteed for a transition
or spike in the waveform. However, a problem with window size may exist if the WR
code has the same global time windows for the entire circuit.
REFERENCES
Abstract. Inconsistent initial conditions, which can exist in switched networks, cannot be
handled by the usual integration routines. A method based on numerical inversion of the Laplace
transform was developed. It is equivalent to a high-order integration and can handle inconsistent
initial conditions, discontinuous functions and Dirac impulses. The method was used to write
programs for analysis of switched networks.
retain the ability of the Laplace transform to correctly handle inconsistent initial
conditions. Consider ·the Laplace transform equation
J
c+j""
N
E(M + N - i)!(~)zi
eZ R: RN,M(Z) = -M---!:i-::!O!..-_ _ _ _ _ _ _ •
E( -1)i(M + N - i)!("f)zi
i=O
M K
RN,M(Z) = L:-'-,
i=1 Z - Zi
where the numerical evaluation ofthe roots, Zi, and residues, Ki, is done only once.
The inversion formula becomes
J ~V
c+joo
c _ joo
Z- Zi (~) t
dz.
M
v(t) = -~ L:Ki V (:i)
i=1
(1)
149
and consider only upper half plane poles and their residues. It is shown in [1] that
formula (1) approximates the first M + N + 1 terms of the Taylor expansion of v(t)
for any t > O.
In the case of networks we can use the system equations
where v(t) is the result from the previous step. The possibility of resetting the
initial conditions by means of initial voltages and currents makes the procedure
equivalent to a numerical integration formula, somewhat resembling the Runge-
Kutta methods, because previous results are not needed. However, evaluations are
done in complex. In the following we assume the selection M = 4, N = O.
1
v=--
8+1
Application of formula (1) gives a value which is correct to about 15 decimal digits
(on a 16 decimal digit machine) for every t in the interval from 10- 12 to 10- 3 ,
see Figure 2(a). In each case the error corresponds to a single step. The error
starts growing for larger t, but that is to be expected, since the formula is an
approximation. In this case the Dirac impulse does not appear at the output.
150
R c
R
10 vet)
(a) (b)
Figure 1.
The situation changes drastically for Figure 1(b). It is the same network but
the output is taken at a different node. In the Laplace domain the output is
s 1
(3) V(s)=-=1--
s+1 s+1
le+OO
R
le-04
a
t
i
v le-08
•
E
r
r 1e-12
0
r
le-16
le-12 le-09 le-06 1e-03 IHOO
SLep si •• (s)
Figure 2(a).
A Dirac impulse appears at the output. In this case formula (1) gives a large
error for a small step, e = 10- 4 for t = 10- 12 , see Figure 2(b), again for a single
step. However, the error decreases almost linearly to e ::::: 10- 13 for t ::::: 10- 3 . Here
the integration error is acceptable, but the solution is a poor approximation of the
initial condition at t = 0+.
In order to get correct initial conditions at t = 0+, even in the presence of the
Dirac impulse at t = 0, we propose to first make one large step forward, to get to
the lninimum of the integration error. Afterwards, starting from this new point, we
make an exactly equal step backward. This backward step is essentially error-free,
since there is no impulse at the new starting point. The error in this step is the
same as in Figure 2(a). As a result of this two-step procedure we get correct initial
condition at t = 0+.
151
10+ 00
R
0
I 10·04
"t
1e-08
E
r
r 10· 12
0
r
le· IS
10· 12 1e· 09 1e-06 1e-03 10+ 00
Step silo (s)
Figure 2(b).
s 1
V(s) = s+l =1- s+l'
i/(t) = -~
t
Re [Kl (1 __
t ) + K2 (1 __t )].
+t Zl +t Z2
For very small t and finite arithmetic precision the fractions will be dominated by
the units. The expression effectively reduces to
due to the fact that the real parts of the residues are equal in magnitude but have
opposite signs. This error is eliminated by the two-step method.
(4)
CD
T
...f1JL.
L o R
The first one is the true initial condition after switching, obtained by the two-
step method explained above. The term V6 is a multiplicative factor corresponding
to the area of the Dirac impulse. This term is always finite, can be stored in the
memory of the computer, and has zero value when no impulse has occurred at the
instant of switching. The problem is to find this coefficient.
Consider the situation in which we have reached the instant t = 0- , just before
switching. Using the two step method we can obtain the term v(O+) but we still do
not know whether the impulse has occurred at t = O. To discover its existence we can
calculate the area between t = 0- and t = 0+ by the same two-step method. Since
in the Laplace domain the integration is expressed by division by s, we evaluate
v(r)dr = Re
0- .-1
and do the same for the step back. The difference of the two areas is the area of
the Dirac impulse. Note that this integration is done almost for free; it represents
only four additional multiplications for our selection of M = 4 and N = 0, and for
both steps forward and back.
It is interesting to see the accuracy of this method. For the function
3
V(s)=-5+-
s+2
whose time domain response is
f
0.005
v(t)dt = -4.98850758
0-
f
0+
v(t)dt = -0.01492525
0.005
with relative error 8.4 x 10-11 • The sum of the two integrals is -5.00000, with
relative error 3.7 x 10-15 • We thus have an accurate method to find out whether a
Dirac impulse did or did not appear at the instant of switching.
(5)
and an iterative method used to reduce the error to zero. A suitable method is the
Newton-Raphson iteration, based on the equations
J(k)[V(O-)]~V(k) = -E[v(O-)]
V(k+l) = v(k) + ~v(k)
where the Jacobian matrix is
The integration method explained above is used to overcome problems with switch-
ing and possible inconsistent initial conditions. It is also used in the evaluation of
the Jacobian. Mathematical details are given ip [6,7] where the efficiency of the
method is demonstrated on several practical examples.
154
REFERENCES
[1) J. VLACH AND K. SINGHAL, Computer Methods for Circuit Analysis and Design, Van Nos-
trand Reinhold, New York, 1983.
(2) A. OPAL AND J. VLACH, Consistent initial conditions of linear switched networks, IEEE
Transactions on Circuits and Systems, CAS-37 (3), March, 1990, pp. 364-372.
(3) A. OPAL AND J. VLACH, Analysis and sensitivity of periodically switched linear networks,
IEEE Transactions on Circuits and Systems, CAS-36 (4), April, 1989, pp. 522-532.
(4) D. BEDROSIAN AND J. VLACH, Time-Domain Analysis of Networks with Internally Controlled
Switches, Vol. 39 (3), March, 1992, pp. 199-212.
(5) D. BEDROSIAN AND J. VLACH, Analysis of Switched Networks, International Journal of
Circuit Theory and Applications, Vol. 20 (3), May-June, 1992, pp. 309-325.
(6) D. BEDROSIAN AND J. VLACH, Accelerated Steady-State Method for Networks with Internally
Controlled Switches, IEEE Transactions on Circuits and Systems I: Fundamental Theory and
Applications, July 1992, Volume 39, Number 7, pp. 520-530.
(7) D. BEDROSIAN AND J. VLACH, An Accelerated Steady-State Method for Networks with
Internally Controlled Switches, IEEE International Conf. on Computer-Aided Design, Santa
Clara, California, November 11-14, 1991, pp. 24-27.