0% found this document useful (0 votes)

533 views172 pages

The IMA Volumes in Mathematics and Its Applications: Avner Friedman Willard Miller, JR

Uploaded by

Pedro Pereyra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

533 views172 pages

The IMA Volumes in Mathematics and Its Applications: Avner Friedman Willard Miller, JR

Uploaded by

Pedro Pereyra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 172

The IMA Volumes

in Mathematics
and its Applications
Volume 58

Series Editors
Avner Friedman Willard Miller, Jr.
Institute for Mathematics and
its Applications
IMA
The Institute for Mathematics and its Applications was established by a grant from the
National Science Foundation to the University of Minnesota in 1982. The IMA seeks to encourage
the development and study of fresh mathematical concepts and questions of concern to the other
sciences by bringing together mathematicians and scientists from diverse fields in an atmosphere
that will stimulate discussion and collaboration.
The IMA Volumes are intended to involve the broader scientific community in this process.
Avner Friedman, Director
Willard Miller, Jr., Associate Director
**********
UMA ANNUAL PROGRAMS
1982-1983 Statistical and Continuum Approaches to Phase Transition
1983-1984 Mathematical Models for the Economics of Decentralized
Resource Al1ocation
1984-1985 Continuum Physics and Partial Differential Equations
1985-1986 Stochastic Differential Equations and Their Applications
1986-1987 Scientific Computation
1987-1988 Applied Combinatorics
1988-1989 Nonlinear Waves
1989-1990 Dynamical Systems and Their Applications
1990-1991 Phase Transitions and Free Boundaries
1991-1992 Applied Linear Algebra
1992-1993 Control Theory and its Applications
1993-1994 Emerging Applications of Probability
1994-1995 Waves and Scattering
1995-1996 Mathematical Methods in Material Science
IMA SUMMER PROGRAMS
1987 Robotics
1988 Signal Processing
1989 Robustness, Diagnostics, Computing and Graphics in Statistics
1990 Radar and Sonar (.June 18 - .June 29)
New Directions in Time Series Analysis (.July 2 - .July 27)
1991 Semiconductors
1992 Environmental Studies: Mathematical, Computational, and Statistical Analysis
1993 Modeling, Mesh Generation, and Adaptive Numerical Methods
for Partial Differential Equations
1994 Molecular Biology
** •• ******
SPRINGER LECTURE NOTES FROM THE UMA:
The Mathematics and Physics of Disordered Media
Editors: Barry Hughes and Barry Ninham
(Lecture Notes in Math., Volume 1035, 1983)
Orienting Polymers
Editor: J .L. Ericksen
(Lecture Notes in Math., Volume 1063, 1984)
New Perspectives in Thermodynamics
Editor: James Serrin
(Springer-Verlag, 1986)
Models of Economic Dynamics
Editor: Hugo Sonnenschein
(Lecture Notes in Econ., Volume 264, 1986)
W.M. Coughran, Jr. Julian Cole
Peter Lloyd Jacob K. White
Editors

Semiconductors
Part I

With 55 Illustrations

Springer-Verlag
New York Berlin Heidelberg London Paris
Tokyo Hong Kong Barcelona Budapest
W.M. Coughran, Jr. Julian Cole
AT&T Bell Laboratories Department of Mathematical Sciences
600 Mountain Ave., Rm. 2T-502 Rensselaer Polytechnic Institute
Murray Hill, NJ 07974-0636 USA Troy, NY 12180 USA
Peter Lloyd Jacob K. White
AT&T Bell Laboratories Massachusetts Institute of Technology
Technology CAD Department of Electrical Engineering and
1247 S. Cedar Crest Blvd. Computer Science
Allentown, PA 18103-6265 USA 50 Vassar St., Rm. 36-880
Cambridge, MA 02139 USA
Series Editors:
Avner Friedman
Willard Miller, Jr.
Institute for Mathematics and its
Applications
University of Minnesota
Minneapolis, MN 55455 USA
Mathematics Subject Classifications (1991): 35-XX, 60-XX, 76-XX, 76P05, 81UXX, 82DXX,
35K57, 47N70, 00A71 , ooA72, 81T80, 93A30, 82B40, 82C40, 82C70, 65L60, 65M60, 94CXX,
34D15, 35B25

Library of Congress Cataloging-in-Publication Data

Semiconductors / W.M. Coughran, Jr .... let aI.J.
p. cm. - (The IMA volumes in mathematics and its
applications; v. 58-59)
Includes bibliographical references and index.
ISBN-13: 978-1-4613-8409-0 e-ISBN-13: 978-1-4613-8407-6
DO I: 10.1007/978-1-4613-8407-6
1. Semiconductors - Mathematical models. 2. Semiconductors-
-Computer simulation. 3. Computer-aided design. I. Coughran,
William Marvin. II. Series.
TK7871.85.S4693 1994
621.3815'2-dc20 93-50622
Printed on acid-free paper.

© 1994 Springer-Verlag New York, Inc.

Softcover reprint of the hardcover I st edition 1994
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New
York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereaf-
ter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
by anyone.
Permission to photocopy for internal or personal use, or the internal or personal use of specific
clients, is granted by Springer-Verlag, Inc., for libraries registered with the Copyright Clearance
Center (Ccq, provided that the base fee of $5.00 per copy, plus $0.20 per page, is paid directly
to CCC, 222 Rosewood Drive, Danvers, MA 01923, USA. Special requests should be addressed
directly to Springer-Verlag New York. 175 Fifth Avenue. New York. NY 10010, USA.
ISBN-13: 978-1-4613-8409-0 e-ISBN-13: 978-1-4613-8407-6
DO I: 10.1007/978-1-4613-8407-6
Production managed by Laura Carlson; manufacturing supervised by Jacqui Ashri.
Camera-ready copy prepared by the IMA. -
987654321
ISBN-13: 978-1-4613-8409-0 e-ISBN-13: 978-1-4613-8407-6
DO I: 10.1007/978-1-4613-8407-6
The IMA Volumes
in Mathematics and its Applications

. Current Volumes:
Volume 1: Homogenization and Effective Moduli of Materials and Media
Editors: Jerry Ericksen, David Kinderlehrer, Robert Kohn, J.-L. Lions

Volume 2: Oscillation Theory, Computation, and Methods of Compensated Compactness

Editors: Constantine Dafermos, Jerry Ericksen,
David Kinderlehrer, Marshall Slemrod

Volume 3: Metastability and Incompletely Posed Problems

Editors: Stuart Antman, Jerry Ericksen, David Kinderlehrer, Ingo Miiller

Volume 4: Dynamical Problems in Continuum Physics

Editors: Jerry Bona, Constantine Dafermos, Jerry Ericksen,
David Kinderlehrer

Volume 5: Theory and Applications of Liquid Crystals

Editors: Jerry Ericksen and David Kinderlehrer

Volume 6: Amorphous Polymers and Non-Newtonian Fluids

Editors: Constantine Dafermos, Jerry Ericksen, David Kinderlehrer

Volume 7: Random Media

Editor: George Papanicolaou

Volume 8: Percolation Theory and Ergodic Theory of Infinite Particle Systems

Editor: Harry Kesten

Volume 9: Hydrodynamic Behavior and Interacting Particle Systems

Editor: George Papanicolaou

Volume 10: Stochastic Differential Systems, Stochastic Control Theory and Applications
Editors: Wendell Fleming and Pierre-Louis Lions

Volume 11: Numerical Simulation in Oil Recovery

Editor: Mary Fanett Wheeler

Volume 12: Computational Fluid Dynamics and Reacting Gas Flows

Editors: Bjorn Engquist, M. Luskin, Andrew Majda
Volume 13: Numerical Algorithms for Parallel Computer Architectures
Editor: Martin H. Schultz

Volume 14: Mathematical Aspects of Scientific Software

Editor: J.R. Rice

Volume 15: Mathematical Frontiers in Computational Chemical Physics

Editor: D. Truhlar

Volume 16: Mathematics in Industrial Problems

by Avner Friedman

Volume 11: Applications of Combinatorics and Graph Theory to the Biological

and Social Sciences
Editor: Fred Roberts

Volume 18: q-Series and Partitions

Editor: Dennis Stanton

Volume 19: Invariant Theory and Tableaux

Editor: Dennis Stanton

Volume 20: Coding Theory and Design Theory Part I: Coding Theory
Editor: Dijen Ray-Chaudhuri

Volume 21: Coding Theory and Design Theory Part II: Design Theory
Editor: Dijen Ray-Chaudhuri

Volume 22: Signal Processing: Part I - Signal Processing Theory

Editors: L. Auslander, F.A. Griinbaum, J.W. Helton, T. Kailath,
P. Khargonekar and S. Mitter

Volume 23: Signal Processing: Part II - Control Theory and Applications

of Signal Processing
Editors: L. Auslander, F.A. Griinbaum, J.W. Helton, T. Kailath,
P. Khargonekar and S. Mitter

Volume 24: Mathematics in Industrial Problems, Part 2

by Avner Friedman

Volume 25: Solitons in Physics, Mathematics, and Nonlinear Optics

Editors: Peter J. Olver and David H. Sattinger
Volume 26: Two Phase Flows and Waves
Editors: Daniel D. Joseph and David G. Schaeffer

Volume 27: Nonlinear Evolution Equations that Change Type

Editors: Barbara Lee Keyfitz and Michael Shearer

Volume 28: Computer Aided Proofs in Analysis

Editors: Kenneth Meyer and Dieter Schmidt

Volume 29: Multidimensional Hyperbolic Problems and Computations

Editors: Andrew Majda and Jim Glimm

Volume 30: Microlocal Analysis and Nonlinear Waves

Editors: Michael Beals, R. Melrose and J. Rauch

Volume 31: Mathematics in Industrial Problems, Part 3

by Avner Friedman

Volume 32: Radar and Sonar, Part 1

by Richard Blahut, Willard Miller, Jr. and Calvin Wilcox

Volume 33: Directions in Robust Statistics and Diagnostics: Part I

Editors: Werner A. Stahel and Sanford Weisberg

Volume 34: Directions in Robust Statistics and Diagnostics: Part II

Editors: Werner A. Stahel and Sanford Weisberg

Volume 35: Dynamical Issues in Combustion Theory

Editors: P. Fife, A. Lifian and F.A. Williams

Volume 36: Computing and Graphics in Statistics

Editors: Andreas Buja and Paul Tukey

Volume 37: Patterns and Dynamics in Reactive Media

Editors: Harry Swinney, Gus Aris and Don Aronson

Volume 38: Mathematics in Industrial Problems, Part 4

by Avner Friedman

Volume 39: Radar and Sonar, Part II

Editors: F. Alberto Griinbaum, Marvin Bernfeld and Richard E. Blahut
Volume 40: Nonlinear Phenomena in Atmospheric and Oceanic Sciences
Editors: George F. Carnevale and Raymond T. Pierrehumbert

Volume 41: Chaotic Processes in the Geological Sciences

Editor: David A. Yuen

Volume 42: Partial Differential Equations with Minimal Smoothness and Applications
Editors: B. Dahlberg, E. Fabes, R. Fefferman, D. Jerison, C. Kenig and
J. Pipher

Volume 43: On the Evolution of Phase Boundaries

Editors: Morton E. Gurtin and Geoffrey B. McFadden

Volume 44: Twist Mappings and Their Applications

Editor: Richard McGehee and Kenneth R. Meyer

Volume 45: New Directions in Time Series Analysis, Part I

Editors: David Brillinger, Peter Caines, John Geweke, Emanuel Parzen,
Murray Rosenblatt, and Murad S. Taqqu

Volume 46: New Directions in Time Series Analysis, Part II

Editors: David Brillinger, Peter Caines, John Geweke, Emanuel Parzen,
Murray Rosenblatt, and Murad S. Taqqu

Volume 47: Degenerate Diffusions

Editors: W.-M. Ni, L.A. Peletier, J.-L. Vazquez

Volume 48: Linear Algebra, Markov Chains and Queueing Models

Editors: Carl D. Meyer and Robert J. Plemmons

Volume 49: Mathematics in Industrial Problems, Part 5

by Avner Friedman

Volume 50: Combinatorial and Graph-Theoretic Problems in Linear Algebra

Editors: Richard Brualdi, Shmuel Friedland and Victor Klee

Volume 51: Statistical Thermodynamics and Differential Geometry of

Microstructured Materials
Editors: H. Ted Davis and Johannes C.C. Nitsche

Volume 52: Shock Induced Transitions and Phase Structures

Editors: J .E. Dunn, Roger Fosdick and Marshall Slemrod
Volume 53: Variational and Free Boundary Problems
Editors: Avner Friedman and Joel Spruck

Volume 54: Microstructure and Phase Transitions

Editors: D. Kinderlehrer, R. James and M. Luskin

Volume 55: Turbulence in Fluid Flows: A Dynamical Systems Approach

Editors: C. Foias, G.R. Sell and R. Temam

Volume 56: Graph Theory and Sparse Matrix Computation

Editors: Alan George, John R. Gilbert and Joseph W.H. Liu

Volume 57: Mathematics in Industrial Problems, Part 6

by Avner Friedman

Volume 58: Semiconductors, Part I

W.M. Coughran, Jr., Julian Cole, Peter Lloyd and Jacob White

Volume 59: Semiconductors, Part II

W.M. Coughran, Jr., Julian Cole, Peter Lloyd and Jacob White

Forthcoming Volumes:
Phase 'lTansitions and 1iTee Boundaries
Free Boundaries in Viscous Flows
Applied Linear Algebra
Linear Algebra for Signal Processing
Linear Algebra for Control Theory
Summer Program Environmental Studies
Environmental Studies
Control Theory
Robust Control Theory
Control Design for Advanced Engineering Systems: Complexity, Uncertainty, In-
formation and Organization
Control and Optimal Design of Distributed Parameter Systems
Flow Control
Robotics
Nonsmooth Analysis & Geometric Methods in Deterministic Optimal Control
Systems & Control Theory for Power Systems
Adaptive Control, Filtering and Signal Processing
Discrete Event Systems, Manufacturing, Systems, and Communication Networks
Mathematical Finance
FOREWORD

This IMA Volume in Mathematics and its Applications

SEMICONDUCTORS, PART I

is based on the proceedings of the IMA summer program "Semiconductors." Our

goal was to foster interaction in this interdisciplinary field which involves electrical
engineers, computer scientists, semiconductor physicists and mathematicians, from
both university and industry. In particular, the program was meant to encourage the
participation of numerical and mathematical analysts with backgrounds in ordinary
and partial differential equations, to help get them involved in the mathematical as-
pects of semiconductor models and circuits. We are grateful to W.M. Coughran, Jr.,
Julian Cole, Peter Lloyd, and Jacob White for helping Farouk Odeh organize this
activity and trust that the proceedings will provide a fitting memorial to Farouk.
We also take this opportunity to thank those agencies whose financial support
made the program possible: the Air Force Office of Scientific Research, the Army
Research Office, the National Science Foundation, and the Office of Naval Research.

Avner Friedman

Willard Miller, Jr.

Preface to Part I
Semiconductor and integrated-circuit modeling are an important part of the high-
technology "chip" industry, whose high-performance, low-cost microprocessors and
high-density memory designs form the basis for supercomputers, engineering' work-
stations, laptop computers, and other modern information appliances. There are a
variety of differential equation problems that must be solved to facilitate such mod-
eling.
During July 15-August 9, 1991, the Institute for Mathematics and its Applica-
tions at the University of Minnesota ran a special program on "Semiconductors." The
four weeks were broken into three major topic areas:
1. Semiconductor technology computer-aided design and process modeling dur-
ing the first week (July 15-19, 1991).
2. Semiconductor device modeling during the second and third weeks (July 22-
August 2, 1991).
3. Circuit analysis during the fourth week (August 5-9, 1991).
This organization was natural since process modeling provides the geometry and
impurity doping characteristics that are prerequisites for device modeling; device
modeling, in turn, provides static current and transient charge characteristics needed
to specify the so-called compact models employed by circuit simulators. The goal
of this program was to bring together scientists and mathematicians to discuss open
problems, algorithms to solve such, and to form bridges between the diverse disciplines
involved.
The program was championed by Farouk Odeh of the IBM T. J. Watson Research
Center. Sadly, Dr. Odeh met an untimely death. We have'dedicated the proceedings
volumes to him.
In this volume, we have combined the papers from the process modeling (week
1) and circuit simulation (week 4) portions of the program.
Processing starts with a pristine wafer of material (for example, silicon). Ion
implantation and diffusion are often used to introduce conductive impurities in a
controlled way. Chemical and ion etching is used to change surface features. In
the case of silicon, oxide is grown when an insulator is necessary. A wide variety of
techniques are used to alter, shape, remove, and add various materials that form the
final complex structure. The papers on process modeling in this volume include an
overview of technology computer-aided design, TCAD, as well as papers on plasma
and diffusion processes used in integrated-circuit fabrication.
Circuit simulation starts with models for resistors, capacitors, inductors, diodes,
and transistors as a function of input voltage or current; often the diode and transistor
models are extracted from device simulations. The goal is to model accurately and
rapidly the response of an aggregation of such individuals devices. Multirate integra-
tion of systems of differential-algebraic equations are an important part of large-scale
circuit simulation of which "waveform relaxation" is a popular instance. The papers
describe techniques for dealing with a number of circuit-analysis problems.
W. M. Coughran, Jr.
Murray Hill, New Jersey
Julian Cole
Troy, New York
Peter Lloyd
Allentown, Pennsylvania
Jacob White
Cambridge, Massachusetts
CONTENTS

Foreword ................................................................. xi
Preface ................................................................... xiii

SEMICONDUCTORS, PART I
Process Modeling

IC technology CAD overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

P. Lloyd
The Boltzmann-Poisson system in weakly collisional sheaths .............. 17
S. Hamaguchi, R. T. Farouki, and M. Dalvie
An interface method for semiconductor process simulation. . . . . . . . . . . . . . . . . 33
M. J. Johnson and Carl L. Gardner
Asymptotic analysis of a model for the diffusion of dopant-defect pairs. . . . . 49
J.R. King
A reaction-diffusion system modeling phosphorus diffusion ................ 67
Walter B. Richardson, Jr.
Atomic diffusion in GaA.s with controlled deviation from stoichiometry. . . . . 79
Ken Suto and Jun-Ichi Nishizawa

Circuit Simulation

Theory of a stochastic algorithm for capacitance extraction in

integrated circuits ......................................................... 107
Yannick L. Le Coz and Ralph B. Iverson
Moment-matching approximations for linear(ized) circuit analysis ......... 115
Nanda Gopal, A.shok Balivada, and Lawrence T. Pillage
Spectral algorithm for simulation of integrated circuits. . . . . . . . . . . . . . . . . . . . . 131
O.A. Palusin.ski, F. Szidarovszky, C. Marcjan, and M. Abdennadher
Convergence of waveform relaxation for RC circuits ....................... 141
Albert E. Ruehli and Charles A. Zukow&ki
Switched networks........................................................ 147
J. Vlach and D. Bedro&ian
SEMICONDUCTORS, PART II

Device Modeling

On the Child-Langmuir law for semiconductors

N. Ben Abdallah and P. Degond
A critical review of the fundamental semiconductor equations
G. Baccarani, F. Odeh, A. Gnudi and D. Ventura
Physics for device simulations and its verification by measurements
Herbert S. Bennett and Jeremiah R. Lowney
An industrial perspective on semiconductor technology modeling
Peter A. Blakey and Thoma$ E. Zirkle
Combined device-circuit simulation for advanced semiconductor
devices
J.F. Burgler, H. Dettmer, C. Riccobene,
W.M. Coughran, Jr., and W. Fichtner
Methods of the kinetic theory of gases relevant to the kinetic
models for semiconductors
Carlo Cercignani
Shock waves in the hydrodynamic model for semiconductor devices
Carl L. Gardner
Macroscopic and microscopic approach for the simulation of
short devices
A. Gnudi, D. Ventura, G. Baccarani and F. Odeh
Derivation of the high field semiconductor equations
P.S. Hagan, R. W. Coz and B.A. Wagner
Energy models for one-carrier transport in semiconductor devices
JO$eph W. Jerome and Chi- Wang Shu
Some Applications of asymptotic methods in semiconductor
device modeling
Leonid V. Kalachev
Discretization of three dimensional drift-diffusion equations
by numerically stable finite elements
ThomB$ Kerkhoven
Mathematical modeling of quantum wires in periodic
heterojunction structures
ThomB$ Kerkhoven
Numerical simulation of MOS transistors
Era$mu$ Langer
Scattering theory of high frequency quantum transport
H.C. Liu
Accele~ating dynamic iteration methods with application to
semiconductor device simulation
Andrew Lum,daine and Jacob K. White
Boundary value problems in semiconductors for the stationary
Vlasov-Maxwell-Boltzmann equations
F. Poupaud
On the treatment of the collision operator for hydrodynamic
models
Lui" G. Reyna and Andre, Saul
Adaptive methods for the solution of the Wigner-Poisson
system
Chri,tian Ringhofer
The derivation of analytic device models by asymptotic
methods
Chridian Schmei"er and Andrea" Unterreiter
symmetric forms of energy - momentum transport models
Michael Sever
Analysis of the Gunn effect
H. Steinriick and P. Szmolyan
Some examples of singular perturbation problems in
device modeling
Michael J. Ward, Lui, Reyna and F. Odeh
Dedication

Farouk Odeh (1933 - 1992)

Farouk Odeh died unexpectedly at Yorktown Heights, New York, on May 6,

1992, at the age of 58. He was born on July 4, 1933, at Nablus, Palestine. Odeh
spent most of his distinguished career working in the Mathematical Sciences De-
partment of the Thomas J. Watson Research _Center of IBM at Yorktown Heights.
He made important contributions to mathematical and numerical analysis, pure
and applied mathematics, and physics.
After finishing his undergraduate education at the University of Cairo, Egypt,
in 1955, Odeh lectured for a year at the Teachers College in Amman, Jordan. In
1956 he became a graduate student and Research Assistant of his thesis advisor,
B. Friedman of the Mathematics Department at the University of California at
Berkeley where he received his Ph.D. in Applied Mathematics in 1960. Odeh's
thesis and related work was in scattering and radiation theory. Working with J.
Keller as a young post-doctoral researcher, Odeh gave the first rigorous treatment of
the structure and properties of solutions to a class of partial differential equations
with periodic coefficients which includes the Schroedinger equation. This early
work, which was referred to often and is still being cited, provided a mathematical
foundation for the theory of Bloch waves in the multidimensional case which is at
the heart of crystal band theory.
Odeh joined IBM in 1960 and, with the exception of several temporary outside
assignments, he remained at the Watson Research Center at Yorktown Heights,
New York until his death. In 1962/63 he was a Temporary Member of the Courant
Institute of Mathematical Sciences of New York University, and from 1976 through
1978 he was a Visiting Member of that institute. In 1967/68 Odeh spent a sabbatical
year as a Visiting Professor at the Department of Mathematics of the American
University in Beirut, Lebanon. Since 1985, Odeh was Manager of the Differential
Equations Group at the Watson ltesearch Center. During this time, several young
researchers joined that group whose work now is devoted mainly to the theory and
numerical solution of partial differential equations.
Mechanics, Superconductivity: During his early years at IBM, Odeh made
important contributions in various areas of applied mathematics, notably in me-
chanics and superconductivity. Working with I. Tadjbakhsh, he studied existence
problems in elasticity, and visco-elasticity. In superconductivity, he proved existence
and uniqueness results for solutions to the linear differential equations of the Lon-
don model, and to the integro-differential equations governing the non-local Pippard
model. In cooperation with H. Cohen, Odeh studied the dynamics of the switching
between the superconducting and normal conducting states. Later on, he worked
on existence and bifurcation problems in the framework of the nonlinear Ginzburg-
Landau theory of superconductivity and calculated magnetic field distributions and
critical fields in superconducting films, based on a nonlocal generalization of the
Ginzburg-Landau theory.
Ordinary Differential Equations: Starting in the 1970's, Odeh began his
very fruitful work on the theory of numerical integration methods for systems of
stiff ordinary differential equations, i.e. systems which simultaneously possess very
fast and very slow modes. It was a great privilege for me to cooperate with him in
this area of numerical analysis. Early on, Odeh gave an elegant proof of linear sta-
bility of multistep integration formulas for arbitrarily large, fixed steps by applying
the powerful degree theory of analytic maps on Riemann surfaces. Around 1975,
Odeh proposed a technique, based on 12-estimates, for obtaining stability results
for A-stable formulas as applied to stiff dissipative nonlinear systems of differential
equations.
In solving stiff systems with smooth solutions, methods of higher accuracy and
less stability are both more natural and m.ore efficient than A-stable formulas. Such
methods were in practical use for a long time but, when applied to nonlinear prob-
lems, their stability properties were not understood. In 1981, Odeh and O. Nevan-
linna introduced a new analysis technique, the multiplier theory, for measuring
simultaneously the reduced stability of a method and the degree of smoothness of
the stiff system to which the method is applied. Integration methods are charac-
terized by properties of their multipliers, which are special It -sequences associated
with such methods. The multiplier determines the adequacy of a method in the
nonlinear regime. By combining this theory with Liapunov and functional analysis
techniques, Odeh and Nevanlinna proved convergence of methods with appropriate
multipliers. They also gave strategies for selecting steps and adaptively choosing
methods during the course of integration to ensure convergence of the numerical
solution and to optimize efficiency.
One of the by-products of the multiplier theory is a numerical "uncertainty
principle" which quantifies the incompatibility of extreme accuracy and stability
requirements for multistep methods. The theory has application to the waveform
relaxation method for circuit simulation discussed in the subsequent paragraph, to
the discretization of hydrodynamic flows of low Reynolds numbers, and to stabmty
aspects of multi variable control systems.
From a mathematical viewpoint, the introduction of the multiplier theory is
considered, by experts in the field, as one of the most sophisticated achievements of
modern numerical analysis. For this contribution Odeh received an IBM Outstand-
ing Innovation Award in 1985.
Waveform relaxation: The advent of very large scale integrated circuits dur-
ing the 1980's led to a need for efficient numerical integration methods for very large,
special systems of ordinary differential equations. The waveform relaxation method,
which was proposed by engineers for the computer simulation of circuits and de-
vices, answered this need. Working with A. Ruehli, J. White and O. Nevanlinna,
Odeh made important contributions to the theory and practice of the waveform
relaxation method. Among the theoretical contributions was a convergence proof
for the discrete version of this method. This proof made use of the multiplier theory
mentioned above and was valid under physically realistic assumptions. Odeh also
contributed to a deeper understanding of the waveform relaxation method and its
robustness, and thereby to improvements in its practical implementation. His work
on this method also led to the first A-stability theorem for multirate integration
methods for ordinary differential equations of order greater than one.
Semiconductors: Odeh's work on semiconductor device simulation, which be-
gan during the 1980's, led to some of the most significant achievements of his scien-
tific career. At the center of these are his contributions to the hydrodynamic model,
a generalization of the popular drift-diffusion model which, until recently, was used
successfully for semiconductor device simulation. The hydrodynamic model takes
into account non-local effects, such as velocity -overshoots in response to rapid vari-
ation of the electric field, and hot electron effects, such as impact ionization. These
effects became important to the understanding of devices because of advances in
VLSI technology and progressive device miniaturization. Device simulation rests
upon the Boltzmann transport equation. The drift-diffusion model is based upon
a first moment approximation to that equation. The more general hydrodynamic
model on the other hand, takes into consideration the first two moments of the
Boltzmann equation, yielding the charge density equation, the momentum balance
equation, and the energy balance equation.
Odeh, together with M. Rudan of the University of Bologna, was the first to
seriously analyze the equations of the hydrodynamic model which had been pro-
posed during the 1970's. Odeh and his collaborator proposed the first convergent
multi-dimensional discretization of the full hydrodynamic equations and moreover,
this discretization could easily be incorporated in existing drift-diffusion codes. In
collaboration with M. Rudan and A. Gnudi of the University of Bologna, he was
also the first to simulate actual one- and two-dimensional devices using the full
set of hydrodynamic equations and to achieve good agreement with experimental
results. More recently, Odeh with G. Bacarrani, M. Rudan, and A. Gnudi of the
University of Bologna, has proposed, analyzed, and implemented a new alternative
spherical harmonics approximation to the Boltzmann equations whose range of ap-
plicability extends beyond that of the hydrodynamic model. This is a fundamental,
and difficult extension which is expected to have considerable impact in the field of
semiconductor device simulation.
In addition to the practical analyses and implementations mentioned briefly
above, Odeh led an extensive effort to develop a mathematical foundation and the-
ory for semiconductor design and simulation. There are several aspects to this part
of his work. With M. Ward, D. Cohen, and L. Reyna, he successfully applied asymp-
totic techniques to obtain such a theory for the current/voltage relationships within
one-dimensional and quasi-one dimensional devices. This provided both a mathe-
matical basis for existing engineering analyses and extended that theory. With E.
Thomann, he made important contributions to the question of the well-posedness
of the hydrodynamic equations, determining appropriate boundary conditions for
the hydrodynamic model in both the steady state and time dependent cases. With
H. Steinrueck, using Wigner functions, he also made important contributions to
the modelling and simulation of devices where quantum effects are important. His
knowledge of both the physics and the engineering aspects of semiconductor de-
vices, combined with his extensive mathematical knowledge and talents, made him
unique within the mathematical semiconductor community.
The work of Odeh and his collaborators on the hydrodynamic model has gener-
ated an enormous literature in mathematics and in device physics and simulation.
Because of his contributions, Odeh is sometimes referred to, by researchers in this
field, as "the father of the hydrodynamic model". The hydrodynamic model has
been incorporated in FIELDAY, the official IBM software tool for device simula-
tion. Odeh's contributions to this field have spanned mathematical and numerical
analysis, electrical engineering and quantum theory of devices and were honored by
an IBM Outstanding Innovation Award in i991.
Some of the information used in this obituary was taken from personal commu-
nications by W. Coughran, Jr., G. Dahlquist, C. Gardner, and J. Keller.
For his colleagues and many friends, the passing of Farouk Odeh is a great loss.
He was an extremely talented and creative mathematician and scientist and a very
sensitive human being. All of us were enriched by our association with Faroules
warm friendship and collegiality and with his great scientific knowledge and insights.
We will sorely miss him.

Werner Liniger
Thomas J. Watson Research Center
LIST OF PARTICIPANTS

Aarden, J. University of Nijmegen

Baccarani, Giorgo University of Bologna
Bennett, Herbert NIST
Biswas, Rana Iowa State University
Blakey, Peter Motorola Corporation
Borucki, Leonard Motorola Corporation
Buergler, Josef ETH Zurich
Casey, Michael University of Pittsburgh
Cercignani, Carlo Politecnico di Milano
Cole, Dan IBM GPD
Cole, Julian Rensselaer Polytechnic Institute
Coughran, Jr., William AT&T Bell Labs
Cox, Paul Texas Instruments
Degond, Pierre Ecole Polytechnique
Gaal, Steven University of Minnesota
Gardner, Carl Duke University
Gartland, Chuck Kent State University
Gerber, Dean IBM
Giles, Martin University of Michigan
Glodjo, Arman University of Manitoba
Gnudi, Antonio Universita Degli Studi Di Bologna
Grubin, Harold Scientific Research Associates
Hagan, Patrick Los Alamos National Lab
Hamaguchi, Satoshi IBM
Henderson, Mike IBM
Jerome, Joseph W. Northwestern University
Johnson, Michael IBM
Kalachev, Leonid Moscow State University
Kerkhoven, Thomas University of Illinois, Urbana
King, John University of Nottingham
Kundert, Ken Cadence Design Systems
Langer, Erasmus Technical U. Vienna
Law, Mark University of Florida
Leimkuhler, Ben University of Kansas
Liniger, W. IBM
Liu, H.C. National Research Council, Ottawa
Liu, Sally AT&T Bell Labs
Liu, Xu-Dong UCLA
Lloyd, Peter AT&T Bell Labs
Lojek, Robert Motorola
Lumsdaine, Andrew MIT
Makohon, Richard University of Portland
Meinerzhagen, Berndt Technischen Hochschule Aachen
Melville, Robert AT&T Bell Labs
O'Malley, Robert E. Rensselaer Polytechnic Institute
Odeh, Farouk IBM
Palusinski, O. University of Arizona
Perline, Ron Drexel University
Petzold, Linda R. University of Minnesota
Pidatella, Rosa Maria Citta' Universitaria, Italy
Pillage, Larry University of Texas
Please, Colin Southhampton University
Poupaud,Frederic University of Nice
Reyna, Luis IBM
Richardson, Walter University of Texas at San Antonio
Ringhofer, Christian Arizona State University
Rose, Donald J. Duke University
Rudan, Massimo University of Bologna
Ruehli, Albert IBM
Schmeiser, Christian TU-Wien-Austria
Seidman, Tom U.of Maryland-Baltimore County
Sever, Michael Hebrew University
Singhal, K. AT&T Bell Labs
So, Wasin IMA
Souissi, Kamel IBM
Strojwas, Andre Carnegie Mellon University
Suto, Ken Tohoku University
Szmolyan, Peter TU-Wien-Austria
Tang, Henry IBM
Thomann, Enrique Oregon State University
Venturino, Ezio University of Iowa
Vlach, Jeri University of Waterloo
Ward, Michael Stanford University
White, Jacob MIT
Wrzosek, Darek University of Warsaw
Young, Richard A. University of Portland
IC TECHNOLOGY CAD OVERVIEW*
P. LLOYD**

Abstract. This paper gives an overview of predictive Technology CAD tools for simulating
and modeling the fabrication and electrical behavior of integrated circuits. Recent trends in the
integration of process, device, and circuit simulation tools, and the current emergence of UNIX-
based computing environments of networked workstations make possible user-friendly, task-based
CAD systems for technology optimization, characterization, and cell design.

1. Introduction.
In the electronics industry, CAE and CAD tools/systems have played a critical
role in reducing non recurring engineering (NRE) costs, improving product quality
and shortening time-to-market intervals. The productivity and quality gain in semi-
conductor electronics industry stems from both shorter design intervals and more
robust design verification keyed to process capability and has depended extensively
on simulation. At the lowest level of the CAD tool hierarchy are the circuit, device
and process simulators which link circuit design to fabrication. Detailed process
and device simulation can playa key role in generating data for modeling of circuit
performance prior to fabrication. Predictive capabilities enable early delivery of
accurate compact models which are critical to high performance cell and detailed
sub-system design. These Technology CAD tools for modeling the fabrication and
electrical behavior of integrated circuits are rapidly gaining in maturity. Smooth
interfacing and integration of the various modeling tools has been a recent trend
particularly in industries where Technology CAD has become an integral part of IC
development and is given organizational focus [1].
Initially, predictive Technology CAD (TCAD) tools are generally a substitute
for physical experimentation to save time, effort and money, and to provide addi-
tional insight. Later, tools are integrated into a TCAD system and an optimization
capability is added to aid in the evaluation of competing technology alternatives.
Furthermore, it is recognized that the manufacturing process has inherent variabil-
ity as do the operating and physical environments in which the products have to
work. These variations cause product behavior to deviate from the nominal design
resulting in a reduction of yield. Traditionally, circuits have been designed by us-
ing a worst-case approach, often sacrificing either performance or yield. Technology
CAD tools can be used to make the production process and device and circuit design
less sensitive to the inherent variations. Figure 1.1 illustrates various components
in an TCAD system and their relationship in the context of IC manufacturing and
design.
The evolution of IC technologies is primarily driven by the need for small, light,
fast, low power, reliable electronic circuits in military and commercial systems and

*Presented 7/15/91 at the Summer Program on Semiconductors, Institute for Mathematics and
Its Applications, University of Minnesota.
** AT&T- Bell Laboratories, Allentown, Pennsylvania 18103.
2

information processing systems in particular. Currently the IC market is dominated

by bipolar and MOS products. As of today in 1991, the minimum feature length in
MOS ICs is 0.5-0.8 p.m. It is expected to be shrunk to 0.35 p.m in 1995, 0.25 p.m in
1998 and 0.18 p.m in 2001. While TCAD tools have successfully applied to current
IC technologies, TCAD tools need to be extensively enhanced to better model the
characteristics of future fine-line technologies. Device simulation tools have been
supported by extensive research and development during the last two decades in the
areas of physics and numerics, and the best tools have paced technology evolution.
On the other hand, process simulation capabilities have followed behind cutting-
edge process development due to the number of effects to be modeled, the high
degree of interaction between physical mechanisms and the challenge of establishing
appropriate numerical methods. Device models and circuit simulation need to retain
their accuracy, robustness and speed to sustain the design of the very large circuits
which become feasible as dimensions are deduced.
Simulate
fabrication
Process
Process opetaUons:
Design a
Modeling Oxidation .
Opllm1z8tloo
Implantation.
Diffusloo",

~
,, Simulate

:, De"'" &
electrical
behavior.
Device
Characterization
,, Interconnect
(Ament. a Reliability
,, Modeling
Hot electrons,

D
Stvdies
, Capacltanco_..

1--------
• • • v
Compact F~compact
Palllffleter
De..,., models 10
Fi~s
Modeling cnatacterisllC$

~ Sfmuialo
OC, AC&
transient
ell'evil
Perlonnance

[BE]
Cfrooll
Simulation behOviOtOI pflJdiction a
cells & OpNmlz._
circuits

Figure 1.1. Tool Sets in Technology CAD

TCAD is maturing and several TCAD vendors have entered the market place.
With support from universities, the CAD Framework Initiative (CFI) and a number
of semiconductor companies, framework standards for TCAD are emerging. The
advances in TCAD framework will help streamline TCAD applications. Neverthe-
less, there are many challenging issues yet to be addressed, for example, efficient
simulation on massively parallel compute environments.
This paper provides an overview of TCAD tools and integration of these tools
into TCAD systems for deployment and development . These tools include those
3

for process, device and circuit simulation, parameter extraction and optimization.
TCAD tools can be used independently or coupled together to form tasks. Appli-
cations of these tools and systems at AT&T to technology development and circuit
design for optimization, characterization, and verification will be illustrated.

2. Process simulation.
Accurate simulation of an IC device structure begins with an accurate repre-
sentation of the geometry and material properties of the structure. To obtain this
representation a simulation of the individual process steps involved in fabricating
the device is performed. A typical silicon IC process consists of several sequences of
patterning the Si wafer, localling implanting impurities (acceptors or donors) into
the exposed regions, followed by high temperature activation to get the impurities
into the regions desired at the proper concentrations. Key device characteristics
such as switching speed and threshold voltage depend on precise control of the
doping profiles in the device. To prevent numerous iterations of processing wafers
to achieve the desired device characteristics, simulation of the incorporation and
diffusion of the impurities is necessary. Once the profile of acceptors and donors in
the device is simulated, this profile can be used as input into a device simulator to
predict device characteristics before fabrication.
Ion implantation is the most common method of locally incorporating impurities
into a silicon wafer. High energy ions of the species of interest (typically boron,
arsenic, or phosphorous) are accelerated and bombarded into the silicon surface.
The resultant distribution of ions in the wafer is usually described by a Pearson
IV distribution in the vertical direction, and an error function distribution in the
lateral directions where vertical mask edges exist. The moments ofthe distributions
are determined by the kinetic energy and the mass of the ion. Ion implantation also
results in a distribution of interstitial point defects and longer range defects which
must also be accounted for.
After the impurities are incorporated into the wafer, they must be activated, or
moved onto lattice sites. This is done with a high temperature anneal process which
not only activates the impurities, but causes diffusion of the impurities as well. The
rate of diffusion is dependent on the silicon temperature, the gradient of the impurity
concentration, the local electric field, and the local point defect concentration by
which the impurities may move. Often the silicon surface is exposed to an oxidizing
ambient during diffusion either to grow a thin oxide for the gate of the device,
a thick oxide to isolate devices, or an oxide to be used as the mask layer for a
future process step. This oxidation results in a movement of the surface boundary,
segregation of the impurities at the interface of the growing oxide, and generation
of interstitial defects which diffuse into the silicon. All of these phenomena must be
taken into consideration to accurately calculate the resultant impurity profile. The
diffusion equations are solved numerically on an appropriate multi-dimensional grid
and with appropriate time steps so as not to add numerical errors to the solution in
addition to errors introduced by assumptions 011 which the formulations are based.
In a typical CMOS process, n- and p-channel devices are created on the same
wafer. The impurity profile of the initial p-type substrate can be used to control the
4

threshold voltage of the n-channel device, but a deep tub of n-type impurities must
be created for the p-channel device. The wafer is covered with photoresist, then the
n-type tub region is exposed, implanted and diffused. Next, a new mask is placed
on the surface which protects the regions where the devices will be formed, but
exposes the regions between. A thick oxide to electrically isolate devices is grown
at a high temperature. The mask over the active regions is stripped, and the thin
gate oxide is grown. This is protected immediately with a layer of polysilicon for
the gate material which serves as the mask in the next step for implanting a high
concentration of acceptors into regions on either side of the p-channel gate to create
the source and drain regions of the device. The same is then done for the n-channel
device with a high concentration of donor ions. The wafer is annealed to activate
these source and drain regions, after which the impurity profiles in the devices are
essentially formed. Once simulated in a process simulator [2), as shown in Figure
2.1 these profiles are stored in disk files and can be used by a device simulator to
calculate current-voltage and charge-voltage characteristics.

Figure 2.1. Simulated Profiles of a CMOS Device Structure by

BICEPS
As can be seen from the figure, lateral dimensions of the gate and spacer re-
gions of the devices have shrunk below 1pm, and vertical dimensions of source and
drain junctions are less than .5pm. Therefore, the physical details of the transient
diffusion that occurs in the very short time scales which are used to achieve such
dimensions are increasingly important in modeling the submicron devices. Also,
the exact structure of masking and oxide ~ayers, as a result of deposition and etch-
ing processes above the silicon, plays a significant role in determining the lateral
profile of the devices. Simulation of these processes is becoming essential. Addi-
5

tional, dimensions are shrinking in all three dimensions, requiring three-dimensional

simulation with appropriate visualization.

3. Device modeling.
As semiconductor devices continue to shrink in size, and as new technologies en-
ergy, device structures become more complicated and the need for physically-based
numerical device simulation grows. Simulation tools are needed for the design of
devices and in order to gain insight into new physical effects. At present, device
modeling has become a necessary and integral element in any new process or tech-
nology development effort.
Device modeling is accomplished by solving the basic equations governing the
behavior of semiconductor devices. Basically, Maxwell's equations of electromag-
netism and t,he Boltzmann Transport Equation (BTE).
A direct approach to solve BTE is the Monte Carlo method. This technique sim-
ulates, at a microscopic level, the transport process of mobile carriers. The Monte
Carlo approach has proven to be successful in simulating transport effects. How-
ever, its primary drawback is the enormous cost associated with the long cpu time
required, particularly when coupled with Poisson's equation. The Hydrodynamic
model or the Momentum and En~rgy Balance equations are alternative approaches
to solving the BTE. However, the simplest form of the transport equation is that of
the Drift and Diffusion model. This model,. which can be derived from the hydro-
dynamic model, comprises electron and hole current continuity equations coupled
with Poisson's equation.
The inputs to device simulation tools are typically a description of impurity dop-
ing profiles obtained from process simulation, as discussed in the previous section,
and device geometry as well as bias conditions. The output will be the electrical
responses e.g. steady-state, transient or small-signal waveforms of currents and
voltages at terminals and/or carrier density, electric field or potential distributions
inside the device.
General two-dimensional (2D) or three-dimensional (3D) device simulation pro-
grams, such as PADRE [3], as well as application-specific tools, such as MEDUSA
[4], can be used to solve a wide range or problems.
There are many examples of device simulations applied in both device optimiza-
tion and reliability improvement. In device optimization, various devices are simu-
lated to quantify the effects of short channel length and narrow channel widths using
2D and 3D device simulators. As far as reliability improvement is concerned, de-
vice simulations have been used to refine CMOS device structure to prevent latchup
problems [3] and gain insight into the effects of hot carriers and velocity overshoot.
As dimensions of electronic devices decrease, resulting in faster switching speed,
delay caused by parasitic capacitance and resistance of interconnections becomes
more significant. The RESCAL program [5] has been developed to solve Laplace's
equation in two dimensions to provide fast an.d accurate values of distributed ca-
pacitance and resistance. RESCAL also produces plots of equipotential and flux
lines to represent visually the distribution of the electric field. Figure 3.1 shows an
6

example of a contour plot for a structure in which there are two layers of different
dielectric constants, and two trapezoid-shaped conductors in the lower dielectric
layer over a conducting substrate.

Figure 3.1. Contour Plot of the Electrostatic Field for two

Conductors over a Ground Plane and with two Dielectrics

4. Compact models.
Circuit simulations are used to verify IC designs based upon compact device
models, so the models must be able to accurately represent characteristics of de-
vices being manufactured. Also, the device characteristics generated from compact
models provide a reference to which the manufacturing process should be controlled
such that the device characteristics will resemble the reference. Compact models
are thus an important link between IC design and manufacturing.
Aggressive IC design places strong demands on compact device models. The
models must allow to accurately represent the DC, AC and transient behavior of
circuits, and often also the circuit noise performance, distortion level and sensitiv-
ity to variations in manufacturing and operating conditions. In addition, compact
models must be computationally efficient, must be simple enough for robust param-
eter extraction, and must model device behavior over a wide range of bias, geometry
and operating temperature. To meet all these challenges is often difficult, for exam-
ple, MOSFETs exhibit an exponential variation of current with applied bias in the
subthreshold region of operation and a polynomial variation of current with applied
bias above threshold. It is also desirable for compact models to have a good basis
7

in device physics, so that physical understanding can be used to guide model devel-
opment and ongoing model improvements, and so that process variations measured
as changes in test wafer measurements can be mapped, at least to a first order, into
changes in model parameters.
ASIM3, an enhanced version of the model described in [6], is the most advanced
MOSFET model available in the AT&T circuit simulator, ADVICE [7]. ASIM
includes subthreshold conduction, models short and narrow channel effects, and is
based on an advanced mobility model that accounts for mobility reduction due to
gate and backgate fields and due to velocity saturation. ASIM is charge-based, and
accurately models both the partitioning of charge between the source and drain and
the variation of overlap capacitance with bias. Both the geometry and temperature
dependence of MOSFET behavior are modeled by ASIM. ASIM includes models
for noise and substrate injection current. The current and charge models of ASIM
are continuous in function value and derivatives with respect to the applied biases
across all operating regions.
Figure 4.1 shows the output and subthreshold characteristics of ASIM, com-
pared with data from the MEDUSA device simulator. ASIM accurately models the
DC current, output conductance and transconductance of MOSFETs.

'0 OJI

, , ""-'
,~: , ""-'
-,*-, 0.7

-,*-, OJI
ur'
~

1 ur' I' I ...

_3 .J
,"" 0.3

,,,.. ...
,0-'0
,V 0.,

0 0
0 0
v.. M v.. M V.. M

, ""-'
D.'
'0
-,*-,

'0-'
,0-'
~
L""
-' ,,,.. •
"
' 0"'
'0-'
,,", ,...
'0"
0
'0"' 0 , U
0
0 2 3
V.. M V.. M V.. M

4.1. MOSFET Characteristics by ASIM and MEDUSA

Major deficiencies have existed in previous MOSFET models. First, most MOS-
FET models are formulated as regional models, that is, different modeling equations
are used in the subthreshold, triode and saturation regions of operation. Regional
models have limited continuity, and display kinks and glitches, at the region bound-
aries. This causes problems for parameter extraction and DC convergence, limits
the accuracy of distortion analyses, makes some advanced techniques such as ho-
motopy [8] inapplicable to MOS circuits, limits the order of integration that can be
used for transient analyses, and leads to inefficient transient analyses as it causes
small time-steps to be used. Second, most MOSFET models are formulated with
the source node as the reference. This easily causes the model to display asym-
metries with respect to the source and drain, even though MOSFETs usually are
symmetric devices.

5. Circuit simulation.
Small feature size and mixed analog and digital components in today's VLSI IC
technologies demand more accurate technology modeling in circuit simulation than
that in the past. The emergence of networked workstation environments demands,
flexibility in task-oriented procedural simulation and robustness in design centering.
A circuit simulation system normally consists of six components: front-end for
user interactions, macro interface for procedural threads, design centering and opti-
mization, analysis engine, model interface for device models, and graphics display.

- The Front-End
The front-end provides a graphic user interface between the user and the
circuit simulator. In addition, it interacts with a schematic capture program
and also provides remote execution capabilities across the network. User-
friendliness is determined by not only the set of commands but also its look
and feel.

- The Macro Interface

Through this interface, designers may drive the simulator by programming a
sequence of circuit simulation tasks, including updating the circuit, perform-
ing simulation and analyzing results, thus eliminating repetitive manual pro-
cedures. AT&T's circuit simulator ADVICE contains a C-Ianguage based
macro interface an its procedural simulation controls the circuit simulation
at fine granularity at the simulation-command level by linking the engine
and the interface through UNIX pipes. ADVICE also has a specification-
driven generator [9] which generates C code for macro procedural simulation,
thus it relieves designers from the burden of mastering programming tasks.

- The Design Optimization

Successful circuit design often requires design iterations, to optimize the cir-
cuit. Normally the optimizer is coupled closely with the macro interface to
control the analysis engine. The simulation language is extended to allow
specifications of design objectives. Statistical models [10] for semiconductor
devices are needed for yield and manufacturability analysis to determine the
9

impact of manufacturing and environmental variations on a design. Vari-

ous optimization algorithms can be used. For example, ADVICE contains
interactive features for design optimization and manufacturability analysis
based on a software system called CENTER [11]. The CENTER system
contains features of both deterministic optimization and manufacturability
analysis. The deterministic optimization in CENTER can be accomplished
by either sequential quadratic programming or random search.

- The Analysis Engine

The engine provides analysis capabilities for dc, ac, noise, sensitivity, tran-
sient, steady-state, Monte-Carlo, loop stability, and others. It handles a
comprehensive set of components including resistors, capacitors, inductors,
transformers, voltage sources, current sources, controlled sources, diodes,
bipolar transistors (bjts), junction field-effect transistors (jfets), mosfets,
josephson junction devices, switches, and transmission lines. For most types
of components, there is an associated model with a set of model parameters
to characterize a specific IC technology. For example, ADVICE contains an
extended Gummel-Poon model [12] and a charge-based short channel model,
ASIM which accurately model the devices in AT&T's technologies. Solving
the circuit equations involves a number of numerical techniques, such as
implicit time integration, Newton-Raphson iterations, homotopy methods,
and sparse matrix techniques.

- The Model Interface

The model interface allows users to introduce their own models into the
simulator [13]. It has proven to be an effective tool for developing new
models for circuit simulation.

- The Graphics Display

Graphics display is as important as the front-end. It allows the user to
visually examine the simulation results mostly via 2-D graphics.
6. TCAD system framework.
In the area of TCAD frameworks, a key issue is the distinction between frame-
works used primarily for (1) tool development and (2) tool integration/deployment.
In order to better differentiate these two, we will refer to a tool development envi-
ronment and a tool deployment framework.
In recent years, TCAD tools have matured to the extend that they are capable,
with moderate accuracy, of modeling the dominant performance-limiting features
of current technologies. With this maturity has come an increasing demand for
accuracy, robustness, integration (with other tools) and ease of use. These demands
place additional constraints on TCAD tool developers and require an increase in
development resources.
This situation has been recognized by several university development groups, as
well as their industrial counterparts. One of the early efforts in this area was the
Profile Interchange Format (PIF) [14] which is now becoming a standard for data
10

exchange between TCAD simulation tools. Data exchange between TCAD tools is
the first hurdle that must be overcome before tools can be integrated. Integration,
however, is only the first stage of deployment. Other issues such as capability,
accuracy, robustness, ease of use, and user-friendliness all play an important role in
gaining the user's acceptance.
TCAD frameworks can be viewed from two distinct points of view:
• As a framework for the integration of TCAD tools. An example of such
a framework is the AT&T Mecca system [15] and Intel's Ease system [16]
which integrate process/device and circuit simulation with analysis tools
such as optimization and parameter extraction as shown in Figure 6.1.
• As an environment, and associated set of support tools, for the development
of TCAD tools as shown in Figure 6.1. We are not aware of any significant
previous work in this area.
The integration framework is of immediate benefit to all TCAD practitioners, as
well as TCAD customer organizations: technology development, technology charac-
terization, and manufacturing. The development environment, on the other hand,
is of importance primarily to universities and industry R&D groups engaged in
TCAD tool development.
The next two subsections expand on the above definitions.

Analysis Capabilities Simulation Ca-! abilltl~

Sensitivity Analysis

worst-Case
Analysis

Optimization

User Interface

Figure 6.1. An Integrated set of TCAD Simulation and Anal-

ysis Tools
11

6.1 Summary. We have discussed two views of TCAD frameworks. These two
views are not exclusive. In fact, achieving a standard for TCAD tool development
would result in increased uniformity across the tools, which would greatly ease the
integration of such tools into a common TCAD system.
In addition to the traditional uses of TCAD in technology development and
characterization, opportunities are being persued in:

Process Control and Diagnosis

Where TCAD tools can be used to develop algorithms for active process
control, and as an aid in diagnosing process faults.

Computer Integrated Manufacturing

Where TCAD tools can be integrated into the manufacturing environment
and used to evaluate and enhance reliability, yield and manufacturability.

7. Applications (Loop-closure, Optimization, Worst case). Figure 1.1

illustrates how the individual AT&T technology CAD tools are put together into an
integrated system. With this system, given the process description and the structure
and geometry of a device, a compact device model parameters can be determined.
The compact model can be used in the ADVICE circuit simulator to characterize the
circuit performance. Furthermore, the effects of statistical variations in the process
control parameters on the compact model parameters and the circuit performance
can be determined. (Important process control parameters include furnace times
and temperatures, ion implant doses and energies, etc.) This technology CAD
system is a useful aid in predicting performance of circuits, verifying designs, and
developing new or modified technologies.
Extraction of compact circuit model parameters is routinely done for each tech-
nology and technology variant. The extraction is carried out for at least three cases:
nominal, worst-case slow and worst-case fast. For nominal, all the process input
conditions are nominal. For worst-case fast, the process input conditions are such
as to result in the slowest possible circuit performance (for digital MOS circuits,
this amounts to minimum current drive of the transistors), while for worst-case fast,
the process inputs are such as to result in the fastest possible circuit performance
(maximum current drive of the transistors). Thus, each technology is characterized
by a nominal, worst-case slow, and a worst-case fast compact device model. These
models are used in circuit simulators to characterize important or characteristic
subcircuit modules, and the worst-case models are used to verify that designs will
meet their specification limits. Finally, the accuracy of the models is verified versus
test data from the manufacturing line.
Optimization can be done using the CENTER software system, which exploits
features of the UNIX operating system. CENTER can be used to optimize IC
designs and semiconductor device technologies. But only the latter will be discussed
here. Figure 7.1 shows how CENTER is integrated into AT&T's technology CAD
system. The inputs include the technology objectives and constraints. CENTER
contains six numerical methods for nonlinear optimization:
12

- Sequential quadratic programming

- A projected, augmented Lagrangian algorithm
- A quasi-Newton solver
- A nonlinear least squares minimizer
- The NeIder-Mead simplex method
- Simulated annealing
Verification of the compact device models is done based upon measurements on
each lot in the manufacturing line. Figure 7.2 shows plots for the O.9fLm CMOS
technology of the measured NMOS Ion for 23 lots versus the Ion predicted by the
nominal and the worst-case slow and fast compact models. The measured data are
within the limits predicted by the models. Figure 7.3 shows the probability distri-
bution of the measured ring oscillator frequency for the 1.25fLm CMOS technology
versus the frequencies predicted by the compact circuit models. The predicted and
measured frequencies agree well.

Problem
Definition

User
CENTER Interface

••C°U'IuIM: "
;~
(Approximate)
~1JiFUW1h~
( Worst-Case

Simulators Task Modules

Figure 7.1. CENTER Optimization System

13
14

Ring Oscillator Frequencies

1.25 f.lm CMOS Technology

--
C/)

'2
:::s
.e
-
"FAST" Prediction
ttl

"SLOW" Prediction

1 10 50 90 100
Normal Probability (%)
Figure 7.3. Predicted and Measured Ring Oscillator Frequen-
cies

8. Current trends and conclusions.

The following summarizes the major observations and trends discussed in this
paper:
• TCAD is maturing. Several vendors have entered the market place.
• Process simulation tools require better physical models and numerical meth-
ods.
• Above-silicon simulation and visualization are increasingly important.
• Device and interconnect models are being improved to meet challenges for
submicron design.
• Circuit simulation will become more robust and efficient for large circuits
with extensive interconnects.
• Algorithms are needed to exploit evolving vector parallel computer archi-
tecture.
• Evolving standards and integration framework provide for collaborations
amongst companies and universities ..
• Technology CAD will contribute to manufacturing studies.
15

9. Acknowledgements.
I thank past and present members of the Technology CAD Department at AT&T
Bell Laboratories for their contributions to the work described here. In addition, I
thank Heinz Dirks, Sally Liu, Jim Prendergast and Kishore Singhal for their help
in preparing this paper.

REFERENCES

[1) P. LLOYD, H.K. DIRKS, E.J. PRENDERGAST, AND K. SINGHAL, Technology CAD for Com-
petitive Products, IEEE Trans. Computer-Aided Design, vol. CAD-9, Nov. 1990.
[2) B.R. PENUMALLI, A Comprehensive Two-Dimensional VLSI Process Simulation Program,
BICEPS, IEEE Trans. Electron Devices, vol. ED-30, Sept. 1983.
[3) M.R. PINTO, W.M. COUGHRAN, JR., C.S. RAFFERTY, AND E. SANGIORGI, Device Simulation
for Silicon ULSI, Computational Electronics, Ed. K. Hess, J.P. Leburton, and U. Ravaioli,
Kluwer Academic Publishers, 1991.
[4) W.L. ENGL, R. LAUR, AND H.K. DIRKS, MEDUSA-A Simulator for Modular Circuits, IEEE
Trans. Computer-Aided Design, vol. CAD-I, April 1982.
(5) B.R. CHAWLA AND H.K. GUMMEL, A Boundary Technique for Calculation of Distributed
Resistance, IEEE Trans. Electron Devices, vol. ED-17, Oct. 1970.
(6) S.W. LEE AND R.C. RENNICK, A Compact IGFET Model-ASIM, IEEE Trans. Computer-Aided
Design, vol. CAD-7, Sept. 1988.
(7) L.W. NAGEL, ADVICE for Circuit Simulation, Proc. ISCAS, Houston, 1980.
(8) L. TRAJKOVIC, R.C. MELVILLE, S.-C. FANG, Improving DC Convergence in a Circuit Simu-
lator Using a Homotopy Method, IEEE Custom Integrated Circuits conference - CICC-91,
San Diego, CA, May 1991.
(9) M.S. TOTH, MakCal: An Application Generator for ADVICE, AT&T Technical Journal,
vol. 70, Jan./Feb. 1991.
(10) S. LIU AND K. SINGHAL, A Statistical Model for MOSFETS, IEEE International Conference
on Computer-Aided Design - ICCAD-85, Santa Clara, CA, Nov. 1985.
(11) K. SINGHAL, C.C. McANDREW, S.R. NASSIF, AND V. VISVANATHAN, The CENTER Design
Optimization System, AT&T Technical Journal, vol. 68, May/June 1989.
(12) G.M. KULL, L.W. NAGEL, S.-W. LEE, P. LLOYD, E.J. PRENDERGAST, AND H.K. DIRKS,
A Unified Circuit Model for Bipolar Transistors including Quasi-Saturation Effects, IEEE
Trans. Electron Devices, vol. ED-32, June 1985.
(13) S. LIU, K.C. Hsu, AND P. SUBRAMANIAM, ADMIT-ADVICE Modeling Interface Tool, Proc.
1988 Custom Integrated Circuits Conference, Rochester, 1988.
(14) S.G. DUVALL, An Interchange Format for Process and Device Simulation, IEEE Trans.
Computer-Aided Design, vol. CAD-7, July 1988.
(15) E.J. PRENDERGAST, An Integrated Approach to Modeling, Proc. NASCODE IV, Dublin,
June 1985.
(16) J. MAR, K. BHARGAVAN, S.G. DUVALL, R. FIRESTONE, D.J. LUCEY, S.N. NANGAONKAR, S.
Wu, K.-S. Yu, AND F. ZARBAKHSH, EASE-An Application-Based CAD System for Process
Design, IEEE Trans. Computer-Aided Design, vol. CAD-6, Nov. 1987.
THE BOLTZMANN-POISSON SYSTEM
IN WEAKLY COLLISIONAL SHEATHS

S. HAMAGUCHI, R. T. FAROUKI, AND M. DALVIE*

Abstract. Ion distribution functions in weakly-collisional direct-current (DC) sheaths and

collisionless radio-frequency (RF) sheaths are discussed from the viewpoint of kinetic theory. Ana-
lytical formulae for the ion distributions in a self-consistent field are obtained for weakly-collisional
sheaths, and are shown to be in good agreement with results from Monte Carlo simulations. A
detailed knowledge of the angular and energy distribution of the ion flux impinging on a surface is
of interest in the plasma-processing of semiconductor materials.

1. Introduction. In various plasma processes used in integrated-circuit (IC)

fabrication technology [1] [2], the outcome of an etch or deposition step can be sensi-
tively dependent on the angular and energy distribution of the ion flux bombarding
the semiconductor wafer. As the dimensions of ICs diminish and more stringent con-
trol over feature shapes and sizes is required, weakly-collisional or collisionless plasma
discharges are increasingly used for such processes, since the uni-directionality of the
ion flux increases as the frequency of ion-neutral collisions in the sheath diminishes.
In such weakly-collisional/ collisionless plasmas, however, the incident-ion distribu-
tions are expected to be sensitive functions of such basic plasma parameters as the
applied cathode voltage, sheath thickness, and bulk plasma temperature.
The goal of this paper is to determine the relation between these controllable
plasma parameters and the incident-ion distribution functions. To achieve this, we
discuss ion kinetics in a weakly-collisional direct-current (DC) sheath, based on a
boundary value problem of the steady-state Boltzmann-Poisson system. Assuming
that elastic hard-sphere collisions are the dominant form of ion-neutral encounter
and that the ion mean-free-path is large compared to the sheath thickness, we ob-
tain the ion velocity distribution function at the cathode to lowest order in the col-
lisionality parameter (Le., the ratio of the sheath thickness to the ion mean free
path). Subsequently, angular and energy distributions of the ion flux are calculated
and compared with data from Monte Carlo simulations. We also briefly discuss a
time-periodic Vlasov-Poisson system that describes the ion kinetics in a collision-
less radio-frequency (RF) sheath. In the regime of high RF frequency, we obtain
a simple expression for the characteristic "double-peaked" profile of the ion energy
distribution [3] [4].
This paper is organized as follows: in the following section, the Boltzmann-Poisson
system for a collisional DC sheath is formulated and solved in the limit of small
collisionality. In sections 3 and 4, the angular and energy distributions of the incident-
ion flux in a DC sheath are calculated analytically and numerically. In section 5,
collisionless RF sheaths are discussed briefly. The final section contains conclusions.

2. Kinetic equations for DC sheaths. -As a model of a DC plasma sheath

[5][6], we consider a steady-state (a/at = 0), capacitively coupled planar discharge

* IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598

with the electric field in the z direction. For the sake of brevity, the sheath is assumed
to be composed purely of ions; we define the presheath/sheath boundary as the point
beyond which electrons are significantly depleted. Since, in most planar discharges,
the neutral density ng is much higher than the ion density n, we take into account
only two-body ion-neutral collisions. We also assume that the ions and the neutrals
have equal mass, and that the neutrals are uniformly distributed and cold, i.e., the
velocity distribution of the neutrals is given by F(v) = ngo(v).
The ion distribution function fey, z) is then governed by the following non-
dimensionalized Boltzmann-Poisson system:

(1) Uz
af -
a..r
d,p af- d
- dr -a = -,- u
.. Uz /lmfp
{)4
J(- feu ,
u'
U
-, U
()-iff! -
U!o!al
- }
feu, () ,

(2)

where U = lui and u' = lu'l. Here we have used the normalizations

f
(3)
v uWpid, z (d,

where nl is the ion density at the presheath/sheath boundary, nl = nCO), d is the

sheath thickness, q is the ion charge, Wp; = (q2nJ/eom)1/2 is the ion plasma frequency,
z is distance into the sheath measured from the presheath/sheath boundary, and (l)
is the electric field potential. In Eq. (1), U is the differential cross section for ion-
neutral collisions, Uto!al is the total cross section, and Amfp = (ngu!o!al)-l is the mean
free path. The primed quantity u' denotes an ion velocity before a collision. Here
the ion-neutral collisions are assumed to be elastic hard-sphere collisions. Then the
differential cross section for scattering of ions into the solid angle element iff! may
be written [7J as

d2n U!o!al. Xd d. l•
(4) U .. = g;-SlD2" X 0/,

where X is the polar scattering angle measured in the laboratory system, and t/J is the
azimuthal scattering angle about the u' direction. Note that u = u' cos x.
We now solve Eqs. (1) and (2) in the limit of weak collisionality, e = d/Amfp < 1.
For the sake of simplicity, the ions are assumed to enter the sheath with a fixed
velocity VB as a beam, so the boundary conditions for f are given by

for u z ;::: 0,
(5)
/(u,( = 1) = 0 for Uz < O.
Here UB = vB/wpid > 0, and U.L is the magnitude of the component of u perpendicular
to the z direction. It is known that the initial ion stream velocity VB is typically given
by the ion sound speed VB = (kBT./m)!, where kB is the Boltzmann constant, T.
is the electron temperature of the bulk plasma and m is the ion mass (this is the
19

Bohm sheath criterion; see [8]). The boundary values for the potential are given by
I/> = 0 and dl/>/d( = EJ at ( = 0, where EJ denotes the (normalized) magnitude of
the electric field at the presheath/sheath boundary.
Assuming that the dependence of 1 and I/> on c: is analytic, we expand the ion
distribution function 1 and the potential ,p in terms of the small parameter c: in the
form 1 = 10 + c:ll + ... and I/> = <Po + c:l/>l + .... To the lowest order, we obtain from
Eqs. (1) and (2) the following equations for a collisionless sheath:

(6)
810 _ d,po 8j~ _0
Uz 8( d( 8u z - ,

(7)

With the use of the new independent variable

(8)

Eq. (6) becomes uz 810(uJ.,£,O/8( = 0 and its solution with the boundary condition
(5) is given by

(9)
_=
fa
{~~,,~)8( /U - UB) if U z > 0 and & 2: 0,

o otherwise.

The lowest-order potential ,po may be calculated through substitution of Eq. (9)
into Eq. (7), i.e.,

d2 ,po -UB
(10)
de = Ju1- 2,po·
The exact, closed-form solution of Eq. (10) is derived in [10], where it is shown that
1/>0 is a non-positive ( ,po :::; 0 ) monotonically decreasing function for all ( 2: o. It is
known that in the limit UB ~ -2,po, the solution of Eq. (10) gives the collisionless
Child-Langmuir law [9] [10]:

We now proceed to the first-order equations, which are given by

(11)

(12)
20

where

B+ = uJ (~)41o(u/)~<fn
--1
u Utotal

*oO(Vii - UB) ifu z > 0 and h == (U1 ~~ U~)2 + 2IPo ~ 0,

(13)
otherwise.
and
if U z > 0 and & ~ 0,
(14) " ~ { : '1:.'1'(.;u - "8)
otherwise.

For the derivation of Eq. (13), see [6].

In order to solve the first-order equation (11), we again transform the variable U z to
& defined in Eq. (8). It is convenient to split II as II = l~ + 1~ in such a way that Eq.
- -
(11) may be written as uzafl la( = B+ and u.af~ la( - (dlPdd()afolau. = -B-,
~

or
...!,o( Vii - UB)
al~ = { "U
z
if U z > 0 and h ~ 0,
(15) a(
o otherwise,

with u~(&,() = 2(& -lPo«(», and

al~ _ ~ dlPI alo = { -~~uJ.}O(-.!U - UB) if u. > 0 and & ~ 0,
(16)
a( Uz d( auz 0
otherwise,

where uo(u.d = uzo(u.d is used. Since

1 #Ialo dlPlo(u.L) O(-.!U-UB)

(17)
- !t z df au. = df 27ru.L UB( -.!U - UB) ,
we obtain from Eq. (16)

(18) l~ = - o(U.L) o(VU _ UB)( _ o(U.L) o(-.!U - UB) IPI«()

27ru.L 27ru.L UB( -.!U - UB)

if U z > 0 and & ~ 0, and l~ = 0 otherwise.

Since lPo( () is a monotonically decreasing function, we transform the variable ( to
y by

(19) y = -2IPo«().
Evidently y ~ 0 and y ~ -2&. Then Eq. (15) becomes

aft _ O(-./h-UB)
(20)
8y - 2IPo7r(2& + y)
21

ifU z > 0 and h ~ o. Otherwise 8ft /Oy = O. Here tP~ = dtPo/d( is evaluated at
( = ((Y) = tPo 1 ( -y/2). The function h may be written in terms of £, U.L, and y as

2 ul
h = 2( U.L + £) + 2£ + y ,
which is a monotonically decreasing function of y (> -2£). In integrating Eq. (15),
we find
(21)

(22)

(23) and

on the domain y ~ 0, y ~ -2£, and U.L ~ o. Otherwise, It = O. In Eq. (21),

(c = (c(U.L,£) denotes the (value that satisfies h = u~, or (c = ((Ye) with
4
(24) Yc = -_v
?C'
+ UB-
2
U.L
2( 2 vC').
U.L+

3. Angular distributions. We now calculate the angular distribution of the

ion flux

fe(O, z)

(25)

where

(26)

In this section, we are concerned only with angular distributions for 0 > 0 and do not
count ballistic ions (0 = 0) whose distribution function is given by the 6 function.
In order to carry out the integration of Eq. (25), we need to determine the range
of the integration variable U for which Ii
is given by Eq. (21). From the inequality
(22) and Eq. (26), we obtain the condition

(27)
Substituting Eq. (26) into the inequality (23) yields the condition

(28)
The discriminant of this quadratic equation for u 2 is given by

D (2y + u~ cos 2 0)2 - 4(y + u~)y

u~( u~ cos4 0 - 4y sin 2 0).
22

Therefore, the range of 1£ is given as follows: if

(29)

then we only need Eq. (27), i.e.,

(30)

On the other hand, if

1£1 sin 2 0
(31) - > -- (¢:::::> D 2 0)
4y - cos 4 0

then inequalities (27) and (28) must hold simultaneously, i.e.,

(32) and

It is easy to show that (1£~ + y) cos 2 02 y + (1£~/2) cos 2 0 + JD/2 for all y 20 and O.
It should be noted that the term 1£1/y = mv1/2ql<l>ol denotes the ratio of the initial
ion kinetic energy to the zeroth-order potential energy <1>0 at z = Cd and typically
takes a small value.
In the case 1£1/4y = mv1/8ql<l>ol < sin 2 0/ cos 4 0 where the inequality (30) holds,
therefore, the angular distribution of the ion flux is given by

ro(O,z) =

(33)

where the relation 1£1- 2(1t.L2 + E) = (u~ + y) - u2 (1 + sin 2 0) is used. For small
angles 0 satisfying u1 /4y 2 sin 2 0/ cos 4 0, the range of integration of Eq. (33) needs
to be changed according to the inequality (32).
If the electric field is constant, then the term -</l~ = EJ may be taken outside of
the integration and we can in fact carry out the integration:

(34) r o(0 , z ) = _d_

). r a (u1 - 2</lo <0 )r. (0)
E l:IO,
mfp r

where ro = nrVB denotes the total flux and

2 sin 0 cos 0 ( 1
(35) 90(0) = ( . 20)2 4 log --:-0
1 + sm sm
+ sm. 4 0 - )
1 .

Equation (35) gives the profile of the ion flux angular distribution.
If the initial velocity VB is sufficiently srnall, so that the condition u1/4y « 1 is
satisfied, then the inequality u1/4y = mv~/8ql<l>ol < sin 2 0/ cos 4 0 holds for most of
23

0> 0, i.e., uB/2y ;S 0:<:; 7r/2. In this case, Eq. (34) may be further simplified with
the use of u1 ~ -</>0 = Fh( and given in dimensional form by

(36) ro(O,z)
roz-90(0).
= -,
/lmfp

Thus, ro is seen to be independent of the electric field strength EI in the case of a

constant electric field.
In the case of self-consistent electric fields, where the potential </>0 is obtained
by solving Eq. (10), the electric field cP~ is no longer independent of (. The angular
distribution r 0(0, z) then needs to be calculated directly from Eq. (33) with knowledge
of the dependence of </>~ on u and O.
Multiplying Eq. (10) by d</>o/d( and integrating with respect to (, we obtain

where the boundary condition d</>o/ d( = EI is used. Carrying out this integration
and substituting cPo = -Ye/2 yields

(37)

where f{ = EJ/2u1- 1 (> -1). It is shown in [10] that the potential </>0 is a weak
function of f{ for realistic values of I{ (-1 < f{ < ~). The function Ye given in
Eq. (24) satisfies

(38)

Equations (37) and (38) give an expression for the dependence of cP~( (e) on u and O.
Introducing

(39) ~=--,
U

U max
a=--
UB
U max
with U max = Ju1 + y,

(40)

From Eq. (33), the angular distribution of the ion flux is then given by

(41)

where, in the case of a self-consistent field, we define

(42) 9;C( 0; a,I{) = 2 sin 0 cos 0 11(,,)

r - ()( ~3d~. 20)e)'
ga ~ 1 - 1 + sm
24

Here the range of integration lea) is given as follows: from the inequality (30),

if (¢=} u1<sin
-
4y
8)
2
- 4-
cos 8 '

(43) then lea) = {O S ~ S cos8}.

Although u1j4y is generally small and the angular distribution for most values of 8
is given by the integration over lea) above, an accurate account of the small-angle
distribution must be given by a different integration range lea). From the inequality
(32),

if
Q> 2sin8
- (1 + sin2 8)

(44) then

Here iJ = a 2 (a 2 cos 4 8 - 4(1 - ( 2 ) sin 2 8). We note that the function gee depends
on I< through the term aI< in the function ga(~) of Eq. (40). Since the value Q is
typically small and the dependence of the potential 4>0 on the parameter I< is known
to be weak (10), the parameter dependence of gee on I< is also weak.
Figure 1 shows a comparison of the theoretically-predicted angular distributions
and Monte Carlo simulation results in the case dj >'mfp = 0.14. The electric field used
in the Monte Carlo simulation and the self-consistent-field distribution (Eq. (42), the
solid line) is the solution to Eq. (10), subject to the boundary conditions 4>0(0) = 0
and EJ = d4>o(O)jd( = 2.8 x 10- 4 . The constant-field approximation (Eq. (35» is
given by the dashed line. The theoretical distribution for the ballistic ion component,
which is a delta function at 8 = 0, is not shown here and the Monte Carlo ballistic
ion component, represented by the first bin at 8 = 0, is truncated by the frame of
the figure; all curves are normalized so as to enclose unit area. A good agreement
between the analytic distributions and the simulation results is evident in Fig. 1.

4. Energy distributions. The energy distribution of the ion flux rEN(I],Z) is

defined as

(45) rEN(I], z)dl] = Jo

f27r r
dcp Jo sin 8d8vzfv2dv,

where I] denotes the ratio of the ion kinetic energy to the kinetic energy of the ballistic
ions at z, i.e.,
!rnv 2 u2
(46) I] = 1 2 = -2- _ .
"i rnv B 2 - q<Po UB +y
For scattered ions (I] < 1), we have from Eq. (45)

(47)
25

CD
~
c::: 2
0
+::
"- ,, -

€ \
\
U5 \
i:5 \
..... \
J!! \ -
~
\
,,
0>
c::: \
«

OULUL~~~~~~~'U'~"-~U~~~~--L-I-L~
0.0 0.2 0.4 0.6 0.8 1.0
e / (1tI2)
FIG. 1. The angular distributions of the ion flu x in the case of a self-consistent electric field obtained
from the Monte Carlo simulations (histogram) and Eq. (42) (the solid curve). For comparison, the
formula for the constant-field approximation (Eq. (35)) is also presented as a dashed line . The
dimensionless parameters used here are d/A mjp = 0.14, un = 1.0 x 10- 2 , EJ = 5.1 X 10- 3 , and
(= 1.
26

Here we have used the relation u~ - 2( ul + &) = (u~ + y)(1 - (1 + sin2 0)1]).
The range of integration J(O') for 0 is obtained from the conditions (27) and (28):
If 0 ~ 1] ~ 1 - 0'2, then

(48)

and if 1 - 0'2 ~ 1] < 1, then

(49)

In the case of a constant electric field, we may substitute -</>~('C)

Eq. (47). We thus obtain

(50)

where
-log(1 - 1])
(51) Qm("l; 0') ={
-log 0'2 (1 - 0'2 < "l < 1),
which holds for any 0' (0 < 0' < 1). We note that the distribution is constant for
1-0'2<"l<1.
In the case of a self-consistent electric field, the potential <Po is obtained by solving
Eq. (10). In this case, the electric field <P~ also becomes a function of "l and O. As
shown in Eqs. (37) and (38), we may write -<p~('c) = ..j2;;u max gE(O), where

(52) gE(O) = JJ(1- "l)2/(1 - (1 + sin 2 O)"l) + O'I<.

Here the function gE«(}) and the function ga(O of Eq. (40) are equivalent, with the
e
relation "l = = U2/(U~ + y). From Eq. (47), the energy distribution of the ion flux
for a self-consistent electric field is then given by

(53)

where

(54) f?SC ( }")

'!:Im"l;O',\ ="l
1J(,,)
sin () cos OdO
.'
gE(O)(1 - (1 + sm 2 O)"l)

Figure 2 shows a comparison of theoretically-predicted energy distributions (the

constant-field approximation (51) for the dashed line and the self-consistent-field
distribution (54) for the solid line) to Monte Carlo simulation results in the case
d/ Amjp = 0.14. The electric field used in the Monte Carlo simulations and for Eq. (54)
is the same as that used in Fig. 1. The analytical distribution for the ballistic ion
component, which is a delta function at "l = 1, is not shown in Fig. 2 and the Monte
Carlo ballistic ion component, represented by the bin at "l = 1 is truncated by the
frame of the figure. A good agreement between the analytical formulae and the Monte
Carlo results is clearly seen in Fig. 2.
27

4
E
z
w
~
c::
a 3
~
.0
.....
.~

II)

0 2
>. A
E>
Q)
/
/.
/
c:: ,/
W "..
"..
"..

0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Ion Energy 11
FIG . 2. The energy distributions of the ion flux in the case of a self-consistent electric field obtained
from the Monte Carlo simulations (histogram) and Eq. (54)(the solid curve). For comparison, the
formula for the constant-field approximation (Eq. (51)) is also presented as a dashed line. The
dimensionless parameters used here are the sante as those for Fig. 1.

5. RF sheaths. We now briefly discuss ion distributions in RF sheaths. For the

sake of simplicity we consider only a collisionless sheath, to which a time-periodic
cathode voltage is applied. We also assume, as before, the electrons to be sufficiently
depleted in the sheath that the sheath electric field is determined only by the ion
space charge. The validity of these assumptions will be discussed later in this section.
The ion velocity distribution function f( t , z, v z ) at position z is then governed, after
some normalization, by the following Vlasov- Poisson system:

(55)

(56) cf2¢
d(2 =- Jf( -T, (, U z )du.,

where the potential satisfies a time-periodicity, i.e., ¢( T, 0 ¢(T + 27r,O. The

boundary condition for J(T,(,U z ) is given by

(57)

namely, the ions are assumed to be injected into the sheath as a beam in the z direction
with velocity UB. Since the sheath is collisionless, the perpendicular component of the
velocity vanishes, i.e., ttl. = 0, and we may consider that the distiribution function
J is already integrated over Ul.' The normalizations used here are thus somewhat
28

different from those in Eq. (3) and are given by

T wt, v./wd,
(58)
f /nJ/(wd), e Wpi/W ,

where ( = z/d and q, = ¢>qn[li?/eo are the same as those in Eq. (3), and Wpi
q.jnJ/meo denotes the ion plasma frequency. The boundary conditions for the po-
tential ¢>(t, z) are given by

(59)

where V(T) = eoV(wt)/(qn[li?) denotes the normalized cathode voltage. Here V(wt)
is given as a time-periodic function in t with period 27r /w.
The time average of a function h( T) is defined as

(60) (h) 1 la 2"

= -27r h(T)dT.
0

We also introduce the following time-integral operator C applied to a time-dependent

function h( T):

If the function h(T) is 27r-periodic in T and satisfies (h) = 0, then the function Ch
is also 27r-periodic in T and satisfies (Cf) = O. Defining ¢>o(O = (¢» and ¢>l(T,O =
¢> - (¢», we obtain the characteristic equations for Eq. (55):

(61)

(62)

We now seek the time-periodic solution of the system (55) and (56) satisfying

in the regime of high frequency, i.e., E <t: 1. When e <t: 1, the solution to the
characteristic equations at z is given [11], up to order e 2 , by

(63)

where ¢>~(T,O = 8¢>d8(. The derivation of Eq. (63) is based on a two-time scale
asymptotic expansion in small e. Since df/ dT = 0 or / = constant along the char-
acteristics, the standard method of characteristics yields the velocity distribution
function at (:

8({[UZ+E2C¢>~r +2e ¢>o(O}! -UB) 2

(64) uo:~()8(uz- [uO f(O-e 2 C4>m,

29
1
where uo/(O = [ul- 2e 2 I/>o(O]2. Here, terms up to order e 2 only ~ Eq. (63) are
used to derive Eq. (64).
Writing the kinetic energy as t: = mv;/2, the energy distribution f e of the ion
flux at the cathode is given by f edt: = vzf(t, d, vz)dvz , i.e., f e = f(t, d, vz)/m. The
experimentally-observed ion energy flux distribution is then the time average of fe:

(65) (fe) = -2
w
1f
1 0
21</'"
fedt =- nj
2
1fmw
d
1 f(r,(,u z) dr.
0
21<-

In dimensional form, we obtain from Eq. (64)

(66) (fe)(t:) = niVj r;;;£

1fVo/ V2
Eldt:"'I"'-I,i
i dep
where vOl = uo/wd, t:", denotes the final kinetic energy as a function of the final phase
ep = wt, i.e.,

(67)

and the discrete phases epi = epi(t:) denote all the distinct solutions ep of the equation
t: = t:",(ep). In the case of sinusoidal time dependence i.e. E(t, z) = -8iP/8z =
Eo(z) + E1(z) coswt, the general expression (66) reduces to

(68)

where
qEI(d)
v± =vo/±--.
mw
Numerical calculations of the energy distribution (f e), based on Eq. (68) and Monte
Carlo simulations, are found in [11].
Equation (64) gives the ion velocity distribution for any given electric field poten-
tial 1/>. However, in order to obtain a self-consistent electric field profile, one must
solve the Poisson equation (56), using Eq. (64). Carrying out the integration of
Eq. (56), we obtain

(69)

It is easy to see that Eq. (69) may be split into the following two equations:

d21/>o UB
(70)
d(2 Jul- 2e 2 I/>o(O'
d21/>1
(71)
d(2
o.
Suppose that the (normalized) time-dependent cathode voltage is given by VCr) =
Vo + iii cosr. Then the boundary conditions for Eqs. (70) and (71) become
(72)
30

and

(73)

Equation (70) subject to the boundary conditions (72) has a form similar to Eq. (10),
giving a collisionless DC-sheath potential. Equation (71) with the boundary con-
ditions (73) gives a uniform (i.e., z-independent) oscillation field, i.e., [)!fJd[)( =
VI COST.

The z-independent oscillating electric field derived above, however, is physically

implausible since in real RF sheaths the high-mobility electron gas responds almost
instantaneously to the RF excitation and gives rise to a rapid oscillating motion
of the relatively sharp presheath/sheath boundary. A realistic self-consistent model
must incorporate such electron effects in Poisson's equation (56). For example, if
the electrons are governed by the Boltzmann distribution (Le., the electron density
is proportional to exp(q!fJ/kBTe)), then Poisson's equation shoud be given, instead of
Eq. (56), by

(74)

1
where Ad = (eokBTe/n[q2)"2 denotes the Debye length. The self-consistent solution
from Eqs. (64) and (74) is beyond our present scope and will be discussed elsewhere.
Discussion of the effects of a high-frequency motion of the sharp presheath/sheath
boundary on the ion distribution may be found in [12].

6. Discussion. From the steady state Boltzmann-Poisson system, we derive an-

alytic formulae for the angular and energy distributions of the incident ion flux at the
cathode surface in weakly-collisional DC sheaths. The resulting distributions, given
by Eqs. (35), (42), (51) and (54), are compared with Monte Carlo simulations and are
found to be in good agreement. Although the model considered here is an idealiza-
tion of a real DC sheath, it captures the essential physics of the collision mechanism
and the self-consistent electric field. The idealization, moreover, renders the model
amenable to a relatively simple analytical treatment, and clarifies the dependence of
the distributions on various plasma parameters. In fact, it is not difficult to incor-
porate detailed physical features, such as more realistic presheath/sheath boundary
conditions and energy-dependent collision-cross sections, in the Monte Carlo simu-
lations in order to compare the resulting ion distributions with experimental results
under various conditions.

As for RF discharges, we briefly discussed the ion distribution in collisionless

RF sheaths, based on the time-periodic Vlasov-Poisson system. The time-averaged
energy distribution of incident ion flux at the cathode was derived analytically in
Eq. (68) for a given sinusoidally oscillating electric field. Using a more general ex-
pression for the distribution function given in Eq. (64), we obtain a simple configura-
tion of self-consistent RF electric field without taking into account the high-mobility
electron effects. However, more realistic self-consistent RF fields (taking into account
the highly mobile electron gas) could differ significantly from the RF field discussed
in the present work. A detailed discussion of this matter is beyond our present scope
and we defer it to a future publication.
31

REFERENCES
[1) S. M. Sze, VSLJ Technology, McGraw-Hill, New York (1988).
[2) B. Chapman, Glow Discharge Processes, John Wiley & Sons, New York (1980).
[3) J. W. Coburn and E. Kay, J. App!. Phys. 43, 4965 (1972).
[4) W. M. Holber and J. Forster, J. Vac. Sci. Techno!. AS, 3720 (1990).
[5) R. T. Farouki, S. Hamaguchi, and M. Dalvie, Phys. Rev. A 44, 2664 (1991).
[6) S. Hamaguchi, R. T. Farouki, and M. Dalvie, Phys. Rev. A 44, 3804 (1991).
[7) See, for example, L. D. Landau and E. M. Lifshitz, Mechanics, Pergamon Press, Oxford, 1960.
[8) W. P. Allis, "Motion ofIons and Electrons," in Handbuch der Physik, Vol. 21, Springer-Verlag,
Berlin, 1956, p. 383.
[9) C. D. Child, Phys. ~ev. 32, 492 (1911): 1. Langmuir, Phys. Rev. (Ser. II) 2, 450 (1913).
[10) R. T. Farouki, M. Dalvie, and L. F. Pavarino, J. Appl. Phys. 68, 6106 (1990).
[11) S. Hamaguchi, R. T. Farouki, and M. Dalvie, Phys. Rev. Lett. 68, 44 (1992).
[12] R. T. Farouki, S. Hamaguchi, and M. Dalvie, Phys. Rev. A45, 5913 (1992)
AN INTERFACE METHOD FOR SEMICONDUCTOR
PROCESS SIMULATION
MICHAEL J. JOHNSON* AND CARL L. GARDNER**
Abstract. The diffusion of dopants in silicon at high temperatures is modeled by a nonlin-
ear parabolic system of partial differential equations on a two-dimensional region with a moving
boundary. A numerical solution using the L-stable TRBDF2 time integration method and a "box
method" spatial discretization is described.
Details are given of the methods used to specify and manipulate curves, and to define arbitrary
simply connected regions by their boundary curves. Numerical experiments are presented com-
paring the divided difference and TR/TR methods for dynamically adjusting the timestep, and
comparing Newton and Newton-Richardson iteration.

1. Introduction.
Semiconductor process simulation} models the nonlinear diffusion of dopant
atoms during the thermal annealing of silicon or other semiconductor wafers which
have been doped by ion implantation. When the temperature is raised, the dopant
atoms diffuse in the sample. The diffusivities of the dopant atoms depend on local
dopant concentration and change with time due to a number of transient effects.
During the anneal, portions of the surface of the wafer are allowed to oxidize.
The rate of oxide growth depends in part on local dopant concentration along the
oxide/silicon boundary, so that the mathematical model of this process becomes a
free boundary problem.
The nonlinear diffusion process is described by a set of conservation laws for
impurities

(1) ~. ~v (p.,vc,) d.

in a simply connected region net), where C",,(x, t) is the concentration of the ath
species of dopant, D is a matrix of phenomenological diffusion coefficients, and a,
{3 = 1, ... , N label the types of impurities. Note that D = D(C}, ... , CN)
includes the effects of the coupling of the impurity ions to the electric field.
The boundary on(t) of net) represents the union of a silicon/mask interface
and a silicon/oxide interface. Boundary conditions of homogeneous Neumann type
are imposed by the physical constraints that no dopant ions may leave the silicon
region net) unless consumed by the growing oxide regions and that no migration
occurs across the oxide/silicon interface:
(2) (ft· \7C"")afl(t) = O.
The code which positions the moving boundary is distinct from the code that cal-
culates dopant diffusion. This paper addresses the efficient numerical solution of
the diffusion problem only, using a prescribed ?oundary net).
*IBM Corporation, Endicott, NY, 13760.
**Department of Computer Science, Duke University, Durham, NC 27706. Research supported
in part by the National Science Foundation under grant DMS-8905872.
1 See Ref. [1] for a review.
34

2. Numerical methods.
We use the composite TRBDF2 method [2] to integrate the solution in time_
To integrate Eq. (1) from t = tn to tn+l = tn + At n , we first apply the trapezoidal
rule (TR) to advance the solution from tn to tn+'r = tn + 'YAtn:

(3)

and then use the second-order backward differentiation formula (BDF2) to advance
the solution from tn+"Y to t n+l:

(4)

This composite one-step method is second-order accurate and L-stable [2]. The
importance of L-stability for diffusion is illustrated for a 1D computation in Figure
1. After a single timestep with At = 50AtEuler = 50A y 2/2D max , the TR method,
which is A-stable but not L-stable, exhibits severe unphysical oscillations near the
maximum of C.

c
2

-----------TR
-----TRBDF
after one time step
1.5

L -______~~~----~--------~--------~------~y
2 4 6 8

Figure 1: L-stable vs. A-stable (lD example).

We linearize Fn+t in Eq. (4) (and similarly Fn+"Y in Eq. (3)) by approximating

(5)

where k = 0, 1, ... labels the Newton iterations, and the Frechet derivative

(6)
35

The new solution is obtained by setting

(7) n+1
C (k+l) = Cn+l
(k)
+ \'C(k)
AU
n+1 C n+1 Cn+'Y
, (0) =

where A is a damping factor [3] between 0 and 1, chosen to insure that the norm of
the residual for Eq.(3) or (4) decreases monotonically. At each TR or BDF2 partial
step, we iterate until the Newton method converges.
The Newton equation for the TR partial step is
(8)

[ 1 _ 'Y Atn (6P)n+'Y] 6Cn+'Y = _ (Cn+'Y _ cn) + Atn(Fn+'Y + pn) -GTR

2 6C (k) (k) 'Y 2 (k) -
(k)

where GTR is the residual for Eq. (3).

The Newton equation for the BDF2 partial step is

1- 'Y A (6P)n+l] 6Cn+1 =

[1 _ 2 tn •C (k)
- 'Y U (k)

(9)
_ (cn+1 _
( k)
1 Cn+'Y + (1- 'Y? cn) + 1- 'Y A
2 _ 'Y ut n
pn+l - G
(k) = - BDF2
'Y(2-'Y) 'Y(2-'Y)

where GBDF2 is the residual for Eq. (4).

Note that the Jacobians for the TR and BDF2 partial steps have the same form
if'Y = 2 - J2. If the solution is varying slowly, then Jacobian factorizations may
be reused (Newton-Richardson method) while retaining quadratic convergence of
the Newton method [2]. The effectiveness of reusing Jacobians in semiconductor
process simulations is discussed in Section 4.
Equations (8) and (9) are discretized in space by using the box method [4,5].
The box method, based on Gauss's theorem, evaluates the average divergence of a
vector quantity f over a box as

(10)
v .f _ f . nds
fboundary of box
( )average, interior of box - area of box .

The box method may be most easily implemented in computations on a rectangular

grid by taking the sides of the boxes to be halfway between grid points. For example,
along side 1 in Figure 2,

u(b) - u(a)
(11) (Vu) . n::::: x(b) _ x(a)'

and

(12) ( '<"7 ). ~ (v(a) + v(b») (u(b) - u(a»)

vvu n~ 2 x(b)-x(a) .
J6

All spatial operators employed in the Newton method described above are of this
type.
To define the box at the boundary aO(t), consider, for a moment, point a in
Figure 2 88 the origin of a coordinate system, and tbe dasbed box as a unit square.
In our implementation of the box method, the smallest incremental area is one
octant of this unit square. To enable correct identification of interior octants, tbe
list of boundary points adheres to an orientation convention such that the previoa.!
boundary point defines a starling octant, and the 1t<!:zt boundary point defines a
stopping octant, with the interior oct ants identified by counterclockwise rotation
from premolU to next, as shown in Figure 3.

• • •
I I
• • •
I
I I
a b
L ~

• • •
Figure 2: Box method in the interior of a rectangular grid.

• • ne%t

oxide
prevIous

L
a
~
I •
s ilicon
•
Figure 3: Box method 00
•
the boundary of a region.
•
The box method with central differences couples the BOlution at one point to
the llOiution at nearby neighbors; there is no coupling between distant points. As a
result, the matrix representing the spatially discretized operator (and consequently,
the matrix to be solved at each timestep) is sparse. The discretized linear systems
are solved wing the sparse matrix package of Bank [2).
37

The timestep size l:l.t is adjusted dynamically within a window [l:l.tmin' l:l.tmax ]
by monitoring a divided-difference estimate of the local truncation error T [2]:

(13) Tn+! = kl:l.t!C(3)

(14) ~ 2kl:l.t (.!.Fn _ 1 Fn+'r + _1_ Fn +1 )

n ( ((1 - () 1- ( ,
where

(15)
k= -3(2 + 4( - 2
12(2 - ()
The three values of F employed in Eq. (14) have already been calculated in the
most recent TRBDF2 timestep.
An alternative approximation for Tn+! involves re-taking the most recent partial
timestep (from tn+'Y to tn+d using TR instead of BDF2 [6]. (We will refer to the
resulting value of C as C:;'"kiTR') The TR/TR step yields the approximation

(16) n+l _ 2 2S(Cn +1 n +1

CTR/TR )
T -. TRBDF2 -

for ( = 2 - J2. Very few Newton iterations are necessary in taking the second TR
step, since cn+l is, in fact, already known. The performances of the TR/TR and
divided difference error estimators are compared in Section 4.

3. Interface method for moving boundaries.

The interface software which handles the geometry of regions with moving
boundaries consists of a library of subroutines which define and manipulate certain
data structures. The choice of structures employed here is motivated by the front-
tracking method of Glimm and McBryan and coworkers (see, e.g., [7]). The method
we describe is a simplified front-tracking code appropriate for stable interfaces. The
major structures employed here are the grid, curve, point, curve-point, and
region structures.
The grid structure defines the resolution in space. Included in the grid structure
are lists of x and y coordinates, allowing a tensor product griddingj that is, variable
spacing in either the x or y direction, or both. The goal of tensor product gridding
is to increase overall accuracy by refining the mesh in certain areas [S].
The curve structure is a linked list of curve-point structuresj each curve-point
structure contains, in addition to its own x and y coordinates, a link to the pre-
vious and to the next curve-point structures. Curve-points need not lie on grid
points. Figure 4 shows graphically the information contained in the curve and grid
structures. The filled circles represent curve-point structures. It should be noted
that the sinusoidal curve, used here as an example, is not indicative of a realistic
oxide/silicon interface.
Each region structure defines one simply connected region by its boundary.
Boundary curves are represented by curve structures which close on themselves but
which are otherwise not self-intersecting. Witllln the region structure are tables
assigning to each point in the region an index into the list of unknowns to be
determined by the Newton method.
38

Figure 4: Graphical representation of curve and grid.

3.1. Processing of boundary curves. Figure 5 shows the curve of Figure 4

after additional points have been appended to construct a boundary curve.
The algorithm to define the interior of a region given its boundary is as follows.
Fbr each value of y on the grid, we note each x coordinate at which the boundary
crosses this (y = constant) line. Starting from the first such crossing, we mark every
grid point as "in" until the second crossing, then every grid point is "out" until the
third crossing, etc. By retaining (within the region structure) the minimum and
maximum values of x and y as the region's boundary curve is formed, we avoid
having to loop over every grid point when defining the interior of the transition
region. Obviously, in order to follow this algorithm, the points of intersection of the
the boundary curve with the grid lines must be known.
The process of finding these points of intersection is called meshing the curve.
The algorithm for meshing a curve is complicated by the consideration of finite
precision arithmetic; for example, a calculation which in exact arithmetic would
produce, say, b slightly larger than a, might with truncation produce just the op-
posite. To avoid problems of this kind, at every step we adjust values which are
very close to grid lines, moving them onto the line. More precisely, special variables
called smallz and smally are established. Points within 2 x smallx of a vertical
grid line are initially moved to the line (and similarly for horizontal grid lines using
smally). Thereafter, any two points within smallx of each other are considered to
have the same x coordinate (likewise for y coordinates using smally).
After the adjustment, points are investigated in pairs. For each pair of points
(pointl, point!), the mesh subroutine must determine whether a horizontal or ver-
tical grid line has been crossed. If more than one grid line has been crossed, the
crossing which is closest to pointl is entered into the linked list of points in the
meshed curve. This point of intersection then becomes the new pointl. After all
such grid crossings have been treated, point! is copied into the meshed curve if it
is on a grid line (after adjustment). Then, point! becomes the new pointl, and a
new point! is extracted from the original curve. Figure 6 shows the curve of Figure
5 after meshing.
39

Figure 5: Boundary curve.

Figure 6: Meshed curve.

Computational information (such as the solution) is kept only at grid points, the
points of intersection of grid lines. The division of grid blocks (the area bounded by
neighboring grid lines) into many smaller sub blocks is not computationally useful
unless information can be associated with the subblocks, which would effectively
yield a refined grid. In other words, if resolution finer than a grid block is required,
then the grid should be refined. (At present, this must be done manually.) For
this reason, multiple crossings of any grid block by a curve are eliminated in an
operation called pruning, which we now describe.
In the following discussion, it will be useful to label as block (i, j) that grid block
which encompasses the area between Xi and Xi+! and between Yi and Yi+l. To prune
a meshed curve, we move along the meshed curve one chord (two consecutive curve
points) at a time. If the chord crosses grid block (i, j), then we increment a counter
box(i,j). If box(i,j) exceeds one, we move backwards along the curve, removing a
40

point at a time from the curve structure, until box(i,j) = 1. Figure 7 shows the
curve of Figure 6 after pruning.

Figure 7: Pruned curve.

In keeping with the philosophy that information is to be kept only at grid

points, the pruned curve is finally replaced with the closest possible match given
by a sequence of grid points. To accomplish this, the "adjust" algorithm described
above is applied with 2 x smallx set to half the grid spacing (and similarly for
2 x smally). The resulting adjusted curve may contain redundant sequences of
curve points, since several points may be adjusted to the same destination. Worse
yet, the adjustment process can yield "kinks" if two non-consecutive curve points
are adjusted to the same grid point while intervening curve points are not. The
curve must therefore be "cleaned" by identifying and removing kinks and redundant
points.
The cleaned curve may contain "gaps," e.g., consecutive points separated by
more than one grid line. The cleaned curve is therefore re-meshed to yield the final
result, the projected curve. The projected curve is continuous in the sense that
consecutive curve points are separated by at most dXi and dYj. Figure 8 shows the
curve of Figure 7 after projecting.
At first glance, the projected curve of Figure 8 may seem a poor representation
of the boundary curve in Figure 5. However, this is merely an indication that the
grid spacing may be inappropriate given the characteristics of the curve. Figure 9
shows the projection of the same curve onto a finer grid.
Projecting of boundary curves simplifies the algorithm described above, by
which regions are defined by boundary curves. In addition, after projecting bound-
ary curves, the smallest division of area is one half grid block. This information is
used when applying the box method to points on the boundary of a region.
The experience of Glimm and McBryan was that projecting curves in this man-
ner numerically stabilizes an interface, making the method inappropriate for sim-
ulations of interface instabilities, such as in gas dynamics. The slow oxide growth
41

modeled here does not exhibit instabilities experimentally, however, and the pro-
jection method is quite appropriate.
To avoid accumulation of the positional error associated with projection, the
projected curve is never propagated. Instead, at each timestep, the exact boundary
curve is calculated and projected. The process of moving a boundary curve itself is
thus straightforward, but after the movement, algorithms of greater complexity are
required to deal with the consequences of the move, as we now describe.

/ /
~ ,,' ~
,I,-
''''
~t'-
V V ,l.-

I .... I....
--- ---
Figure 8: Projected curve.

Figure 9: Projected curve on finer grid.

3.2. Transition regions. Physically, the moving boundary represents a grow-

ing oxide. Calculations are not performed on th~ oxide region. To avoid re-tallying
the entire grid at each timestep, only the areas between the old and new boundary
42

curves are reassigned after moving the boundary. These areas are called the tran-
sition regions. Figure 10 shows the transition regions (labeled I-IV) formed when
the boundary curve of Figure 8 is moved to the position marked "new."
Boundary curves are always set up to encircle the regions they define in a
clockwise manner. This convention is used to ensure a consistent definition of the
concepts of moving "forward" or "backward" along a curve, and it is also employed
when discretizing via the box method, to decide locally which oct ants are in the
region.
The algorithm used to define each transition region is as follows:

Algorithm to define a single transition region.

• Move forward along the old boundary until a point is reached which is not
in the new boundary.
• Move backward along the old boundary one point. Call this point A. Copy
point A into the transition boundary as its first point.
• Move forward along the old boundary, copying each point into the transition
boundary, until another point is reached which is in the new boundary. Call
this point B. Copy point B into the transition boundary.
• Move backward along the new boundary, copying each point into the tran-
sition boundary, until point A is reached. Copy point A into the transition
boundary as its last point.
As can be seen in Figure 10, a single boundary movement can result in many
transition regions. Therefore, the transition region algorithm must be applied again
and again, starting from point B, etc., until the last point in the old boundary is
reached. Surprisingly, this still does not yield all the transition regions. The remain-
der must be found by reversing the roles of new and old above; that is, searching for
points in the new boundary which are not in the old boundary. Transition region
II in Figure 10 is of this type.

Figure 10: Transition regions between old and new boundary

curves.
43

Points in the transition region can be identified by knowledge of the boundary.

All these points must now be flagged as lying in the oxide region. Since points in
the transition region were formerly in silicon, they are already flagged as lying in
the silicon region as well. Points in the interior of the transition region must have
their flags changed to indicate removal from the silicon region. Points on the new
boundary are flagged as lying in both silicon and oxide regions.
Finally, the total amount of dopant in each transition region is calculated and
added to the running total of dopant in the oxide. In this way, checks for conser-
vation of total dopant can be implemented without recounting the oxide total; in
fact, information about points in the oxide may in principle be discarded. Given
homogeneous Neumann boundary conditions, the numerical methods employed here
for the discretized problem (i.e., TRBDF2 timestepping with box method spatial
discretization) conserve total dopant exactly in exact arithmetic. (The same boxes
must be used as are employed in the box method; dopant per box is the concentra-
tion at the one internal grid point times box area.) Thus, conservation checks are
useful rapid indicators of both software implementation error and rounding error
due to finite machine precision.

3.3. Dynamic nature of curve structures. As curves are meshed, pruned,

moved, etc., the number of points in the curve, and consequently the amount of
storage required to represent the curve, may increase or diminish. In order to
handle curves efficiently, dynamic allocation subroutines are used. These increase
or decrease the storage allocated to a given curve structure. On the other hand,
most curves contain many points, so that inefficiencies would result if the allocation
routines were called each time a point is added or deleted. Therefore, storage for
curves is increased and decreased in blocks which can hold many points.

4. Numerical experiments.
The numerical experiments described here have been summarized in the form
of tables and graphs below. In the tables, in the column labeled "Case," M is a
medium (40 X 20) grid and F is a fine (80 X 40) grid; SB refers to a stationary
boundary and MB to a moving boundary; the final digit is the number of dopant
species. The column labeled "MN" is the total number of unknowns. (M is the
number of spatial points and N is the number of dopant species.)
The implant (initial data) is the following Gaussian, which yields a total dose
of approximately 10.5 X 1020 atoms/cm per species:

C = e- 30 (X 2 +lJ 2 ) x 1020 atoms/cm3 ,

X= x - !(Xmax + Xmin) ,
xmax - Xmjn
Y - Ymin
11= Ymax - Ymin

In cases of more than one species, the initial pr9files are the same except that the
concentration of the second species is 0.9 times that of the first. In all cases, the
simulated annealing time is T = 30 min.
44

For the single species cases, the diffusivity was modeled as D = aC + b, with
a = 5 x 1O-33cm5/sec and b = 0.1 x 1O-l3cm2/sec. For the dual species cases, the
diffusivity matrix was

(17) D = (aCl + b -dCl) ,

-dC2 aC2 +b

with d = 0.05 x 1O-33cm5/sec. The diagonal blocks in D crudely approximate

the actual diffusivity matrix blocks for boron (Cl ) and phosphorus (C2 ) at 1000 C
when the two concentrations are roughly equal. The off-diagonal blocks are greatly
simplified from the actual cross terms, which decrease rapidly as the concentrations
of the two species begin to differ significantly [9,10].

4.1. Timestep selection. As discussed in Section 2, the timestep size is

adjusted according to the most recent estimate of the local truncation error Tn+! ,
which can be obtained by either a divided difference formula or the TR/TR method.
A comparison of the relative performance of these two methods is shown in Table
1, and condensed into scattergram form in Figure 11.

Table 1: Comparison of TR/TR and divided difference estima-

tors. CPU times were measured on a Sun 4/280.

TR/TR DD
Case MN CPU time- CPU time- AtTR/TR/ AtDD
sec. steps sec. steps mm max
M-SB-1 800 lOS 12 89 12 1.00 1.09
M-MB-1 800 129 13 115 15 1.04 1.86
M-SB-2 1600 515 11 454 12 1.04 1.08
M-MB-2 1600 615 13 571 15 1.04 1.24
F-SB-1 3200 721 11 644 12 1.05 1.08
F-MB-1 3200 889 14 948 19 1.05 2.26
F-SB-2 6400 5558 11 4846 12 1.05 1.08
F-MB-2 6400 5458 13 6635 19 1.05 2.26

For stationary boundary problems, the divided difference error estimate gives
superior performance. For large moving boundary cases, the TR/TR estimate is
preferable, for the following reason. At the beginning of a timestep the boundary
is moved to a place where 'VC was formerly nonzero. In calculating pn (at the
beginning of the timestep), we force n . 'V C = 0 on the boundary; but only a short
distance away from the boundary, the initial gradient of C (which was inherited
from the preceding timestep) may be fairly steep. Since P is calculated from second
spatial derivatives, we may expect the initial pn to contain some error associated
with this effect near the boundary. The other terms in Eq. (14), pn+"Y and pn+ 1 , are
45

calculated from C's that have undergone diffusion since the boundary was moved,
so they do not contain this error. Since Tn+! is calculated from difference$ in these
values, a given relative error in Fn will induce a relatively larger error in Tn+! and,
hence, in the calculated timestep size.

1.4 ,.-------,----,---,---,-----.--,------,

1.2
• •<> <>

•
TR/TR CPU 0.8 •
DDCPU
0.6
<> stationary boundary
0.4 • moving boundary

0.2

1000 2000 3000 4000 5000 6000 7000

number of unknowns

Figure 11: Relative CPU usage for TR/TR and divided differ-
ence timestep adjusters.

4.2. Re-using Jacobians. If the Jacobian is changing slowly enough, there is

a potential for savings in execution time by re-using old factorizations, that is, by
employing Newton-Richardson iteration as opposed to Newton iteration. On the
other hand, since the reused Jacobian will no longer generate updates along the true
Newton direction, the number of iterations required for convergence may increase.
Software algorithms implementing Newton-Richardson iteration must therefore in-
clude some test for slow convergence, at which point the matrix is re-factored. Since
the assumption of a slowly changing Jacobian is then suspect, a flag may be set to
re-factor on subsequent iterations as well [11,3].
Our code includes three such tests. Ordinarily, we factor the matrix only at the
beginning of each composite TRBDF2 timestep. However, the matrix is re-factored
in any of the following circumstances:
(a) If MAX-NEWTONS are reached without convergence, the code switches
to standard Newton iteration for the duration of the timestep. (MAX-
NEWTONS has been set at 9.)
(b) If the norm of the residual exceeds MAX-RATIO times the norm of the resid-
ual at the previous iteration, the Jacobian is re-factored. (MAX-RATIO has
been set at 0.1, which implies that re-factorization occurs if the Newton-
Richardson method is producing less than one decimal digit per iteration.)
46

(c) If the damping factor Ak+l is less than 1.0, the Jacobian is re-factored. (The
damping factor is calculated by the formula of Bank and Rose [3].)
The performance of this Newton-Richardson method was compared experimen-
tally with that of ordinary Newton iteration. The results are summarized in Table
2. The Newton-Richardson method usually gives a modest performance improve-
ment over the Newton method, and occasionally only a slight degradation, so that
Newton-Richardson may be considered as the method of choice for most nonlinear
diffusion problems.

Table 2: Comparison of Newton-Richardson and Newton meth-

ods.

Newton- Newton
time- Richardson
Case MN steps CPU total CPU total
sec. iterations sec. iterations
M-SB-1 800 12 87 51 89 46
M-MB-1 800 15 103 58 115 57
M-SB-2 1600 12 466 49 454 45
M-MB-2 1600 15 521 58 571 57
F-SB-1 3200 12 670 51 644 46
F-MB-1 3200 19 926 74 948 69
F-SB-2 6400 12 4793 59 4846 58
F-MB-2 6400 19 6007 74 6635 74

5. Conclusion.
We have demonstrated an efficient set of algorithms appropriate for modeling
stable interfaces in two spatial dimensions, and we have applied these algorithms
to the solution of a set of nonlinear diffusion equations on a region with a mov-
ing boundary, from the field of semiconductor process modeling. We have shown
that for problems with a stationary boundary, a divided difference error estimator
gives optimal performance, while a TR/TR scheme is preferable with a moving
boundary. We have also demonstrated that in most nonlinear diffusion problems,
Newton-Richardson iteration yields a modest performance improvement over New-
ton iteration.
47

REFERENCES

[1] R.B. FAIR, C.L. GARDNER, M.J. JOHNSON, S.W. KENKEL, D.J. RoSE, J.E. RoSE, AND
R. SUBRAHMANYAN, Two dimensional process simulation using verified phenomenological
models, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 10 (1991), pp. 643-651.
[2] R.E. BANK, W.M. COUGHRAN, W. FICHTNER, E.H. GROSSE, D.J. RoSE, AND R.K. SMITH,
Transient simulation of silicon devices and circuits, IEEE Transactions on Computer-Aided
Design, vol. CAD-4 (1985), pp. 436-451.
[3] R.E. BANK AND D.J. ROSE, Global approximate Newton methods, Numerische Mathematik,
vol. 37 (1981), pp. 279-295.
[4] R.E. BANK, D.J. ROSE, AND W. FICHTNER, Numerical methods for semiconductor device
simulation, SIAM Journal on Scientific and Statistical Computing, vol. 4 (1983), pp. 416-435.
(5] R.S. VARGA, Matrix Iterative Analysis, Prentice-Hall, 1962.
[6] H.R. YEAGER AND R.W. DUTTON, An approach to solving multiparticle diffusion exhibiting
nonlinear stiff coupling, IEEE Transactions on Electron Devices, vol. ED-32 (1985), pp.
1964-1976.
[7] I.L. CHERN, J. GLIMM, O. McBRYAN, B. PLOHR, AND S. YANIV, Front tracking for gas
dynamics, Journal of Computational Physics, vol. 62 (1986), pp. 83-110.
[8] D.H. SELIM, Tensor product grid implementation for PREDICT2, Master's thesis, Duke
University, 1990.
[9] Microelectronics Center of North Carolina, Research Triangle Park, NC, PREDICT Users
Manual, 1986.
[10] M.J. JOHNSON, Numerical Methods for Semiconductor Process Simulation in Two Spatial
Dimensions: a Nonlinear Diffusion Problem with a Free Boundary, PhD thesis, Duke Uni-
versity, 1991.
[11] W.M. COUGHRAN, E.H. GROSSE, AND D.J. ROSE, Aspects of computational circuit analysis,
in VLSI CAD Tools and Applications (W. Fichtner and M. Morf, eds.), pp. 105-127, Kluwer
Publishers, Boston, 1986.
ASYMPTOTIC ANALYSIS OF A MODEL FOR THE DIFFUSION
OF DOPANT-DEFECT PAIRS

J.R. KING*

Abstract. Asymptotic methods are applied to a model describing the diffusion through silicon
of a dopant which pairs with both vacancies and self-interstitials. Several different asymptotic
limits are discussed for problems in both one and higher dimensions.

1. Introduction and model. The purpose of this paper is to summarise

some results of an asymptotic analysis of models for the diffusion of a dopant in
silicon mediated by mobile dopant-point defect pairs. Such models are now widely
accepted as providing accurate descriptions of the redistribution of many impurities
in semiconductors; see, for example, Fahey et al. [1]. The defects involved are the
vacancy (V) and the self-interstitial (I), each of which may pair with a dopant atom
(d) to produce a pair (dV,or dl). For simplicity we shall follow Richardson and
Mulvaney [10] and Morehead and Lever [9] in treating all species as electrically
neutral. Six bimolecular reactions between the various species are then possible, as
follows.

Forward reaction rate Reverse reaction rate

(1) V+I;:::O Fl = KlcvCl Rl = KlcvCj

(2) d+V;:::dV F2 = K 2W vCdCV R2 = K2CvCdV
(3) d + I;::: dI Fa = K3wjCdCj R3 = K3 c j c dI
(4) dV+I;:::d F4 = K 4 CdVCj ~ =K4wvcjc d

(5) dI + V;::: d Fs = KSCdICV Rs = KSWlCVCd

(6) dV + dI ;::: 2d F6 = K6CdVCdI ~ = K6WVWjC~
Here C with the appropriate suffix denotes the concentration of a given species, Cv
and cj are equilibrium concentrations of vacancies and interstitials respectively, and
Kl - K6 and WV and Wj are constants. We note that the K's all have the same
dimensions and that Wv and Wj are dimensionless. Relationships between various
reaction coefficients have been derived by exploiting the principle of detailed balance
(see, for example, [7]) which implies that under equilibrium conditions each reaction
must individually be in balance. Thus in equilibrium

cvcj = CVCj ,

*Department of Theoretical Mechanics, University of Nottingham NG7 2RD, England

which implies in particular that if, in addition, defect concentrations individually

take their equilibrium values then we have

The corresponding system of reaction-diffusion equations is

{JCd
(Ll) lit = R2 - F2 +R3 -F3 +F4 -~ +F5 -R5 + 2F6 -2Rs ,

{JCdV {J2 CdV

(1.2) -at = DdV {JX 2 + F2 - R2 +~ - F4 + R6 - F6 ,

{JCdI {J2CdI
(1.3) -at = DdI {JX2 + F3 - R3 + R5 - F5 + R6 - F6 ,

{JCV {J2CV
(1.4) fJt = Dv {JX2 + RI - FI + R2 - F2 + R5 - F5 ,

{JCI {J2CI
(1.5) lit = DI {JX2 + RI - FI + Rs - F3 +~ - F4 ,

where the D's are constant diffusivities and it is assumed that the dopant is unable
to diffuse in its unpaired state. Before proceeding further we make the simplifying
assumption that K2 and K3 are sufficiently large that the relations

(1.6)

may be assumed valid. The governing system is then made up of the algebraic
equations (1.6) together with the following:

(1.7) ! (Cd +CdV +CdI) = ::2 (DdVCdV+DdICdI) ,

(1.8) :t (CV+CdV) = ::2 (Dvcv+DdVCdV )+RI-FI +R4-F4+R5-F5+R6-F6,

{J ~ .
(1.9) fJt (CI +CdI ) = {JX2 (DICI+DdICdI) +RI - FI +~-F4+R5 - F5+Rs- F6,

which are obtained by taking suitable combinations of (1.1) - (1.5). We note that
conditions such as (1.6) must hold in order for the governing equation to be the
linear diffusion equation
{JCd _ D. {J 2Cd
fJt - • {JX2
when Cd is small everywhere and Cv '" cv, CI '" cj everywhere, in this case the
intrinsic diffusivity Di being given by

Di = (DdVWV + DdIWI) / (1 +wv + WI) .

We note, however, that (1.7) - (1.9) is a lower order system than (1.1) - (1.5),
and if the boundary conditions on (1.1) - (1.5) are not consistent with (1.6) then
51

boundary layers will occur in which (1.6) is not valid. Additional timescales are
also necessary to describe initial transients when (1.6) does not hold at t = o.
Using (1.6) we may rewrite (1.8) and (1.9) as

(LlO)

(1.11)

from which the enhancement to the defect generation-recombination rate resulting

from the presence of dopant is apparent.

2. Non-dimensionalization and preliminary asymptotics. We now non-

dimensionalize (1.6), (1.7), (LlO) and (1.11) by writing

Cd = cdu, Cv = Cy V, CI = cjI, CdV = wvcdPv, CdI = wIcjPI ,

t=Tt ,
where cd is a representative dopant concentration and T is a representative timescale;
we note that V and I now denote dimensionless concentrations. We then obtain
(dropping overbars)

C2.1) Pv = uV, PI = uI ,

C2.2) ! ((1+WVV+WII)u) = ::2 (UvV+/II)u) ,

C2.3) ! (rv + wvu)V) = ::2 (C9V rV + Ivu)V) + ("or + "lu + :2 u2 ) (1- IV

C2.4) ! (CrI+wIu)I) = ::2 (C9IrI +/Iu)I) + ("or+"1u+ :2 u 2 ) (I-IV),

where we have dimensionless constants

.1
rv = Cv
* / cd,
* r = (rvrI) 2 ,

Iv = DdVWv / (DdVWV + DdIWI) , II = DdIWI/ (DdVWV + DdIWI) ,

9V = Dv / (DdVWV + DdIWI) , 91 = DI/ (DdVWV + DdIWI) ,

.1 .1
"0 = Kl (CYCj) 2 T, "1 = (K4 wvcj + KSWIcy)T , "2 = K6(CYCj) 2 T .
52

We note the relationship

/v+h =1,
and that rv, rI and r are the only parameters which depend on cd' The notation
used in (2.1) - (2.4) is somewhat different from that used earlier [4] in discussing a
special case of this model.
We now note two sets of boundary and initial conditions applicable to (2.2) -
(2.4), as follows.

(a) Surface source

u =u s V = 1,1= 1,
rx~o
,

(2.5) as x ~ +00 u~o , V ~ 1,1= 1,

at t = 0 u =0, V = 1,1'= 1,

where u. is a prescribed constant.

(b) Implanted dopant

aastx=o ! (UvV + h 1)u) =0, V=1,1=1,

(2.6) {
x~+oo u ~ 0, V ~ 1, 1 = 1 ,

at t = 0 u = u(x) , V = 1, 1 = 1,

where U(x) is a prescribed function with

Q=jU(x)dX
o

finite. The initial conditions on V and 1 neglect implantation damage effects, but
these initial conditions will in any case play little role in the subsequent analysis.
It is convenient to write

at x = 0 u=u.(t)j

in this case u. must be determined as part of the solution.

Based on the parameters used by Morehead and Lever [9] we have the following
approximate orders of magnitude (taking Cd to be the surface dopant concentration
and T to be the duration of the diffusion):

Wv, WI ~ 10- 3 ,
53

Morehead and Lever [9] do not include terms corresponding to 11:0 or 11:2 and they as-
sume that II:} is sufficiently large that the generation recombination term dominates
giving

The parameters used by Richardson and Mulvaney [10] also imply that rv, rI r, Wv
and WI are small and that gv and gI are large.
We shall always therefore consider the limits

rv, rI, r, wv, WI -> 0,

and in much of what follows we shall concentrate on the case in which

and we assume that

rI = O(rv) , gI = O(gv) , t = 0(1) .

In this case the solution has a two region asymptotic structure made up of an
inner region (x = 0(1» on the length scale of dopant diffusion and an outer region
(x = O(g!) where 9 = (9V9I)! ) on the much longer length scale of defect diffusion.
In this limit, equations (2.3) and (2.4) imply that for x = 0(1) we have at
leading order

Matching into the outer region then implies that

(2.7)

which corresponds to the well-established assumption of flux balance (see, for ex-
ample, [11], [8]).
In each of cases (a) and (b) above, equation (2.7) implies that

(2.8) (9VrV + fvu)V - (gIrl + fru)I = gvrv - gIrl + (fv - fr)u. ,

and the leading order problem in x = 0(1) is governed by (2.8) together with

(2.9) au a2
at = ax2 (
(fv V + frI)u ) ,

(2.10) ::2 (9Vrv + fvu)V) + (lI:or + II:}U + :2 u2)(1- IV) = 0,

where we have retained all the generation-recombination terms.
54

In the outer region we write x = gty, and for y = 0(1), 9 ~ 1 the dopant
concentration u is exponentially small with respect to 9 and the dominant balance
is given by

oV 02V
at = 'Y oy2 + P (1 -
KO
(2.11) IV) ,

ol 1 o2l
(2.12) at =:y oy2 + KOP (1 - IV) ,

where 'Y = (9V/9I)t and P = (rv/rI)t. Matching the two regions together implies
that the conditions

oV
(2.13) as x -+ +00 u-+O, ox-+O

hold on (2.9) and (2.10), and defining Voo and loo by

as x -+ +00 V(x, t) -+ Voo(t), l(x, t) -+ loo(t) ,

where V and I denote leading order solutions to the inner problem, then (2.11) and
(2.12) are governed by

(2.14) at y =0 V = Voo(t) , I = loo(t) ,

as y -+ +00 V -+ 1,

at t = 0 V=l, l= 1.

In order to make further analytical progress we now consider limiting processes

involving the generation-recombination terms, the two possible limits being dis-
cussed in the next two sections.

3. Negligible generation-recombination.

3.1 Intermediate dopant concentrations r = O(l/g). The precise balance

of terms in (2.2) - (2.4) depends on the maximum dopant concentration present,
and there are two cases we need to discuss, namely an intermediate concentration
case in which r = O(l/g) and a high concentration case with r = O(w/g), where
W = (WVWI)t and we assume that WI = O(wv). Here the value of r is determined
by defining Cd to be the surface concentration of dopant in the surface source case
and to be the maximum value of Cd in the initial profile for the ion-implanted case.
We start by assuming that r = O(l/g), which was the case considered in the
previous section and which led to the flux-balance condition (2.7). We now assume
in addition that
55

so that for x = 0(1) the generation-recombination terms in (2.3) and (2.4) may be
neglected. This implies that at leading order

so that

(3.1) v= (gVrV + fvus) / (gVrV + fvu) ,

(3.2) I = (gIrl + !Ius) / (gIrl + !Iu) ,

and then (2.9) becomes

(3.3)

where

(3.4) D ( ) - f (1 + svus) f (1 + SIUs )

1 U,U s - v (1 + SV U)2 + 1(1 + SIU)2
with

(3.5) Sv = fv / gVrv , SI=/J/glr I .

We note that the contributions of vacancy and interstitial mechanisms to the effec-
tive diffusivity Dl are in this case additive. It follows from (3.1) and (3.2) that in
(2.14) we have

(3.6) Voo = 1 + svu. , 100 = 1 + SIU. ,

so that Sv and SI provide measures of the degree of point defect supersaturation.

The diffusivity (3.4) exhibits a surface concentration enhancement, the tail dif-
fusivity giving given by

The corresponding profile for U will not, however, exhibit at high concentrations
a plateau of the kind that is observed for phosphorus diffusion in silicon (see [3]).
This follows because Dl decreases with increasing u.
We note from (3.6) that if gr ~ 1 then the defect supersaturation is very large.
If it is sufficiently large then the number of pairs becomes comparable to the number
of unpaired dopant atoms and the dominant balance of terms changes. This occurs
when gr = O(w).
56

3.2 Very high dopant concentrations T = O(w/g). The governing equa-

tions are (2.2) together with

! (Tl
and
+wIu)I) = ::2 (gI+/Ju)I) .
We restrict attention here to the surface source case (2.5), in which case the solution
is self-similar with u == u(x/d), V == V(x/d), I == I(x/d). Guided by (3.6), for
T = O(w/g) we introduce the rescalings

(3.7) u=wu, x = x/w t ,

and write

hV=gVTV/W,

to give the following leading order balance in x = 0(1):

au a2 ( ~ + hII~) ,
= - aX2 hvV
at
a ~
(3.8) at (v V u) ; (hV + fvu)i1) ,

! (fu/v)
02
aX2 (hI
~ ~
+ /Ju)I) ;

we note that flux-balance does not occur in this case.

It is clear from the rescalings (3.7) that since W ~ 1 the solution to (3.8) will
not enable us to satisfy the conditions of (2.5) on x = O. It turns out that there is
a boundary layer in which we write

and obtain (imposing conditions from (2.5))

(3.9) V'" u./u ,

the required solution to (3.8) satisfies

V'" u./u ,
57

4. Dominant generation-recombination.

4.1 Intermediate dopant concentrations r = O(l/g). We now consider

the case in which at least one

hold, so that for u = 0(1) equations (2.3) and (2.4) imply that at leading order

(4.1) IV=l.

For r = O(l/g) equation (2.8) holds, and it then follows from (4.1) that

(4.2) V = {((gIrl - gvrv + (fI - IV)u.)2 + 4(gIrI + Jru)(gVrV + Iv u ») 2

- (gIrl - gVrV + (fI - Iv )u.) } /2(gVr V + Ivu ) ,

+ 4(gIr I + Jru )(gVrV + Ivu »)

(4.3) 1= { ((gIrl - gVrv + (fI - IV)U s )2 2

+ (gIrl - gvrv + (fI - Iv)u s )} /2(gIrI + Jru) .

Substituting into (2.9) now gives

(4.4)

with
(4.5)
D 2 (u, us) = ((gVrvV+glrII)(fvV+III)+4Iv Jru) / ((gVrvV+glrII)+(fvF +JrI)u)

An explicit expression for the dependence of D2 on u and Us may be obtained by

substituting for V and I using (4.2) and (4.3). Here we restrict further attention to
t
some special cases. Firstly we note that if Iv = II = then (4.2) and (4.3) yield

V= 1, 1= 1,

so that
{)u {)2u
&t = {)x 2 •

Of more interest is the case in which gr ~ 1 when three different scales for u must
be considered. We assume that II > Iv; if Iv > Jr the roles of vacancies and
interstitials are interchanged.
58

(i) u = 0(1) (high concentrations) (surface region).

For gvrV, gIrl < 1 we then obtain
l.
V . . . {(h - Iv?u~ +4Idv u2 )' - (h - Iv)u.} /2Iv u ,
l.
I . . . {(h - Iv?u~ +4Idvu2) 2 + (h - Iv)u.} /2hu,
and
l.
(4.6) D2(u, u.) '" 41dvu / (h - Iv ?u~ + 4ldvu2) 2 •

We note that in this case the terms IvVu and hIu on the right-hand side of (2.9)
make equal contributions, whatever the value of Iv / h (provided that lv, h =I- 0)

(ii) u = 0 (gIrl )1) (intermediate concentrations) (kink region).

Then

V", hu/(h - Iv)u. ,

I . . . (h - Iv )u. / hu ,
and

(4.7) D( ) 41dv u + gIrI(h - Iv)u.

2 u,u • . . . (h _ Iv)u. hu2
(iii) u = O(gIrI) (low concentrations) (tail region).
We now have

V", (gIrl + hu)/Ch - Iv)u. ,

(4.8) I", (h - Iv )u. /(gIrI + hu),

and

(4.9)

We note that which defect is supersaturated and which is undersaturated here

depends only on which of h andlv is the larger. We also note that there is again
a surface concentration enhancement of the tail diffusivity with

A uniformly valid approximation to D2 is easily obtained in the form

(4.10)
59

The three region structure given by (i), (ii) and (iii) is reminiscent of that used
by Fair and Tsai [3] to describe phosphorus diffusion in silicon. The diffusivity is
largest in the tail region (iii), drops to a minimum in the intermediate region (ii)
and increases again in the high concentration region (i). It is evident from (4.6)
that in order to obtain a plateau effect due to a sufficiently large diffusivity at
high concentrations, we require that neither II nor Iv be too small. To obtain a
significant tail diffusivity (4.9) we require that fr and Iv not be too close in value.
We note from (4.8) that

as u-O

so that the interstitial supersaturation becomes very large as girl becomes small; if
it is sufficiently large then the number of interstitial-dopant pairs can be comparable
to the number of unpaired dopant atoms and a different balance of terms is again
needed. We now discuss this high supersaturation case.

4.2 Very high dopant concentrations r = O(wlg). The governing equa-

tions are (2.2), (4.1) and

! (rv +wvu)V - (rl +wlu)I) = ::2 (9Vrv + Ivu)V - (girl + fru)I) .

We again restrict attention to (2.5), in which case u == u(zltt), V == V(zltt) and

I == I( z Itt) again hold. We again treat the case fr > Iv. The asymptotic structure
is complicated and we simply outline the various regions. We again write

girl = hlw ,

with hi = 0(1), W <: 1. The five regions describing the behaviour of the dopant are
as follows.

(1) Surface region u = 0(1).

Writing
u = uo(z,t)+ 0(1) as W - 0
we have

&0 8 (
-=41r1v- Uo (0)
i -8z
(4.11)
8t 8z
((/1 - Iv )2u~ + 411 Ivu~ )

(cf. (4.6». Equation (4.11) is to be solved as a moving boundary problem subject

to
at z =0 Uo = u. ,
(4.12) at z = so(t) uo :::; 0, ~: = -(/1 - Iv )u.80 / 4fr Iv ,
at t= 0 uo = 0,
60

where So == ~: and so(t) must be determined as part of the solution.

(2) Interior layer 1 (kink region).

Writing u = wfu*, x = s(t;w) +wfz with

u* '" u~(z, t), s '" so(t) as w -+ 0

gives (cf. (4.7))

(4.13) -SOUO
. * = (4!I Iv Uo + hI(fIII-u(;2
Iv )u.) -
QUo
,
(!I - Iv)u. QZ

and it is matching with this that leads to the condition (4.12). From (4.13) we
obtain

(4.14)
. 4IIIv uo hI(!I - Iv)u.
-SoZ =
(II - Iv)u. 2II u,(/
where the arbitrary function of t which arises on integrating (4.13) has been set
to zero. This may be achieved by appropriate specification of the O(wi) term in
s( t; w); completing the determination of s to this order requires the matching of
further terms in the expansion for u, however.
It follows from (4.14) that

as Z -+ +00

(3) Interior layer 2.

We write
z
x = s(t;w) + In(l/w) ;

these scalings are necessary to match into region (4). At leading order we have

.- (t) _ hI(1I - Iv)u. Ouo

(4.15) -SoUo - a - I IU 2 -;:;:;- ,
o uZ

where aCt) remains to be determined. Equation (4.15) implies that

(4.16) as Z -+ +00

(4) Transition region.

We now have

u = w! In-!(l/w)u t , x = s + 0(1) ,
61

with
o=~ ( U ot-28U~)
8x '
8x
so that matching with (4.16) yields

(4.17)

(5) Tail region.

In this final region we write

u=wu, x = x/w t
and obtain at leading order

(4.18) 8 ~
at (Iouo/v) = az ((hI + /ruo)Io~) ,
8:£2

Yo = l/fo .
The required solution to (4.18) satisfies

(4.19)
as x -+ 0+ Uo <'V (hI(JI - fV)Ust/fr)
l.
2 Ix IntCl/x), Yo <'V /ruo/(JI - fv)u s,

10 <'V (JI - fv )u s l/ruo .

Matching (4.19) with (4.17) then yields

The discussion at the end of section 4.1 considered asymptotics on the diffusivity
D2 but not on the corresponding profile of u. If the latter were considered, the
regions (1) - (4) would be essentially as in this section, while region (5) would
simplify to give the diffusivity (4.9) (this arises if hI :> 1 in (4.18». We note that
(4.18) also corresponds to a limit in which flux-balance does not occur.

5. Higher dimensions. We restrict the discussion here to the case r =

O(l/g). The appropriate generalisations of (2.8) - (2.10) are then

(5.1) \7 2 ((gVrV + fvu)V) = \7 2 ((gIrl + /ru)I) = - (lI:or + II:IU + 11:: u 2 )(1- IV) ,

(5.2) 8u
at = \7 2(( fvV + frI)u )
62

and we may immediately note the following. Schaake [11] claims that in any number
of dimensions there is a flux-balance condition, which for (5.1) would read

V(gvrv + Ivu)V) = V(glrl + lIu)l) ;

this is however, other than in one dimension, a far more restrictive condition than
(5.1) and it cannot in general hold. This implies that in dimensions higher than one
it is not possible to obtain algebraic expressions (such as (3.1) and (3.2) or (4.2) and
(4.3)) relating the defect concentrations to the dopant concentrations in a simple
local manner; the only non-local effects in (3.1) - (3.2) and (4.2) - (4.3) arise from
the dependence on the surface concentration u •. This would seem to indicate that
attempts to extend the Fair-Tsai model to higher dimensions, such as [6] and [2],
are largely inappropriate. Instead, coupled elliptic-parabolic systems such as (5.1)
- (5.2) should be solved; such reduced systems do not have the extreme stiffness
associated with the original systems.
We shall restrict further attention here to the form of the far-field behaviour
associated with the systems (5.1) and (5.2). H 11:0,11:1 and 11:2 are sufficiently small
then (5.1) reduces to

(5.3) V2(gvrv+lvu)V) =V2 (glrl + lI u)l) =0,

but when the recombination-generation terms dominate in the far-field we should
consider

(5.4) IV=1.

We shall attempt to be as general as possible with regard to boundary and initial

conditions, subject to the following constraints. We consider a two-dimensional
problem on the domain

-00 < y < +00, O~x<+oo,

x = 0 being the silicon surface. We assume that

u -+ 0 as x -+ +00 or as y -+ -00
and that the behaviour is one-dimensional in the limit y -+ +00, and can therefore
be described by the analysis given earlier. Such conditions are applicable to the
important problem of diffusion under a mask edge.
It then follows that as x -+ +00 with y / x -+ +00 we have
(5.5)

where for (5.3) Veo and 100 are given by (3.6), while for (5.4) we have
(5.6)
1.
leo = { (gIrl - gVrv + (II - IV)u s )2 + 4glrl gv rv ) 2

+ (glrl- gvrV + (II - Iv)u.)} 12glrl .

We note that in either case we have

(5.7)

this follows from (2.8). In addition, for any x we have

(5.8) as y -+-00 v -+ 1,
Writing x = r cose, y = r sine, the far-field defect behaviour as x -+ +00 with
y/x = 0(1) may now be obtained. Since u -+ 0 in this limit it follows from (5.1),
(5.7) and (5.8) that

(5.9) gyryV - glrlI '" gyry - gIrl + ~(fy - h) (1 + ~) Us,

and for (5.3) we obtain

(5.10) 1
V '" 1 + '2SY 2e)
( 1 + -;- Us,

while f01' (5.4) we have V = 1/ I with

I", I*( e, t)
where
(5.11)

I* = { (gIrl - gyry + ~(h - Iy)(1 + ~) )2 + us 4glrlgyry) t

+ (gIrl - gyry + ~(h - (1 + ~) us)}

Iy) /2glrl .

To determine the resulting far-field behaviour of the dopant concentration u we

introduce an artificial small parameter 5 by writing

R=5r.

We now define

for (5.3) and

,pee, t) = hI* + Iy / I*

for (5.4) (we note that . . (1 + -;-

,p depends on B and t only through the comblOatlOn 2B) us(t)
Equation (5.2) then implies that in either case

(5.12) 8u
at '"
1 (18 (8 ) 1 8
Ii 8R 8R (,pu) + R2 8B2 (,pu)
52
-
2
)
.
64

and we apply the W.K.B.J. method by assuming that

(5.13) u '" r(o) ( ao(R, 8, t) + ... )e- F (R,9,t)/6 2 as 0 --+ 0 ,

for some function r( 0).

The far-field behaviour is largely determined by F, which satisfies the first order
partiai differential equation

Since 0 is an artificial small parameter, the require solution takes the form

with

(5.14) 8G
lit = -tP(8, t) ((8G)2
88 + 4G2) .

By solving (5.14) by the method of characteristics we may, for example, determine

the ratio of lateral to vertical diffusion lengths without having to consider the full
system (5.1) - (5.2). The appropriate boundary condition on (5.14) is obtained by
matching into the one-dimensional region corresponding to y --+ +00; this implies
that

G", (8-~r /4 JtPG ,u.(t'»)dt' .

(5.15) as 8--+::
2
o

If u. is a constant, so that tP does not depend on t, then the problem may be

further reduced to a single ordinary differential equation because G takes the form

G(8, t) = H(8)/t

with

6. Discussion. The paper has outlined the results of applying singular per-
turbation methods to a simple model for dopant-defect pair diffusion. Much of the
analysis carries over to more realistic models which allow for the electric charges
carried by the various species.
65

Some of the reduced problems given here are not new, but have been obtained
before by physical reasoning; see [8] and [9] in particular. We are, however, able to
state precise conditions on the governing parameters in order for such reductions
to be valid. In particular, the reduced problems which follow from flux-balance,
namely (3.3) - (3.4) and (4.2) - (4.5) (see also [8] and [9]), require that, for example,
gIrl = 0(1) or in dimensional terms that

The higher concentration problems (see sections 3.2 and 4.2), which are new, are
appropriate when, for example, gIrl = O(WI) so that

cd = 0 (cjDI / wI(DdVWV + DdIWI))

In these cases a fuller balance of terms occurs (see (3.8) and (4.18)), but the surface
region can take a particular simple form (see (3.9)).
The reduced problems in one dimension often take the form of nonlinear diffu-
sion equations which contain a non-local dependence on the surface concentration
(this surface dependence arises from the formation of pairs at the surface which
diffuse in and then dissociate); see (3.3) and (4.4). Some general considerations for
such equations are given in [5]. A crucial step in the derivation of such models is
the balancing of fluxes of pairs and defects. As we have shown, such flux-balance
conditions do not in general hold for high concentrations or in more than one dimen-
sion. For this reason extensions of simplified one-dimensional models into higher
dimensions should be treated with caution. In higher dimensions the dopant dif-
fusion equations arising from the asymptotic analysis exhibit not only non-local
dependence but also direction dependence. This is exemplified by (5.12).
The results given at the end of section 4.1 illustrate how effective diffusivities
which may explain the form of diffused phosphorus profiles may be derived if both
interstitial and vacancy effects are considered (cf. [9] and [10]). For the surface
region diffusivity (4.6) to be sufficiently large we require that neither fI nor Iv be
too small, so that both mechanisms must be operative. It is nevertheless possible for
fI to be significantly larger than Iv (or vice versa) so that under low concentration
conditions the mechanism of diffusion may be dominated by a particular point
defect.

REFERENCES

[1] P.M. FAHEY, P.B. GRIFFIN AND J.D. PLUMMER, Point defects and dopant diffusion in silicon,
Rev. Mod. Phys., 61 (1989), pp. 289-384.
[2] R.B. FAIR, C.L. GARDNER, M.J. JOHNSON, S.W. KENKEL, D.J. ROSE, J.E. ROSE AND
R. SUBRAHMANYAN, Two-dimensional process simulation using verified phenomenological
models, IEEE Trans. Comp.-Aided Des., 10 (1991), pp. 643-650.
[3] R.B. FAIR AND J.C.C. TSAI, A quantitative model for the diffusion of phosphorus in silicon
and the emitter dip effect, J. Electrochem. Soc., 1.24 (1977), pp. 1107-1118.
[4] J .R. KING, Asymptotic analysis of an impurity-defect pair diffusion model, Q.J. Mech. Appl.
Math., 44 (1991), pp. 369-412.
66

[5] J.R. KING, Surface-concentration dependent nonlinear diffusion, Euro. J. Appl. Math (to
appear).
[6] F. LAU AND U. GiiSELE, Two-dimensional phosphorus diffusion for soft drains in silicon
MOS transistors, Appl. Phys. A, 40 (1986), pp. 101-107.
[7] J.W. MOORE AND R.G. PEARSON, Kinematics and mechanism, John Wiley, New York,
(1981).
[8] F.F. MOREHEAD AND R.F. LEVER, Enhanced "tail" diffusion of phosphorus and boron in
silicon: self-interstitial phenomena, Appl. Phys. Lett., 48 (1986), pp. 151-153.
[9] F.F. MOREHEAD AND R.F. LEVER, The steady-state model for coupled defect-impurity dif-
fusion in silicon, J. Appl. Phys., 66 (1989), pp. 5349-5352.
[10] W.B. RICHARDSON AND B.J. MULVANEY, Plateau and kink in P profiles diffused into S;: a
result of strong bimolecular recombination?, Appl. Phys. Lett., 53 (1988), pp. 1917-1919.
[11] H.F. SCHAAKE, The diffusion of phosphorus in silicon from high surface concentrations, J.
Appl. Phys., 55 (1984), pp. 1208-1211.
A REACTION-DIFFUSION SYSTEM MODELING
PHOSPHORUS DIFFUSION

WALTER B. RICHARDSON, JR.*

Abstract. At very high concentrations phosphorus diffusion in silicon exhibits marked non-
linearities. The hierarchy of physical models that attempt to explain this anomalous diffusion are
reviewed. An eight-species kinetic model is derived that yields a quasilinear, partly-dissipative sys-
tem of reaction-diffusion partial differential equations. The numerical method of lines is used to
solve the system for a simplified five-species model in three dimensions. The linear system in the
Newton iteration is solved using several matrix-free methods. In all cases the dimension of the
Krylov subspace must be quite large to insure convergence. This suggests that preconditioning will
be more important for efficiency than choice of an accelerator.

Key words. nonlinear diffusion, reaction-diffusion, semiconductor doping, method of lines.

AMS(MOS) subject classifications. 65J15,60J60,35K57

1. Introduction. The trend toward shrinking device dimensions in Very Large

Scale Integration has produced an increased need for accurate simulation tools for
Technology Computer Aided Design. This has lead to a hierarchy of increasingly
sophisticated models for device simulation including drift-diffusion, hydrodynamic,
and full Boltzmann. These tools rely upon process simulation for input data, including
accurate impurity concentration profiles and material boundaries. Physical processes
involved in fabricating an integrated circuit include lithography, etching, deposition,
diffusion, ion implantation, epitaxy, and oxidation. Process modeling is less mature
than device modeling, in part because the physics is less well understood. It provides a
wealth of open problems for mathematicians interested in nonlinear diffusion, moving
boundary problems, and large reaction-diffusion-convection systems.
This paper treats only the problem of phosphorus diffusion in silicon. The model
used currently and several of its refinements are reviewed. Experimental evidence
shows that phosphorus diffuses by a dual interstitialcy-vacancy mechanism. An eight-
species kinetic model is derived. Using experimentally determined estimates for the
diffusivities and simple kinetic approximations for the rate constants, it is shown that
numerical simulations of a high concentration predeposition do exhibit the anomalous
plateau, kink, and tail, characteristic of phosphorus diffusion. This model has been
implemented in 3-D using the numerical method of lines with the system integrator
LSODP. Results on several test problems indicate that for its linear solver, SIOM,
as well as several of the accelerators in the NSPCG package, the size of the Krylov
subspace must be taken sufficiently large to insure convergence in the inner Newton
iteration. Finally, analytic results for the partly-dissipative system are reviewed and
directions for future work given.

2. Nonlinear Diffusion. High temperature diffusion is used to introduce

• Department of Mathematics, University of Texas at San Antonio, San Antonio, Texas 78249.
This work was supported in part by NSF Grant DMS - 9024712
68

dopants such as phosphorus, arsenic, and boron into a silicon wafer in order to form
n-p junctions, as well as to anneal damage to the lattice caused by ion implantation.
When the impurity concentration C is less than the intrinsic electron concentration
ni, approximately 1018 cm-3 at lOOO°C, the heat equation represents dopant diffusion
well. For predeposition it is solved with a Dirichlet boundary condition at the top
surface of the wafer

(1) Ct = DIlC C(O,t) = C' and a~(oo,t) =

ae ° C(x,O) =0
As the ambient concentration C' increases, measured diffusivities show a marked
concentration dependence. To explain this, Schwettmann, Yoshida, and others pos-
tulated that diffusion was more than a simple random-walk nearest neighbor inter-
change mechanism. It is thermodynamically more favorable for an impurity atom to
bond with a vacancy, a lattice site not occupied by a silicon atom, and the pair move
through the lattice. Vacancies exist in several charge states, VO, V+ , V- , V=, with
a density that is temperature and Fermi-level dependent. Diffusion depends on the
concentration of vacancies and the concentration of, say, acceptor vacancies [V-) is
proportional to the electron concentration n. Assuming charge neutrality, n depends
on Cas in n = ~ (C + JC2 + 4n~ ); which gives a nonlinear diffusion equation

(2) ~~ = V· {D(C) [VC + C V(ln n») }

Here D( C) is a compound diffusivity

(3) D(C) = D~ + D~ (~) + D~ (:J + D~ (:J 2

and D~, D~, D~, and D~ are the intrinsic diffusivities due to neutral, donor, acceptor,
and double acceptor vacancies, respectively. If more than one impurity is present,
there would be additional equations for the other impurities, and in the drift term
the concentration C is replaced by the net electrically active concentration N. Using
(2) gives better results for extrinsic (high concentration) diffusion than (1), but still
fails to explain the pronounced nonlinearities that occur for a high-concentration
phosphorus source.
In addition to vacancies, interstitials - silicon atoms not residing on a lattice
site - are known to exist in numbers roughly equal to vacancies. Experiments such
as oxidation enhanced diffusion suggest that these species also aid in the diffusion
process. In 1974, Hu proposed that "P diffuses in Si via a dual mechanism, i.e., a
mixture of vacancy and interstitialcy mechanisms," by means of the reactions

v + pi;:"p. + S
S + Pi;:"P. + I
where S denotes a Si lattice atom, p. (Pi) substitutional (interstitial) phosphorus,
and V (1), a vacancy (interstitial), respectively. Over the next fifteen years a number
of models were developed that refined this idea, including those of Mathiot-Pfister,
Morehead-Lever, Law-Dutton, and Mulvaney-llichardson. They solved for concen-
trations of as many as three of the chemical species, using various assumptions about
equilibrium to simplify the model.
69

3. Diffusion-Limited Kinetics. A completely nonequilibrium model [1],[2] was

formulated in terms of the five reactions

;=0 <0> (R1)

;=0 P+V- (R2) (R3)
~ V- (R4) (R5)

This results in the quasilinear reaction diffusion system,

+ kneu {
+
[P ) +
n;
[e-]

(4) a[V-) V'. {Dv- (V'[V-) - [V-) V'ln(n)) }

with three analogous equations for interstitial species. Reaction R1 represents bi-
molecular generation-recombination with rate constant kbi • < 0 > denotes the result
of an interstitial silicon atom occupying a lattice site and annihilating the vacancy.
For reactions R2-R5, the forward and reverse rate constants are denoted by k! and
k~ respectively. In the equation for electron concentration (denoted both by [e-) and
n), the last term dynamically enforces charge neutrality.
The single species model with a nonlinear diffusivity has been exchanged for a
large system each equation of which is quasilinear with constant diffusivity. The
kinetic model attempts to more accurately embody the physics and gives data about
species whose concentrations cannot be measured directly. This is at the expense of
more equations and a stiffer system. Boundary conditions on the upper surface are

(5)
a[p+v-] = 0
a[
(6) (2:) . [V-j<q [P+] = C*
ni ~

with, of course, analogous conditions for the interstitial species. In the Dirichlet
condition for P+, C* represents the concentration of phosphorus in the ambient gas.
70

Reflecting Neumann conditions are enforced for all species deep in the bulk. Initial
conditions are zero for all species except the electrons and defects, which are set to
their equilibrium levels.
For vacancies it is possible to estimate the diffusivities from first principles using
thermodynamics. For interstitials and pairs the data is nonexistent or of doubtful
accuracy, and experimental data of Mathiot-Pfister was used to give the following
values

For the reaction A+B ;:: C , a simple kinetic argument due to Debye gives an estimate
for the forward rate constant for a diffusion-limited reaction of
(7)
where R is the encounter distance for A-B interaction. The rate constant for the
reverse reaction is approximated by

(8)

where n"
is the concentration of lattice sites and Eb is the binding energy of the
A - B pair. For reactions R4-R5 involving electrons this is precisely analogous to
Schocldey-Read-Hall theory of recombination-generation in which
(9) kr = Vt" Un ni exp( Ev- - Ei )
where V,,, ~ 101 cm/sec is the thermal velocity of an electron, Un ~ 1O-15cm2
is the capture cross-section, and Ev- is the energy level of the acceptor vacancy.

Table II Reaction Constants at 900°C .

Reaction R2 R3 R4 R5
Forward (cm3s 1 ) 3.0 x 10 14 4.4 x 10 14 1.0 x 1O-1S 1.0 x 1O-1S
Reverse ( S-1 ) 1.4 x 10· 8.8 X 10 1 5.6 X lOw 1.7 X 1011

In order to obtain a marked plateau, the bimolecular generation-recombination rate,

kbi , was initially taken to be 1.7 x 1O-1 cm3 s-t, which is several orders of magnitude
greater than the one determined from (7). Using an encounter distance for vacancy-
interstitial interaction of roughly 10- 7 cm, the values of Dv and DJ from Table I give
an estimate of 7x 1O- 14 Cm3 S- 1 for kbi • It has been shown, however, that recombination
is strongly dependent on impurity concentration and this dependence will result in
a much larger effective rate. Note that there are four important alternate paths for
defect recombination

P+v- + JO ..... P+ + e- (R6) p+r + VO (R7)

P+v- + [- ..... P+ + 2e- (R8) p+r + v- ..... (R9)
71

1011
10 20
10 19
..,"""' 10 II
·sU 10 17
'-'
10 16
~
.....
0
..... 1015
C":I
10 I'
h
c::01) 10 J)
U
c:: 1011
0 1011
U
10 10
10 9

Distance (!lm)
Figure 1: Simulation of a phosphorus predeposition at 900 D C for 10 minutes using the
eight-species nonequilibrium model in one dimension. This includes the additional
terms from reactions R6-R9. Crosses are experimental data of Yoshida et al., J. Appl.
Phys. p. 1498(1974).

InclusionofR6 in the model would result in an additional term oftheform - kl [P+V-] [1

in the equation for ]0. At equilibrium

2 X 10- 16 [P+] (~) [V0j"q

Given a surface phosphorus concentration of 3 x 1020, [P+V-] ::::: 106 [V0] for the
region close to the surface and

For the reaction of P+V- and 1°, orientation effects would be important as compared
with the direct recombination mechanism. Still, supposing that kl ::::: kbi' k~f1 would
be 106 times the value predicted from (7).
Numerical modeling using kinetic estimates for all rate constants, (kbi = 10- 14
cm3 S-I) and including reactions R6-R9 gives Figure 1 [3]. Note the pronounced
plateau, kink, and tail in the profile. Comparison between this and the five-species
model presented below suggests that the effect of charge is secondary. The primary
reason for the observed anomalies is the interaction between the two types of defects
and pairs via reactions R1, R6-R9.

4. Numerical Solutions in Three Dimensions. The above diffusion models

were among several implemented in the one-dimensional process simulator PEPPER
72

[4] developed at Microelectronics and Computer Technology Corporation during 1985-

1986. Work to extend these models to three dimensions is ongoing [5]. Results are
obtained for both the standard model (2) and a five-species kinetic model that results
from simplifying (4):
ov k~ p. Y + k'E E

a]
at = DI V2] k~ p.] + k'FF - kbi (y.] - v·q·]"q)

(10) oE DE V 2 E + k~ p.y
at k'E E

of
at = D F V 2F + k~ p.] k'F F

op
- k~P.Y + - k~ p.] + k'F F
at k'E E

Here E represents the phosphorus-vacancy pair or E-center and F denotes P+]-.

The partial differential equations are discretized using the numerical method of lines.
This continuous-in-time, discrete-in-space method converts (10) into a large coupled
nonlinear system of ordinary differential equations, which is then integrated using
the system integrator package LSODP (Livermore Solver for Ordinary Differential
equations - Projection), developed by Brown and Hindmarsch [6]. NMOL has proven
particularly effective on large reaction-diffusion problems; less effective when strong
convective terms are present. Members of the LSODE family are variable-order,
variable-step and use a predictor-corrector scheme based on Backward Differentiation
Formulae to control error. This implicit method requires the solution of a nonlinear
algebraic system at each time step which is performed via a Quasi (an approximate
Jacobian that is only updated as needed), Inexact (the linear system is only ap-
proximately solved using the Scaled Incomplete Orthogonalization Method) Newton
iteration. The ODE system is extremely stiff because: 1) there are over four orders
of magnitude difference in the diffusivities of the various species as seen from Table I,
2) a very fine grid is required to resolve the sharp gradients in defect concentrations
near the surface of the wafer and it is the grid spacing which determines the eigen-
values of the discrete Laplacian, and 3) the strong reaction kinetics apparent from
Table II. For convenience the simulation domain chosen is a rectangular cube with
reflecting boundary conditions on all but the top side where the conditions (5)-(6)
are enforced. Finite differences with a seven point star discretize the Laplacian and
pointwise estimates are used for the reaction terms. The absolute and relative toler-
ances are set at 1010 and 10% respectively and, unless otherwise noted, the remaining
LSODP parameters take their default values. A suite of 3 test problems is used with
Problem A possessing symmetry in two directions so that it is pseudo I-D. Figure 2
shows the geometries for Problems Band C.

The SIOM algorithm [7] has very modest storage requirements and is "matrix free"
requiring only the matrix-vector product Av for a given v. To solve Ax = b it takes
a shifted Krylov subspace I<m of JRn and seeks an approximate solution x(m) which
belongs to I<m and such that the residual rim) at x(m) is orthogonal to I<m. An Arnoldi
method recursively builds an orthonormal basis {VI, ... , v m } for I< m; incomplete refers
to the fact that Vk is only orthogonalized against the previous {Vk-b ... , Vk_j}. Saad
73

z
Figure 2: Geometries for the three dimensional test problems. B possesses symmetry
along one axis and can be compared with output from a 2-D simulator. C represents
a problem that would be difficult to model using a sequence of 2-D simulations.

has shown that when A is symmetric, SIOM reduces to conjugate gradient and in
general is equivalent to ORTHORES and ORTHOMIN.
Good results are achieved with the standard model with grids of up to 27,000
nodes. The kinetic model results in a much stiffer system, and to get convergence on
even the simple Problem A, the maximum dimension of the Krylov subspace must
be taken much larger than its default value of m = 5. Figure 3 shows the effect on
vacancy and interstitial profiles of changing the maximum Krylov dimension. The
integrator halted due to nonconvergence with m less than eight. For m = 10, a
solution is obtained, but it is unphysical and radically different from a 1-D PEPPER
simulation where a direct solver is used. Refining the grid does not improve the
situation: in Figure 4 a grid of 80 points is used in the z direction and m must be 30
before relatively fiat profiles are obtained. IT the Krylov subspace is large enough for
convergence, it may still be insufficient to produce a physically meaningful solution.
It is important to solve the linear system accurately at each timestep and this cannot
be done simply by reducing the tolerance in the outer loop, one must enlarge the
subspace or cut back on the stepsize severely. For the SIOM algorithm in LSODP
the quantity Av's/Calls is the average Krylov dimension and represents a measure of
the efficiency of the method. Figure 5 shows Kavg for both models on a grid of 1000
points. As time increases the asymmetric Jacobian makes a larger contribution to
the linear system, effectively requiring a larger Krylov subspace to obtain an accurate
solution.
To compare SIOM with other accelerators, the calling sequence in LSODP is
modified so that routines from the NonSymmetric Preconditioned Conjugate Gradient
74

"
16
c::
.9
..,
..,...
C1l 15
.................................. .......... :--...,:
~~

d
<IJ
U
c:: u
0
U
....0 ':..-- ....................... - ........ . ........... ~ .7. ~P.... . __ -:
----_ .. - .. ", ...
~~
~~
Il
- ... _---------- ----------
~,

bO
m = 15
....
0
....:l
12

m=lO

11
1. 2.

Distance (p.)
3. . . s.

Figure 3: The effect on vacancy and interstitial profiles of increasing the Krylov
subspace dimension. This is Problem A for five species with 20 gridpoints in the z
direction. The lower curve of each pair is the vacancy, the upper the interstitial.

11 ~______________________________________________- ,

16 -------
........

...o 15 :' :' 20

~.~ .. -.-).-..:...-.-.-.-.~.~.:. -.-.-.-.-.-. - .
/ ....~~

bO ,':' ........... 30
o
....:l ......yI~:.,.. ::............... .
....... :4~ ....
-------

II ~--~----~--~----~--~----._--_r----._--_r--~
1. 3. . S. ,
Distance (p.)
Figure 4: Same problem as in Figure 3, but with 80 gridpoints in the z direction.
Dim(Km ) must be as large as 30 to achieve the expected flat defect profiles.
75
20 ~-------------- __________________________- ,

AS
IS

CI ._____ .... -- ....... -#--------

._. _....---~~~~~-~~~~., S; -:-.-----------------------
-..:.-------- A1 - - - - - - - - - - - - 1
0 +O------I~O-
O ----~
~O-
O ----~
,O~
O ----~.O~O------50~O----~
.OO

Time (sec)
Figure 5: Average Krylov dimension [(aug as a function of time. The number [(aug
is a measure of the relative effectiveness of the SIOM algorithm in solving the linear
system.

package [8] can be called. It uses various accelerator techniques such as Chebyshev
and generalized conjugate gradient. Four accelerators were chosen for comparison,
ORTHOMIN, GMRES, BCGS, and Minimum Error. Table III gives the runtimes
in seconds on a Sparcstation for the standard model (2) on a grid of one thousand
points, where F(u)'s is the number of evaluations of the RHS in (10), Av's is the
number of times the matrix-vector product is formed, and Calls is the number of
calls made to the iterative solver. All four algorithms performed well, with roughly
equivalent runtimes, although GMRES and ORTHOMIN were slightly faster.

TABLE III
Prob Method Time F(u)'s Av's Calls
A BCGS 132.7 314 244 69
GMRES 109.3 253 183 69
ME 135.2 314 244 69
ORTHOMIN 107.7 253 183 69
B BCGS 1452.0 3489 2784 704
GMRES 1087.4 2589 1935 653
ME 1448.3 3489 2784 704
ORTHOMIN 1167.3 2793 2088 704
C BCGS 1898.4 4544 3628 915
GMRES 1555.7 3661 2739 921
ME 1899.9 4544 3628 915
ORTHOMIN 1537.6 3637 2721 915

Of more importance than the choice of a particular accelerator will be selection of an

effective preconditioner.
76

5. Conclusions and Future Work. Process modeling, and in particular mod-

elling diffusion of impurities, is less mature than device modeling and represents an
area rich in open problems for applied mathematicians. As the trend towards Ultra
Large ScaJe Integration continues, there will be an even greater need for more accu-
rate physical models and numerical techniques to model diffusion, oxidation, lithog-
raphy, and vapor deposition. Detailed theoretical analyses of the reaction-diffusion-
convection equations that arise in process simulation are just beginning. King [9]
performs an asymptotic analysis for the standard model and the phenomenological
Fair-Tsai model for phosphorus. Yeager [10] examines a "reactive-definite" system
involving several species but only one reaction,
(11) u~ = ~u; ± f(ub ... , un)
including the question of convergence of the discrete Newton method. Hollis and
Morgan [12] prove that the five-species system (10) satisfies their theory of partly-
dissipative reaction-diffusion systems
(12) Ut = D~U + F(U)
where D = Diag(dl, ... ,dm ) with d; ;::: 0, because the function F corresponding to
(10) satisfies conditions of 1) quasipositivity 2) the intermediate sum condition of
Morgan and 3) polynomial growth. They observe that the boundary condition in (6)
for substitutional phosphorus should be replaced by Dirichlet conditions for the pairs,
since P+ has zero diffusivity. From a numerical standpoint either set of conditions is
enforced via a penalty method and chemical equilibrium is very quickly achieved at
the wafer surface, so that similar results are obtained.
We have presented a hierarchy of models for phosphorus diffusion in Si, culmi-
nating with an eight-species reaction-diffusion system. It has been shown that this
model can be implemented in three dimensions using current technology on grids of
203 points. As the mesh is further refined, runtimes become prohibitive for all the
iterative solvers and for both models. This is explained by noting that when the
Laplacian is discretized, the resulting matrix A has a spectrum that depends on the
mesh spacing. A becomes increasingly ill-conditioned as the meshsize goes to zero.
This carrys over to the operators \7 . (D(C)\7C) and D~U, and gives rise to stiffer
ODE systems. Our results suggest that preconditioning will be very important in
solving these problems on realistic meshes. Future work will include preconditioning
with the strategies of Bramble and also Neuberger. A detailed theory of convergence
for the discretized system has yet to be worked out. This analysis should be made for
very irregular grids yet with realistic boundary conditions such as injection of inter-
stitials during oxidation. Precise a posteriori error estimates would allow for accurate
grid refinement as part of the solution process.

REFERENCES

[1] W. B. Richardson and B. J. Mulvaney, Plateau and kink in P profiles diffused into Si
- A Result of Strong Bimolecular Recombination?, Appl. Phys. Lett. 53(1988), pp.
1917-1919.
77

[2] W. B. Richardson and B. J. Mulvaney, Nonequilibrium behavior of charged point defects

during phosphorus diffusion in silicon, J. Appl. Phys., 65(1989), pp. 2243-2247.

[3] B. J. Mulvaney and W. B. Richardson, The effect of concentration-dependent defect

recombination reactions on phosphorus diffusion in silicon, J. Appl. Phys. 67(1990),
pp. 3197-3199.

[4] B. J. Mulvaney, W.B. Richardson, and T. Crandle, PEPPER - A Process Simulator

for VLSI, IEEE Trans. on Computer-Aided Design, 8(1989), pp. 336-349.

[5] W. B. Richardson, G. F. Carey, and B. J. Mulvaney, Modeling Phosphorus Diffusion in

Three Dimensions, IEEE Trans. on Computer-Aided Design, 11(1992), pp. 487-496.

[6] P. N. Brown and A. C. Hindmarsch, SIAM J. Numer. Anal., 24(1987), pp. 610.

[7] Y. Saad, Krylov Subspace Methods for Solving Large Unsymmetric Linear Systems,
Math. of Comp., vol. 37,105(1981).

[8] T. C. Oppe, W. D. Joubert, and D. R. Kincaid, Center for Numerical Analysis Report
CNA-216, University of Texas.

[9] J. R. King, On the diffusion of point defects in silicon, SIAM J. AppJ. Math. 49(14),
1989, pp. 1018-1101.

[10] H. R. Yeager and R. W. Dutton, An Approach to Solving Multiparticle Diffusion

Exhibiting Nonlinear Stiff Coupling, IEEE Trans. Electron Devices, vol. ED-32, pp.
1964-1976, 1985.

[11] S. L. Hollis, J. J. Morgan, and W. B. Richardson, Partly Dissipative Reaction-Diffusion

Systems and a Model of Phosphorus Diffusion in Silicon, submitted.
ATOMIC DIFFUSION IN GaAs WITH CONTROLLED DEVIATION
FROM STOICHIOMETRY
KEN SUTO* AND JUN-ICHI NISHIZAWA **
Abstract. Although several models for atomic diffusion in GaAs have been presented, they
have not given strong attention to the effect of the arsenic vapor pressure, i.e., the deviation from
stoichiometry, or some of them are thought to be unrealistic.
On the other hand, we have shown that the crystal growth from solution or melt under ap-
plied vapor pressure i.e., temperature difference method under controlled vapor pressure, can be
explained by the equality of the arsenic chemical potentials, and the dominating point defects are
arsenic interstitial atoms and arsenic vacancies, but not gallium vacancies and gallium interstitials.
On the basis of this theory, we will present models for atomic diffusion of impurities and point
defects. We discuss sulfur diffusion, self-diffusion of gallium and arsenic, and silicon diffusion.
they can well explain the arsenic vapor pressure dependences, and the comparison with known
experiments gives reasonable values for formation energies and migration energies. We also discuss
impurity-enhanced diffusion at a heterostructure interface where sharp discontinuities of gallium
and aluminum chemical potentials are present.

Key words. chemical potentials, interstitials and vacancies

AMS(MOS) subject classifications.

1. Introduction.
The deviation from stoichiometry was found to greatly change the material
properties. In the case of GaAs, the heat-treatment experiment under the applied
arsenic pressure showed the existence of the exact stoichiometric vapor pressure,
and it was found that interstitial arsenic atoms and arsenic vacancies were dominant
point defects governing the deviation from stoichiometry [1-3]. On the other hand,
the liquid phase and melt growth experiments were made with applied arsenic vapor
pressure upon the solution [3-8], and it was found that the deviation from the
stoichiometry was controlled by the applied vapor pressure and the stoichiometric
crystals were segregated at just the same arsenic vapor pressure as that of the heat-
treatment. The grown crystals were very perfect. This growth method was called
the temperature difference method under controlled vapor pressure (TDM· GVP).
Figure 1 illustrates the experimental methods of the heat-treatment and TDM .
GVP growth.
The stoichiometry-controlled crystal growth in T D M· GV P was found to be due
to the increased saturation solubility of the solution under applied vapor pressure
[6,7] and it was found that the equality of the chemical potentials of arsenic holds
between the three phases. The details of the experiments and the chemical potential
approach can be referred to Reference [7].
Atomic diffusions in GaAs are also thought to be greatly influenced by the
deviation from stoichiometry. However, most of the discussions so far have not
given strong attentions on this point. In this paper, we discuss the atomic diffusions

*Department of Materials Science, Faculty of Engineering, Tohoku University, Aoba Aramaki,

Aoba-ku, Sendai 980, JAPAN.
**Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980, JAPAN.
80

on the basis of the chemical potential approach developed for TDM· GVP. We
deal with diffusion of sulfur, self diffusion of arsenic and gallium, and also diffusion
of silicon. As we show in the following sections, these are typical of interstitial
diffusion, diffusion via arsenic vacancies, and diffusion with site transfer. In these
discussions we assume that corresponding point defects are in thermal equilibrium
under the applied vapor pressure. In the final section, however, we deal with atomic
diffusion at a hetero-interface, which is an example in which equality of the chemical
potentials does not hold, which causes characteristic interface mixing phenomena.

two zon~ furnace

c:r "I
ampoule
~
As

Figure la. Heat-treatment of GaAs under controlled arsenic

vapor pressure.

T, - - -- '01 source
crystals

b;d
solution
T2 --- - - GaAs
9.Jbstrate

I~m perature

Figure lb. Schematic diagram of stoichiometry controlled crys-

tal growth method (TDM· GVP).
81

2. The stoichiometry control by the applied arsenic vapor pressure.

In the TDM . GVP crystal growth and also in the heat-treatment of GaAs
under applied arsenic vapor pressure, the deviation from the stoichiometry of the
crystal is determined by the equality of the arsenic chemical potentials. That is,

(2.1) J.I~. = J.I~s = J.lAs (for TDM· GVP)

(2.2) J.I~s = J.lA. ( for heat-treatment ).

There are four kinds of point defects which possibly cause the deviation from
stoichiometry, except anti-lattices. They are an arsenic interstitial atom, lAs, ar-
senic vacancy. VAs, gallium interstitial, lGa, and gallium vacancy VGa. However,
we have experimentally shown that lAs and VAs are dominant, while lGa and VGa
are much less in concentration. In such a case, the AS 4 vapor pressure which gives
the exact stoichiometry (which we call the optimum vapor pressure in T D M . GV P)
is determined by the equality of the concentration of lAs and VA •. Figure 2 shows
the experimental stoichiometric vapor pressures in T D M . GV P and in the heat-
treatment, together with the calculation based on lAs and VAs. These three curves
are in a very good agreement. Table 1 gives the free energies of formation for lAs
and VA. adopted for the calculation, which were determined from the change in the
lattice parameter. Recently, photocapacitance studies have shown [9-11] that the
formation energy of the interstitial arsenic atoms is t:.Hr~s = 1.1eV, which is just
the same as adopted in the calculation. From these results, the formation energies
of lAs and VA. listed in Table 1 are thought to be reliable enough.

lAs 1.52 eV 1.19 eV (1.1 eV*) 10.48 e.u.

VAs 1.38 eV 1.71 eV 1.18 e.u.

*Reference [10 I

Table 1. Free energy parameters for arsenic interstitial atoms

and arsenic vacancies in GaAs.
82

T.OC
1200 1000 800 600

• heat-treatment

I crystal growth

....
....
~ 2.6Xl06exp(-~ )
kT

calculat ion
1 64Xl06 eXp(_1.04eV )
. kT

a.
o

0.6 0.8 1.0

103/ T. OK-I

Figure 2. Optimum arsenic vapor pressure as a function of

temperature

We consider the following reactions for the formation of lAs and VAs'

(2.3) As (surface, solid) = lAs; ~GfAs'

(2.4) As (lattice site) = VAs +As (surface, solid); ~Gf:As'

~GfAs and ~G~ As are the free energy differences per a molecule, and described
as

(2.5)
~GfAs = ~HrAs - T~SrAs
~G~ As = ~He:As - T~Se: As

where ~HrAs and ~He: As are the entalpies, and ~SrAs and ~Se: As are the en-
tropies of vibration.
In order to refer to the applied AS4 vapor pressure, we need the following reac-
tion equation.

(2.6) As (surface, solid) = 1/4As 4 ; ~GA:

where ~G~!, is the free energy of sublimation of arsenic element. If we write the
reactions of formation of IAa and VAs as,

1 r
(2.7) 4As4 (gas) = lAs; tl.G 1As
1
(2.8) As (lattice site) = VAs + 4As4 F'
(gas); tl.G VAs .

We have the following relations from Equations (2.3) to (2.8).

~Gf~s = tl.GfAs - ~G~:

(2.9)
tl.G~~s = tl.GfAs + tl.G"A':.
In these expressions F' and F mean the free energies referring to the solid and gas
phase arsenic, respectively. When the chemical potential of the gas phase is equal
to that of the solid phase, we have the following equations from Equations (2.7) and
(2.8).

(2.10)

(2.11)

In these expressions, [lAs] and [VAs] are defined as [lAs] = NIAs/NYAs and [VAs] =
NVAs/N~As' where NIAs and NVAs are concentrations of lAs and VAs, and NYAs
and N~As are the concentrations of the whole available interstitial and the substi-
tutional arsenic sites.
Assuming that NYAs = N~As' the AS4 vapor pressure corresponding to the exact
stoichiometry, which we call the optimum vapor pressure p~~t , can be obtained from
the condition that [lAs) = [VAs].
That is, from Equations (2.10) and (2.11) we have

4tl.G~:) {2(tl.G~ As - tl.GfAS)}

As. = exp (
popt
kT exp - kT
(2.12)
F' kT F'} .
2(tl.G VAs - tl.G 1As )

Figure 2 has shown the calculation by this equation with the parameters listed
in Table 1. As for the gallium vacancies, VGa, and gallium interstitials, IGa, the
similar expressions are possible, though their concentrations are thought to be much
smaller.
That is, the formation energies are defined from the following reactions,

(2.13) Ga (surface, solid) = IGa; tl.GfGa,

(2.14) Ga (lattice site) = VGa + Ga (surface, solid); tl.G~Ga.
84

In order to refer to the AS 4 vapor pressure, we use the following equation

(2.15) GaAs (solid) = Ga (solid) + 1/4As4 (gas); D.CSG'!'As

where D.CSG'!'AB is the free energy of the sublimation of GaAs solid phase. Then, we
have

. 1 ~
(2.16) GaAs (solid) = Iaa + 4As4 (gas); D.GIGa

(2.17) ~AS4 = Vaa + As (lattice site); D.G~~a

with

(2.18)
D.Gf~a = D.Gfaa + D.GG'!'As
D.G~~a = D.G~aa - D.GG'!'As
Equations (2.16) and (2.17) gives the arsenic vapor pressure and temperature de-
pendences of [Iaa] and [Vaal as follows.

(2.19)

(2.20)

3. Diffusion of sulfur.
There is a detailed experiment on the diffusion of sulfur in GaAs under applied
arsenic vapor pressures by Young and Pearson [12]. The diffusion coefficient D
increases as the square root of the AS4 vapor pressure, but it saturates at a higher
vapor pressure as shown in Figure 3. They proposed that the complex of gallium
divacancy and sulfur donor is responsible for the diffusion. However, as pointed
out in Sec. 2, the equilibrium concentrating of the gallium divacancies should be
so small that they could not be dominant migrating species. On the other hand,
B. Tuck assumed the arsenic vacancies in his calculation of the diffusion profile [13].
But we think that arsenic vacancies cannot explain the vapor pressure dependence
of D, in spite of his assertion.
In place of them, we propose an interstitial sulfur diffusion model. First, it
should be pointed out that the cross over-points of the square root line and the
constant D line, is close to the optimum vapor pressure described in Section 2,
both at T = 1000°C and 1130°C, as denoted by arrows in Figure 3. The model
must explain this fact, as well as the vapor pressure dependence.
We assume that a substitutional sulfur donor st changes to an interstitial
sulfur, It, and an arsenic vacancy, V~s' or we assume that, in the presence of
an interstitial arsenic atom I~s' st chang~s to an interstitial molecular complex
(Is. IAs)+ and an arsenic vacancy.
85

10- 11 c------------------=::::;::::~q

• 1003·C
x 1003·C (preannealed)
o 1130·C

10
PAst., atm

Figure 3. Diffusion coefficient of sulfur in GaAs by Young

and Pearson [12). Arrows indicate the optimum arsenic vapor
pressures by Nishizawa, Okuno and Tadano [5).

The following reaction equations should hold in the above two cases, respec-
tively.

(3.1) s1. +=2 If" + V~.j 6Gi 1

(3.2) s1. + I t +=2 (I• . IAs)+ + V~.j 6Gi 2 •

We have assumed that the charge states of sulfur do not change in this reaction
because there is no such an observation of strong concentration dependence as in
Zn diffusion [17). Also, charge states of arsenic vacancies and arsenic interstitial
atoms have been assumed to be neutral as was described in Section 2.
For the interstitial diffusion under thermal equilibrium of VAs and lAs, the for-
mer equation gives the (PAs.)* dependence of D, while the latter gives (PAs.)i
dependence. Because of the vapor pressure dependence and also because the dif-
fusion coefficient of sulfur is much smaller than that of zinc for which it is well
established that zinc interstitial atoms have a form of isolated atoms [14), it will
be reasonable to assume that an interstitial sulfur and an interstitial arsenic atom
make a complex like a molecule as described by Equation (3.2).
Following the diffusion equation which was established in the case of zinc diffu-
sion, the diffusion equation for sulfur is given by

~(N - ~ (DI ONsub D~ONinter)

(3.3) at sub + N.)
mter - at; ax + • ax
6

where N.ub and Ninter are concentrations of substitutional and interstitial species
and D~ and Dl are their intrinsic diffusion coefficients, respectively. If we can
86

assume that N.ub > Ninter, and the reaction given by Equation (3.2) under the
applied arsenic vapor pressure is fast enough, we get

(3.4)

which gives the following effective diffusion coefficients, D.ub, for substitutional
sulfur

, , 8Ninter
(3.5) Dsub = Dsub + Dinter-8N.
sub
.

As will be discussed in the next section, the substitutional impurity diffusions

via vacancies which are typical in elemental semiconductors are thought to be hard
to occur in the covalent compound semiconductors. Therefore, we consider only the
second term, i.e.,

• I 8Ninter
(3.6) Dsub =; Dinter -8N. .
sub

Under the arsenic chemical potential given by the applied arsenic vapor pressure,
the equilibrium for the equation (3.2) gives the following equation.

(3.7)
[I•.It1 .[V1.1 _ (_ ~Gi2)
[st1' [11.1 - exp kT

where [st1 is the concentration of st, relative to the concentration of the whole
available sites for st, and all other notations [11.1, [v1.1 and [IA • . It1 are also
relative concentrations of each species.
Assuming that the densities of the whole available sites for st and (I• . I A .)+
are the same, we get

(3.8) 8Ninter
- [11.1
- - - --exp (~Gi2)
---
8Nsub - [v1.1 kT

so that the effective diffusion constant for substitutional sulfur is given by

(3.9)

If the thermal equilibrium is established under the applied AS4 vapor pressure, [11.1
and [v1.1 are given by Equations (2.10) and (2.11) i):! section 2, respectively.
Then, Dsub is expressed as follows using P A.4 and the free energies of formation
for IA., VA. and (I.' IA.)+.

(3.10)
87

On the other hand, the stoichiometric vapor pressure, ~~! is given by Equation
(2.12), at which [IAa)= [VAa ) holds. Therefore, the diffusion constant at the stoi-
chiometric vapor pressure, D~;~ is given by

(3.11) opt
Dsub = I
D inter exp
(
-
t::..G i2
kT ) .

D{nter is approximately given by the following form using the free energy of
migration of the interstitial molecule t::..G m2 = t::..Hm2 - Tt::..Sm2

(3.12) I
D inter •
=; '16 a-72-1I exp -t::..Skm2- exp (t::..Hm2)
---y;;r-

where d is a distance corresponding to a jump for interstitial migration, and iJ is a

jump frequency.
Therefore, the activation energy Q~pt for the sulfur diffusion under the stoichio-
metric vapor pressure P';C. is given by

where t::..Hi2 is the entalpy part of t::..G i2 •

Young and Pearson gave Qs = 2.6eV, in the region where the vapor pressure
dependence of the diffusion coefficient saturates, which should be close to Q~Pt as
was shown in Figure 3. As a rough estimation of t::..Hm2 and t::..Hi2' the migration
energies of self-interstitial diffusion in silicon and germanium was estimated to be
very small, in the range of 0", O.3eV [15,16). In the case of interstitial molecule,
t::..Hm2 should be considerably larger than O.3eV.
There are two important interstitial sites, which have tetragonal and hexagonal
symmetries as illustrated in Figure 4. In elemental semiconductors, Si and Ge, it
is considered that the tetragonal site has a lower energy, so that the free energy
for migration corresponds to a jump through the hexagonal interstitial site [19).
Therefore, as a possible model of interstitial complex, two atoms S and As are
in the neighboring tetragonal interstitial sites as shown in Figure 4, and they can
move as a whole, with each atoms moving through a hexagonal saddle point, to one
of four equivalent neighbouring locations. Therefore, t::..Hm2 for a molecule should
not be so much larger than twice of t::..Hm for isolated atoms, that is, we roughly
estimate that t::..Hm2 ~ O.6eV.
As for t::..Hi2, it should be smaller than the activation energy for Frenkel pair of
lAs and VAs, that is

(3.13) As (lattice) +:t lAs + VAs; t::..G~:nkel = GrAs + t::..G~ As'

From Table 1, t::..Htenkel = 2.geV is obtained and we get t::..Hi2 < t::..Hil <
2.geV.
Therefore, it is understood that Q~Pt = t::..Hm2 + t::..Hi2 = 2.6eV is in the
reasonable range.
88

Comparing with the present sulfur case, it was assumed in the case of Zn
diffusion that zinc interstitials are isolated atoms. IT the attractive interaction of
zinc and arsenic interstitial atoms is too strong, a stable complex may not be formed
because they tend to find a place in a single interstitial cell so that they destroy
a lattice and one of them return to a substitutional site. We assume that the
molecular complex (l• . lA.)+ is stable because of a weaker interaction between I.
and lAo as illustrated in Figure 4.

T
--@

Figure 4. Interstitial sites in the GaAs lattice, and a model of

a comlex of sulfur and arsenic interstitial atoms.

3.1 Saturation of the diffusion coefficient of sulfur at a high arsenic

vapor pressure. It was shown that the density of the stacking faults in GaAs
grown by T D M CV P rapidly increases with increasing the arsenic vapor pressure
when it exceeds the optimum vapor pressure [3]. This fact suggests that arsenic
interstitial atoms tend to aggregate each other to form larger scale defects like
dislocations, stacking faults, and precipitates at higher vapor pressures exceeding
the optimum vapor pressure [3,7].
We assume here that, at a high arsenic vapor pressure, irreversible aggregation
reaction takes place while the point defects like arsenic interstitial atoms and arsenic
vacancies are still nearly in equilibrium with the applied vapor pressure. IT we can
assume an aggregation reaction takes place when an interstitial molecule find an
arsenic interstitial atom, the reaction can be expressed as,

(3.14)

where k is the reaction rate, and {2As + S} denotes the smallest aggregated con-
figuration. Then, we have the following rate equation,

(3.15)

Also, the reaction described by Equation (3.2) can be expressed by using the rate
89

constants kl and k2 as follows

(3.16)

In a steady state, the whole generation rate of (l• . lA.)+ equals 0, i.e.,

(3.17)

Then, we have

(3.18)

Therefore, the equivalent diffusion coefficient is given by

(3.19) D.ub = Dlnter 11k •

( ~)
BNsub 0
+r.
with

(P~)2 (b..Gf~. - b..Gf:~. + b..Gi2)

( {)NiDter)
{)N snb0
= [11.1
[v1.1 exp
(_ b..G.i2)
kT
=
A.. exp kT .

( ~NN~Dlrr)
(J Inter 0
is just the same as was given by Equation (3.8).
The diffusion coefficient in the saturation region is given by D::t = DIDter ~ ,
and the critical vapor pressure is given by

( 3.20) l= (kl)2
peritic ..
A.. k exp
{2(b..Gf~. - b..Gf:~.
kT
+ b..Gi2)} = (k2)2 opt
k PA •• ·

The experimental fact that the p~r!!ieaJ is close to the optimum vapor pressure
means that the values of k and k2 are not greatly different each other.
Comparing the two reactions,

+ +
(3.16) (I•. lA.) + VA.0 +2
k.
SA. + lAs
0

(3.14) (Is· lA.)+ + l~s ~ {2As + S}.

The reaction rate constants k2 and k are determined by the random walk areas of
(l• . lAs)+, V1s and l1s' as far as we assume they immediately react when they come
90

to the nearest neighbour sites each other. Therefore, if interstitial arsenic ato~ are
isolated each other, k may be larger than k2 because the diffusion constant of lAs
is thought to be the largest. However, if interstitial arsenic atoms are interacting
each other at such a high concentration that aggregation takes place and effective
diffusion constant of ~8 is reduced, then the diffusion constant of (Is' lAs)+ deter-
mines both k2 and k so that k2 ~ k will hold. The fact that the cross over point is
close to the optimum vapor pressure means that the above mentioned mechanism is
essentially true. However, more precisely, the cross over point is about twice times
the optimum pressure and the curve bends more sharply than Equation (3.19) pre-
dicts. We think, therefore, that actual aggregation takes place more suddenly when
the concentration of arsenic interstitial atoms exceeds some level.
Finally in this section, Figure 5 shows the calculated temperature dependence
of the diffusion coefficient of sulfur at a stoichiometry and at a gallium rich liquidus
line as well as cross over point with assuming flHm2+flHi2 = 2.6eV, and (lit)2 = 2.

800
10-11 ,....-....:.:,,:..:..-...,.---:,.:...:~--r---....----,

10"12

III
o

10~7~ _ _L - _ - L_ _~_-J_ _~~_L-_-J

0.7 0.8 0.9 1.0

103/T, "K-1

Figure 5. Calculated temperature dependence of the diffusion

coefficient of sulfur in GaAs, together with experimental points
from Young and Pearson (. at the stoichiometric vapor pres-
sure, 0 at the solidus boundary).
91

4. Self-diffusion of Ga and As.

Although there is an early work on a self-diffusion by Goldstein [17], the arsenic
vapor pressure was not controlled. Recently, Kashiwagi [18] has made a more pre-
cise experiment of the self-diffusion of Ga and As under the applied arsenic vapor
pressures. His result is greatly different from that of Goldstein as for the activation
energies.
Kashiwagi's result is that both DAs and DGa is proportional to (PAs 4 )-i, and
the highest values near to the gallium-rich liquidus line have been given as a function
of temperature as follows

D As = 2.3 X 10- 5 exp

2.06eV)
(---;;y;-
DGa = 5.2 x 10
-5
exp (2.14eV)
---;;y;- .

The most striking result is that D As and DGa are nearly equal. Although the
above expressions are a little bit different from each other, they are equal within
the experimental uncertainty over wide ranges of vapor pressure and temperature.
It should be first pointed out that the diffusants are isotopic As and Ga (we
describe them As' and Ga') while the arsenic vapor pressure is that of usual natural
arsenic atoms. As a result, the chemical potential for isotopic arsenic, JL~s' is
without control, and [I~s] should be much smaller than [lAS]' On the other hand,
there is no discrimination between As' and As for VAs' Therefore, this kind of
experiment is not related to the diffusion of arsenic interstitial atoms.
We present a model in which both arsenic and gallium atoms diffuse assisted
by arsenic vacancies.
It has been usually assumed that VAs and VGa can migrate like shown in Figure
6 (a), that is, an arsenic atom jumps to VAs directly from one of the next nearest
lattice sites. As a comparison, in silicon and other elemental semiconductors, an
atom need only to jump to the nearest neighbour lattice site, as shown in Figure
6 (b). The latter need only stretching of the lattice bonds as illustrated by dotted
lines, but in the former case (a) the bonds should be broken and the atom must go
through an interstitial site. It will need a much higher migration energy than in the
case of (b). On the other hand, if an atom in compound semiconductors could jump
to a vacancy at the nearest neighbour lattice site like in elemental semiconductors,
then, a line of antilattices would be generated as illustrated in Figure 6 (c). From
this discussion, it is understood that a simple diffusion mechanism via vacancies as
in silicon is hard to be considered in covalent compound semiconductors.
The experimental fact that DGa = D As should not be accidental but it implies
that Ga' (or Ga) and As' (or As) jump as a pair in the presence of an arsenic
vacancy.
92

VAs , L\

Figure 6. illustrations of vacancy migration: a) a simple mi-

gration model in GaAs, b) migration in Si, c) migration
forming antilattices.

Figure 7 illustrates a possible migration process. First, a Ga' (or Ga) jumps to
the nearest neighbour VAs site and, at the same time, As' (or As) nearest to the
Ga' jumps to the former Ga' site, so that the VAs moves to the next nearest lattice
site and paired antilattices Ga' - As' are formed. This paired antilattices in Figure
7 (b) do not mean a stable energy state but a saddle point through which (Ga' -
As') move to the final stable state. The latter half of the step is as follows. The
three atoms of the paired antilattices (Ga' - As') and a neighbouring Ga denoted
Ga" in Figure 7 (c) cause interchanges between them as shown by the three arrows
in Figure 7 (c) and relax to a final state shown in Figure 7 (d) . It is assumed that
the saddle point energy of the paired antilattices (Ga' - As') is enough high to
cause the movement of the third atom Ga". As for the possibility of an interchange
within (Ga' - As'), it will need a higher energy than the three body interchange.
The migration of an arsenic vacancy should be much easier in this model than
that in the simpler process illustrated in Figure 6 (a). There may be a similar
process based on VGa' but we can assume that the concentration of VGa is much
smaller than that of VAs as explained in Section 2. Therefore, the diffusion constant
of As and Ga are the same, and given by the following form in terms of [VAs].
93

VAs ..l
I 'I

Ga'

As'

Figure 7. Model of gallium and arsenic self-diffusion in GaAs

via arsenic vacancies. Ga' -As' in a) are finally transferred as
Ga'-As' in d).

(4.1)

where ~G;'a.ir = ~H;:Ur - T ~S;air is the free energy of the saddle point correspond-
ing to paired antilattices with VAs at the nearest site.
Using Equation (2.10) the diffusion coefficient is expressed in terms of the ap-
plied AS4 vapor pressure and temperature as follows

. 1 2 _~ (~G;:"ir + ~Gf~s)
(4.2) DAs = DGa ~ tid V(PAs.) • exp kT .

In order to compare with the experiment, we must know the expression at the
gallium rich liquidus line at which [VAs] becomes maximum and PAs. becomes
minimum.
-~
(PAs:) min is obtained from the following equation

1
(4.3) G aAs (lattice ) = Ga (liquid ) + 4As4; A sub'
UGaAs'
94

This is composed from the following equations

1
(2.15) GaAs (solid) = Ga (solid) + '4As.i ~asG':A.

(4.4) Ga (solid) = Ga (liquid) i ~G~/J

so that we obtain
(4.5) ~G~:~s = ~G~:A. + ~G~/J
where ~~:A. is the free energy of sublimation of GaAs and ~G~/J is the free energy
of fusion of gallium. Equation (4.3) gives

(4.6)a~/J. (PAs.)! = exp ( ~~*~.).

where ab/J is the activity of liquid gallium.
At the gallium rich liquidus line at temperature far below the melting point,
the solubility of arsenic in liquid gallium is small and we can assume aGa ~ 1.
Theft, we have
l.
min = exp ( ~GSUb'
GaA8 )
(4.7) A) s
( P4 .· kT·
Therefore, the highest value of the diffusion coefficient of As and Ga is given by

(4.8) Dmax _
A8 -
Dmax ' - .!.cf
G/J ..... 6 1/ exp
(~GP'air + ~G~~8
kT
-
~G~:~8) .

In the expression D = Do exp (-~), Q = 2.1eV was obtained by Kashiwagi, that

is, we get Q = ~H;'.Jr + ~H~~8 - ~HG~~. ~ 2.1eV. Using the known value
~HG~~8 = 1.14eV, ~Hb/J = 0.058eV we get ~H~:~s = 1.20eV. Also, we have
shown that ~H~~8 = 1.71eV in Table 1. Therefore, we get ~H;'.Jr = 1.6eV.
This corresponds to the energy height of the saddle point, and it means the
activation energy of migration of VAs.
The migration energies of vacancies in silicon and germanium have been esti-
mated to be 1.06 '" 1.0geV, and 0.95 '" 0.98eV, respectively. Considering that
the state in Figure 7 (b) is higher in energy than the saddle point state in ele-
mental semiconductors shown by the dotted line in Figure 6 (b), the estimated
value of ~H;'.Jr should be thought to be in a reasonable range. Using the values
~H;'.Jr = 1.6eV and ~H~~s = 1.71eV, the self-diffusion coefficients in Equation
(4.2) are described as follows.

(4.9) DA8 = DGa = DO(PA8')-! exp (_ 3.:~V).

Figure 8 shows the calculated temperature and vapor pressure dependence in com-
parison with the experiments by Kashiwagi and Goldstein.
Considering that Goldstein's experiment was made under excess arsenic vapor
pressures but without control, his experimental points are almost within a calculated
region, but the activation energies may be meaningless.
95

T, ·c
1200 1000 800

N
tn
E
<J
-10"1'

~
III
(!)

010-15

10"16

10~7LL..........L-.....~..........~.....~..........J -..........L -.....~

0.7 0.8 0.9 1.0
lO¥T, • K-l

Figure 8. Calculated temperature dependence of the self-diffusion

coefficients of gallium and arsenic in GaAs, D(Ga) = D(As).
I, experimental data from Kashiwagi [18] .• and 0 are D (Ga)
and D (As), respectively, by Goldstein [17].

5. Diffusion of silicon.
Diffusion coefficients of most of foreign elements in GaAs are much larger than
the self-diffusion coefficients of As and Ga which can be interpreted by the move-
ment of a pair of atoms via an arsenic vacancy.
Therefore, we must consider interstitial diffusion and other mechanisms for
them. In the case of silicon in GaAs, silicon atoms can locate both Ga and As lat-
tice sites, so that we must consider a diffusion mechanism based on the site transfer
of silicon atoms, other than the interstitial diffusion mechanism. The site transfer
diffusion was first introduced in Si-Si pair diffusion [3,19], but in the present model
Si atoms need not be strongly paired.
Figure 9 illustrates the site transfer diffusion mechanism. As a first step, Sioa
(silicon atom at the gallium site) transfers to VAs at the nearest neighbour site,
resulting in the formation of Si A • and V Oa , then VOa goes out of the lattice, or
recombines with an interstitial gallium, lOa, so that thermal equilibrium is reached.
As a second step, Si A • transfers to VOa when it comes to the nearest neighbour
96

site, which results in the formation of SiG" and VA., and the latter also returns
to the thermal equilibrium. Each step can be described by the following reaction
equations

(5.1) Sti;" + Vis + e ~ Si'A. + v8" + h

(5.2) Si'A. + v8" + h ~ Sit + vi. + e.

VAsJ...
I ,

Figure 9. Illustration of the first step of the silicon migration

by the site transfer mechanism.

Both reactions are equilibrated, and the equilibration can be described by the
following equation.

(5.3)

where t = exp (-~) . We will later discuss the equilibrium concentration ratio
[Sibal/[Si'A.) using this equation.
Different from the case of the self-diffusion discussed in Section 3, the state
corresponding to Figure 9 (b) is not a saddle point, but a stable state, because
both SiGa and SiAs are stable. The first and the second steps occur in a series and
the probabilities of occurrence are proportional to [VA.), and [VGa ), respectively.
That is, the rate determining step should be the latter except at very high arsenic
vapor pressures because [VGa) is usually much smaller than [VAS)' The formation
energy of [VGa) is estimated to be very high (about 3 eV as will be later discussed) .
Therefore, the site transfer diffusion process should be dominant at higher tempera-
tures, while we must take into account the interstitial diffusion mechanism at lower
temperatures. Let us discuss first the site transfer diffusion. Diffusion coefficient
of SiGa and SiAs are the same for this mechanism and it can be described in the
following form

(5.4)
97

where r elf is the effective jump rate, and can be described by r A. and rVG.. which
are the jump rates via VA. and VG .. , respectively, (that is, they are proportional to
kl and k2 in equations (5.1) and (5.2)).

(5.5)

with

(5.6)

(5.7)

where ~GVA. and ~GVG .. are corresponding free energies of migration. At low
and middle vapor pressures relf =. rVG .. holds, and we have,

(5.8) D(SiG .. ) = D(SiAs) =. ~cfIlG,,[VGalexp ( ~~~G")

that is,

(5 .9) D(S')
ZGa = D (S')' 102
ZA. "'" 6a-IIGa A•• ' exp (~GfGa kT
(P).1 + ~GVG") .

As shown in Figure 10 our experiment has shown that the diffusion depth mono-
tonically increases with the AS4 pressure at 950", 1000°C, which suggests the site
transfer diffusion. However, at lower temperatures 900 '" 875°C, the diffusion c0-
efficient rather decreases with increasing vapor pressure in a lower pressure region,
which suggests the contribution of the interstitial diffusion mechanism as will be
discussed later.
In the above expression ~G~~a is expressed as

(2.18)

The entalpy of the sublimation of GaAs is known to be 1.14 eV. The entalpy of
the formation of VGa is estimated as follows. Vacancy formation energies for silicon
and diamond were estimated to be 2.3 eV and 4 eV, respectively. We assume VGa
in GaAs is roughly in the middle of the two values, that is, ~H:Ga ~ 3eV and
so ~H:~a ~ 2eV. On the other hand, the migration entalpy ~HVGa in Equation
(5.7) is estimated as follows. In the case of silicon crystals ~HVSi were estimated
as ~HVSi ~ 0.33 '" leV while, in Section 4 we have obtained for the two atom
migration as ~Hm ~ 1.6eV, so that we roughly estimated that ~HVGa ~ leV.
Therefore, our estimation of the diffusion coefficients for the site transfer mech-
anism is

(5.10)
98

As for the equilibrium ratio r = [SiGa]/[SiAs], it is determined from Equation (5.3)

(5.3)

(5.11)

where n and p are electron and hole concentrations, while Nc and N v are their effec-
tive densities of states, respectively, that is, we have the relation pn = NcNv exp ( -~ ).
In the experiment shown in Figure 10, the diffused region became n-type, that is,
[SiGal > [SiAs] holds. If we can assume [SiGal > [SiAs], then n ~ NGa[Siba] holds
and we have

_ [SiAs] _ [S'+ ]2 !V.1.] Nba ( t::.Gt -E9)

r - [S+ ] - tGa [V:0] N2 exp kT
tGa Ga c
(5.12)
= [S '+ ]2 NGa
tGa N; 2 (P
As.
)-t
exp
( _ t::.G t -
kT F' F') .
Eg + t::.G VAs - t::.G VGa

•
0
875 ·c
900
40
E •
V
900 -- _. SP.Si
925
:1.
. X 950 - - - - SP. S i
:::c 30
l-
0..
•e 1000 - - 23h
1000 •
UJ
c + 1000

520
tn
::>
La..
La.. o
0 '0

o~~--~~~~~~~~--~~~~--~--~~
1 10 10 2 10~

PAs4; Torr
Figure 10. Experimental diffusion depth of silicon in GaAs in
Reference [3].
99

Next, we discuss the interstitial diffusion mechanism which may become domi-
nant at lower temperatures. IT we assume an isolated Si interstitial atom, Is;, but
not a molecular complex, the following two reaction equations will hold

(5.13)
Sit .. = It; + V~.. f:::.G;
j

Si As = It; + vt + 2ej f:::.G,.

Comparing with Equation (5.3) we have the relation

(5.14)

We consider the case that [Sit .. ] > [Si As ] holds, then Equation (5.13) gives

(5.15) 8 N inter = [1.$] = _l_exp (_ f:::.G;)

8N.ub [Sit .. ] [V~.. ] kT'

Therefore, Equation (3.6) gives the diffusion coefficient as

.) , 1 (f:::.G;)
(5.16) D ( SIGa = Dinter [V~a] exp - kT

using Equation (5.15) and referring to Equation (3.11), we have

f:::.G; -f:::.G VGtJ

F' )
kT
(5.17)
f:::.G; - f:::.G~~tJ + f:::.G~)
kT

where f:::.G~ is the migration energy of the interstitial silicon.

On the other hand, if we consider a molecular complex (Is; . lAB), the corre-
sponding reaction equation is

(5.18)

which gives

(5.19)

In this case, vapor pressure dependence is not expected because [lAB] and [VGa]
have the same (PAso)~ dependence.
Experimentally, we observe the increase of the diffusion coefficient with de-
creasing vapor pressure at lower temperatures._ Therefore we assume that silicon
interstitial atoms are isolated. The entalpy parts of the free energies in Equation
(5.17) are estimated as follows. As a very crude estimation, we assume that f:::.H;
100

is nearly equal or less than the formation entalpy of Frenkel pair in silicon crystal,
that is,
~Hi < ~H~Si + ~Hfsi'
It was estimated to be about 3.1 '" 3.3eV [15]. Therefore, we tentatively assume
that ~Hi ~ 2.5eV. ~H$; and ~H~~a is estimated to be about 0.3eV and 2eV,
respectively, similar to the earlier discussions.
Then, we have the following expression for the interstitial diffusion mechanism.

(5.20)

Figure 11 shows the calculation according to Equations (5.8) and (5.20). The con-
stants Do was fitted to the experimental point at the highest vapor pressure and
temperature, while D~ was at the lowest vapor pressure and temperature.

T ·C
1200 1000 • 800
1~8r--.----r----.-----'-------.-----'

109 lOOOTorr

I Torr

=
til
10- 11
I Torr

o 1000 Torr

interstitial

0.7 0.8 0.9 1.0

100/T. ·K-1

Figure 11. Calculated temperature dependence of the difftision

coefficient of silicon in GaAs. _
101

As is seen in Figure 11 the both mechanism compete each other at temperatures

around 900° e. As for the observed decrease in the diffusion depth around 80 Torr at
900 o e, which is close to the optimum vapor pressure, it is the result of the decrease
of the surface concentration of silicon atoms in the vicinity of the stoichiometric
vapor pressure, rather than due to the decrease of the diffusion coefficient itself.
This decrease of the surface concentration is another issue, but not discussed in this
paper.

6. Diffusion at the heterostructure interface.

It was observed that the disordering of the superlattice was enhanced by the
diffusion of impurities like zinc and silicon [20,211. Let us consider the interface
between GaAs and AlAs as illustrated in Figure 12. Different from the free sur-
face of GaAs, there is no source of gallium at the heterointerface which controls
the chemical potential, so that there is a sharp decrease of the gallium chemical
potential in the vicinity of the heterostructure interface. This fact means that the
concentrations of Vaa and faa are far from the values in equilibrium. First, we
consider zinc diffusion by the interstitial mechanism.

GaAs AlAs
surfaCR
J-IGa /--------....
r----jJAI

Figure 12. Schematic illustration of the changes in the gal-

lium and aluminum chemical potentials at the GaAs-AIAs
heterostructure interface.

The reaction can be expressed as follows.

(6.1)

In the homogeneous crystal [Vaa1 is assumed to be equilibrated with an external

arsenic vapor pressure. This equilibration is caused by the reaction equations given
in Section 2, that is,

1 pi
(2.16) GaAs (solid) = faa + 4As4; !:!"GIGa

(2.17) ~ AS4 = Vaa + As (lattice site) ; p'

!:!"G vaa ·

However, it is understood that the above equation implicitly assume the existence
of the free surface, which is not the case at the heterostructure interface. Actually,
102

large amount of V&a will be generated by the reaction in Equation (6.1) because
!:l.G Zn should be much smaller than formation energies of VGa via Equation (2.17).
They must recombine with IGa or go out to the free surface for the equilibration,
but the concentration of IGa is much lower than the equilibrium level corresponding
to the sharp decrease of the gallium chemical potential. Although a part of VGa's
go out to the free surface, but most of them contribute to the disordering at the
interface.
In such a case we should consider that Equation (6.1) is equilibrated in itself,
without assistance of Equations (2.16) and (2.17). That is, we have

(6.2) [Zntl(V&al = exp (_ !:l.Gzn)

[Zns ](p/Nv )2 kT

but the condition [Zntl = (V&al must hold because Znt and V&a are generated as
a pair.
Therefore, the concentration of VGa equilibrated by the reaction equation (6.1)
IS

(6.3) 0 12
VGa
[ = -
[Zn.l(p/Nv)2exp (!:l.Gzn)
--,;;y- .

If we can assume that N Zn • = p, we have

(6.4) 0 1= [Zn; ll!2 (N2;a)

VGa
[ N v exp (!:l.Gzn)
- 2kT .

It is understood that (V&al at the interface is not determined by the arsenic va-
por pressure, but determined by zinc concentration and its activation energy is
.z.~?i. , which should be much smaller than !:l.H~~a' the formation entalpy in the
homogeneously controlled crystal.
This excess of V&a concentration is the origin of the interface disordering. In the
case of silicon diffusion, both the interstitial silicon formation and the site transfer
reaction can generate VGa. At lower temperatures where the interstitial diffusion is
dominant, the reaction

(5.13)

is equilibrated. That is,

[l;;l(V&al _ (_ !:l.Gi)
(6.5) .+ 1 - exp
[SZGa kT·

From the condition that [Itil = (V&al, we have

0 1
(6.6) VGa
[ SZGa 1.12 exp (!:l.Gi)
= [.+ - 2kT .
103

On the other hand, at higher temperatures where the site transfer diffusion is dom-
inant, the reaction equation which should be equilibrated is Equation (5.3), so
that we have the equilibrium relation (5.11). But this time, the condition that
[SiAs) = [VJa] must hold. As for W1.] we assume it is controlled by the external
vapor pressure, that is,

(2.11)

Therefore, we have

= [S·+ ]! Nc [yOA..]~ exp (_ 6..G2kT

(N&a)
[ lJ"O ] t - E9)
VGa ~Ga
(6.7)
= [S·+ ]! (N&a) (P
~Ga Nc As.
)-t exp (6.. Gt - Eg +
2kT 6..G~~s) .

Both of Equations (6.6) and (6.7) also shows that the concentration of gallium
vacancies which cause the interface disordering are much increased depending on
silicon concentration. We give the calculational result for Equation (6.6), which is
the simplest case. As was discussed in Section 5, we assume that 6..H; ~ 2.5eV.
Also, we simply assume 6..S; = o. As shown in Figure 13, the concentration of
gallium vacancy, NVGa, at the interface can be much higher than in homogeneous
crystals. In the case of thermal equilibrium, NVGa is expected to be of the order of
101l-12 cm -3 at 1000°C, if we assume 6..HVGa = 3eV and 6..SVGa = 0 in Equations
(2.18) and (2.20). Therefore N VGa shown in Figure 13 is more than 104 times
higher than in the homogeneous crystals. These excess VGa's in the vicinity of
the interface are rapidly occupied by aluminum interstitial atoms diffusing from
the AlAs region driven by the steep slope of the aluminum chemical potential.
The same phenomenon also occurs in the AlAs region. These processes cause the
gallium-aluminum mixing at the interface region.
104

T. ·C
1200 1000 800
lOll , - - - - . , , - - - r - - - - . - - - - , - - - - - r - - - - - ,

,;>1017
E
<.J

c
.2
~'015
1
8

l~~UL--~-~--~-~--~----~--~
1.0

Figure 13. Calculated concentration of gallium vacancies at

the heterostructure interface.

REFERENCES

[1] J. NISHIZAWA, H. OTSUKA, S. YAMAKOSHI AND K. ISHIDA, Nonstoichiometry of Te-doped

GaAs, Jpn. J. AppJ. Phys., 13 (1974), pp. 46-56.
[2] J, NISHIZAWA, I. SHIOTA AND Y. OYAMA, Site location of As+ -ion-implanted GaAs by means
of a multidirectional and high-depth-resolution Rutherford backscattering/channelling tech-
nique, J. Phys. D: AppJ. Phys., 19 (1986), pp. 1073-1078.
[3] J. NISHIZAWA, N. TOYAMA, Y. OYAMA AND K. INOKUCHI, Influence of ArsenIc Pressure on the
Defects in GaAs Crystals, Proc. of the Third Int. School on Semiconductor Optoelectronics
(Cetniewo, 1981), Optoelectronic Materials and Devices ed. by M.A. Herman, PWN-Polish
Scientific Publishers, Warszawa, 1983, pp. 27-77.
[4] J. NISHIZAWA, S. SHINOZAKI AND K. ISHIDA, Properties of Sn-doped GaAs, J. AppJ. Phys.,
44 (1973), pp. 1638-1645.
[5] J. NISHIZAWA, Y. OKUNO AND H. TADANO, Nearly Perfect Crystal Growth of III-V Com-
pounds by the Temperature Difference Method under Controlled Vapor Pressure, J. Crystal
Growth, 31 (1975), pp. 215-222.
[6] J. NISHIZAWA AND Y. OKUNO, Stoichiometric Crystallization Method of III-V Compounds
for LED's and Injection Lasers, Proc. of Second Int. School on Semiconductor Optoelectron-
ics (Cetniewo, 1978), Semiconductor Optoelectronics edited by M.A. Herman, PWN-Polish
Scientific Publishers, Warszawa, 1980, Chap. 5, pp. 101-130.
105

[7] J. NISHIZAWA, Y. OKUNO AND K. SUTO, Nearly perfect Crystal Growth in III-V and II-VI
compound semiconductors, JARECT Vol. 19, Semiconductor Technologies (1986), edited by
J. Nishizawa, OHM & North-Holland, 1986, pp. 17-80.
[8] J. NISHIZAWA, Stoichiometry Control for Growth of III-V Crystals, J. Crystal Growth, 99
(1990), pp. 1-8.
[9] J. NISHIZAWA, Y. OYAMA AND K. DEZAKI, Stoichiometry-Dependent Deep Levels in n-type
GaAs, J. Appl. Phys., 67 (1990), pp. 1884-1896.
[10] J. NISHIZAWA, Y. OYAMA AND K. DEZAKI, Formation Energy of Excess Arsenic atoms in
n-type GaAs, Phys. Rev. Letters, 65 (1990), pp. 2555-2558.
[11] ,Stoichiometry-Dependent Deep Levels in p-GaAs prepared by annealing under excess ar-
senic vapor pressure, J. AppJ. Phys., 69 (1991), pp. 1446-1453.
[12] A.B.Y. YOUNG AND G.L. PEARSON, Diffusion of Sulfur in Gallium Phosphide and Gallium
Arsenide, J. Phys. Chern. Solids, 31 (1970), pp. 517-527.
[13] B. TUCK, Atomic Diffusion in III- V Semiconductors, Adam Hilger, Bristol and Philadelphia,
1988.
[14] K.K. SHIH, J .W. ALLEN AND G.L. PEARSON, Diffusion of Zinc in Gallium Arsenide under
Excess Arsenic Pressure, J. Phys. Chern. Solids, 29 (1968), p. 379.
[15] K.H. BERNNEMANN, New Method for Treating Lattice Points Defects in Covalent Crystals,
Phys. Rev., 137 (1965), pp. A 1497-1514.
[16] R.R. HASIGUTI, Calculation of the Properties of Vacancies and Interstitials, p. 27 (U.S.
Government Printing Office, Washington, D.C., 1966).
[17] B. GOLDSTEIN, Diffusion in Compound Semiconductors, Phys. Rev., 121 (1961), pp. 1305-131]
[18] M. KASHIWAGI, (to appear).
[19] M.E. GREINER AND J.F. GIBBONS, Diffusion of silicon in gallium arsenide using rapid thermal
processing: Experiment and model, AppJ. Phys. Lett., 44 (1984), pp. 750--752.
[20] W.D. LAIDIG, N. HOLONYAK, JR., AND M.D. CAMRAS, Disorder of an AIAs-GaAs superlat-
tice by impurity diffusion, Appl. Phys. Lett., 38 (1981), pp. 776-778.
[21] K. MEEHAN, N. HOLONYAK, JR., J.M. BROWN, M.A. NIXON AND P. GAVRILOVIC, Disorder
of an Al",Gal_",As-GaAs super/attice by donor diffusion, Appl. Phys. Lett. 45 (1984), pp.
549-551.
THEORY OF A STOCHASTIC ALGORITHM FOR
CAPACITANCE EXTRACTION IN INTEGRATED CmCUITS·

YANNICK L. LE COZ AND RALPH B. IVERSONt

Abstract. We present the theory of a novel stochastic algorithm for high-speed capacitance
extraction in complex integrated circuits. The algorithm is most closely related to a statistical
procedure for solving Laplace's equation known as the floating random-walk method. Our analysis
begins with surface Green's functions for Laplace's equation on a scalable square domain. From
them, we obtain integrals for electric potential and electric field at the domain center. An electrode-
capacitance integral is next derived. This integral is expanded as an infinite sum, and probability
rules that statistically evaluate the sum are deduced. These rules define the algorithm.

1. Introduction. Future technological improvements in circuit integration will

make electrical connections just as important as the devices they join. It is commonly
accepted that electrical performance of integrated circuits will be limited, not by
device-switching speed, but by signal propagation along connection paths. Circuit
designers will therefore require software "tools" that can rapidly extract capacitance,
inductance, and resistance in the complex two- and three-dimensional geometries
typical of integrated circuits. With these thoughts in mind, we propose a highly
efficient stochastic algorithm for extracting capacitance in structures with numerous,
randomly oriented electrodes.
Before discussing capacitance extraction in further detail, we will first decom-
pose the integrated-circuit electrical connections into a set of idealized mathematical
objects. For the most part the set comprises electrodes, corresponding to electrical
connections themselves, and dielectrics, corresponding to various insulating layers.
We must now solve Laplace's equation!l] for this system. Usually, Laplace's equation
is solved for a series of electrode potentials (Dirichlet conditions), after which elec-
tric fields and inter-electrode capacitances are found. Numerical solution of Laplace's
equation is generally required, since integrated-circuit geometries are complex and,
to a certain extent, arbitrary. Conventional solution methods are deterministic,
employing most often finite-difference, finite-element, spectral, or boundary-integral
discretizations.[2-5] In two and three dimensions these methods are computationally
practical as long as the geometry is relatively simple, possessing few electrodes and
dielectrics. However, for complex integrated-circuit geometries these methods are
computationally prohibitive.
To resolve this difficulty, we propose a random-walk algorithm that directly eval-
uates the inter-electrode capacitance matrix. The algorithm most closely resembles
the floating random-walk method[6, 7] for solving the Laplace equation. It is partic-
ularly efficient in complex rectilinear geometries, in two and, even more so, in three
dimensions. It has the added advantage of statistically estimating electric field only
in regions where it is needed-the Gaussian surfaces surrounding each electrode. The
fact that statistical errors in electric field tend to cancel during Gaussian-surface in-
tegration enhances algorithm efficiency as well. Importantly, when evaluating the
capacitance matrix, our algorithm requires a number of Gaussian-surface integra-

• Written material in this paper has been excerpted from a larger work in draft form which will
be submitted to Solid State Electronics, for future publication
t Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Insti-
tute, Troy, NY 12180-3590
108

~a,

Y+ ' _ _---,,----
L. ti,
x

FIG. 1. An example of a two-dimensional, rectilinear electrode arrangement. Gaussian boundaries

e
are denoted {h, 92, and 93. The parameter measures length along any of these boundaries. Note
also, electric-field E and outward unit normal n both depend on e.

tions equal to the total electrode number. Usual capacitance-extraction procedures,

in contrast, require on order of the square of such as many integrations. We note
lastly that, owing to its stochastic nature, the algorithm readily parallelizes for speed
improvement.
Hereafter, we will assume a two-dimensional rectilinear geometry and neglect
variation in electric permittivity. The sections that follow constitute the theory of our
algorithm for capacitance extraction. We will first derive a nested-integral expansion
for the capacitance matrix associated with a general assembly of rectilinear electrodes.
To efficiently evaluate this expansion, we then deduce a novel stochastic algorithm.
We finish with a proof of the algorithm's mathematical validity.
2. Capacitance-integral expansion. We will now deduce an expression for
the electrode-capacitance matrix. Figure 1 is an example electrode arrangement.
These two-dimensional electrodes are shown in black, their edges parallel to the xy-
coordinate axes. It is understood that the electrodes extend infinitely in the +z and
-z directions. For N electrodes, off-diagonal elements of the capacitance-matrix Gij ,
i #- j,1 are defined according to
N
(1) qi = I:G;j(Vi - Vj),
j=t

where i = 1, ... , N. Above, the VI, ... , VN and qt, ... , qN denote electrode voltages
and their corresponding charges per unit length in z. Appropriately, Gij is a capaci-
tance per unit length in z as well.
Gauss's law permits us to write

(2)

1 Diagonal elements C ll , C22 , .•. represent, what we call, electrode "self-capacitances". They
serve no use in electrical modeling.
109

FIG. 2. Ezamples of initial mazimal squares. Boundaries Sa({J are of edge size a(e) and are centered
on Gaussian-boundary points (dark dots).

where f. is a constant electric permittivity and (ii is a Gaussian boundary (actually

a Gaussian-surface cross section) surrounding a single electrode i. The parameter
~ measures boundary length on which electric field E and outward unit normal n
depend. Figure 1 shows this parameterization.
We can rewrite (2) by replacing each component of E with its integral equivalent.
We will express these components as an integral along the edges of a square SaW'
parallel to the coordinate axes and centered about any particular boundary point on
(ii. For each~, the square will be chosen the largest possible containing no electrodes.
The square's boundaries thus conform, at least in part, to the electrode boundaries
themselves. Figure 2 gives examples of such constructions.
Denoting the edge size of these maximal squares as a(~), we expand (2) as a
double integral. To this end,

(3) qi = -1. d~ Si(~) 1

!i, saW
de' w(elOG[a(e)Ie')] tfJsa<e) (e'),

where at each electrode i = 1" .. , N we have defined a weight function

(4)

Above, n", and ny are components of n.

The quantities G, Gz , and Gil are, respec-
tively, suitably defined surface Green's functions[!] for electric potential, x compo-
nent of electric field, and y component of electric field. We have also introduced,
for completeness, arbitrary sampling-functions Si(~), which do not affect the value of
charge-integrals (3). These functions are assumed normalized over the (ii,

(5)
110

ti ,

FIG . 3. Ezamples of subsequent mazimal squares (initial square shaded). Each boundary Sa(e")
can be decomposed into an electrode part Sa({"") and a non-electrode part Sa({" " )' These squares
are centered on previous square-boundary points (dark dots).

To deduce our capacitance algorithm, we must express the qi in terms of electrode

potentials. We begin by splitting the domain Sate) into two parts, examples of which
are shown in Fig. 3. The first Sate) is the electrode part, and the second Sate) is
the nonelectrode part. Equation (3) can then be written as a sum of two integrals,
one over Sate) and the other over Sate)' The nonelectrode potential "'saW (e /) in the
latter integral can be replaced. We do so with the aid of G. Note carefully, we
must use a new dummy variable e",
and we choose2 Sate') as the new integration
domain. Also, we have extended our domain-construction procedure as shown in
Fig. 3. Geometrically, SaW) is the largest square, containing no electrodes, centered
about any particular boundary point on Sate)' If this entire process-splitting Sate')
into Sate/) and Sate), expanding as a sum of two integrals, and replacing the non-
electrode-boundary potential by means of G-is repeated indefinitely, one obtains an
infinite sum of nested integrals: 3

2 To simplify notation, we supress the dependence of ar(e /) on e. In general, we have ar(e-"') =

ar(e,e /, ... ,C").
3 Higher-order additive terms in (6) are generated with a simple recursive sequence: (i) copy the
last known term, (ii) place a tilde over its rightmost integral limit, (iii) replace t/Jsa(e / ) (C''') within
it by Lsa(C ''') df""'" G[ar(f""")If"""1t/Jsa({ ...") (e-""') .
111

qi - 10i d{ 8i(e) {

1SaW de' wWe') G[a(e)le'JtPsa(e) (e')

+ (
isa(e)
de' wWe') G[a(e)Ie'J r
JSa(e')
de" G[a(e')W1 tPs ,W)
a(e )
+ isSaW de' wWe') G[a(e)WJ issale') de" G[a(e')W1 x
1sa(e") de''' G[a(e") I xi"1 tPSa(e,,/e lll
)

(6) + }.
Expansion (6) depends only on electrode potentials. Grounding all electrodes
except the jth, which we set to some arbitrary voltage vj, reduces (1) to

(7)

for i = 1, ... , N (i =f. j). Since the qi of (6) are linear functions of Vj, Gij can be
written as an integral expansion independent of Vj. This argument is valid for any
possible jj thus, off-diagonal elements of the capacitance matrix are independent of
electrode voltages, and we have

Gij = 10i de 8i(e) {

r. de' wwe') G[a(e)WJ
JS~(e)
+ isale) de' wWe') G[a(e)WJ lsja(et) de" G[a(e')Ie"J
+ isSaW de' wWe') G[a(e)Ie'J isa(e') de" G[a(e')W'Jl a(e")
S
j de''' G[a(e")Ie'''J
S

(8) + }.

The boundary4 S~(e"4) is simply the portion of Sa(e"") coincident with electrode j.
3. Extraction algorithm and proof. In practice, direct evaluation of (8) is
computationally prohibitive, especially for integrated-circuit geometries where N is
usually large. We resolve this difficulty with a stochastic algorithm that estimates
the Gij:
1. Partition each integration variable in (8) into small segments of, possibly different,
size LlC" (r = 1, 2, ... ). Define corresponding discrete variables at each
segment center C".

4 We mean here the superscript ' ... ,' to possibly incJu_de the unprimed case. That is, C" is one
of e,e',e", ....
112

FIG. 4. Examples of first-, second-, and third-order walks. For clarity, the walks have been drawn
to start at the same boundary-point e.
of (h.

2. Introduce new variables Ni and Wi, the meaning of which will be made clear
shortly. Initially, set Ni = Wij = 0 for all ij (i,j = 1, ... , N). Set i = 1.
3. Randomly pick a e., say e., on Qi, with discrete probability distribution Pe(e,)=
It-e, de Si(e)·
4. Randomly pick a e~, say e~, on Sor(e.), with discrete probability distribution, con-
ditioned by e., pe'(e.le~) = ft.e~ de' G[a(e.)Ie'J.
5. If the last variable picked in Step 4 is not on an electrode boundary, then change
e
Step 4 as follows: mark every occurrence of with an additional primeS' , ,
and repeat Step 4.
6. If the last variable picked in Step 4 is on the jth electrode boundary, then replace
Ni with Ni + 1 and W ij with Wij + w(e.le~).
7. If Ni is sufficiently large, go to Step 8. Else, change Step 4 to its original form
(written here above) and go to Step 3.
8. Cij = Wij/Ni for j = 1, ... , N. If i = N, then stop. Else, replace i with i + 1,
change Step 4 to its original form (written here above), and go to Step 3.
We will now explain why this capacitance-extraction algorithm works. For a given
starting electrode i, enumerate a possible set of Ni random trajectories, or "walks",
generated by the algorithm that start on Qi and end on any electrode. Figure 4 gives
examples of such first-order (e. -+ e~), second-order (e. -+ e~ -+ C), and third-order
(e. -+ e~ -+ C -+ C') walks.

5 This applies to all occurences of e,

regardless of subscripts and superscripts. For example,
Pe(e.le~) = Il>~~ de' G[a(e.le')] -+ p~,,(e~ IC) = Il>~1! de" G[a(e~W')]· Once Step 4 has been
properly changed it remains so until otherwise stated.
Note also, in keeping with footnote 2, Sa({~"') = Sa({.,{~, ... ,{~··') in Step 4, where e.,e~,·· .,e~··'
are the set of random picks before entering or re-entering Step 4.
113

Consider, for the moment, a subset of this enumeration consisting of first-order

walks ending on a specific electrode j. Of the M total walks starting from gi,
M Pe({r) of them start at {r. Of those, a fraction Pe'({rl{~) end at {~ on S~(er)"
Therefore, the total number of walks from {r to {~ is simply M Pe({r) Pe'({rl{~). The
algorithm sums w({rl{~) over all possible {T{~-pairs and divides by M, giving

.:1
W(l)
(9) cV) ~ = ~ Pe({T) ~ w({rl{~)Pe'({rl{~)·
• OJ sia(er)

The sums in (9) are to be taken over discrete-points {r and {~ on their respective
surfaces gi and S~(er). In addition, we have designated first-order-walk contributions
with the superscript' (1)'. Observe that (9) and the discussion preceding it are valid
for any starting Gaussian surface and ending electrode, that is, any ij-pair (i #- j).
Remember, as well, that (9) is a good approximation to cV)
when M is large-large
enough to ensure that the known probability-distributions Pe and Pe' adequately
represent the actual distributions in our enumeration of walks.
A connection with (8) follows upon replacing Pe and Pe' in (9) with their integral
equivalents from Steps 3 and 4 of the algorithm. We get

(10) cV) ~ ~
0i
[1aer d{ SjW] sj~ w({TI{~) {1aer,de' G[a({T)It]}.
a(er)

The ~{~ are assumed small enough so that w({rl{~) varies little over their extent.
This permits us to change {~ to {' in wand to move w within the rightmost integrand
of (10). Hence,

(11) CV) ~~
0i
U aeT
d{ Si({)] ~
sj
1aeT,de' w({rlf) G[a({T)Ie']·
a(er)

Now, evaluating the rightmost sum above, we find immediately

(12) cV) ~ ~
gj
[1aer d{ s;({)] 1sa(er) de' w({rIO G[a({r )Ie']·
j

Lastly, if we assume the ~{T are small enough, so that the rightmost integral in (12)
varies little over their extent, we can change {r to { in w, G, and a; and move the
integral within the leftmost integrand. Evaluating the remaining sum, as before, gives
our final result:
(13) CV) ~ f d{ Sj({) f. d{ w({lO G[aWIe'].
JOj JS~(e) .

The expression above is a first-order approximation to Cjj, in other words, the

first term in expansion (8). The general proof for the nth term, Cj~n), in (8) proceeds
alongs lines similar to (9)-(13) and is left to the reader.
The algorithm actually sums w for walks of varying order n. Consequently, it
generates a statistical estimate of Cjj by summing statistical estimates of C;~n) over
n. Mathematically, we have

(14)
114

where wLn) is the sum of weight-functions w for all nth-order random walks starting
on Gaussian boundary gi and ending on electrode j.

Acknowledgement. The authors thank Professor James D. Meindl for sponsor-

ing this work and for helpful technical discussions. The authors also thank Professor
. Alan L. McWhorter for reading the original manuscript and suggesting improvements.

REFERENCES

[1) P.M. Morse and H. Feshbach, Methods of Theoretical Physics, Part I, McGraw-Hill, New York,
1953.
(2) A.H. Zemanian, "A Finite-Difference Procedure for the Exterior Problem Inherent in Capac-
itance Computations for VLSI Interconnections", IEEE 1rans. Electron Devices, vol. 35,
pp. 985-991, 1988.
(3) P.E. Cottrell and E.M. Buturla, "VLSI Wiring Capacitance", IBM J. Res. Develop., vol. 29,
pp. 277-288, 1985.
(4) F.s. Lai, "Coupling Capacitances in VLSI Circuits Calculated by Multi-Dimensional Discrete
Fourier Series", vol. 32, pp. 141-148,1989.
(5) A.E. Ruehli and P.A. Brennan, "Efficient Capacitance Calculations for Three-Dimensional
Multiconductor Systems", IEEE 1rans. Microwave Theory Tech., vol. MTT-21, pp. 76-82,
1973.
(6) G.M. Brown, "Monte Carlo Methods" in Modern Mathematics for Engineers, E.F. Beckenbach,
editor, McGraw-Hill, New York, 1956.
(7) A. Haji-Sheikh and E.M. Sparrow, "The Solution of Heat Conduction Problems by Probability
Methods", Trans. ASME, vol. C-89, pp. 121-131, 1967. (See, in particular, the section
"Authors Closure", and references therein.)
MOMENT-MATCHING APPROXIMATIONS FOR
LINEAR(IZED) CIRCUIT ANALYSIS·

NANDA GOPAL, ASHOK BALIVADA AND LAWRENCE T. PILLAGEt

Abstract. Moment-matching approximations appear to be a promising approach for linear

circuit analysis in several application areas. Asymptotic Waveform Evaluation (AWE) uses moment-
matching to approximate the time- or frequency-domain circuit response in terms of a reduced-
order model. AWE has been demonstrated as an efficient means for solving large, stiff, Iinear(ized)
circuits, in particular, the large RC- and RLC-circuit models which characterize high-speed VLSI
interconnect. However, since it is based upon moment-matching, which has been shown to be
equivalent to a Pade approximation in some cases, AWE is prone to yielding unstable waveform
approximations for stable circuits. In addition, it is difficult to quantify the time domain error
for moment-matching approximations. We address the issues of stability and accuracy of moment-
matching approximations as they apply to linear circuit analysis.

1. Introduction. Moment-matching approximations appear to be a promising

approach for linear circuit analysis in several application areas. Asymptotic Waveform
Evaluati~n (AWE) [20] uses moment-matching to approximate the time- or frequency-
domain response of an n-th order circuit in terms of a reduced qth order model. For
large, linear RLC circuits with thousands of poles, the response at any node tends
to be dominated by only a few of the poles, therefore excellent approximations are
possible for q < < n. The dominant poles also tend to be the lower frequency poles
(poles closer to the origin) which contain most of the signal energy. The effects due
to higher frequency poles that influence the response only for a very short time and
contain little energy [7] are averaged into the waveform approximation. The order of
the AWE approximation - the number of dominant pole( s) in the approximation -
determines the overall waveform accuracy.
AWE has been demonstrated as an efficient means for solving, large, stiff, linear
circuits, in particular, the large RC- and RLC-interconnect circuit models that are
difficult to evaluate using traditional circuit simulation algorithms. However, since
it is based upon moment-matching, which has been shown to be equivalent to a
Pade approximation in some cases, AWE is prone to yielding unstable waveform
approximations for stable circuits. In addition, it is difficult to quantify the time-
domain error which makes it difficult to select an appropriate order for the moment-
matching approximation. In this paper we consider these stability and accuracy issues
as they apply to linear circuit analysis and we propose some techniques for addressing
these problems for linear, passive RLC circuits.

2. Moment-matching methods. The use of moments in the simplification of

high-order systems is a well-known technique [3, 9, 26]. Moment values of actual,
physical, linear-systems can be obtained from experimental data, calculated from an
exact transfer function expression, or measured from a model of the system. From 2q
moment values a q-th order model can be uniquely specified .

• This work was supported by the National Science Foundation under the grant MIP #9007917.
t Computer Engineering Research Center, Department of Electrical & Computer Engineering,
The University of Texas at Austin, Austin, Texas 78712
116

Model-order reduction via moment-matching is motivated by the characteristics

we seek in a good system model approximation. For an asymptotically stable system
with a proper, rational, transfer function H(s), or impulse response h(t), Zakian [26]
defines a "good" system model, H(s) or h(t), as one for which:
1. [h(t) - h(t)] converges rapidly to 0 as t -+ 00
2. h(O) - h(O) =0
3. max(o<t$oo) Ih(t) - h(t)1 :5 K > 0
where K is a constant which is application dependent.
Condition 2 is easily satisfied by forcing the Initial Value Theorem to apply. To
ensure Condition 1, one potential criterion is:

(1) 10 00
tj[h(t) - h(t)]dt = 0, j = 0,1,2, ... , (m + n)
where m and n are respectively the degrees of the numerator and denominator of
H(s). Equation (1) is recognized to suggest that the first (m + n + 1) moments of h
and h are equal. Finally, it can be shown that Condition 3 is satisfied for sufficiently
large values of (m + n) [26].
The existence of the moments of the actual system function h( t) is ensured if h( t) is
piecewise continuous in [0,00) and of exponential order O[exp( ITt)], t -+ 00, IT < 0 [26].
For passive, linear RLC circuits, which are asymptotically stable, the responses are
smooth and piecewise continuous in [0,00), thus satisfying both the above require-
ments.
The preceding discussion developed a reduced-order model H(s), that is termed a
moment-approximant. The Pade approximants are a similar class of approximating
functious that are related to moment-approximants. A Pade approximant, denoted
[P / Q], is a rational function approximation of a transfer function H (s ), analytic about
s = 0, such that the first (P+Q+1) coefficientsofthe MacLaurin expansions of [P/Q]
and H(s) are equal [1,2]. In the above definition, P and Q refer to the degrees of
the numerator and denominator polynomials respectively, in the Pade approximant.
To establish the relation between the moment- and the Pade-approximants, con-
sider the Laplace transform definition of an analytic function h(t):

(4)

If, for any function /(t), the integral

(2) {" /(t)dt

exists, the nth moment Mn of the function about the origin is defined as (9)

(3)

It is shown in [9, 17] that the normalized moments Mil Mo are analogous to the mean of a probability
distribution f u n c t i o n . '
117

Expanding e-·t in a MacLaurin series yields:

(5) H(s) =

(6)

In other words, the time moments of a function h(t) are related to the coefficients
of the MacLaurin series expansion of h(t). The following theorem by Zakian [26],
explicitly defines the relation between the moment- and the Pade-approximants:
THEOREM 2.1. Let h be piecewise continuous on [0,00) and of exponential order
O[exp(O't)], t -+ 00, 0' < 0, and let the Laplace transform C{h} be an asymptotically
stable (min) rational function; then h is the (m+n) moment-approximant of h if and
only if C{h} is the Pad!. approximant [min] of C{h}. The reference to C{h} being
asymptotically stable is an important one, and will be addressed in more detail in a
later section on stability.

3. Asymptotic waveform evaluation. Asymptotic Waveform Evaluation (AWE)

is a generalized approach to approximating the waveform response of linear(ized) RLC
circuits via moment-matching [19, 20]. AWE is most conveniently explained in terms
of the differential state equations for a lumped, linear, time-invariant (LLTI) circuit:

(7)

x
where is the n-dimensional state vector and gis an m-dimensional excitation vector
of impulses. Such a circuit description can be found for most LLTI circuits. Modeling
the state variables permits the modeling of any output variable as a linear combination
of these state variables. While the following development can be applied to other
excitation forms such as step or ramp voltages, we consider only impulse excitations
since from them, all other responses can be obtained by analytical convolution and
superposition.
The Laplace transform solution of Eq.(7) is

(8)

which can be expanded into a MacLaurin series

(9)

Focusing on a specific component of X(s), say the ith, the coefficients of the series
expansion can be denoted as:

mi0 = [-A-1b]i
mi1 [-A-2~i

(10) m i2q _ 1 [-A- 2q b]i

where m~ denotes the j-th coefficient in the series expansion of the i-th state variable.
118

The efficiency of AWE lies in the recursive computation of these coefficients [nil;.
As explained in [20], the explicit construction/inversion of the state matrix A is
not required. Instead, finding A -1 from Eq.(7) is equivalent to solving for the port
voltages of the open-circuit capacitance ports and port currents of the short-circuit
inductance ports [6]. To illustrate, consider the circuit in Fig.1(a). The response

L3 R4

(a)

L3
L3 .mj R4

(b)

FIG. 1. Computing the (j + l)th set of response coefficients in AWE.

coefficients can be obtained from the circuit in Fig.l(b), where all the capacitors have
been replaced by current sources, and inductors by voltage sources.
The recursion in Eq.(10) is initiated by replacing the source V in Fig.l(b), by a
constant voltage of value 1, and setting all the capacitor and inductor-sources to zero.
Solving this dc circuit for the capacitor voltages and inductor currents yields the first
coefficient mo for each state variable. This is equivalent to substituting g= 1, i = 0
in Eq.(7), and solving for x. This yields x = -A -lb, which, from Eq.(lO), is the first
coefficient mo.
Higher-order coefficients mj are then recursively obtained by shorting the excita-
tion source V, setting the capacitor-current sources equal to -C;m}_l1 the inductor-
voltage sources equal to - L i m}_1, and solving for the port voltages and currents. This
is again equivalent to setting g= 0 in Eq.(7"), i = Xj-l and solving for Xj [13, 20].
119

In AWE, the reduced qth-order model of the ith state variable has the form:
• kt
E
9
(11) [X(S)]i = - (i)
1=1 S - PI

where the terms pi are the q, unique dominant-pole approximations and the terms kt
are the corresponding residues. The values of pi and kj are computed such that the
model in Eq.(l1) best approximates the actual response in Eq.(9) in the sense ofthe
Pade approximation:

(12)

Cross-multiplying and equating the coefficients of sq, sq+l, ... , yields the following
set of linear equations for the denominator coefficients of Eq.(12) [13, 20]:
m1
m; ...
i
•.• m q_ 1
m~
i 1
.
.. . .. ...
(13)

mqi ... m i2q _2

The roots of the characteristic polynomial

(14)
are the dominant pole approximations.
To solve for the corresponding residues ki, the first q coefficients of the s terms in
the expansion of Eq.(l1) are matched to those of Eq.(9) to obtain the system:

ki ki ki
(15) -( (pi1)q + (p~2)q + ... + (l)q) =
This may be rewritten in matrix form [20] as

(16) _
k~i - V-1 ~i
- mL

where mi is a vector of the low-order coefficients, (m~, mi, ... , m~_1V, and V is the
matrix
(pD-1 (p;)-1 ... (Pq
;• )-1
[ (pD-2 (p~)-2 ... - (Pq
)-2 1
. .
(pD-9 (p~t9 ... (p~)-q
120

4. Order of approximation. The automatic selection of an appropriate order

of approximation q is a very hard problem, and may be influenced by several factors:
the required approximation accuracy, circuit behavior, numerical precision, and signal
bandwidth. Previous work suggested a stopping criterion based on the results of
successive orders of approximation [19]. A normalized root-mean-square error was
measured between the qth and (q + 1)th orders of approximation; when this error
decreased below a particular value, the approximation process was stopped.
However, it would be more prudent to use a stopping criterion based on a combi-
nation of several of the factors mentioned above. In [13], Huang showed that as the
order of approximation was increased, the model poles converged to actual system
poles. For a particular signal bandwidth fmax, it would therefore seem necessary to
increase the order of approximation until those poles corresponding to all frequencies
below and near fmax have converged.
For signals with very small rise-times, this approach might lead to excessively high
orders of approximation that might not be realizable by a stable model, given the finite
numerical precision used. This effect of numerical precision refers to the increasing
sensitivity of the model simplification scheme to errors in the series coefficient values
as the order of approximation is increased. By examining the coefficient values, a
practical limit on the highest order possible with a given set of coefficients, can be
derived as shown below. The power series representation of [X(s )]., from Eq.(S)-(lO),
is
= m~ + mis + m;s + ...
.... .. . 2
(17) [X(s)].
The Hankel matrix representing this power series for the first (2q - 1) coefficients is

(IS) H(O) =
m~
[ mi m'
m!_l 1
m~:q~2
q

m~_l
This is also recognized to be the matrix in Eq.(13) for the roots of the characteristic
polynomial.
From [12, 5], it is seen that the degree of a proper rational function is equal
to the rank of the Hankel matrix representing its power series expansion. In the
development of [X(s)]., which is indeed a proper rational- function approximation,
the degree would represent the order of approximation that is sought, i.e., the degree
of [X(s)] •. Further, since, by assumption, the series expansions of the actual response
and the reduced-order model are to agree at least as far as the first 2q coefficients,
Equation (18) would represent the series expansion of [X(s)]. as well. Hence, an upper
limit on the order of approximation would be q :5 pHJO) where p denotes the rank.
Attempting to obtain an order of approximation higher than this limit would result
in the truncation noise being personified by unstable poles or poles with relatively
insignificant residues. (A similar approach was also suggested to us by Pak Chan [4]).
Thus, it might not be possible to fit the signal bandwidth with a set of poles
spanning that bandwidth. Rather, the order of approximation is increased until
either the bandwidth requirement is satisfied, or there is amplification of numerical
noise in the form of poles with vanishing residues.
121

As an example, consider the 4000-node RC-tree in Fig.2. The approximate domi-

Rl R2
---~

IC4000

FIG. 2. -IOOO-node RC-tree circuit.

nant poles and residues of the response at the output node are shown in Table 1 for
increasing orders of approximation. To obtain an approximation for a signal band-
TABLE 1
Poles and residues at output of -IOOO-node RC tree for increasing orders of approximation.

Order Poles Residues

1 -5.00467e+07 -l.OOOOOe+OO
2 -6. 17908e+07 -1.27681e+00
-4.05788e+08 2.76814e-Ol
3 -6.17363e+07 -1.27303e+00
-5.75856e+08 5.03070e-0l
-9.80745e+08 -2.30044e-Ol
4 -6.17363e+07 -1.27303e+00
-5.57263e+08 4.25811e-Ol
-1.41178e+09 -1.90l65e-Ol
-3.86961e+09 3.73820e-02
5 -6.17363e+07 -1.27303e+00
-5.57018e+08 4.24933e-0l
-1.54068e+09 -2.52229e-Ol
-2.66287e+09 1.06412e-0l
-1.41686e+11 -6.08759e-03
6 -6.17363e+07 -1.27303e+00
-5.57017e+08 4.24892e-Ol
-1.54286e+09 -2.51642e-0l
-2.70264e+09 1.06551e-0l
-1.4594ge+11 -6.77357e-03
-5.73723e+07 -9.54016e-09
7 -6.17363e+07 -1.27303e+00
-5.57110e+08 4.25406e-Ol
-1.50857e+09 -2.38670e-Ol
-2.68746e+09 9.06235e-02
-1.45735e+11 -4.33107e-03
-5.37956e+07 3.19245e-1O
-1.62724e+07 -5.20880e-17

width of, say, 5e+8 radians, a 3rd-order approximation would be sufficient, since
all the poles below that frequency have converged and do not shift appreciably at
122

higher orders. However, for signal frequencies much greater than le+ll, attempting
to obtain models of order greater then 5 would yield poles with relatively insignif-
icant residues that do not influence the response. These poles would represent the
magnification of numerical noise and occur at random locations, as illustrated by the
"noise" poles in the 6th- and 7th-order approximations in Table 1.
The next section introduces concerns of instability that are inherent to moment-
matching and Pade approximation techniques, in particular. These concerns play an
important part in the actual approximation process.

5. Moment-matching instability. Theorem 2.1 requires the asymptotic sta-

bility of the Pade approximant £{g}. However, obtaining the model poles from
Eq.(12)-(14) may yield unstable models of systems that are asymptotically stable. In
the case of stable, linear(ized) RLC circuits, this instability manifests itself in the
form of unstable model poles, or poles in the right-half of the s-plane. This is due
to an inherent instability problem associated with the Pade approximation [3] and
moment-matching methods in general.
As explained by Huang [13], the Zinn-Justin theorem [27] on the convergence of
the Pade approximants demonstrates the uniform convergence of the diagonal Pade
sequence to the actual system function, except in exceptional areas of the complex
plane. These exceptional areas can be made arbitrarily small by increasing the order of
approximation. This leads to the implication that those model poles that correspond
to actual system poles appear repeatedly in increasing orders of approximation [13].
This phenomenon is seen in the results in Table 1 on page 7, where the convergence
of the minimum poles is clearly observed.
The model poles that correspond to actual system poles are easily distinguished by
their repeated occurrence and significant residues. However, those model poles that
do not correspond to any system pole, termed defective poles [13], are distinguished
by their insignificant residues and random occurrence. These defective poles may
occur at any order of approximation due to their random nature, and may even cause
the model to become unstable.
One of the reasons for the occurrence of these defective poles is the extreme
sensitivity of the Pade approximation technique to errors in the coefficient values.
Errors are due to truncation/rounding when working with finite-precision machines.
Table 2 demonstrates this extreme sensitivity for the response at node 2000 of a 4000-
node RC-tree. A change of Ie-57 in the value of the 8th moment causes a stable
fourth-order approximation to become unstable.
In addition to numerical noise, certain system-pole patterns resist approximation
at certain orders. Using another example from [13], the pole-pattern in Fig.3 depicts
a system function with several complex pole-pairs and numerous other insignificant
poles. Attempting an approximation with an odd number of model poles will yield a
model pole that does not correspond to any system pole, and may result in instability.
Similarly, the locations of the system zeros playa significant part in the stability of
the reduced-order model. This can be illustrated using the contrived 4-pole system in
Table 3. For this example, the system poles were fixed while the locations of the zeros
were varied, as in [13], and the stability of the resultant models at different orders
123

TABLE 2
Rlustration of sensitivity of moment-matching scheme to noise in the coefficient values of response
at node 2000 of ..tOOO-node RC tree (perturbed coefficient in bo:t).

Actual coefficients poles residues

1.000000000000000e+00 -6.17363e+07 -8.99251e-01
-1.497267448118228e-08 -5.56846e+08 -3.01767e-01
2.368287909248411e-16 -1.66283e+09 2.32082e-Ol
-3.823417609189066e-24 -7.18473e+09 -3.10641e-02
6.190703522181460e-32
-1.002720592080674e-39
1.624190893290490e-47
1-2.630850265524870e-55I
Perturbed coefficients poles residues
1.000000000000000e+00 -6.17363e+07 -8.99254e-Ol
-1.497267448118228e-08 -5.72767e+08 -3.39101e-Ol
2.368287909248411e-16 -1.28563e+09 2.38355e-01
-3.823417609189066e-24 3.73120e±04 -7.39886e-21
6.190703522181460e-32
-1.002720592080674e-39
1.624190893290490e-47
1-2.629163340894621e-55I

x
x x x
---cr
x x x
x
FIG. 3. Pole-pattern of an artificial system function.
124

TABLE 3
E/Ject of the location of system zeros on the stability of the reduced-order models of a 4-pole, 3-zero
function.

I
System poles I -1 -10 -15 I -100 I
Case 1 System zeros -2 -20 -40
O(1)-model poles -1.66
O(2)-model poles -1.04 -22.31
O(3)-model poles -1.00 -9.23 -158.87
Case 2 System zeros -2 -13 +6
O(I)-model poles -1.30
O(2)-model poles -1.16 +895.36
O(3)-model poles -1.00 -11.08 -104.44
Case 3 System zeros +19 +8 -6
O(I)-model poles -0.84
O(2)-model poles -1.00 +8../.12
O(3)-model poles -1.00 +3.61 +7.4-63

of approximation was observed. A better understanding of all the aforementioned

effects would greatly aid in the development of techniques to improve the reliability
of moment-matching methods.
The problem of instability associated with moment-matching methods has been
the focus of numerous papers in several branches of engineering. An approach sug-
gested by Brown [3] to remedy the instability of a reduced-order model of a stable
system was to increase the order of approximation, i.e., increase the number of dom-
inant poles in the model. An auxiliary performance criterion was also proposed for
computing the additional parameters of the model in the event of the non-availability
of the extra system-moments required. However, caution must be exercised not to
exceed the order of the actual system itself.
In contrast, Zakian [26] proposes successively decreasing the order of approxi-
mation until a stable model is obtained. While a stable model may be eventually
achieved, the order at which stability is achieved may be too low for the application,
resulting in loss of accuracy. However, this method is also not fool-proof and it is
possible to obtain transfer functions with unstable 1st-order models [13].
Besides the above two, numerous other techniques have been proposed for ensuring
model-stability in moment-matching approximations [8, 7, 9, 15, 18, 24, 25]. However,
most of these techniques are unsuitable for circuit analysis. In the following section,
some recent approaches to overcoming the instability problem as applied to electrical
networks is described.

6. Minimizing instability. While the inherent instability of the Pade approx-

imation is difficult to detect or remedy without a priori knowledge of the actual
transfer function, numerical instability can be minimized. One of the most apparent
problems stems from the rapid divergence of the coefficients of the series expansion
of the actual response, as seen in Table 2. 'Such widely varying magnitudes of the
coefficients may cause the matrix in Eq.(13) to tend to singularity at very low orders
'of approximation.
125

This ill-conditioning of the coefficient matrix can be minimized by employing

frequency scaling of the coefficient values. The normalized coefficients are used to
find a normalized solution which can be scaled back to obtain the desired values. In
AWE, the scale factor selected, with respect to the impulse-response coefficients m,
is:
rno
(19) 'Y=-.
rnl
The normalization is achieved by scaling the ph impulse-response coefficient by 'Yj .
The scaled coefficients are used to find a set of normalized poles and residues. The
desired values are then recovered by scaling back the poles by 'Y. An example of this
scaling technique is shown in Table 4.
TABLE 4
Frequency scaling of coefficient values of response at output of 4000-node RC-tree driven by a 5V
step excitation.

coefficient unscaled scaled

rno 5.000000000000000e+00 5.000000000000000e+00
rnl -9.990664920421377e-08 -5.000000000000000e+00
rn2 1.663647067197849e-15 4.16689369365401ge+00
rn3 -2.703917792898261e-23 -3.389380423777905e+00
rn4 4.381506365320577e-31 2.748690814396173e+00

Scale factor 'Y: 5.004671901046116e+07

However, in some cases, using the scaled coefficient values may still result in
the near-singularity of the Hankel matrix in Eq.(18). This was demonstrated by
Huang [13] who showed that the higher-order coefficients are increasingly influenced
by the minimal poles. Further, the contribution of the high-frequency poles decreases
with the ratio of the magnitudes of the low- and high-frequency poles. The effect of
scaling is not evinced here, since the ratio of the magnitudes of the poles is unaffected.
To overcome this, a method of frequency shifting, termed uniform predistoriion,
was suggested in [13]. As compared to frequency scaling, where the energy-storage el-
ements are scaled, frequency shifting involves adding proportional resistors in parallel
or series to the energy storage elements. This has the effect of moving the jw-axis to
the right, as shown in Fig.4, and thus increasing the ratio of low- to high-frequency
poles. With respect to Fig.4,
Pl+,).>Pl
P2+,). Pl·
The degree of shift, .\, of the jw axis determines the change in the ratio of pole
magnitudes.
Another technique of overcoming numerical instability, for the special case of pas-
sive, linear RC-interconnect circuits, is described in [11, 10]. This approach attempts
to map a set of series coefficients of the homogeneous step response, to a stable domi-
nant pole-residue representation using constrained. optimization. This is facilitated by
126

jCil
!

x
P2
;1 .cr

jCil
I
I
I
I~A.
I
I cr
I
P2+A. Pl+ A.
FIG. 4. Frequency shifting to improve the ratio of pole magnitudes.

the a priori knowledge that the poles of passive, linear RC-circuits are real and nega-
tive. Hence, for a stable approximation, the model-poles should be real and negative
as well. This forms a nonlinear inequality constraint that can be easily incorporated in
the form of a variable transformation, (pj = -exp(xj}) on the system in Eq.(15}. The
resultant, transformed system is optimized in x-space using unconstrained techniques.
This constrained optimization technique is employed in RICE (Rapid Intercon-
nect Circuit Evaluator) [23), an implementation of AWE for the analysis of inter-
connect circuits. RICE uses an efficient path-tracing scheme [22), that minimizes
introduction of numerical errors, for the computation of the moments of the circuit
responses. In addition, problem conditioning is maintained throughout through the
use of frequency scaling and the use of numerical techniques such as the singular value
decomposition [11, 10).
Table 5 shows the results of using RICE to model the step response at node 1000
of the RC-tree in Fig.2. A 3rd-order unconstrained approximation yields an unstable
TABLE 5
Unconstrained and constrained models of the step response at node 1000 of 4000-node RC-tree.

poles (PI) residues (kl )

Unoptimized Optimized Unoptimized Optimized
-6.17370e+07 -6.17274e+07 -4.90604e-Ol -4.90336e-Ol
-6.08460e+08 -5.2277ge+08 -5. 13784e-Ol -5.5187ge-Ol
5.49704e+08 -2.37423e+08 4.38880e-03 4.26101e-02

model, while using the constrained optimization scheme yielded a stable model which
compares very favorably with the output of a circuit simulator [21), as shown in Fig.5.

The RICE software represents an application-specific implementation of AWE

that exploits the tree-like topology of interconnect circuits. However, even a general-
ized version of AWE [14), when compared with a circuit simulator [21), displays the
127

1.0

0.8

0.6
CD
C'l
CG
.:::
0 0.4
> RICE
-------- PSPICE
0.2

0.0
O.OOe+O 1.00e-8 2.00e-8 3.00e-8 4.00e-8 5.00e-8
Time
FIG. 5. Comparison of the constrained 3rd-order AWE model of the response of a 4000-node RC-tree
versus the output of a circuit simulator.

TABLE 6
Execution time (in seconds) for various sizes of RC-interconnect circuit models. (An'" indicates the
circuit WIIS too large in terms of memory requirement.)

RC-Interconnect circuit size

Branches Nodes RICE AWE{LU) PSPICE
4,001 1,601 0.07 5.79 97.6
16,001 6,401 0.28 57.43 908.7
64,001 25,601 1.17 921.7 *
tremendous improvements in performance that motivate the use of moment-matching
methods for certain applications (Refer Table 6).
Although the results reported in Table 6 are for passive RC-interconnect circuits,
the implementation in RICE works equally efficiently for passive RLC-interconnect
models, as shown in Fig.6. The poles and residues of passive RLC circuits may be
complex, although they are still located in the left-half of the complex plane. Since
these circuits have a much higher response bandwidth than the RC circuits discussed
earlier, a much higher order of approximation is required to obtain a response model
that matches the output of a circuit simulator. Consequently, the instability problem
is more pronounced and occurs more frequently in RLC-circuit response models than
in RC-circuit response models. However, the gains in computation time to be achieved
by using AWE rather than a circuit simulator, are also multiplied by several orders
of magnitude. Fig.7 shows the comparison of the step response of the circuit in Fig.6
128

FIG. 6. Typical RLC-interconnect circuit model.

1.2

1.0

0.8
Q)
Cl 0.5
~ 0.4
0
> PSP ICE
0.2 RIC:: 6th order
0.0

- 0.2
Oe- O 2e - 8 4e-6 5e-8 8e-8
Time

FIG. 7. Comparison of the 6th-order AWE model of the step response of an RLC-interconnect circuit
versus the output of a circuit simulator.

obta.ined from PSPICE and a 6th-order AWE model. Note that even a 6th order
model does not capture all of the high frequency effects due to an ideal step input.
As the rise time of the input increases, the AWE model approximates the actual
waveform more closely. Fig.8 shows this observation. The greater the rise time of the
input signal, the lesser is it's high frequency content, hence, the lower the required
order of approximation.
Circuits with controlled-sources and active devices may be inherently asymptoti-
cally unstable and may possess transfer-function poles in the right half of the complex
plane. In such cases, AWE reduces to a "pure" Pade approximation, since the mo-
ments of the response of a function with a positive pole cannot be obta.ined due to
the divergent nature of the response [16]. However, the problem posed here is the
determination of whether a positive model-pole reflects the instability problem as-
sociated with the Pade approximation or is an approximation to the actual positive
system-pole.

7. Conclusion. Asymptotic Waveform Evaluation has been demonstrated as

an efficient approach to waveform estimation for linear RLC circuits, interconnects in
particular. Indeed, RICE, which is an application-specific implementation of AWE,
has proven to be capable of analyzing stiff interconnect-circuit models several orders
of magnitude faster than a circuit simulation, with no loss in accuracy.
However, there rema.in many unanswered questions regarding moment-matching
and Pade approximation, some of which have eluded eminent researchers since a long
129

1.2

1.0

0.8
II> 0.6
Cl
!2
"0 0.4
PSP ICE
> R I C~ 6th order
0.2

0.0

- 0.2
OeoO 2e-8 4e-8 5e - 8 8e - €
Time

FIG. 8. Comparison of the 6th-order AWE model and the output of a circuit simulator for a 10ns
input-signal rise time.

time. Alleviating these problems of instability and order-estimation will enable the
extension of AWE to even more complex and challenging tasks.

Acknowledgements. The authors would like to thank Demosthenes F. Anas-

tasakis for his helpful discussions and assistance in the preparation of this document.

REFERENCES

[1] G. A. Baker, Jr. Essentials of Pade Approrimants. Academic Press, 1975.

[2] G. A. Baker, Jr. and P. Graves-Morris. Encyclopedia of Mathematics and its Applications,
volume 13. Addison-Wesley Publishing Co., 1981.
[3] R. F. Brown. Model stability in use of moments to estimate pulse transfer functions. Electron.
Lett., 7, 1971.
[4] P. K. Chan. Comments on asymptotic waveform evaluation for timing analysis. Private
correspondence.
[5] C. Chen. Linear System Theory and Design. CBS Collge Publishing, 1984.
[6] L. O. Chua and P. Lin. Computer-Aided Analysis of Electronic Circuits: Algorithms and
Computational Techniques. Prentice-Hall, Inc., 1975.
[7] E. J. Davison. A method for simplifying linear dynamic systems. IEEE 7hins. A uto. Control,
11, Jan 1966.
[8] J. F. J. Alexandro. Stable partial Pade' approximations for reduced-order transfer functions.
IEEE 7hins. Auto. Control, 29, 1984.
[9] L. G. Gibilaro and F. P. Lees. The reduction of complex transfer function models to simple
models using the method of moments. Chem. Eng. Sc., 24, 1969.
[10] N. Gopal and L. T. Pillage. Constrained approximation of dominant time constant(s) in RC
circuit delay models. Technical Report TR-CERC-TR-LTP-91-01, Compo Eng. Res. Ctr.,
U. Texas (Austin), Jan 1991.
[11] N. Gopal, C. Ratzlaff, and L. T . Pillage. Constrained approximation of dominant time con-
stant(s) in RC circuit delay models. In Proc. 19th IMACS World Congress Compo App.
Math., Jul 1991.
[12] P. Henrici. Applied and Computational Complex Analysis. John Wiley & Sons, 1974.
[13] X. Huang. Pade' approrimation of linear(ized) circuit responses. PhD thesis, Carnegie Mellon
Univ., Nov 1990.
[14] X. Huang, V. Raghavan, and R. A. Rohrer. AWEsim: A program for the efficient analysis of
linear(ized) circuits. In Proc. IEEE Int'l. Coni Computer-Aided Des., Nov 1990.
[15] M. F. Hutton and B. Friedland. Routh approximations for reducing ordElr of linear time-
invariant systems. IEEE 7hins. Auto. Control, 20, 1975.
[16] S. M. Kendall and A. Stuart. The Advanced Theory of Statistics. MacMillan Pub. Co., Inc.,
1977.
130

[17] S. P. McCormick. Modeling and Simulation of VLSI Interconnections with Moments. PhD
thesis, Mass. Inst. Tech., June 1989.
[18] J. Pal. Stable reduced-order Pade' approximants using the Routh-Hurwitz array. Electron.
Lett., 15, 1979.
[19] L. T. Pillage. Asymptotic Waveform Evaluation for Timing Analysis. PhD thesis, Carnegie
Mellon Univ., Apr 1989.
[20] L. T. Pillage and R. A. Rohrer. Asymptotic waveform evaluation for timing analysis. IEEE
Trans. Compo Aided Design, 9, 1990.
[21] PSPICE USER'S MANUAL. Version ./.03. Micr08im Corp., Jan 1990.
[22] C. L. Ratzlaff. A fast algorithm for computing the time moments of RLC circuits. Master's
thesis, The Univ. of Texas at Austin, May 1991.
[23] C. L. Ratzlaff, N. Gopal, and L. T. Pillage. RICE: Rapid Interconnect Circuit Evaluator. In
Proc. 28th ACM/IEEE Design Auto. Conf, Jun 1991.
[24] R. H. Rosen and L. Lapidus. Minimum realization and systems modeling: Part I - Fundamental
theory and algorithms. A. 1. Ch. E. J ., 18, Jul 1972.
[25] Y. Shamash. Linear system reduction using Pade approximation to allow retention of dominant
modes. Int'l. J. Control, 21(2), 1975.
[26] V. Zakian. Simplification of linear time-invariant systems by moment approximants. Int'l. J.
Control, 18, 1973.
[27] J. Zinn-Justin. Strong interaction dynamics with Pade' approximants. Phy. Rep., 1970.
SPECTRAL ALGORITHM FOR SIMULATION
OF INTEGRATED CmCUITS

O.A. PALUSINSKI, F. SZIDAROVSZKY,

C. MARCJAN, AND M. ABDENNADHER*,

Abstract. Waveform relaxation improves the efficiency of integrated circuits transient simu-
lation at the expense of large memory needed for storage of coupling variables and complicated
intersubcircuit communication requiring interpolation. A new integration method based on the ex-
pansion of unknown variables in Chebyshev series is developed. Such a method assures very compact
representation of waveforms, minimizing storage requirements. Solutions are provided in continu-
ous form, therefore no extra interpolation is needed in the iterations. The resulting algorithm was
implemented and proved to be very efficient. A short description of spectral technique is presented
and an application of spectral analysis in computing the transient behavior of an MOS circnit is
discussed. The computing proved to be much more efficient in comparison with other methods.

1. Introduction. Electronic circuits are mathematically represented by a set

of nonlinear differential equations formulated using modified nodal analysis (MNA).
A method based on Newton-Kantorovich's approach is used for linearization of the
circuit. The resulting linearized system is in the form of a set of first order differential
equations which are solved with application of spectral analysis. Spectral technique
framework based on Chebyshev polynomials with their properties is applied in the
prototype software (SPEC) yielding accurate (globally controllable accuracy) and fast
[1] simulation processes.

2. Representation of functions using Chebyshev series. The expansion of

a function c(t) defined in the interval [-1, 1] is written as an infinite series as follows

00 ,

(2.1) c(t) = L: c;T;(t),

;=0

where T;(t) denotes a first kind Chebyshev polynomial of ith degree and c; are the
constant coefficients [2]. The prime at the summation symbol denotes that the first
term in the summation is halved. If a function· c(l) is defined in a general interval
[11' 12] then it has to be scaled to the interval [-1, 1] to obtain the scaled function c(t)
used in (2.1). The scaling is performed using the following operation

(2.2)

To simplify notation the equation (2.1) can be rewritten in the following vector form

(2.3) C{c(t)} = c,
• Department of Electrical and Computer Engineering, University of Arizona, Thcson, AZ 85721
132

where C denotes the Chebyshev transformation and the entries of the vector c are
composed of expansion coefficients of the function c(t) as

(2.4)

Using properties of Chebyshev polynomials, transformation of the product of two

functions c(t) and g(t) can be written as

(2.5) C{c(t)g(t)} = Cg = Ge,

where

!eo Cl

!Cl
(2.6a) C=
!c;

and

~go gl
~gl
(2.6b) G=
~gi

Chebyshev series offer a simple relation between the coefficients of expansion of a

function y(t) and the coefficients of expansion of its derivative y'(t) given by

(2.7) y=By*+2ey(-1)

where

(2.8) y* = C{y'(t)} ,
133

(2.9)

and the matrix B is invariant and its first row is defined as:

1
'2 i=O
{
(2.10) Bo; = (t2 -
- -
t1) . -.
(_1)'+1
1 i=l
(i-1)(i+1)
i = 2,3, ...

and the remaining entries are

j = i-I
(2.11) j = i +1
elsewhere.

3. Solving differential equations using Chebyshev series. Linear differ-

ential equations are easily solved using spectral techniques with the use of Chebyshev
series and their properties. In this section a numerical algorithm is presented. Con-
sider the following scalar differential equation

- dy _ --
(3.1) c(t) dt = g(t)y + h(t)

(3.1a)

where functions c(t), g(t) and k(t) are defined in the time interval

(3.2)

and y = y(t) is the unknown function on the same interval. To simplify mathematical
operations equation (3.1) is scaled to the interval [-1,1] as described in the previous
section. As a result, equation (3.1) is rewritten as

dy
(3.3) c(t) dt = g(t)y + h(t)

(3.3a) y(-l) = Yo,-

134

where y = y(t) is the unknown function, and c(t), g(t) and h(t) are known functions
of t on the interval [-1,1]. Using the notation introduced in the previous section
the above equation can be rewritten in the following transformed form as a relation
between the Chebyshev expansion coefficients

(3.4) Cy· =Gy+h.

Using relation (2.7) the above equation is rearranged in the following form

(3.5) (C - GB)y· = ii(t1)g + h,

which can be easily solved since it is a set of linear algebraic equations, where matrices
C, G, B and vectors g and h are known, defined in section 2.

In practice, Chebyshev expansions of the functions are performed to some selected

degree which depends on the required accuracy of the solution [3]. When the solution
is of order N, then analyzing equation (3.4) and relations (2.6ab), the forcing function
h(t) is expanded into Chebyshev series of degree N, however functions c(t) and g(t)
are of degree 2N.

4. Simulation of MOS circuits. Analyzed MOS circuits are described as a

set of M nonlinear differential equations with M unknown functions formulated by
using MNA [4]. Those equations are scaled and rewritten in the following form:

dv(t)
(4.1) C(v(t), t)--;u = q(v(t), t) v(-l) = Yo,
where v(t) is an M-dimensional vector of circuit variables, Vo is a vector of initial con-
ditions, C is an M x M square matrix with variable entries and q is aM-dimensional
vector of nonlinear functions.

Special Case

Consider the scalar case (M = 1), linearize equation (4.1) around a given waveform
yp(t) to yield

(4.2) c"(t)!; = gP(t)v(t) + hP(t) v(-l) = Yo = Vo,

where

= c(yp(t), t) ,
--av£9./ v=yp(t)
_.11£/
av V=lIp(t)
~
dt' (4.3)
= q[yp(t), t]- gP(t)Yp(t) .
135

Equation (4.2) is solved by using the same procedure as in the case of equation (3.3).
Based on this linearization a Newton-Kantorovich-type iteration procedure is used to
recover the solution. The process continues until the iterates satisfy a convergence
condition

(4.4)

where to is the convergence tolerance, k denotes the iteration count, and II . 1100 denotes
the maximum-norm.

General Case

Equation (4.1) can be rewritten as

(4.5)

where

(4.6a) Ci; = Ci;(V, t)

(4.6b) qi = qi(V, t) .

Each sub circuit is linearized around a vector of waveform

(4.7)

yielding an equation of ith subcircuit:

(4.8)
M
L Cij(Yp, t)
;=1
d: = L
dv' M

;=1
gi;(t)V; + qi(YP, t) -
M
L gi;(t)V
;=1
p;,

where
136

(4.9) 9i;(t) = Oqi /

oV; V=lIp
_ E01:;;/
1=1 OVk V=lIp
dvp; .
dt

After linearizing all subcircuits using (4.8) and (4.9), equation (4.7) is rewritten in
the form:

(4.10) CP(t) ~; = GP(t)v + hP(t),

where C"(t), G"(t) are M x M matrices and ~~ , v and h"(t) are M dimensional
vectors. Each element of matrices cP(t), GP(t) is expanded into Chebyshev series
of degree 2N and formes an (N + 1) x (N + 1) submatrix defined by (2.6ab). The
elements of the vector h"(t) are expanded into Chebyshev series of degree N and
form subvectors of order (N + 1) of coefficients of expansion. Using property (2.7)
the constructed equation for expansion coefficients can be assembled and represented
in a matrix form as:

(4.11)

where A", gP, B are M(N + 1) x M(N + 1) square matrices and p., X are M(N + 1)
dimensional vectors. The vector {)* is composed of (N + 1) dimensional subvectors
vi. A subvector vi contains the coefficients of the expansion of the ph component of
~. The blocks of matrices A", g" are determined by the submatrices created from
expansion of CP(t) and G"(t). Matrix B is a block diagonal with blocks composed of
M identical (N + 1) x (N + 1) square matrices defined by relations (2.10) and (2.11).
Vector X is composed of (N + 1) dimensional subvectors each represents expansion
coefficients of a respective element of vector h"(t). Vector p. is composed of the (N +1)
dimensional subvectors

(4.12)

k = 1,2, ... ,M,

which depends on the initial conditions VOl< for the kth components of vector v. The
M(N + 1) dimensional vector {) is calculated by using the relation

(4.13) {) = B{)* + p..

The M(N + 1) vector {) is composed of M subvectors Vk each containing (N + 1)
expansion coefficients.
137

5. Example.

5.1. Circuit model. Simulation of MOS circuits is based on the solution of or-
dinary differential equations obtained using MNA. The circuit equations are written
in the general form (4.1) In order to provide better description of the model and solu-
tion algorithm details a specific example is given below. Model of CMOS NAND Gate
The schematic of NAND gate built in CMOS technology is shown in Fig. 1. The
equivalent circuit is obtained by replacing the MOS transistors by appropriate model
[5] and the equations for the unknown nodal voltages (Va, V4 ) are written in the matrix
form

(5.1)

where

C11 ('\I3,"II4) = 2· cgdp + 2· cbdp('\I3,5.0) + cgdN + cbdN('\I3,"II4)

Ql('\I3, "114, Vi, \12) = cgdp . ft Vi + (cgdp + cgdN) . ft \12 - IN('\I3, \12, "114)
-Ip(Va, V2 ,5.0) - Ip('\I3, Vi, 5.0)

and

Q2('\I3, "114, Vi, \12) = cgdN . ft \12 + (cgbN + cgsN)' ft \12

+IN('\I3, \12, "114) - IN("II4, Vi, 0.0).

The noulinear capacitances are in the form

(5.3)

The remaining capacitors are constant. Functions IN and Ip represent the drain
current for the n-channel and p-channel MOS devices, respectively. The current
function IN depends on the regions of operation of the MOS transistors and it is
given by the following expressions
138

SV SV

FIG. 1. Electrical schematics of CMOS NAND gate.

a) forward region Vds=(Vd-Vs) ;::: 0

IN(Vd, Vg, Vs) =

Vgs = (Vg - Vs) < Vt = 0.5
(5.4a) {,8VdS(2(V9S-Vt~-VdS)(1+AVdS) O<Vds<Vgs-Vt
,8(Vgs - Vt)2(1 + AVds) Vgs-Vt<Vds

b) reverse region V ds ~ 0

IN(Vd, Vg, Vs) =

V gd = (V 9 - V d) < V t
(5.4b) {,8VdS(2(V9S-Vt~+VdS)(1-AVdS) 0< -Vds < Vgs- Vt
-,8(Vgs - Vt)2(1- AVds) Vgs - Vt < -Vds

The p-channel device operates in the same manner as the n-channel device except
that all voltages and currents are reversed.

5.2. Results of simulation. The CMOS NAND gate described above was sim-
ulated by using the prototype software. The simulation was performed in the time
interval [0.0, 2.3]l's which was divided into five subintervals (windows) as shown in
Fig. 2a and Fig. 3a. The Chebyshev expansion degree was set to 32 for all windows.
The results of simulation, voltage \/3, are shown in Fig. 2a together with the driving
signals lit and V2. The output shows a logical representation of the NAND function,
\/3 is low when lit and V2 are in high level, and \/3 is in high level otherwise. The
iteration process is illustrated in Fig. 3b, where the results of some initial iteration
steps are shown. The accuracy of the solution was set to 1.0 m V in each window.
139

V3M
5.0~
I ~3M
1\ f 5.00
IV
r "
2.50
0.00 \. 1 !Vi
V3=*'¥?:.
2 V3

,
I~ L H
V1M L H H
5.00 H L H
H H ~
2.50 H-Iigh l.rI
0.00 I l-loillBlll

V2M
5.0v-
I
2.5,.
0.00 l
time [us
0.00
\ time [us
0.0 0.4 0.8 1.5 1.9 0.0 0.4 0.8 1.5 1.9
a b
FIG. 2. Simulation of a NAND gate obtained using SPEC simulator; a) driving signals, Vi, V2, and
output, V3; b) the details of the output voltage V3.

6. Conclusions. A set of nonlinear differential equations describing an elec-

tronic circuit is linearized with the use of a newly developed linearization scheme.
The resulting set of linear differential equations is then transformed to a set of al-
gebraic equations using spectral analysis based on Chebyshev polynomials. Solution
of this system is obtained in an efficient way due to the relatively small number of
variables and the sparsity of the system. The spectral method has a special impor-
tance for bigger circuits where large number of variables is significantly reduced, in
comparison with non-spectral methods, reducing a memory requirement and simu-
lation time. The simulations can be successfully performed with the use of smaller
computers that don't have large memory. Numerous properties of Chebyshev series
[6] bring high potential for further improvements to the simulator.

Acknowledgement. Research described in this paper was supported in part by

the NSF grant: MIP-901 7037 and the authors wish to express their gratitude for
this support. The idea of writing this paper was conceived during the Summer 1991
Workshop at the Institute of Mathematics and its Applications .at the University of
Minnesota. O.A. Palusinski extends his thanks to the organizers for the invitation to
the workshop and sponsorship. O.A Palusinski was from September 1991 till February
1992 on the sabbatical leave at the University of Karlsruhe supported by a fellowship
from the German Science Foundation.

REFERENCES

[1] O.A. Palusinski, F. Szidarovszky, M. Abdennadher, C. Marcjan, K. Reiss, Accelerated Simu-

lation of Integrated Circuits Using Chebyshev Series Proceeding of 1992 IEEE-ISCAS.
[2] O.A. Palusinski, M.W. Guarini and S.J. Wright, Spectral Technique in Electronic Circuit
Analysis, International Journal of Numerical Modeling: Electronic Networks, Devices and
Fields, 1, 137-151 (1988).
[3] M.W. Guarini and O.A. Palusinski, Functional Relaxations and Spectral Techniques in
140

........ ,
V3M
25. n I
i iter.tiCliI
V3M
22.0
...
~
,f.... -

.: ....
",t ••'
20.0

20 . lB.OO
:1 16.0 .'
j! / .. '0
:
.,' -
14 .0 ~~
15." ~'(,e, . . . . . . .
I, itHatiOD 1
.
. :,l,i
12 . 0
,.'
: fl
.•.. ",.'
"
.
10 . 0
10. . t,.e""'t.'l. ••
I ituatim] B.00
-Ie ,. ........ it:r,.tiOii
" ~ iT
.....
.:........ ........
",
5. iY~ itentiClit 6.00

\ If 4.0
/
last it.erjtton - SOlution
\. iI
.,......
2.0
o.n
if 0.0 /
time (us] I time lus ]1
0.0 0.4 0.8 1.5 1 9 1. 50 1. 60 1. 70 1.80 1. 90

a b

FIG . 3. Convergence process in computation of transient in the NAND, the example of output
variable, V3 ; a) V3 in the simulation range with marked window for which the iterations were
recorded; b) the details of iteration process in the selected window.

Computer-Aided Circuit Analysis, International Journal of Numerical Modeling: Elec-

tronic Networks, Devices and Fields, 3, 183-193 (1990).
[4] O.A. Palusinski, M.W. Guarini, Simulation of MOS Circuits Using Spectral Technique in Relax-
ation Framework, COMPEL, the International Journal for Computation and Mathematics
in Electrical and Electronic Engineering, vol. 10, No 4, 363-365, Dec, 1991.
[5] L.W. Nagel, SPICE 2: A Computer Program to Simulate Semiconductor Circuits, Electronic
Research Laboratory Rep. No. ERL-M520, University of California, Berkeley, 1975. Also:
SPICE 2 Version 2G6, Department of Electrical Engineering and Computer Science, Uni-
versity of California, Berkeley, ERL Report 1989.
[6] S. Paszkowski, Numerical Applications of Chebyshev Polynomials and Series, (in Polish),
PWN, Warsaw (1975).
[7] E. Lelarasmee, A.E. Ruehli and A. Sangiovanni-Vincentelli, The Waveform Relaxation Method
for Time Domain Analysis of Large Scale Integrated Circuits, IEEE Trans. Computer-Aided
Des. Integrated Circuits Syst., CADICS-l 131-145 (1982).
CONVERGENCE OF WAVEFORM RELAXATION FOR RC CmCUITS

ALBERT E. RUEHL!- & CHARLES A. ZUKOWSKIt

Abstract. The waveform relaxation [WR] method of circuit simulation has demonstrated the
ability to handle large digital VLSI circuits without sacrificing accuracy. Existing programs use
reasonable heuristics for circuit partitioning of present day circuits. As circuit models begin to
include more and more parasitic elements, due to shrinking geometries, decreasing signal rise times
and increasing operating frequencies, the partitioning will become even more important. This paper
addresses the question of how partitioning should be done within complex interconnect models.
Specifically, we consider the partitioning of a limiting case RC circuit example, and investigate its
convergence properties and optimal time window size.

1. Introduction. Today, the number of parasitic elements required in VLSI

circuit models is rapidly increasing due to the decrease in geometries, decrease in
signal rise times and the corresponding increase in operating frequencies. Examples
include wire resistances and coupling capacitances. WR (waveform relaxation) [1]
programs for circuit simulation haye demonstrated their ability to handle today's
large digital VLSI circuits with accuracy similar to that of SPICE [2] or ASTAP [3].
Little work has been done so far on WR programs which efficiently include models
with large numbers of parasitic elements. Hence, handling circuits in a WR program
with an increased number of parasitic elements is a problem of high priority.
In this paper we consider the special case of the low pass RC circuit shown in Fig. 1
which is typical of a simplified model for the interconnects in VLSI circuits. We chose
this circuit, with no forced drive connection, since it represents the most difficult case
for WR if it is partitioned at the resistor. This circuit is a complementary special
circuit to a so called "high pass" circuit which has been analyzed before [4]. The
circuit shown in Fig. 1 cannot be partitioned with the diagonally dominant Norton
algorithm [5] which is used in several WR programs. Waveform Relaxation for RC
circuits has been considered two recent papers [6, 7]. Further, relaxation for RC
circuits has been investigated earlier in the context of bounding but the properties of
the convergence were not analyzed [8].
Here we are specifically interested in convergence behavior for the RC circuit.
First, it is evident that the convergence for low frequencies is in question since the
capacitors are open circuits at de. Fortunately, we can show that the circuit converges,
as expected from the basic WR convergence proof. Further, the rate of convergence
is of importance. For efficiency reasons we hope that the relaxation converges in less
than five iterations. In fact, we derive a measure of the time window over which rapid
convergence can be expected.

2. Analysis of RC circuit. The low pass RC circuit in Fig. 1 is a common

model for VLSI interconnects. In a sense it represents a worst case situation since
any other circuit element connected from either node to ground will improve local
convergence. The WR equations for this circuit when partitioned through the resistor

• IBM T. J. Watson Research Center, Yorktown Heights, NY 10598

t Columbia University, New York, NY 10027-6699
142

C31
+ +

VI
v.:
~CI
+ V3
S -

- FIG. 1. Low pass circuit

--
are

(1) tit(k)(t)+ aVl(k)(t) = aV3(k-l)(t) + Y.(t)

(2) ia(k)(t) + (3V3(k)(t) = (3Vl(k-l)(t)

where a:= l/~Cl and (3:= I/R2 C3 •

We can analyze the circuit in the frequency domain by using the Laplace transform
for the homogeneous equation or

(3)

where v(s) is the Laplace transform of v(t). The above equation is, in matrix
notation,

(4) (sI + M)v(k) = NV(k-l),

where the definitions of M and N are evident from comparisons of the last two
equations. We can rewrite Eq. 4 as

(5)

We utilize the following theorem [9] to show that the WR iteration has a potential
problem.
THEOREM 2.1. Let X = V(I4, cn) with 1 ::; p ::; 00 and assume that the
eigenvalues of M have positive real parts. Then the spectral radius of the symbol
K(s) is
(6)
143

Evaluating p( K) we find

(7) K(s) = [1.+p

.:<>]
0

and it is clear that the maximum occurs for w = 0 where p(K(O» = 1, which
indicates that convergence problems occur at w-+ O. Fortunately we show that this
does not imply that there is a problem over any finite time interval, but only indicates
that convergence is non-uniform and degrades over the infinite time interval. Also,
this difficulty does not exist if other parasitic elements are connected at either node.
We simplify the problem by choosing a = P = 1 without any loss of key insight.
By executing the WR interation in the s-domain, we arrive at the solution for iteration
(k):

(k)
(8) _ (k)( ) _ " 1
Vl S - ~l(s + 1)2m-l'
The solution in the time domain corresponding to Eq. 8 is found from the inverse
Laplace transform as

(k-l) t 2m
(9)
Vl
(k){t) _
- e
-t"
~o (2m)!'

Taking the limit as the iteration index k -+ 00 we find that the limit is

(10)

To verify that the limit is indeed the exact solution, we start by finding the s-
domain response of the circuit in Fig. 1,

_ (s + P)
(11) VI = s(s+{a+p» ,

for a unit initial voltage. Converting this into its corresponding time solution, we
find

(12)

which matches the limit waveform.

144

CRCCIRCUIT
VOLTAOII
lUCliOl
1.00 -t- - + - - I I - -+ - - t - --t-- - + -0;,0"...------
0.95 1\ "'1"1".... -----
" "l'l'.... .fo['--
0.90 -t[\-
\-+---1I--+--t---t---+-'l1"oniioY -
0"" \ -,r.....sOr. -
0.10 \.
0.15 i\
0.70 \\
0.65 \ \.
0.60 -+-+-\-\".+---1--+--1---+--+
~ -+-~\-1~,,~~-I--+--t---+--+
uo -+--+-+="",.::,~~~-~~h_=-_-_4--=-=-+-_-_-+
0'"" -+---'\+--'~-~+----'r-....-;:-+---=:,-+
O~-+--~-~~-+~__t--~~-+
0.l5 -+--+.---1.....:....-+----'''''- -+- ...;....+
O~_+--~~-I~~~--t-~~--+
O~~--~~-I~--p..--r--~~-;_

0.20 -t--+--'>.MI--+---!~t---t--.::...,+-
0.1' ~--~--'".II--+--f"'T--+--;_
0.10 -+---j----1"""';:--+--I---"'-+.:--t-
O.CI! _+--+----1--""+<-:;::--I---+---=~t-
0.00 -+---j----1--+---=F="+~-t-
0.00 1.00 2.00 !.oo ' .00 ' .00 6.00

FIG. 2. Convergence of initial iterations

3. Convergence results . The above analysis indicates that a reasonable WR

sequence exists even for this worst case RC circuit. The questions which are addressed
in this section are the convergence rate and the choice of time window size T.
A sufficient condition for the convergence of WR sequences for conventional cir-
cuits is that a capacitor is connected to ground from each node [11- Furthermore,
RC circuits that are driven generally exhibit uniform convergence over the infinite
time interval. For this worst case circuit, we can show that the convergence does not
exhibit such uniformity because the sequence does not converge at t = 00. We start
by the applying the final value theorem limtT ....oO'j(O') = f(oo) to Eq_ 11 to find that
the response of the RC circuit at t -+ 00 is given by

(3
(13) Vl(OO) = --(3-
a+

which evaluates to Vl(OO) = ! for a = (3 = L But for t -+ 00 where V.(t) = 0

the WR sequence degenerates to Vl(k) = Vl(k-l)_ If the initial waveform approaches
anything other than! for t -+ 00, the sequence will never get any closer at this limit.
This non-uniformity of convergence implies that accuracy is only reached gradually
for larger and larger times, and implies that time windowing is appropriate.
Next, we investigate the rate of convergence for the model circuit. One of the issues
of interest is the appropriate selection for time window size for this class of circuits.
Most WR programs allow a time window size T that is less than the entire analysis
time for increased efficiency. As is typical, a 'tradeoff exists in our example between
the number of iterations and the time window, and this relationship is quantified in
the following lemma_
145

TABLE 1
Window times
Iteration Time Windows
1 0.74
2 1.47
3 2.21
4 2.94
5 3.68

LEMMA 3.1. Let T E ~, then the WR sequence converges rapidly after the k-th
iteration for k ~eio.
The proof is fairly straight forward using the approximate identity (2m)! e!
$(2m)2m+i e- 2m . After some algebraic manipulations this leads to

(14)

In fact, we can see easily from Eq. 14 that the coefficients decrease faster than
O( ~ ) for the conditions on the time T and the iteration index k given in the lemma.
Fig. 2 shows how the first five iterations converge. In fact, the time window T of
rapid convergence is clearly visible. We can conclude from the above condition that
a reasonable choice for a time window as a function of iteration count is k > ei,
as
pictured. For a comparison, we give the values for the equality in Table 1.
We do get an indication from Fig. 2 and Table 1 how the useful time window
grows with the number of iterations. Of course, with our choices of a = {J = 1, this
is normalized to unit time constants. This implies, in general, that time constants
associated with this subcircuit should be sufficiently large such that the time window
is large enough that it does not constrain the number of time steps in a window too
much. "Large enough" may mean that the numerical integration needs at least 10
time points in a particular window. This condition may be guaranteed for a transition
or spike in the waveform. However, a problem with window size may exist if the WR
code has the same global time windows for the entire circuit.

4. Conclusions. The work presented in this paper increases understanding of

WR partitioning for an extended class of circuits. A simple example that captures
some of the properties of large interconnect subcircuits was analyzed in detail, and
the rate of convergence was related to time window size. We conclude that large
interconnect subcircuits can be partitioned efficiently if appropriate care is taken in
computing local feedback and choosing time window sizes.
146

REFERENCES

[I] E. Lelarasmee, A. E. Ruehii, and A. L. Sangiovanni-Vincentelli. The waveform relaxation method

for the time-domain analysis of large scale integrated circuits. IEEE funs. on CAD of ICs
and Systems, CAD-l(3):131-145, July 1982.
[2] L. W. Nagel. SPICE2, a computer program to simulate semiconductor circuits. Memo UCB/ERL
M520, University of California, Berkeley, May 1975.
[3] W. T. Weeks, A. J. Jimenez, G. W. Mahoney, D. Mehta, H. Quassemzadeh, and T. R. Scott.
Algorithms for ASTAP - a network analysis program. IEEE 1Tans. on Circuit Theory,
CT(20):628-634, November 1973.
[4] U. Miekkala, O. Nevanlinna, and A. E. Ruehli. Convergence and circuit partitioning aspects for
waveform relaxation. Proc. of Fifth Distrib. Memory Computing Con!, D. W. Walker and
Q.F. Strout, Eds., IEEE Compo Society Press, pages 605-611, 1990.
[5] J. White and A. L. Sangiovanni-Vincentelli. Partitioning algorithms and parallel implemen-
tations of waveform relaxation algorithms for circuit simulation. IEEE Proc. Int. Symp.
Circuits and Systems, (ISCAS-B5), pages 1069-1072, 1985.
[6] A. E. Ruehli, G. Gristede, and C. Zukowski. On partitioning and windowing for waveform
relaxation. In Proc. Seventh Int. Con! Numerical Analysis of Semiconductor Devices and
Circuits, pages 69-72, Boulder, Colorado, April 1991. Front Range Press.
[7] B. Leimkuhler, U. Miekkala, and O. Nevanlinna. Waveform relaxation for linear RC circuits.
Impact of Compo in Science and Engineering, pages 123-145, 1991.
[8] C. Zukowski. Relaxing bounds for linear RC mesh circuits. IEEE 1Tans. on CAD of ICs and
Systems, pages 305-312, April 1986.
[9] U. Miekkala and o. Nevanlinna. Convergence of waveform relaxation method. IEEE Int. Con!
Circuits and Systems (ISCAS-BB), pages 1643-1646, 1988.
SWITCHED NETWORKS

J. VLACH* AND D. BEDROSIAN**

Abstract. Inconsistent initial conditions, which can exist in switched networks, cannot be
handled by the usual integration routines. A method based on numerical inversion of the Laplace
transform was developed. It is equivalent to a high-order integration and can handle inconsistent
initial conditions, discontinuous functions and Dirac impulses. The method was used to write
programs for analysis of switched networks.

1. Introduction. The use of switched networks became possible with the

development of semiconductor switches which can operate at high speed and handle
large voltages and currents. Analysis of networks with such switches is possible by
using complex semiconductor models and classical simulators, but computer time
may become excessive while the exact switching responses are rarely of interest.
Making the switches ideal, by assuming their state to be either a true open or short
circuit, reduces considerably computing times, but new problems are created. The
most difficult one is the possibility of inconsistent initial conditions.
The easiest way to visualize inconsistent initial conditions is to consider two
capacitors, one of them discharged and one of them charged to some voltage. If the
two capacitors are suddenly connected by an ideal switch, the charged capacitor
immediately transfers part of its charge to the other capacitor and their common
voltage at the instant of switching becomes equal. The charge transfer is achieved by
an infinitely short impulse of current, the so called Dirac impulse. Strictly speaking,
it is not a function, but is is used widely in electrical engineering. It is denoted by
6(t) and can be defined as a rectangular impulse having duration T and height liT,
with lim T --+ O.
The simple problem described above is easy to resolve, but in more compli-
cated cases it is very difficult to find the initial conditions after switching and very
often the possibility of impulsive currents is simply disregarded. This can have
detrimental influence on the results.
This paper describes development of an integration method which automati-
cally takes care of inconsistent initial conditions, discontinuous functions or Dirac
impulses at the instant of switching. Only small examples and brief description are
given here. For details, the reader is referred to the references.

2. Development of the method. It is known that the Laplace transform

correctly handles Dirac impulses and inconsistent initial conditions, but its use in
linear networks requires the knowledge of the eigenvalues, a step which is impractical
for larger systems. We avoid their evaluation by developing a numerical method of
inversion. It is applied like the commonly known integration methods, but it does

*Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, On-

tario, Canada.
** Analogy, Beaverton, Oregon 97075 USA.
148

retain the ability of the Laplace transform to correctly handle inconsistent initial
conditions. Consider ·the Laplace transform equation

J
c+j""

v(t) = 2!j V(s)eotds.

c_jOO

and use the substitution s = 7 to obtain

Approximate e Z with the Pade rational function:

N
E(M + N - i)!(~)zi
eZ R: RN,M(Z) = -M---!:i-::!O!..-_ _ _ _ _ _ _ •
E( -1)i(M + N - i)!("f)zi
i=O

For N < M it is equivalent to

M K
RN,M(Z) = L:-'-,
i=1 Z - Zi

where the numerical evaluation ofthe roots, Zi, and residues, Ki, is done only once.
The inversion formula becomes

v(t) R: v(t) = _1.

27r]t ~
~
.=1
M

J ~V
c+joo

c _ joo
Z- Zi (~) t
dz.

Applying the residue calculus for the integral we arrive at

M
v(t) = -~ L:Ki V (:i)
i=1

It is proved in [1] (Appendix C) that the integration by residues is possible provided

the overall integrated function V ( 7) R~~M has two more poles than zeros. Since
realizable functions have at most one more zero than pole, the approximation M = 4
and N = 0 will suffice, but other choices are valid as well. If M is an even integer,
we can simplify the formula to

(1)
149

and consider only upper half plane poles and their residues. It is shown in [1] that
formula (1) approximates the first M + N + 1 terms of the Taylor expansion of v(t)
for any t > O.
In the case of networks we can use the system equations

(2) (G + 8C)V = W(8) + 1(0-)

Here G and C are real matrices, W is a vector of external sources, V is the solution
vector in the Laplace domain and I is a vector of initial capacitor voltages and initial
inductor currents. In this equation, every 8 is substituted by z;/t, the evaluation
is repeated M/2 times, and the results are used in (1) to obtain the time domain
vector, v(t). The initial conditions vector, I, is obtained as the product

where v(t) is the result from the previous step. The possibility of resetting the
initial conditions by means of initial voltages and currents makes the procedure
equivalent to a numerical integration formula, somewhat resembling the Runge-
Kutta methods, because previous results are not needed. However, evaluations are
done in complex. In the following we assume the selection M = 4, N = O.

3. Initial conditions. When dealing with inconsistent initial conditions, we

may need the answer to one of two problems:
1. Find the initial condition at t = 0+.
2. Establish that the Dirac impulse exists at t = 0 and find its area.
Equation (1) cannot evaluate the response at t = 0 due to the division by t, but
a choice of a small t could approximate the initial conditions at t = 0+. Such
approximation turns out to be acceptable in some situations and unacceptable in
others. To resolve the problem consider the network in Figure l(a). The input is a
Dirac impulse, the output in the Laplace domain is

1
v=--
8+1

and in the time domain

Application of formula (1) gives a value which is correct to about 15 decimal digits
(on a 16 decimal digit machine) for every t in the interval from 10- 12 to 10- 3 ,
see Figure 2(a). In each case the error corresponds to a single step. The error
starts growing for larger t, but that is to be expected, since the formula is an
approximation. In this case the Dirac impulse does not appear at the output.
150

R c

R
10 vet)

(a) (b)

Figure 1.

The situation changes drastically for Figure 1(b). It is the same network but
the output is taken at a different node. In the Laplace domain the output is

s 1
(3) V(s)=-=1--
s+1 s+1

and in time domain

vet) = bet) - e- t

le+OO

R
le-04
a
t
i
v le-08
•
E
r
r 1e-12
0
r

le-16
le-12 le-09 le-06 1e-03 IHOO
SLep si •• (s)

Figure 2(a).

A Dirac impulse appears at the output. In this case formula (1) gives a large
error for a small step, e = 10- 4 for t = 10- 12 , see Figure 2(b), again for a single
step. However, the error decreases almost linearly to e ::::: 10- 13 for t ::::: 10- 3 . Here
the integration error is acceptable, but the solution is a poor approximation of the
initial condition at t = 0+.
In order to get correct initial conditions at t = 0+, even in the presence of the
Dirac impulse at t = 0, we propose to first make one large step forward, to get to
the lninimum of the integration error. Afterwards, starting from this new point, we
make an exactly equal step backward. This backward step is essentially error-free,
since there is no impulse at the new starting point. The error in this step is the
same as in Figure 2(a). As a result of this two-step procedure we get correct initial
condition at t = 0+.
151

10+ 00

R
0
I 10·04

"t
1e-08

E
r
r 10· 12
0
r

le· IS
10· 12 1e· 09 1e-06 1e-03 10+ 00
Step silo (s)

Figure 2(b).

An explanation of why there is such a difference in the results is also available.

Consider the network in Figure l(b) for which the output is

s 1
V(s) = s+l =1- s+l'

At t = 0+ the correct solution is 1/(0+) = -1. IT this function is inserted into

formula (1), we obtain

i/(t) = -~
t
Re [Kl (1 __
t ) + K2 (1 __t )].
+t Zl +t Z2

For very small t and finite arithmetic precision the fractions will be dominated by
the units. The expression effectively reduces to

due to the fact that the real parts of the residues are equal in magnitude but have
opposite signs. This error is eliminated by the two-step method.

4. Representation of 8(t). IT the network contains switches whose state

depends on the solution variables of the problem, it may be important to discover
whether a Dirac impulse does or does not appear at the instant of switching. For
the purpose of explanation consider the network in Figure 3 where the transistor
Ql is switched externally by a square wave. Both the transistor and the diode are
modeled as ideal switches. When the external square wave causes the transistor
to act as a short circuit, the voltage at the upper end of the inductor is negative
and the diode does not conduct. The current flowing through the inductor builds
around it a magnetic field. When the transistor switch is suddenly opened by the
external square wave, the flow of current is interrupted and, due to the magnetic
field, a positive Dirac impulse will appear at the upper end of L. This impulse
closes the diode and the current through the inductor can continue to flow into the
right part of the network.
The sequence of these events must be discovered by the analysis method in order
to correctly handle the switching. Thus we must discover whether there is or is not
152

an impulse at the instant of switching. The Dirac impulse cannot be represented

in the computer, because it has zero duration and is infinitely large. However, we
can represent the solution at the instant of switching by two components,

(4)

T
...f1JL.
L o R

Figure 3. Switched network.

The first one is the true initial condition after switching, obtained by the two-
step method explained above. The term V6 is a multiplicative factor corresponding
to the area of the Dirac impulse. This term is always finite, can be stored in the
memory of the computer, and has zero value when no impulse has occurred at the
instant of switching. The problem is to find this coefficient.
Consider the situation in which we have reached the instant t = 0- , just before
switching. Using the two step method we can obtain the term v(O+) but we still do
not know whether the impulse has occurred at t = O. To discover its existence we can
calculate the area between t = 0- and t = 0+ by the same two-step method. Since
in the Laplace domain the integration is expressed by division by s, we evaluate

f -~ f= [Ki (:J V (:i)]

t 2

v(r)dr = Re
0- .-1

and do the same for the step back. The difference of the two areas is the area of
the Dirac impulse. Note that this integration is done almost for free; it represents
only four additional multiplications for our selection of M = 4 and N = 0, and for
both steps forward and back.
It is interesting to see the accuracy of this method. For the function

3
V(s)=-5+-
s+2
whose time domain response is

v(t) = -58(t) + 3e- 2t ,

153

the forward step with 5 ms length provides

f
0.005

v(t)dt = -4.98850758
0-

with relative error 2.5 x 10-13 • The backward step provides

f
0+

v(t)dt = -0.01492525
0.005

with relative error 8.4 x 10-11 • The sum of the two integrals is -5.00000, with
relative error 3.7 x 10-15 • We thus have an accurate method to find out whether a
Dirac impulse did or did not appear at the instant of switching.

5. Application. The method was used to write programs for analysis of

switched networks driven by external clocks or operated by internally controlled
switches. IT only external periodic clocks are present, frequency domain analysis
is possible [2,3]. IT the network has some internally controlled switches, then only
time domain analysis can be used [4,5]. However, if such a network is periodic, then
one more application is possible, an accelerated method for finding the steady state
[6].
Steady state can always be reached by integrating the network equations over
sufficiently long time, until the transients die out. This is usually very expensive and
the uncertainty always remains whether the steady state has actually been reached
with sufficient accuracy. An accelerated method was developed, based on the idea
that in steady state the initial conditions at the beginning of the period must be
equal to the final conditions at the end of the same period. An error function can
be defined

(5)

and an iterative method used to reduce the error to zero. A suitable method is the
Newton-Raphson iteration, based on the equations

J(k)[V(O-)]~V(k) = -E[v(O-)]
V(k+l) = v(k) + ~v(k)
where the Jacobian matrix is

The integration method explained above is used to overcome problems with switch-
ing and possible inconsistent initial conditions. It is also used in the evaluation of
the Jacobian. Mathematical details are given ip [6,7] where the efficiency of the
method is demonstrated on several practical examples.
154

REFERENCES

[1) J. VLACH AND K. SINGHAL, Computer Methods for Circuit Analysis and Design, Van Nos-
trand Reinhold, New York, 1983.
(2) A. OPAL AND J. VLACH, Consistent initial conditions of linear switched networks, IEEE
Transactions on Circuits and Systems, CAS-37 (3), March, 1990, pp. 364-372.
(3) A. OPAL AND J. VLACH, Analysis and sensitivity of periodically switched linear networks,
IEEE Transactions on Circuits and Systems, CAS-36 (4), April, 1989, pp. 522-532.
(4) D. BEDROSIAN AND J. VLACH, Time-Domain Analysis of Networks with Internally Controlled
Switches, Vol. 39 (3), March, 1992, pp. 199-212.
(5) D. BEDROSIAN AND J. VLACH, Analysis of Switched Networks, International Journal of
Circuit Theory and Applications, Vol. 20 (3), May-June, 1992, pp. 309-325.
(6) D. BEDROSIAN AND J. VLACH, Accelerated Steady-State Method for Networks with Internally
Controlled Switches, IEEE Transactions on Circuits and Systems I: Fundamental Theory and
Applications, July 1992, Volume 39, Number 7, pp. 520-530.
(7) D. BEDROSIAN AND J. VLACH, An Accelerated Steady-State Method for Networks with
Internally Controlled Switches, IEEE International Conf. on Computer-Aided Design, Santa
Clara, California, November 11-14, 1991, pp. 24-27.