MODULE 1
INTRODUCTION
1.1 WHAT IS AI?
We have claimed that AI is exciting, but we have not said what it is. In Figure
1.1 we see eight definitions of AI, laid out along two dimensions. The
definitions on top are concerned with thought processes and reasoning,
whereas the ones on the bottom address behavior. The definitions on the left
measure success in terms of fidelity to human performance, whereas the ones on
the right measure against an ideal performance measure, called rationality. A
system is rational if it does the “right thing,” given what it knows.
Historically, all four approaches to AI have been followed, each by different
people with different methods. A human-centered approach must be in part an
empirical science, involving observations and hypotheses about human
behavior. A rationalist approach involves a combination of mathematics and
engineering. The various group have both disparaged and helped each other. Let
us look at the four approaches in more detail.
1.1.1 Acting humanly: The Turing Test approach
MODULE 1
INTRODUCTION
The Turing Test, proposed by Alan Turing (1950), was designed to provide a
satisfactory operational definition of intelligence. A computer passes the test if a
human interrogator, after posing some written questions, cannot tell whether the
written responses come from a person or from a computer. Later will discusses
the details of the test and whether a computer would really be intelligent if it
passed. For now, we note that programming a computer to pass a rigorously
applied test provides plenty to work on. The computer would need to possess
the following capabilities:
1. Natural Language Processing to enable it to communicate successfully
in English;
2. Knowledge Representation to store what it knows or hears;
3. Automated Reasoning to use the stored information to answer questions
and to draw
new conclusions;
4. Machine Learning to adapt to new circumstances and to detect and
extrapolate patterns.
Turing’s test deliberately avoided direct physical interaction between the
interrogator and the computer, because physical simulation of a person is
unnecessary for intelligence. However, the so-called total Turing Test includes
a video signal so that the interrogator can test the subject’s perceptual abilities,
as well as the opportunity for the interrogator to pass physical objects ―through
the hatch.‖ To pass the total Turing Test, the computer will need
1. Computer Vision to perceive objects, and
2. Robotics to manipulate objects and move about.
These six disciplines compose most of AI, and Turing deserves credit for
designing a test that remains relevant 60 years later. Yet AI researchers have
devoted little effort to passing the Turing Test, believing that it is more
important to study the underlying principles of intelligence than to duplicate an
MODULE 1
INTRODUCTION
exemplar. The quest for ―artificial flight‖ succeeded when the Wright brothers
and others stopped imitating birds and started using wind tunnels and learning
about aerodynamics. Aeronautical engineering texts do not define the goal of
their field as making ―machines that fly so exactly like pigeons that they can
fool even other pigeons.‖
1.1.2 Thinking humanly: The cognitive modeling approach
If we are going to say that a given program thinks like a human, we must have
some way of determining how humans think. We need to get inside the actual
workings of human minds. There are three ways to do this: through
introspection—trying to catch our own thoughts as they go by; through
psychological experiments—observing a person in action; and through brain
imaging—observing the brain in action. Once we have a sufficiently precise
theory of the mind, it becomes possible to express the theory as a computer
program. If the program’s input–output behavior matches corresponding human
behavior, that is evidence that some of the program’s mechanisms could also be
operating in humans. For example, Allen Newell and Herbert Simon, who
developed GPS, the ―General Problem Solver‖ (Newell and Simon, 1961), were
not content merely to have their program solve problems correctly. They were
more concerned with comparing the trace of its reasoning steps to traces of
human subjects solving the same problems. The interdisciplinary field of brings
together computer models from AI and experimental techniques from
psychology to construct precise and testable theories of the human mind.
Cognitive science is a fascinating field in itself, worthy of several textbooks and
at least one encyclopedia (Wilson and Keil, 1999). We will occasionally
comment on similarities or differences between AI techniques and human
cognition. Real cognitive science, however, is necessarily based on
experimental investigation of actual humans or animals. We will leave that for
other books, as we assume the reader has only a computer for experimentation.
MODULE 1
INTRODUCTION
In the early days of AI there was often confusion between the approaches: an
author would argue that an algorithm performs well on a task and that it is
therefore a good model of human performance, or vice versa. Modern authors
separate the two kinds of claims; this distinction has allowed both AI and
cognitive science to develop more rapidly. The two fields continue to fertilize
each other, most notably in computer vision, which incorporates
neurophysiological evidence into computational models.
1.1.3 Thinking rationally: The “laws of thought” approach
The Greek philosopher Aristotle was one of the first to attempt to codify ―right
thinking,‖ that is, irrefutable reasoning processes. His syllogisms provided
patterns for argument structures that always yielded correct conclusions when
given correct premises—for example, ―Socrates is a man; all men are mortal;
therefore, Socrates is mortal.‖ These laws of thought were supposed to govern
the operation of the mind; their study initiated the field called logic.
Logicians in the 19th century developed a precise notation for statements about
all kinds of objects in the world and the relations among them. (Contrast this
with ordinary arithmetic notation, which provides only for statements about
numbers.) By 1965, programs existed that could, in principle, solve any
solvable problem described in logical notation. (Although if no solution exists,
the program might loop forever.) The so-called logicist tradition within artificial
intelligence hopes to build on such programs to create intelligent systems. There
are two main obstacles to this approach. First, it is not easy to take informal
knowledge and state it in the formal terms required by logical notation,
particularly when the knowledge is less than 100% certain. Second, there is a
big difference between solving a problem ―in principle‖ and solving it in
practice. Even problems with just a few hundred
facts can exhaust the computational resources of any computer unless it has
some guidance as to which reasoning steps to try first. Although both of these
MODULE 1
INTRODUCTION
obstacles apply to any attempt to build computational reasoning systems, they
appeared first in the logicist tradition.
1.1.4 Acting rationally: The rational agent approach
An agent is just something that acts (agent comes from the Latin agere, to do).
Of course, all computer programs do something, but computer agents are
expected to do more: operate autonomously, perceive their environment, persist
over a prolonged time period, and adapt to change, and create and pursue goals.
A rational agent is one that acts so as to achieve the best outcome or, when
there is uncertainty, the best expected outcome. In the ―laws of thought‖
approach to AI, the emphasis was on correct inferences. Making correct
inferences is sometimes part of being a rational agent, because one way to act
rationally is to reason logically to the conclusion that a given action will achieve
one’s goals and then to act on that conclusion. On the other hand, correct
inference is not all of rationality; in some situations, there is no provably correct
thing to do, but something must still be done. There are also ways of acting
rationally that cannot be said to involve inference. For example, recoiling from
a hot stove is a reflex action that is usually more successful than a slower action
taken after careful deliberation. All the skills needed for the Turing Test also
allow an agent to act rationally. Knowledge representation and reasoning enable
agents to reach good decisions.
The rational-agent approach has two advantages over the other approaches.
First, it is more general than the ―laws of thought‖ approach because correct
inference is just one of several possible mechanisms for achieving rationality.
Second, it is more amenable to scientific development than are approaches
based on human behaviour or human thought.
The standard of rationality is mathematically well defined and completely
general, and can be ―unpacked‖ to generate agent designs that provably achieve
it.
MODULE 1
INTRODUCTION
1.2 THE FOUNDATIONS OF ARTIFICIAL INTELLIGENCE
1. Philosophy:
The final element in the philosophical picture of the mind is the connection
between knowledge and action. This question is vital to AI because intelligence
requires action as well as reasoning. Moreover, only by understanding how
actions are justified can we understand how to build an agent whose actions are
justifiable (or rational). Aristotle argued (in De Motu Animalium) that actions
are justified by a logical connection between goals and knowledge of the
action’s outcome.
Can formal rules be used to draw valid conclusions?
• How does the mind arise from a physical brain?
• Where does knowledge come from?
• How does knowledge lead to action?
2. Mathematics:
What are the formal rules to draw valid conclusions?
• What can be computed?
• How do we reason with uncertain information?
Philosophers staked out some of the fundamental ideas of AI, but the leap to a
formal science required a level of mathematical formalization in three
fundamental areas: logic, computation, and probability.
The idea of formal logic can be traced back to the philosophers of ancient
Greece, but its mathematical development really began with the work of George
Boole (1815–1864), who worked out the details of propositional, or Boolean,
logic (Boole, 1847). In 1879, Gottlob Frege (1848–1925) extended Boole’s
logic to include objects and relations, creating the first order logic that is used
today. Alfred Tarski (1902–1983) introduced a theory of reference that shows
how to relate the objects in a logic to objects in the real world.
MODULE 1
INTRODUCTION
The next step was to determine the limits of what could be done with
logic and computation.
The first nontrivial algorithm is thought to be Euclid’s algorithm for computing
greatest common divisors. The word algorithm (and the idea of studying them)
comes from al-Khowarazmi, a Persian mathematician of the 9th century, whose
writings also introduced Arabic numerals and algebra to Europe. Boole and
others discussed algorithms for logical deduction, and, by the late 19th century,
efforts were under way to formalize general mathematical reasoning as logical
deduction.
Besides logic and computation, the third great contribution of
mathematics to AI is the theory of probability. The Italian Gerolamo Cardano
(1501–1576) first framed the idea of probability, describing it in terms of the
possible outcomes of gambling events.
3. Economics:
How should we make decisions so as to maximize payoff?
How should we do this when others may not go along?
How should we do this when the payoff may be far in the future?
The science of economics got its start in 1776, when Scottish philosopher Adam
Smith (1723–1790) published An Inquiry into the Nature and Causes of the
Wealth of Nations. While the ancient Greeks and others had made contributions
to economic thought, Smith was the first to treat it as a science, using the idea
that economies can be thought of as consisting of individual agents maximizing
their own economic well-being. Most people think of economics as being about
money, but economists will say that they are really studying how people make
choices that lead to preferred outcomes.
Work in economics and operations research has contributed much to our
notion of rational agents, yet for many years AI research developed along
MODULE 1
INTRODUCTION
entirely separate paths. One reason was the apparent complexity of making
rational decisions.
4. Neuroscience:
• How do brains process information?
Brains and digital computers have somewhat different properties. Computers
have a cycle time that is a million times faster than a brain. The brain makes up
for that with far more storage and interconnection than even a high-end personal
computer, although the largest supercomputers have a capacity that is similar to
the brain’s. (It should be noted, however, that the brain does not seem to use all
of its neurons simultaneously.) Futurists make much of these numbers, pointing
to an approaching singularity at which computers reach a superhuman level of
performance (Vinge, 1993; Kurzweil, 2005), but the raw comparisons are not
especially informative. Even with a computer of virtually unlimited capacity,
we still would not know how to achieve the brain’s level of intelligence.
5. Psychology:
How do humans and animals think and act?
The origins of scientific psychology are usually traced to the work of the
German physicist Hermann von Helmholtz (1821–1894) and his student
Wilhelm Wundt (1832–1920). Helmholtz applied the scientific method to the
study of human vision, and his Handbook of Physiological Optics is even now
described as ―the single most important treatise on the physics and physiology
of human vision‖ (Nalwa, 1993, p.15). In 1879, Wundt opened the first
laboratory of experimental psychology, at the University of Leipzig. Wundt
insisted on carefully controlled experiments in which his workers would
perform a perceptual or associative task while introspecting on their thought
processes. The careful controls went a long way toward making psychology a
science, but the subjective nature of the data made it unlikely that an
experimenter would ever disconfirm his or her own theories. Biologists
MODULE 1
INTRODUCTION
studying animal behavior, on the other hand, lacked introspective data and
developed an objective methodology, as described by H. S. Jennings (1906) in
his influential work Behavior of the Lower Organisms. Applying this viewpoint
to humans, the behaviorism movement, led by John Watson (1878–1958),
rejected any theory involving mental processes on the grounds that introspection
could not provide reliable evidence. Behaviorists insisted on studying only
objective measures of the percepts (or stimulus) given to an animal and its
resulting actions (or response). Behaviorism discovered a lot about rats and
pigeons but had less success at understanding humans.
6. Computer Engineering:
How can we build an efficient computer?
For artificial intelligence to succeed, we need two things: intelligence and an
artifact. The computer has been the artifact of choice. The modern digital
electronic computer was invented independently and almost simultaneously by
scientists in three countries embattled in World War II.
AI also owes a debt to the software side of computer science, which has
supplied the operating systems, programming languages, and tools needed to
write modern programs (and papers about them). But this is one area where the
debt has been repaid: work in AI has pioneered many ideas that have made their
way back to mainstream computer science, including time sharing, interactive
interpreters, personal computers with windows and mice, rapid development
environments, the linked list data type, automatic storage management, and key
concepts of symbolic, functional, declarative, and object-oriented programming.
7. Control theory and Cybernetics:
How can artifacts operate under their own control?
Ktesibios of Alexandria (c. 250 B.C.) built the first self-controlling machine: a
water clock with a regulator that maintained a constant flow rate. This invention
changed the definition of what an artifact could do. Previously, only living
things could modify their behavior in response to changes in the environment.
MODULE 1
INTRODUCTION
Other examples of self-regulating feedback control systems include the steam
engine governor, created by James Watt (1736–1819), and the thermostat,
invented by Cornelis Drebbel (1572–1633), who also invented the submarine.
The mathematical theory of stable feedback systems was developed in the 19th
century.
Modern control theory, especially the branch known as stochastic optimal
control, has as its goal the design of systems that maximize an objective
function over time. This roughly matches our view of AI: designing systems
that behave optimally. Why, then, are AI and vcontrol theory two different
fields, despite the close connections among their founders? The answer lies in
the close coupling between the mathematical techniques that were familiar to
the participants and the corresponding sets of problems that were encompassed
in each world view. Calculus and matrix algebra, the tools of control theory,
lend themselves to systems that are describable by fixed sets of continuous
variables, whereas AI was founded in part as a way to escape from the these
perceived limitations. The tools of logical inference and computation allowed
AI researchers to consider problems such as language, vision, and planning that
fell completely outside the control theorist’s purview.
8. Linguistics:
How does language relate to thought?
Modern linguistics and AI, then, were ―born‖ at about the same time, and grew
up together, intersecting in a hybrid field called computational linguistics or
natural language processing. The problem of understanding language soon
turned out to be considerably more complex than it seemed in 1957.
Understanding language requires an understanding of the subject matter and
context, not just an understanding of the structure of sentences. This might seem
obvious, but it was not widely appreciated until the 1960s. Much of the early
work in knowledge representation (the study of how to put knowledge into a
form that a computer can reason with) was tied to language and informed by
MODULE 1
INTRODUCTION
research in linguistics, which was connected in turn to decades of work on the
philosophical analysis of language.
1.4 HISTORY OF ARTIFICIAL INTELLIGENCE
1.3.1 The Gestation of AI
In 1943, Warren McCulloch and Walter Pitts laid the foundation for
artificial intelligence (AI) by creating a model of artificial neurons
inspired by brain physiology, propositional logic, and Turing's theory of
computation.
They demonstrated that networks of these neurons could compute any
function and implement logical operations. Donald Hebb (1949)
introduced Hebbian learning to modify connection strengths between
neurons, a concept still influential today.
In 1950, Harvard students Marvin Minsky and Dean Edmonds built the
first neural network computer, SNARC. Minsky later explored universal
computation in neural networks at Princeton.
Alan Turing's 1950 article introduced key AI concepts, including the
Turing Test, machine learning, genetic algorithms, and reinforcement
learning. Turing also proposed the Child Programme idea, simulating a
child's mind instead of an adult's.
MODULE 1
INTRODUCTION
1.3.2 The Birth of AI
In 1951, John McCarthy, an important person in AI, finished his PhD at
Princeton. Later, in 1956, he organized a workshop at Dartmouth, which is
considered the starting point of AI. The goal was to figure out how to make
machines simulate human intelligence. Attendees included famous researchers
like Allen Newell and Herbert Simon. The workshop didn't bring big
breakthroughs, but it united key people.
For the next 20 years, AI was shaped by these people and their connections at
MIT, CMU, Stanford, and IBM. The Dartmouth proposal highlighted that AI
focuses on imitating human abilities, using computer science as its method. AI
became its own field because it had unique goals and methods, unlike control
theory, operations research, or decision theory.
1.3.3 Early Enthusiasm, great expectations
In the early days of AI, with basic computers, pioneers like John McCarthy and
others amazed people by making computers do clever things. ● Allen Newell
and Herbert Simon made the General Problem Solver, a program that solved
problems like humans. It sparked the idea that intelligence involves
manipulating symbols.
At IBM, Herbert Gelernter and Arthur Samuel created AI programs.
● in 1958, McCarthy made Lisp, a key programming language for AI.
● McCarthy later started the AI lab at Stanford to emphasize logic.
MODULE 1
INTRODUCTION
● They explored "microworlds" like the blocks world to solve limited but smart
tasks.
● Early work on neural networks, inspired by McCulloch and Pitts, also
advanced.
● All these achievements set the stage for the future of AI.
1.3.4 A dose of reality
In 1957, Herbert Simon said machines would think and learn fast. But early AI
had problems.
● Translating languages failed because computers lacked knowledge.
● Thinking faster with better hardware didn't work for complex AI challenges.
● In 1973, the Lighthill report criticized AI, reducing support.
● In 1969, Minsky showed that basic structures for smart behavior had limit.
● New learning methods came later, but early AI struggled with big
expectations and real-world difficulties.
"AI Winter" symbolizes a period marked by reduced enthusiasm and backing
for advancements in artificial intelligence.
1.3.5 Knowledge Based System
In early day of AI, AI researchers used weak methods or general searches for
solutions.
● DENDRAL, an expert system, broke ground using specific knowledge for
molecular structure. It replaced exhaustive searches with chemists' pattern
recognition, making it more efficient.
● DENDRAL was knowledge-intensive and used specialized rules.
● MYCIN, another expert system, was a backward chaining expert system that
used AI to identify microorganisms causing severe diseases like bacteraemia
and meningitis and propose antibiotics based on patient weight.
● Since then, domain knowledge became crucial in natural language
understanding. While early systems like SHRDLU had limitations, Roger
MODULE 1
INTRODUCTION
Schank's work at Yale emphasized knowledge representation and reasoning for
language understanding.
● Real-world applications led to different languages, from logic-based Prolog to
Minsky's frame-based approach.
1.3.6 AI Becomes An Industry
In the early 1980s, the first successful commercial expert system, R1, operated
at Digital Equipment Corporation, saving millions of dollars. By 1988, major
corporations like DEC and DuPont had deployed numerous expert systems,
resulting in significant cost savings. The AI industry grew rapidly, reaching
billions of dollars with companies developing expert systems, vision systems,
robots, and specialized software and hardware. However, the period known as
the "AI Winter" followed, marked by companies failing to fulfill grand
promises, leading to a downturn in the AI industry.
1.3.7 The Return of Neural Network
In the 1980s, researchers rediscovered a learning algorithm called
backpropagation, first found in 1969. They applied it to solve learning
problems in computer science and psychology. Some thought that connectionist
models, which emphasize neural networks, could challenge symbolic and logic-
based approaches in AI. There was a debate about whether manipulating
symbols played a crucial role in human thinking. Nowadays, we see both
connectionist and symbolic approaches as working together, not competing.
Current neural network research has two branches: one focuses on designing
effective systems, and the other studies the properties of real neurons.
1.3.8 AI Adopts Scientific Method
In recent years, there has been a significant shift in Artificial Intelligence (AI)
towards building on existing theories, rigorous experimentation, and real-world
applications.
MODULE 1
INTRODUCTION
AI, once isolated, is now integrating with fields like control theory and
statistics.
The scientific method is firmly applied and, now, AI requires hypotheses
to undergo empirical experiments and statistical analysis.
Recent dominance by Hidden Markov Models (HMMs) is due to their
rigorous theory and training on real speech data.
Similar trends are seen in machine translation and neural networks, which
now benefit from improved methodology and theoretical frameworks.
Judea Pearl's work in probabilistic reasoning led to a new acceptance of
probability and decision theory, with Bayesian networks dominating uncertain
reasoning in AI.
Normative expert systems, acting rationally based on decision theory,
have become prominent.
Similar revolutions have occurred in robotics, computer vision, and
knowledge representation, as increased formalization and integration with
machine learning prove effective in solving complex problems.
1.3.9 The emergence of Intelligent Agent
Researchers are looking again at the "whole agent" challenge in AI, like the
SOAR architecture. The Internet is a big deal for smart agents, used in things
like search engines.
Creating complete agents shows the need to shake up AI fields and handle
uncertainties in sensory systems. AI now works closely with areas like control
theory and economics, especially in things like controlling robotic cars.
Despite successes, some AI leaders like McCarthy, Minsky, Nilsson, and
Winston weren't happy.
They wanted AI to go back to its original goal of making Humanlike AI
(HLAI), focusing on machines that think, learn, and create.
MODULE 1
INTRODUCTION
Another idea was Artificial General Intelligence (AGI), aiming for a
universal way of learning and acting in any situation rightly and making
sure AI is friendly and not a worry in this journey.
1.3.10 The availability of large data sets.
In the past 60 years of computer science, people mostly focused on creating
algorithms.
But now, in AI, we're realizing that for many problems, it's more useful to
focus on the data instead of getting too caught up in which algorithm to
use.
This change is because we have a lot of data available, like trillions of
English words or billions of web images.
An important study by Yarowsky showed that, for tasks like figuring out
the meaning of a word in a sentence, you can do it really well without
human-labeled examples.
Another study by Banko and Brill found that having more data is often
more helpful than choosing a specific algorithm.
For instance, Hays and Efros improved a photo-filling tool by using a
bigger collection of photos.
This shift in thinking suggests that in AI, where we need a lot of
knowledge, we might rely more on learning from data instead of
manually coding everything.
With the rise of new AI applications, some say we're moving from "AI
Winter" to a new era, ―AI Summer‖, as AI becomes a fundamental part of
many industries, as noted by Kurzweil.