NLP Module 2
NLP Module 2
LECTURE 3:
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PROBABILITY
THEORY
Probability Theory
● Deals with predicting how likely it is that something will happen.
● Experiment (or trial) - the process by which an observation is made.
● Sample space Ω - a collection of all basic outcomes assumed.
○ Discrete - having at most a countably infinite number of basic outcomes
○ Continuous - having an uncountable number of basic outcomes
● Φ represents the impossible events.
● Experimental outcome must be an event which is a subset of Ω.
Probability Theory
Probability Theory
● Probabilities are numbers between 0 and 1, 0 indicates impossibility and 1 certainty.
● A probability function or distribution distributes a probability mass of 1 throughout
the sample space Ω.
● Formally, a discrete probability function is any function P: Ƒ → [0,1] such that:
○ P(Ω) = 1
○ Countable additivity
● A well founded probability space consists of a sample space Ω, a 𝛔 field (a set which
is closed under complement and countable union of its elements and which has a
maximal element Ω) of events Ƒ and a probability function P.
● Example: A fair coin is tossed 3 times. What is the chance of 2 heads?
Conditional Probability and Independence
● Conditional probability - partial knowledge about the outcome of an experiment that
influences what experimental outcomes are possible.
○ Updated probability of an event given some knowledge.
● The probability of an event before additional knowledge considered is called the prior
probability of the event.
● The new probability that results from using the additional knowledge is referred to as
the posterior probability of the event.
P(A|B) = P(A ∩ B) / P(B)
● The multiplication rule is given by
P(A ∩ B) = P(B) P(A|B) = P(A) P(B|A)
● Two events A, B are independent of each other if P(A ∩ B) = P(A) P(B)
Baye`s Theorem
● The general statement of Bayes' theorem is “The conditional probability of an event
A, given the occurrence of another event B, is equal to the product of the event of B,
given A and the probability of A divided by the probability of event B.” i.e. where,
P(A) and P(B) are the probabilities of events A and B.
Bayes Theorem
Conditional Probability
Median: The middle number in the Data set while listed in either ascending or descending
order is the Median.
Mode: The number that occurs the most in a Data set and ranges between the highest and
lowest value is the Mode.
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Frequency
● Probability mass function (pmf): gives the probability that the random variable has
different numeric values
Mean and Variance
● The expectation is the mean or average of a random variable
● The variance of a random variable is a measure of whether the values of the random
variable tend to be consistent over trials or to vary a lot.
PROBABILITY DISTRIBUTION
Random Variables
● A random variable is a numerical description of the outcome of an experiment.
● It is classified into 2:
○ Discrete Random Variable : A random variable that may assume either a finite
number of values or an infinite sequence of values such as 0, 1, 2, . . .
○ Continuous Random Variable: A random variable that may assume any
numerical value in an interval or collection of intervals
Discrete Probability Distribution
● The probability distribution for a random variable describes how probabilities are
distributed over the values of the random variable.
● For a discrete random variable x, a probability function, denoted by f (x), provides the
probability for each value of the random variable.
● The classical, subjective, and relative frequency methods of assigning probabilities
can be used to develop discrete probability distributions.
● A primary advantage of defining a random variable and its probability distribution is
that once the probability distribution is known, it is relatively easy to determine the
probability of a variety of events that may be of interest to a decision maker.
Discrete Probability Distribution
Binomial Probability Distribution
● A binomial experiment exhibits the following four properties.
Binomial Probability Distribution
● In a binomial experiment, our interest is in the number of successes occurring in the n
trials.
● If we let x denote the number of successes occurring in the n trials, we see that x can
assume the values of 0, 1, 2, 3, . . . , n.
● Because the number of values is finite, x is a discrete random variable.
● The probability distribution associated with this random variable is called the
binomial probability distribution.
Binomial Probability Distribution
Binomial Probability Distribution
Binomial Probability Distribution
Poisson Probability Distribution
● Useful in estimating the number of occurrences over a specified interval of time or
space.
● If the following two properties are satisfied, then the no. of occurrences is a random
variable described by the Poisson probability distribution
○ The probability of an occurrence is the same for any two intervals of equal
length.
○ The occurrence or non-occurrence in any interval is independent of the
occurrence or non-occurrence in any other interval
Poisson Probability Distribution
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
Continuous Probability Distribution
● The probability function for continuous random variable is know as probability
density function.
● PDF doesnot directly provide probabilities.
● However the area under the graph f(x) corresponding to a given interval does provide
the probability that the continuous random variable x assumes a value in that interval.
● Because the area under the graph of f(x) at any particular point is zero, one of the
implications of the definition of probability for continuous random variables is that
the probability of any particular value of the random variable is zero.
● For a continuous random variable, we consider probability only in terms of the
likelihood that a random variable assumes a value within a specified interval.
● The total area under the graph of f(x) is equal to 1.
Uniform Probability Distribution
Normal Probability Distribution
● A random variable that has a normal distribution with a mean of zero and a standard
deviation of one is said to have a standard normal probability distribution.
Uniform vs Normal Probability Distribution
ENGLISH GRAMMAR
English Grammar
● Grammar is defined as the rules for forming well-structured sentences.
● Grammar denotes syntactical rules that are used for conversation in natural languages.
● A grammar G can be written as a 4-tuple (N, T, S, P) where,
○ N or V_N = set of non-terminal symbols, or variables.
○ T or ∑ = set of terminal symbols.
○ S = Start symbol where S ∈ N
○ P = Production rules for Terminals as well as Non-terminals.
○ It has the form α → β, where α and β are strings on V_N ∪ ∑ and at least one
symbol of α belongs to V_N
English Grammar
● Each natural language has an underlying structure usually referred to under Syntax.
● The fundamental idea of syntax is that words group together to form the constituents
like groups of words or phrases which behave as a single unit.
● These constituents can combine to form bigger constituents and, eventually,
sentences.
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
Different Types of Grammar in NLP
● Set of Non-Terminals
● Set of Terminals
● Set of Productions
● Start Symbol
Syntactic categories and their common denotations in NLP: np - noun phrase, vp - verb
phrase, s - sentence, det - determiner (article), n - noun, tv - transitive verb (takes an
object), iv - intransitive verb, prep - preposition, pp - prepositional phrase, adj - adjective
Different Types of Grammar in NLP
Set of Non-terminals
It is represented by V. The non-terminals are syntactic variables that denote the sets of
strings, which helps in defining the language that is generated with the help of grammar.
Set of Terminals
It is also known as tokens and represented by Σ. Strings are formed with the help of the
basic symbols of terminals.
Different Types of Grammar in NLP
Start Symbol
The production begins from the start symbol. It is represented by symbol S. Non-terminal
symbols are always designated as start symbols.
Different Types of Grammar in NLP
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PARTS OF SPEECH
Parts of Speech
● Every sentence you write or speak in English includes words that fall into some of the
nine parts of speech.
● These include
The parts of speech are commonly divided into open
○ Nouns
classes (can be altered and added to as language
○ Pronouns
develops) and closed classes (pretty much set in
○ Verbs
stone).
○ Adjectives
○ Adverbs
○ Prepositions
○ Conjunctions
○ articles/determiners
○ Interjections.
● Also known as word classes, these are the building blocks of grammar.
Parts of Speech
● Every sentence you write or speak in English includes words that fall into some of the
nine parts of speech.
● These include
The parts of speech are commonly divided into open
○ Nouns
classes (can be altered and added to as language
○ Pronouns
develops) and closed classes (pretty much set in
○ Verbs
stone).
○ Adjectives
○ Adverbs
○ Prepositions
○ Conjunctions
○ articles/determiners
○ Interjections.
● Also known as word classes, these are the building blocks of grammar.
Parts of Speech
Noun
Nouns are a person, place, thing, or idea. They are capitalized when they're the official
name of something or someone, called proper nouns in these cases. Examples: pirate,
Caribbean, ship, freedom, Captain Jack Sparrow.
Pronoun
Pronouns stand in for nouns in a sentence. Examples: I, you, he, she, it, ours, them, who,
which, anybody, ourselves.
Adjective
Adjectives describe nouns and pronouns. They specify which one, how much, what kind,
and more. Examples: hot, lazy, funny, unique, bright, beautiful, poor, smooth.
Parts of Speech
Verb
Verbs are action words that tell what happens in a sentence. Examples: sing, dance,
believes, seemed, finish, eat, drink, be, became
Adverb
Adverbs describe verbs, adjectives, and even other adverbs. They specify when, where,
how, and why something happened and to what extent or how often. Examples: softly,
lazily, often, only, hopefully, softly, sometimes.
Articles and Determiners
Articles and determiners specify and identify nouns, and there are indefinite and definite
articles. Examples: articles: a, an, the; determiners: these, that, those, enough, much, few,
which, what.
Parts of Speech
Preposition
Prepositions show spacial, temporal, and role relations between a noun or pronoun and the
other words in a sentence. They come at the start of a prepositional phrase, which contains
a preposition and its object. Examples: up, over, against, by, for, into, close to, out of, apart
from.
Conjunction
Conjunctions join words, phrases, and clauses in a sentence. There are coordinating,
subordinating, and correlative conjunctions. Examples: and, but, or, so, yet, with.
Interjection
Interjections are expressions that can stand on their own or be contained within sentences.
These words and phrases often carry strong emotions and convey reactions. Examples: ah,
whoops, ouch, yabba dabba do!
NATURAL LANGUAGE PROCESSING
LECTURE 8:
MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PHRASE STRUCTURE
Phrase Structures
● Languages have constraint on word order.
● Words are organized into phrases, groupings of words that are clumped as a unit.
● Syntax is the study of the regularities and constraints of word order and phrase
structure.
● Constituents, certain groupings of words, that can be detected by being able to occur
in various positions and showing uniform syntactic possibilites for expansion.
Phrase Structures