0% found this document useful (0 votes)
7 views

NLP Module 2

The document covers foundational concepts in Natural Language Processing, focusing on mathematical and linguistic preliminaries such as probability theory, statistics, and grammar. Key topics include conditional probability, Bayes' theorem, random variables, and different types of grammar relevant to NLP, including context-free grammar. The document emphasizes the importance of these concepts in analyzing and understanding language structures and statistical data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

NLP Module 2

The document covers foundational concepts in Natural Language Processing, focusing on mathematical and linguistic preliminaries such as probability theory, statistics, and grammar. Key topics include conditional probability, Bayes' theorem, random variables, and different types of grammar relevant to NLP, including context-free grammar. The document emphasizes the importance of these concepts in analyzing and understanding language structures and statistical data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

NATURAL LANGUAGE PROCESSING

LECTURE 3:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PROBABILITY
THEORY
Probability Theory
● Deals with predicting how likely it is that something will happen.
● Experiment (or trial) - the process by which an observation is made.
● Sample space Ω - a collection of all basic outcomes assumed.
○ Discrete - having at most a countably infinite number of basic outcomes
○ Continuous - having an uncountable number of basic outcomes
● Φ represents the impossible events.
● Experimental outcome must be an event which is a subset of Ω.
Probability Theory
Probability Theory
● Probabilities are numbers between 0 and 1, 0 indicates impossibility and 1 certainty.
● A probability function or distribution distributes a probability mass of 1 throughout
the sample space Ω.
● Formally, a discrete probability function is any function P: Ƒ → [0,1] such that:
○ P(Ω) = 1
○ Countable additivity
● A well founded probability space consists of a sample space Ω, a 𝛔 field (a set which
is closed under complement and countable union of its elements and which has a
maximal element Ω) of events Ƒ and a probability function P.
● Example: A fair coin is tossed 3 times. What is the chance of 2 heads?
Conditional Probability and Independence
● Conditional probability - partial knowledge about the outcome of an experiment that
influences what experimental outcomes are possible.
○ Updated probability of an event given some knowledge.
● The probability of an event before additional knowledge considered is called the prior
probability of the event.
● The new probability that results from using the additional knowledge is referred to as
the posterior probability of the event.
P(A|B) = P(A ∩ B) / P(B)
● The multiplication rule is given by
P(A ∩ B) = P(B) P(A|B) = P(A) P(B|A)
● Two events A, B are independent of each other if P(A ∩ B) = P(A) P(B)
Baye`s Theorem
● The general statement of Bayes' theorem is “The conditional probability of an event
A, given the occurrence of another event B, is equal to the product of the event of B,
given A and the probability of A divided by the probability of event B.” i.e. where,
P(A) and P(B) are the probabilities of events A and B.
Bayes Theorem

Conditional Probability

● Baye’s theorem lets swapping the order of dependence between events.


● This is useful when one of the conditional quantity is difficult to determine.
Random Variable
● A random variable is a numerical description of the outcome of an experiment.
● Mathematically, it is simply a function X:𝛀→ R ^ n where R is the set of real
numbers.
● Stochastic process is a probabilistic process or randomly generated process used when
referring to a sequence of results assumed to be generated by some underlying
probability distribution.
● A discrete random variable is a function X:𝛀→ S where S is a countable subset of R.
Statistics
● Statistics is the study of Data Collection, Analysis, Interpretation, Presentation, and
organizing in a specific way.
● Statistics is mainly divided into the following two categories.
○ Descriptive Statistics
○ Inferential Statistics
● In descriptive Statistics, the Data or Collection Data are described in a summarized
way, whereas in inferential Statistics, we make use of it in order to explain the
descriptive kind.
Statistics
Descriptive Statistics
● In the descriptive Statistics, the Data is described in a summarized way.
● The summarization is done from the sample of the population using different
parameters like Mean or standard deviation.
● Descriptive Statistics are a way of using charts, graphs, and summary measures to
organize, represent, and explain a set of Data.
○ Data is typically arranged and displayed in tables or graphs summarizing details
such as histograms, pie charts, bars or scatter plots.
○ Descriptive Statistics are just descriptive and thus do not require normalization
beyond the Data collected.
Statistics
Inferential Statistics
● In the Inferential Statistics, we try to interpret the Meaning of descriptive Statistics.
● After the Data has been collected, analyzed, and summarised we use Inferential
Statistics to describe the Meaning of the collected Data.
● Inferential Statistics use the probability principle to assess whether trends contained
in the research sample can be generalized to the larger population from which the
sample originally comes.
● Inferential Statistics are intended to test hypotheses and investigate relationships
between variables and can be used to make population predictions.
● Inferential Statistics are used to draw conclusions and inferences, i.e., to make valid
generalizations from samples.
Statistics
Mean: Mean is considered the arithmetic average of a Data set that is found by adding the
numbers in a set and dividing by the number of observations in the Data set.

Median: The middle number in the Data set while listed in either ascending or descending
order is the Median.

Mode: The number that occurs the most in a Data set and ranges between the highest and
lowest value is the Mode.

For n number of observations, we have


NATURAL LANGUAGE PROCESSING
LECTURE 4:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Counting
● Being able to identify and count the experimental outcomes is a necessary step in
assigning probabilities.
○ Multi Step experiment
○ Combinations
○ Permutations
Frequency
● Probability mass function (pmf): gives the probability that the random variable has
different numeric values
Mean and Variance
● The expectation is the mean or average of a random variable

● The variance of a random variable is a measure of whether the values of the random
variable tend to be consistent over trials or to vary a lot.
PROBABILITY DISTRIBUTION
Random Variables
● A random variable is a numerical description of the outcome of an experiment.
● It is classified into 2:
○ Discrete Random Variable : A random variable that may assume either a finite
number of values or an infinite sequence of values such as 0, 1, 2, . . .
○ Continuous Random Variable: A random variable that may assume any
numerical value in an interval or collection of intervals
Discrete Probability Distribution
● The probability distribution for a random variable describes how probabilities are
distributed over the values of the random variable.
● For a discrete random variable x, a probability function, denoted by f (x), provides the
probability for each value of the random variable.
● The classical, subjective, and relative frequency methods of assigning probabilities
can be used to develop discrete probability distributions.
● A primary advantage of defining a random variable and its probability distribution is
that once the probability distribution is known, it is relatively easy to determine the
probability of a variety of events that may be of interest to a decision maker.
Discrete Probability Distribution
Binomial Probability Distribution
● A binomial experiment exhibits the following four properties.
Binomial Probability Distribution
● In a binomial experiment, our interest is in the number of successes occurring in the n
trials.
● If we let x denote the number of successes occurring in the n trials, we see that x can
assume the values of 0, 1, 2, 3, . . . , n.
● Because the number of values is finite, x is a discrete random variable.
● The probability distribution associated with this random variable is called the
binomial probability distribution.
Binomial Probability Distribution
Binomial Probability Distribution
Binomial Probability Distribution
Poisson Probability Distribution
● Useful in estimating the number of occurrences over a specified interval of time or
space.
● If the following two properties are satisfied, then the no. of occurrences is a random
variable described by the Poisson probability distribution
○ The probability of an occurrence is the same for any two intervals of equal
length.
○ The occurrence or non-occurrence in any interval is independent of the
occurrence or non-occurrence in any other interval
Poisson Probability Distribution

● x is a discrete random variable indicating the no. of occurrences in the interval.


● A property of the Poisson distribution is that the mean of the distribution and the
variance of the distribution are equal.
Binomial vs Poisson Probability Distribution
NATURAL LANGUAGE PROCESSING
LECTURE 5:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
Continuous Probability Distribution
● The probability function for continuous random variable is know as probability
density function.
● PDF doesnot directly provide probabilities.
● However the area under the graph f(x) corresponding to a given interval does provide
the probability that the continuous random variable x assumes a value in that interval.
● Because the area under the graph of f(x) at any particular point is zero, one of the
implications of the definition of probability for continuous random variables is that
the probability of any particular value of the random variable is zero.
● For a continuous random variable, we consider probability only in terms of the
likelihood that a random variable assumes a value within a specified interval.
● The total area under the graph of f(x) is equal to 1.
Uniform Probability Distribution
Normal Probability Distribution

● A random variable that has a normal distribution with a mean of zero and a standard
deviation of one is said to have a standard normal probability distribution.
Uniform vs Normal Probability Distribution
ENGLISH GRAMMAR
English Grammar
● Grammar is defined as the rules for forming well-structured sentences.
● Grammar denotes syntactical rules that are used for conversation in natural languages.
● A grammar G can be written as a 4-tuple (N, T, S, P) where,
○ N or V_N = set of non-terminal symbols, or variables.
○ T or ∑ = set of terminal symbols.
○ S = Start symbol where S ∈ N
○ P = Production rules for Terminals as well as Non-terminals.
○ It has the form α → β, where α and β are strings on V_N ∪ ∑ and at least one
symbol of α belongs to V_N
English Grammar
● Each natural language has an underlying structure usually referred to under Syntax.
● The fundamental idea of syntax is that words group together to form the constituents
like groups of words or phrases which behave as a single unit.
● These constituents can combine to form bigger constituents and, eventually,
sentences.

Syntax describes the regularity and productivity of a language making explicit


the structure of sentences, and the goal of syntactic analysis or parsing is to
detect if a sentence is correct and provide a syntactic structure of a sentence.
English Grammar
Syntax also refers to the way words are arranged together. Some basic ideas related to
syntax:
● Constituency: Groups of words may behave as a single unit or phrase - A constituent,
for example, like a Noun phrase.
● Grammatical relations: These are the formalization of ideas from traditional grammar.
Examples include - subjects and objects.
● Subcategorization and dependency relations: These are the relations between words
and phrases, for example, a Verb followed by an infinitive verb.
● Regular languages and part of speech: Refers to the way words are arranged together
but cannot support easily. Examples are Constituency, Grammatical relations, and
Subcategorization and dependency relations.
NATURAL LANGUAGE PROCESSING
LECTURE 6:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONTEXT FREE GRAMMAR (CFG)
CFG consists of a set of rules expressing how symbols of the language can be grouped and
ordered together and a lexicon of words and symbols. CFG consists of a finite set of
grammar rules having the following four components

● Set of Non-Terminals
● Set of Terminals
● Set of Productions
● Start Symbol

Syntactic categories and their common denotations in NLP: np - noun phrase, vp - verb
phrase, s - sentence, det - determiner (article), n - noun, tv - transitive verb (takes an
object), iv - intransitive verb, prep - preposition, pp - prepositional phrase, adj - adjective
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONTEXT FREE GRAMMAR (CFG)

Set of Non-terminals
It is represented by V. The non-terminals are syntactic variables that denote the sets of
strings, which helps in defining the language that is generated with the help of grammar.

Set of Terminals
It is also known as tokens and represented by Σ. Strings are formed with the help of the
basic symbols of terminals.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONTEXT FREE GRAMMAR (CFG)
Set of Productions
It is represented by P. The set gives an idea about how the terminals and nonterminals can
be combined. Every production consists of the following components:
● Non-terminals,
● Arrow,
● Terminals (the sequence of terminals).
The left side of production is called non-terminals while the right side of production is
called terminals.

Start Symbol
The production begins from the start symbol. It is represented by symbol S. Non-terminal
symbols are always designated as start symbols.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONTEXT FREE GRAMMAR (CFG)
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
Context Free Grammar (CFG)
Issues with using context-free grammar in NLP:

● Limited expressiveness: Context-free grammar is a limited formalism that cannot capture


certain linguistic phenomena such as idiomatic expressions, coordination and ellipsis, and
even long-distance dependencies.
● Handling idiomatic expressions: CFG may also have a hard time handling idiomatic
expressions or idioms, phrases whose meaning cannot be inferred from the meanings of
the individual words that make up the phrase.
● Handling coordination: CFG needs help to handle coordination, which is linking phrases
or clauses with a conjunction.
● Handling ellipsis: Context-free grammar may need help to handle ellipsis, which is the
omission of one or more words from a sentence that is recoverable from the context.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONSTITUENCY GRAMMER (CG)
● It is also known as Phrase structure grammar.
● It is called constituency Grammar as it is based on the constituency relation.
● Constituency grammar can organize any sentence into its three constituents - a
subject, a context, and an object.
● Some fundamental points about constituency grammar and constituency relation.
○ All the related frameworks view the sentence structure in terms of constituency
relation.
○ To derive the constituency relation, we take the help of subject-predicate
division of Latin as well as Greek grammar.
○ Here the clause structure in terms of noun phrase NP and verb phrase VP is
studied.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONSTITUENCY GRAMMER (CG)
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
CONSTITUENCY GRAMMER (CG)
● Advantages
○ Constituency grammar is not language-specific, making it easy to use the same model for
multiple languages or switch between languages, hence handling the multilingual issue
plaguing the other two types of grammar.
○ Since constituency grammar uses a parse tree to represent the hierarchical relationship
between the constituents of a sentence, it can be easily understood by humans and is more
intuitive than other representation grammars.
○ Constituency grammar also is simple and easier to implement than other formalisms, such
as dependency grammar, making it more accessible for researchers and practitioners.
○ Constituency grammar is robust to errors and can handle noisy or incomplete data.
○ Constituency grammar is also better equipped to handle coordination which is the linking
of phrases or clauses with a conjunction.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
DEPENDANCY GRAMMER (DG)
● Dependency Grammar states that words of a sentence are dependent upon other words of
the sentence. These Words are connected by directed links in dependency grammar. The
verb is considered the center of the clause structure.
● Dependency Grammar organizes the words of a sentence according to their dependencies.
Every other syntactic unit is connected to the verb in terms of a directed link. These
syntactic units are called dependencies.
○ One of the words in a sentence behaves as a root, and all the other words except that
word itself are linked directly or indirectly with the root using their dependencies.
○ These dependencies represent relationships among the words in a sentence, and
dependency grammar is used to infer the structure and semantic dependencies
between the words.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
DEPENDANCY GRAMMER (DG)
Dependency Grammar is the opposite of constituency grammar and is based on the
dependency relation. It is opposite to the constituency grammar because it lacks phrasal nodes.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
DEPENDANCY GRAMMER (DG)
Limitations:
● Ambiguity: Dependency grammar has issues with ambiguity when it comes to
interpreting the grammatical relationships between words, which are particularly
challenging when dealing with languages that have rich inflections or complex word
order variations.
● Data annotation: Dependency parsing also requires labeled data to train the model, which
is time-consuming and difficult to obtain.
● Handling long-distance dependencies: Dependency parsing also has issues with handling
long-term dependencies in some cases where the relationships between words in a
sentence may be very far apart, making it difficult to accurately capture the grammatical
structure of the sentence.
Different Types of Grammar in NLP

Types of Grammar in NLP ●



Context-Free Grammar (CFG)
Constituency Grammar (CG)
● Dependency Grammar (DG)
DEPENDANCY GRAMMER (DG)
● Handling ellipsis and coordination: Dependency grammar also has a hard time handling
phenomena that are not captured by the direct relationships between words, such as
ellipsis and coordination, which are typically captured by constituency grammar.
NATURAL LANGUAGE PROCESSING
LECTURE 7:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PARTS OF SPEECH
Parts of Speech
● Every sentence you write or speak in English includes words that fall into some of the
nine parts of speech.
● These include
The parts of speech are commonly divided into open
○ Nouns
classes (can be altered and added to as language
○ Pronouns
develops) and closed classes (pretty much set in
○ Verbs
stone).
○ Adjectives
○ Adverbs
○ Prepositions
○ Conjunctions
○ articles/determiners
○ Interjections.
● Also known as word classes, these are the building blocks of grammar.
Parts of Speech
● Every sentence you write or speak in English includes words that fall into some of the
nine parts of speech.
● These include
The parts of speech are commonly divided into open
○ Nouns
classes (can be altered and added to as language
○ Pronouns
develops) and closed classes (pretty much set in
○ Verbs
stone).
○ Adjectives
○ Adverbs
○ Prepositions
○ Conjunctions
○ articles/determiners
○ Interjections.
● Also known as word classes, these are the building blocks of grammar.
Parts of Speech
Noun
Nouns are a person, place, thing, or idea. They are capitalized when they're the official
name of something or someone, called proper nouns in these cases. Examples: pirate,
Caribbean, ship, freedom, Captain Jack Sparrow.
Pronoun
Pronouns stand in for nouns in a sentence. Examples: I, you, he, she, it, ours, them, who,
which, anybody, ourselves.
Adjective
Adjectives describe nouns and pronouns. They specify which one, how much, what kind,
and more. Examples: hot, lazy, funny, unique, bright, beautiful, poor, smooth.
Parts of Speech
Verb
Verbs are action words that tell what happens in a sentence. Examples: sing, dance,
believes, seemed, finish, eat, drink, be, became
Adverb
Adverbs describe verbs, adjectives, and even other adverbs. They specify when, where,
how, and why something happened and to what extent or how often. Examples: softly,
lazily, often, only, hopefully, softly, sometimes.
Articles and Determiners
Articles and determiners specify and identify nouns, and there are indefinite and definite
articles. Examples: articles: a, an, the; determiners: these, that, those, enough, much, few,
which, what.
Parts of Speech
Preposition
Prepositions show spacial, temporal, and role relations between a noun or pronoun and the
other words in a sentence. They come at the start of a prepositional phrase, which contains
a preposition and its object. Examples: up, over, against, by, for, into, close to, out of, apart
from.
Conjunction
Conjunctions join words, phrases, and clauses in a sentence. There are coordinating,
subordinating, and correlative conjunctions. Examples: and, but, or, so, yet, with.
Interjection
Interjections are expressions that can stand on their own or be contained within sentences.
These words and phrases often carry strong emotions and convey reactions. Examples: ah,
whoops, ouch, yabba dabba do!
NATURAL LANGUAGE PROCESSING
LECTURE 8:

MODULE 2:
MATHEMATICAL AND LINGUISTIC
PRELIMINARIES
CONTENTS
● Probability Theory ● English Grammar
● Conditional Probability and Independence ● Parts of Speech
● Bayes Rule ● Phrase Structures
● Random Variables
● Probability Distributions
● Statistics
● Counting
● Frequency
● Mean and Variance
PHRASE STRUCTURE
Phrase Structures
● Languages have constraint on word order.
● Words are organized into phrases, groupings of words that are clumped as a unit.
● Syntax is the study of the regularities and constraints of word order and phrase
structure.
● Constituents, certain groupings of words, that can be detected by being able to occur
in various positions and showing uniform syntactic possibilites for expansion.
Phrase Structures

● A whole sentence is given the category S.


● A sentence normally rewrites as a subject noun phase and a verb phase.
Phrase Structures
Some of the major phrase types are:
● Noun Phrase
○ A syntatctic unit of the sentence in which information about the noun is
gathered.
○ Consist of an optional determiner, zero or more adjective phrases, a noun head
and some post modifiers such as prepositional phrases or clausal modifiers with
the constituents appearing in that order.
○ Clausal modifiers of nouns are referred to as relative clauses.
○ Eg: The homeless old man in the park that I tried to help yesterday
Phrase Structures
Some of the major phrase types are:
● Prepositional Phrase
○ Headed by a preposition and contain a noun phrase complement.
○ Can appear within all the other major phrase types
Phrase Structures
Some of the major phrase types are:
● Verb Phrase
○ The verb is the head of the verb phrase
○ Organizes all elements of the sentence that depend syntactically on the verb
○ Eg: a. Getting to school on time was a struggle
b. He was trying to keep his temper
c. That women quickly showed me the way to hide
Phrase Structures
Some of the major phrase types are:
● Adjective Phrase
○ Complex adjective phrases are less common
○ Eg: She is very sure of herself , He seemed a man who was quite certain to
succeed.

You might also like