0% found this document useful (0 votes)

3 views22 pages

NLP 1.2

The document outlines the course objectives and outcomes for a Natural Language Processing course at the Apex Institute of Technology. It covers foundational concepts such as n-grams, maximum likelihood estimation, smoothing, and entropy, along with their applications in speech recognition and language modeling. Additionally, it provides references for textbooks and online resources related to the subject.

Uploaded by

Gorika Jindal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views22 pages

NLP 1.2

Uploaded by

Gorika Jindal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Apex Institute of Technology

Department of Computer Science & Engineering

NATURAL LANGUAGE PROCESSING

(22CSH-379)

Dr Prabhjot Kaur
E16646
Assistant Professor DISCOVER . LEARN . EMPOWER
CSE(AIT), CU 1
NATURAL LANGUAGE PROCESSING : Course Objectives
The objectives of this course are:
• To understand the foundational concepts of speech and language processing,
including ambiguity and computational models.
• To explore the role of algorithms and automata in morphological parsing and
linguistic analysis.
• To familiarize students with language modelling techniques like n-grams and
smoothing, and their application in speech recognition.
• To analyze the structure of language through parsing, feature structures, and
probabilistic grammars.
• To introduce semantic representation techniques for understanding meaning in
natural language.
• To equip students with the skills to implement NLP systems using tools and
techniques like tagging, parsing, and unification.

2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-

3
Table of Contents
•N-Gram
•Bi-Gram
•Maximum Likelihood Estimation
•Smoothing
•Entropy

4
Words

5
29

●N-grams of texts are extensively used in text mining

and natural language processing tasks. They are
basically a set of co-occurring words within a given
window and when computing the n-grams you
typically move one word forward (although you can
move X words forward in more advanced scenarios).

Artificial Intelligence: Natural Language Processing 23 April 2020

Artificial Intelligence: Natural Language Processing 23 06/14/2025

Artificial Intelligence: Natural Language Processing 23 April 2020
N-Grams
● It helps in suggesting words which could be used next in a given
sentence.

● An n-gram is a contiguous sequence of n items from a given sample

of text or speech.

● The items can be phonemes, syllables, letters, words or base pairs

according to the application.

● The n-grams typically are collected from a text or speech corpus.

● So, an N-gram model predicts the occurrence of a word based on

the occurrence of its N – 1 previous words.

9
Bi-Gram
● An N-gram is a sequence of N tokens (or words).

● A 1-gram (or unigram) is a one-word sequence.

For the above sentence, the unigrams would simply be: “I”, “love”,
“reading”, “blogs”, “about”, “data”, “science”.
● A 2-gram (or bigram) is a two-word sequence of words, like “I love”,
“love reading”.
And a 3-gram (or trigram) is a three-word sequence of words like “I love
reading”, “about data science”.

10
Example:
For the sentence “The cow jumps over the moon”. If N=2 (known as
bigrams), then the ngrams would be:
• the cow
• cow jumps
• jumps over
• over the
• the moon

So you have 5 n-grams in this case. Notice that we moved from the->cow
to cow->jumps to jumps->over, etc, essentially moving one word forward
to generate the next bigram. 11
If N=3, the n-grams would be:

• the cow jumps

• cow jumps over

• jumps over the

• over the moon

So you have 4 n-grams in this case. When N=1, this is referred to

as unigrams and this is essentially the individual words in a sentence.
When N=2, this is called bigrams and when N=3 this is called trigrams.
When N>3 this is usually referred to as four grams or five grams and so on.
12
How many N-grams in a sentence?

If X=Num of words in a given sentence K, the number of n-grams for sentence K would be:

13
How do N-gram language models work?

An N-gram language model predicts the probability of a given N-gram within any
sequence of words in the language. If we have a good N-gram model, we can
predict p(w | h) – what is the probability of seeing the word w given a history of
previous words h – where the history contains n-1 words.

14
• We must estimate this probability to construct an N-gram model.

• We compute this probability in two steps:

• Apply the chain rule of probability

• We then apply a very strong simplification assumption to allow us to

compute p(w1…ws) in an easy manner.

• The chain rule of probability is:

p(w1...ws) = p(w1) . p(w2 | w1) . p(w3 | w1 w2) . p(w4 | w1 w2 w3) .....

p(wn | w1...wn-1)
15
• So what is the chain rule? It tells us how to compute the joint
probability of a sequence by using the conditional probability of a word
given previous words.

• But we do not have access to these conditional probabilities with

complex conditions of up to n-1 words. So how do we proceed?

This is where we introduce a simplification assumption. We can assume

for all conditions, that:

p(wk | w1...wk-1) = p(wk | wk-1) 16

Maximum Likelihood Estimation
● An intuitive way to estimate probabilities is called maximum likelihood
estimation or MLE.
● The MLE estimate for the parameters of an n-gram model by getting
counts from a corpus, and normalizing the counts so that they lie
between 0 and 1.
● Maximum likelihood estimation involves defining a likelihood function for
calculating the conditional probability of observing the data sample
given a probability distribution and distribution parameters.

17
Maximum Likelihood Estimation
● PMLE(w1 ,..,wn )=C(w1 ,..,wn )/N, where C(w1 ,..,wn ) is the frequency
of n-gram w1 ,..,wn
● PMLE(wn |w1 ,..,wn-1)= C(w1 ,..,wn )/C(w1 ,..,wn-1)
● This estimate is called Maximum Likelihood Estimate (MLE) because it
is the choice of parameters that gives the highest probability to the
training corpus.

18
Smoothing
● What do we do with words that are in our vocabulary (they are not
unknown words) but appear in a test set in an unseen context (for
example they appear after a word they never appeared after in
training)?
● To keep a language model from assigning zero probability to these
unseen events, we’ll have to shave off a bit of probability mass from
some more frequent events and give it to the events we’ve never seen.
● This modification is called smoothing or discounting

19
Entropy
● Entropy is a measure of information.
● Given a random variable X ranging over whatever we are predicting
(words, letters, parts of speech, the set of which we’ll call χ) and with a
particular probability function, call it p(x), the entropy of the random
variable X is:

● The log can, be computed in any base.

If we use log base 2, the resulting value of entropy will be measured in
bits. 20
Reference:
Books:

TEXTBOOKS
T1: Speech and Language processing an introduction to Natural Language Processing, Computational Linguistics
and speech Recognition by Daniel Jurafsky and James H. Martin
T2: Natural Language Processing with Python by Steven Bird, Ewan Klein, Edward Lopper
REFERENCE BOOKS:
R1: Handbook of Natural Language Processing, Second Edition—Nitin Indurkhya, Fred J. Damerau, Fred J. Damera
Course Link:
https://in.coursera.org/specializations/natural-language-processing

Video Link:
https://youtu.be/YVQcE5tV26s

Web Link:
https://www.tutorialspoint.com/natural_language_processing/natural_language_processing_tutorial.pdf

21
THANK YOU

For queries
Email:
[email protected]

Lecture_4_N_grams
No ratings yet
Lecture_4_N_grams
29 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
BCSE306L_AI_MODULE-7_SMSATAPATHY
No ratings yet
BCSE306L_AI_MODULE-7_SMSATAPATHY
51 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
module-1 ch-2
No ratings yet
module-1 ch-2
31 pages
NLP UNIT-4
No ratings yet
NLP UNIT-4
62 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
6.Chapter6_LanguageModel
No ratings yet
6.Chapter6_LanguageModel
33 pages
3_2
No ratings yet
3_2
26 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
NLP_Unit2 (2)
No ratings yet
NLP_Unit2 (2)
65 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
NLP 5th unit
No ratings yet
NLP 5th unit
19 pages
5624 - Softskill - NLP
No ratings yet
5624 - Softskill - NLP
28 pages
Cortado-Cap 6
No ratings yet
Cortado-Cap 6
160 pages
CME4408 P5 N-grams Smooting
No ratings yet
CME4408 P5 N-grams Smooting
43 pages
Natural Language: Anguage Odels
No ratings yet
Natural Language: Anguage Odels
28 pages
NLP_Module 2(1)
No ratings yet
NLP_Module 2(1)
77 pages
Unit 5 notes final
No ratings yet
Unit 5 notes final
14 pages
Chapter-01
No ratings yet
Chapter-01
47 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
21ML1601-NLP-QB (1)
No ratings yet
21ML1601-NLP-QB (1)
34 pages
NLP m2
No ratings yet
NLP m2
74 pages
Unit-3 (NLP)
No ratings yet
Unit-3 (NLP)
28 pages
Lecture 6 to 8 N-gram
No ratings yet
Lecture 6 to 8 N-gram
19 pages
Predicting Words and Sentences Using Statistical Models: Nicola Carmignani
No ratings yet
Predicting Words and Sentences Using Statistical Models: Nicola Carmignani
42 pages
Ai Unit 3 Part 2
No ratings yet
Ai Unit 3 Part 2
8 pages
Unit 5
No ratings yet
Unit 5
26 pages
A34 NLP Expt 02
No ratings yet
A34 NLP Expt 02
7 pages
NLp
No ratings yet
NLp
12 pages
Lec 3 slp04 LM and Ngrans
No ratings yet
Lec 3 slp04 LM and Ngrans
73 pages
NLP-Ch-2 Introduction to Language Models
No ratings yet
NLP-Ch-2 Introduction to Language Models
82 pages
Unit 5-Aiml
No ratings yet
Unit 5-Aiml
25 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
Lec 1.1.2
No ratings yet
Lec 1.1.2
44 pages
lm24aug
No ratings yet
lm24aug
84 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
13 Ai Cse551 NLP 1 PDF
No ratings yet
13 Ai Cse551 NLP 1 PDF
50 pages
02 NLP LM
No ratings yet
02 NLP LM
99 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Natural Language Processing:: N-Gram Language Models
No ratings yet
Natural Language Processing:: N-Gram Language Models
48 pages
Probabilistic Language Modeling Challenges
No ratings yet
Probabilistic Language Modeling Challenges
12 pages
N-Gram Language Models: Random Sentence Generated From A Jane Austen Trigram Model
No ratings yet
N-Gram Language Models: Random Sentence Generated From A Jane Austen Trigram Model
28 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
3_LM_2024
No ratings yet
3_LM_2024
78 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
Natural Language Processing
No ratings yet
Natural Language Processing
28 pages
CH 6
No ratings yet
CH 6
30 pages
Lecture - 3 - Statistical Language Models
No ratings yet
Lecture - 3 - Statistical Language Models
56 pages
Natural Language Processing_Notes_Unit 2.docx
No ratings yet
Natural Language Processing_Notes_Unit 2.docx
19 pages
Ai Unit 5
No ratings yet
Ai Unit 5
16 pages
ai
No ratings yet
ai
13 pages
5)Lecture-Feb11&13&17&18
No ratings yet
5)Lecture-Feb11&13&17&18
21 pages
A Concept of Limits
From Everand
A Concept of Limits
Donald W. Hight
4/5 (4)
Types of Deixis: Pragmatics 2021
No ratings yet
Types of Deixis: Pragmatics 2021
10 pages
MMIL Leads
No ratings yet
MMIL Leads
4 pages
Galfar: Off Plot Delivery Contract - C-31/0603 Area Location Engineering & Contracting S.A.O.G
No ratings yet
Galfar: Off Plot Delivery Contract - C-31/0603 Area Location Engineering & Contracting S.A.O.G
4 pages
Embedded System Question Bank
No ratings yet
Embedded System Question Bank
12 pages
El Rubius - Wikipedia
No ratings yet
El Rubius - Wikipedia
1 page
Pre Intermediate Talking Shop
No ratings yet
Pre Intermediate Talking Shop
4 pages
Narrator Types and Narrative Types
100% (2)
Narrator Types and Narrative Types
4 pages
Bonus-Action Bible Songs
No ratings yet
Bonus-Action Bible Songs
15 pages
C 55 Al-Raƒmån: The Beneficent: Hapter
No ratings yet
C 55 Al-Raƒmån: The Beneficent: Hapter
8 pages
Crash 2024 08 28 - 19.53.36 FML
No ratings yet
Crash 2024 08 28 - 19.53.36 FML
5 pages
Zerilli
No ratings yet
Zerilli
22 pages
Endangered Species Word Search
No ratings yet
Endangered Species Word Search
3 pages
Primary Schooler
No ratings yet
Primary Schooler
9 pages
AACKeys
No ratings yet
AACKeys
4 pages
Brigada Pagbasa Advocacy Campaign
No ratings yet
Brigada Pagbasa Advocacy Campaign
2 pages
Yash Vashishtha Resume-2
No ratings yet
Yash Vashishtha Resume-2
2 pages
Model Paper - 20172018
No ratings yet
Model Paper - 20172018
4 pages
Lesson 2 Communication Models
No ratings yet
Lesson 2 Communication Models
3 pages
What Is Note-Taking Method?
100% (1)
What Is Note-Taking Method?
21 pages
Appdata Gpo
No ratings yet
Appdata Gpo
4 pages
VM Paging
No ratings yet
VM Paging
46 pages
IT304 - Data Structures and Algorithms
No ratings yet
IT304 - Data Structures and Algorithms
4 pages
Language in India: Strength For Today and Bright Hope For Tomorrow
No ratings yet
Language in India: Strength For Today and Bright Hope For Tomorrow
10 pages
Fourth Periodical Test
No ratings yet
Fourth Periodical Test
4 pages
Contoh RPP k13 Versi Inggris
No ratings yet
Contoh RPP k13 Versi Inggris
8 pages
SPELD SA Set 10.1 Book3 Walk and Talk
No ratings yet
SPELD SA Set 10.1 Book3 Walk and Talk
16 pages
DT-EDU-GFP-en-Denodo Training 8.0 Environment
No ratings yet
DT-EDU-GFP-en-Denodo Training 8.0 Environment
18 pages
Mathematicians Are Not An Intelligent Lot-Epistemology Logic Mathematics Philosophy Foundations
No ratings yet
Mathematicians Are Not An Intelligent Lot-Epistemology Logic Mathematics Philosophy Foundations
10 pages
Unit 4 Free Time and Hobbies
No ratings yet
Unit 4 Free Time and Hobbies
4 pages
Simple Present: Mr. Pedro Miguel Monroy Astete
No ratings yet
Simple Present: Mr. Pedro Miguel Monroy Astete
16 pages

NLP 1.2

Uploaded by

NLP 1.2

Uploaded by

Apex Institute of Technology

Department of Computer Science & Engineering

NATURAL LANGUAGE PROCESSING

●N-grams of texts are extensively used in text mining

Artificial Intelligence: Natural Language Processing 23 April 2020

Artificial Intelligence: Natural Language Processing 23 06/14/2025

● An n-gram is a contiguous sequence of n items from a given sample

● The items can be phonemes, syllables, letters, words or base pairs

● The n-grams typically are collected from a text or speech corpus.

● So, an N-gram model predicts the occurrence of a word based on

● A 1-gram (or unigram) is a one-word sequence.

• the cow jumps

• cow jumps over

• jumps over the

• over the moon

So you have 4 n-grams in this case. When N=1, this is referred to

• We compute this probability in two steps:

• Apply the chain rule of probability

• We then apply a very strong simplification assumption to allow us to

• The chain rule of probability is:

p(w1...ws) = p(w1) . p(w2 | w1) . p(w3 | w1 w2) . p(w4 | w1 w2 w3) .....

• But we do not have access to these conditional probabilities with

This is where we introduce a simplification assumption. We can assume

p(wk | w1...wk-1) = p(wk | wk-1) 16

● The log can, be computed in any base.

You might also like