0% found this document useful (0 votes)

16 views97 pages

Module 3 NLP

Uploaded by

kalpanagangwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views97 pages

Module 3 NLP

Uploaded by

kalpanagangwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

MODULE 3

Syntax Analysis
Part-Of-Speech tagging(POS)
• The process of assigning one of the parts of speech to the given word.
• Parts of Speech include nouns, verb, adverbs, adjectives, pronouns, conjunction and
their sub-categories.
• Annotate each word in a sentence with a part-of-speech marker.
• Lowest level of syntactic analysis.
• Useful for subsequent syntactic parsing and word sense disambiguation.

Word Sense Disambiguation (WSD) is the task of determining the correct

meaning of a word in a given context, as words can have multiple senses
What are POS Tags?
• Original Brown corpus used a large set of 87 POS tags.
• Most common in NLP today is the Penn Treebank set of 45 tags.
Closed vs. Open Class
• Closed class categories are composed of a small, fixed set of
grammatical function words for a given language.
• Pronouns, Prepositions, Modals, Determiners, Particles, Conjunctions
• Open class categories have large number of words and new ones are
easily invented.
• Nouns (Googler, textlish), Verbs (Google), Adjectives (geeky), Adverb
(automagically)
Ambiguity in POS Tagging
• “Like” can be a verb or a preposition
• I like/VBP candy.
• Time flies like/IN an arrow.

• “Around” can be a preposition, particle, or adverb

• I bought it at the shop around/IN the corner.
• I never got around/RP to getting a car.
• A new Prius costs around/RB $25K.
Other Issues
• New, rare, or misspelled words that are not in the training data can’t be
tagged reliably. Example: Slang, names, or newly coined words like
“google” as a verb.
• A word’s POS tag often depends on surrounding words. Taggers must
analyze syntactic context, not just the word in isolation.
Example:
• “Book a ticket” → book is a verb
• “Read a book” → book is a noun
POS Tagging Process
• Usually assume a separate initial tokenization process that separates
and/or disambiguates punctuation, including detecting sentence
boundaries.
• Degree of ambiguity in English (based on Brown corpus)
• 11.5% of word types are ambiguous (11.5% of words can have more than one
POS)
• 40% of word tokens are ambiguous (40% of the words POS depends on
context)

• Baseline: Picking the most frequent tag for each specific word type
gives about 90% accuracy.
Approaches to Part-of-Speech Tagging
• Rule-based POS tagging: relies on manually crafted linguistic rules and
dictionaries to assign POS tags to words based on their morphological
features and contextual patterns
• Rule-based taggers often employ a combination of lexical rules (based on word forms)
and contextual rules (based on the surrounding words) to disambiguate POS tags.
• Developing comprehensive rule sets for POS tagging can be time-consuming and
requires linguistic expertise, but rule-based taggers can achieve high accuracy for
languages with well-defined grammatical structures (Advantage and Disadvantage)
• Rule-based taggers employ hand-written rules to choose the correct tag when
a word has more than one possible tag.
• The first stage involves assigning each word a list of potential parts of speech using a
lexicon.
• Second step: In the second stage, it sorts the list down to a single part-of-speech for each
term using extensive lists of hand-written disambiguation criteria.
Stochastic Part-of-Speech (POS) tagging
• The model that includes frequency or probability (statistics) can be
called stochastic. Any number of different approaches to the problem of
part-of-speech tagging can be referred to as stochastic taggers.

Tag Sequence Probabilities

• It is an approach of stochastic tagging, where the tagger calculates the
probability of a given sequence of tags occurring.
• It is also called the n-gram approach. It is called so because the best tag
for a given word is determined by the probability at which it occurs with
the n previous tags.
Word Frequency Approach
• In this approach, the stochastic taggers disambiguate the words based
on the probability that a word occurs with a particular tag. We can also
say that the tag encountered most frequently with the word in the
training set is the one assigned to an ambiguous instance of that word.
The main issue with this approach is that it may yield an inadmissible
sequence of tags.

• An inadmissible sequence of tags is a grammatically invalid or

unlikely combination of POS tags that results from tagging without
considering the overall sentence structure.
Properties of Stochastic POS Tagging
Stochastic POS taggers possess the following properties −
• This POS tagging is based on the probability.
• It requires training corpus
• There would be no probability for the words that do not exist in the
corpus.
• It uses different testing corpus (other than training corpus).
Transformation-based Tagging
• It is an example of transformation-based learning (TBL), a rule-based
technique for automatically labeling POS to the provided text.

• It is inspired by both the rule-based and stochastic taggers that were

previously explained. Rule-based and transformation taggers are similar
in that they are both based on rules that indicate which tags must be
applied to which words.
Working of Transformation Based Learning
(TBL)
• Start with the solution − The TBL usually starts with some solution
to the problem and works in cycles.
• Most beneficial transformation chosen − In each cycle, TBL will
choose the most beneficial transformation.
• Apply to the problem − The transformation chosen in the last step
will be applied to the problem.

The algorithm will stop when the selected transformation in step 2 will not add
either more value or there are no more transformations to be selected. Such kind of
learning is best suited in classification tasks.
Detailed Process
• Initial Tagging
• Each word is given an initial POS tag.
• This is often done using a simple method (like assigning the most frequent tag
for each word based on training data).

• Rule Learning Phase

• The model then learns correction rules by comparing the initial tags to the
correct tags in the training data.
• Rules have the form: “Change tag X to tag Y if condition C is met”(e.g.,
Change "NN" to "VB" if the previous word is "to")
Process (TBL)
Applying Transformations
• The rules are applied sequentially to the initial tags to correct
mistakes and improve accuracy.
• These transformations continue until no more improvements can be
made or a stopping condition is met.

Example Rule:
Rule: Change a word's tag from NN (noun) to VB (verb) if the preceding word is "to".

•Before: "to book a ticket" → "to/IN book/NN a/DT ticket/NN"

•After: Apply rule → "to/IN book/VB a/DT ticket/NN"
Example
Advantages of Transformation-based
Learning (TBL)
• We learn a small set of simple rules and these rules are enough for
tagging.
• Development as well as debugging is very easy in TBL because the
learned rules are easy to understand.
• Complexity in tagging is reduced because in TBL there is an interlacing
of machine-learned and human-generated rules.
Disadvantages
The disadvantages of TBL are as follows −
• Transformation-based learning (TBL) does not provide tag
probabilities.
• In TBL, the training time is very long especially on large corpora.
Hidden Markov Model (HMM) POS Tagging
Hidden Markov Model
• An HMM model may be defined as the doubly-embedded stochastic
model, where the underlying stochastic process is hidden. This hidden
stochastic process can only be observed through another set of stochastic
processes that produces the sequence of observations.
Markov Model
Say that there are only three kinds of weather conditions, namely
• Rainy
• Sunny
• Cloudy
Now, since our young friend we introduced above, Peter, is a small kid, he
loves to play outside. He loves it when the weather is sunny, because all
his friends come out to play in the sunny conditions. He hates the rainy
weather for obvious reasons.
Every day, his mother observe the weather in the morning (that is when he
usually goes out to play) and like always, Peter comes up to her right after
getting up and asks her to tell him what the weather is going to be like.
Since she is a responsible parent, she want to answer that question as
accurately as possible. But the only thing she has is a set of observations
taken over multiple days as to how weather has been.
How does she make a prediction of the weather for today based on what
the weather has been for the past N days?
Say you have a sequence. Something like this:
Sunny, Rainy, Cloudy, Cloudy, Sunny, Sunny, Sunny, Rainy
So, the weather for any give day can be in any of the three
states.

Let’s say we decide to use a Markov Chain Model to solve this

problem. Now using the data that we have, we can construct
the following state diagram with the labelled probabilities.
• In order to compute the probability of today’s weather given N
previous observations, we will use the Markovian Property.
Example
Hidden Markov Model
It’s the small kid Peter again, and this time he’s going to pester his new caretaker.
As a caretaker, one of the most important tasks for you is to tuck Peter into bed
and
make sure he is sound asleep.
Once you’ve tucked him in, you want to make sure he’s actually, asleep and not up
to some mischief.
You cannot, however, enter the room again, as that would surely wake Peter up.
So, all you have to decide are the noises that might come from the room.
Either the room is quiet or there is noise coming from the room. These are your
states.
His mother has given you the following state diagram.
The diagram has some states, observations, and probabilities.
Note that there is no direct correlation between sound from the room and
Peter being asleep.
Probabilities
There are two kinds of probabilities that we can see from the state
diagram.

• One is the emission probabilities, which represent the probabilities of

making certain observations given a particular state. For example, we
have P(noise | awake) = 0.5 . This is an emission probability.
• The other ones is transition probabilities, which represent the
probability of transitioning to another state given a particular state. For
example, we have P(asleep | awake) = 0.4 . This is a transition
probability.
HMMs for Part of Speech Tagging
• We know that to model any problem using a Hidden Markov Model
we need a set of observations and a set of possible states. The states in
an HMM are hidden.
• In the part of speech tagging problem, the observations are the words
themselves in the given sequence.
• As for the states, which are hidden, these would be the POS tags for
the words.
• The transition probabilities would be somewhat like P(VP | NP) that is,
what is the probability of the current word having a tag of Verb Phrase
given that the previous tag was a Noun Phrase.

• Emission probabilities would be P(john | NP) or P(will | VP) that is, what
is the probability that the word is, say, John given that the tag is a Noun
Phrase.
• Our problem here was that we have an initial state: Peter was awake when
you tucked him into bed. After that, you recorded a sequence of
observations, namely noise or quiet, at different time-steps. Using these set
of observations and the initial state, you want to find out whether Peter
would be awake or asleep after say N time steps.

• We draw all possible transitions starting from the initial state. There’s an
exponential number of branches that come out as we keep moving forward.
So, the model grows exponentially after a few time steps. Even without
considering any observations. Have a look at the model expanding
exponentially below.
S0 is Awake and S1 is Asleep. Exponential growth
through the model because of the transitions.
• If we had a set of states, we could calculate the probability of the
sequence. But we don’t have the states. All we have are a sequence of
observations. This is why this model is referred to as the Hidden Markov
Model — because the actual states over time are hidden.
Viterbi Algorithm
• The Viterbi algorithm is used to find the most likely sequence of
POS tags for a given sequence of words.
It is particularly useful in Hidden Markov Models (HMMs), where:
• Words are observed outputs
• Tags are hidden states
The algorithm uses dynamic programming to:
• Avoid recomputation
• Store best paths
• Track probability of the most likely tag sequence ending in each
possible tag at each word
Key components
Viterbi Algorithm
• Advantages
• Efficient: O(N×T²) time (N = words, T = tags)
• Guarantees globally optimal tag sequence
• Widely used in practical POS taggers

• Limitations
• Assumes Markov property: only previous tag matters
• Relies on good training data for accurate probabilities
• Struggles with unknown words (needs smoothing)
Generative vs. Discriminative models for POS tagging
• Generative
• Generative models, such as HMMs, aim to model the joint probability distribution
of words and their corresponding POS tags, essentially learning how these pairs are
generated.
• An HMM for POS tagging would learn probabilities for transitions between tags
(e.g., noun following an adjective) and for the emission of words given a tag (e.g.,
the probability of the word "cat" being a noun).
• Discriminative
• Discriminative models, like CRFs, aim to directly learn the conditional probability
of a tag given a word (or a sequence of words).
• They focus on learning the decision boundaries between different tag categories,
essentially learning which features are most indicative of a particular tag.
• A CRF for POS tagging would learn weights for various features (e.g., word itself,
surrounding words, prefixes, suffixes) to predict the most likely tag for a given
word.
Maximum Entropy Model
Why Maximum Entropy?

• Traditional models like Hidden Markov Models (HMMs) make

strong independence assumptions (e.g., only the previous tag
matters).
• MaxEnt allows us to incorporate diverse contextual features
without assuming independence.
• It’s discriminative, modeling P(tag∣context) directly, unlike HMMs
which are generative.
• It offers a flexible approach to handling contextual information and
resolving ambiguities.

• It belongs to the family of classifiers known as exponential or log-linear

classifiers. It works by extracting some set of features from the input and
combining them linearly (with their weights), using the obtained sum as
an exponent.
• Given the features and weights, our goal is to choose a class (for example
a part-of-speech tag) for the word. MaxEnt does this by choosing the
most probable tag; the probability of a particular class c given the
observation x is:
How Maximum Entropy Models Work for
POS Tagging?
1. Feature Extraction:
• The model analyzes the context of a word, extracting
relevant features. These features can include the word
itself, its capitalization, surrounding words, previous and
next POS tags, and other contextual clues.
2. Feature Weighting:
• Each feature is assigned a weight based on its importance
in predicting the correct POS tag.
3. Probability Calculation:
• The model calculates the probability of each possible POS
tag for a given word by combining the weighted features and
applying a normalization factor.
4. Tag Assignment:
• The tag with the highest probability is then assigned to the
word.
Example of feature functions for POS tagging
Consider the word "running." Potential features could include:
• Current word: "running"
• Suffix: "-ing"
• Previous word: (e.g., "is")
• Previous tag: (e.g., verb)

These features, when combined with their corresponding weights, help

the model determine whether "running" is more likely to be a verb or a
noun in a particular context.
HMM vs. CRF
• The probability of a verb following a noun is static in HMMs, that is at
whatever point you encounter a (verb, noun) pair in the sentence, its
probability would remain constant despite its position which isn’t
what is observed in the domain of NLP.
• Similarly, if a hidden state has produced a verb such as “eat”, then
that probability will also remain static.
• An HMM model also has limited dependencies, for instance the third
hidden state is not dependent on the first hidden state which is often
not the case in Natural Language.
• As for a CRF, one can draw as many connections (dependencies) as
needed.
Working
Introduction
• A parser in NLP is a critical component that analyses the
grammatical structure of a sentence.
• It arranges words into specific groups such as nouns, verbs
and phrases.
• It breaks down a sentence into its constituent parts to
understand the syntactic relationships between them.
• POS removes lexical ambiguity while parser removes
syntax ambiguity.

69
Syntax
• It refers to the way in which words are arranged together to
form sentences.
• It involves rules to govern the structure of sentences and
how words are combined to convey meaning.

70
Parsing Ambiguity
• Parsing ambiguity refers to the phenomenon where a single
sentence can have multiple valid parse trees, each
representing a different syntactic interpretation.
• E.g.: I saw the man with the telescope. (2 parsers)

71
72
73
• The exponential increase in the number of possible parses as
sentences become more complex highlights the challenge of
syntactic ambiguity in NLP.
• Parsers must efficiently manage this ambiguity to generate the
correct parse tree for accurate language understanding.
• Advanced parsing techniques and algorithms, such as Probabilistic
context-free grammars and machine learning based parsers, are
often employed to handle this by assigning probabilities to
different parse trees and selecting the most likely one.

74
Modelling Constituency
• Involves representing the hierarchical structure of sentences
by identifying and labeling their constituent parts, such as
noun phrases (NP) and verb phrases (VP).
• This process aims to break down sentences into nested
phrases and construct parse trees that depict their syntactic
relationships.
• It enables machines to understand the grammatical structure
of sentences, facilitating accurate interpretation and
generation of natural language.
75
• Constituency parsers use parsing algorithms like top-down,
bottom-up, or statistical methods to generate parse trees.
• It also helps in semantic analysis and downstream
applications like named entity recognition and syntactic
ambiguity resolution.

76
• How to arrange words in certain groups?
• How to find automatically word arrangements when new
sentences are given?
• e.g. In English, I play cricket (correct statement)
I cricket play (incorrect)

77
• What is the formal tool to model this constituency?
• That is how words are arranged together?
• Which words come together? Which words do not come
together?
• What groups make a sentence and what groups make a verb
phrase? What group makes a noun phrase?
• Solution is Context Free Grammar.

78
Context-Free Grammar (CFG)
• The most common approach to modelling
constituency involves using production
rules.
• These rules specify how the symbols of a
language can be grouped and arranged.
• For example, a noun phrase can consist of
either a proper noun or a determiner
followed by a nominal, where nominal can
include multiple nouns.
79
CFGs
• Provides rules that explain how to create grammatically
correct sentences in that language.
• A Context-Free Grammar (CFG) is defined by the tuple (T,
N, S, R) where:
• Terminal Symbols (T): These are the basic building blocks
of the language. Like words or punctuation marks.
• Non-terminal Symbols (N): These are categories of words
or phrases, like noun phrases or verb phrases. These don’t
represent actual words themselves but categories of words.
80
• Start Symbol (S): This is the starting point for any sentence.
It triggers the creation of a grammatically correct sentence.
• Production rules (R): These rules rewrite non-terminal
symbols into other symbols (terminal or non-terminal) to
ultimately generate a sentence.

81
82
83
84
85
What is Parsing?
• The process of taking a string and a grammar and returning
all possible parse trees for that string.

• Top-Down (Goal-Oriented)
• Bottom-up (Data Directed)

86
Top-Down Parsing
• Top-down parsing starts with the root node S and builds
down to the leaves. Assuming the input can be derived from
the start symbol S.
• Initial Trees: Identify all trees that can start with S by
checking grammar rules with S on the left-hand side.
• Grow Trees: Expand trees downward using these rules until
they reach the POS categories at the bottom.
• Match Input: Reject trees whose leaves do not match the
input words.
87
Top-Down Parsing
• This method searches for a parse tree by starting at the top
and expanding downward, verifying against the input at
each step.
• E.g. Early Parser, Predictive Parsing
• For top-down parsing we can use depth-first or breadth-first
search and goal-ordering.
• Issues: In case of left recursive rules, e.g. NP -> NP PP can
lead to infinite recursion.

88
Parsing Example

Verb NP
book that flight
book Det Nominal

that Noun

flight
90
Bottom-Up Parsing
• The parser starts with the input words and builds trees
upward.
• Start with words: Begin with the words of the input.
• Apply Rules: Build trees by applying grammar rules one at
a time.
• Fit Rules: Look for places in the current parse where the
right-hand side of a rule can fit.

91
92
93
Bottom-up Parsers
• CYK or CKY Parser: this algorithm works on CNF
(Chomsky Normal Form) grammar only. Also generates
multiple trees when statement is ambiguous.
• Shift Reduce Parser: It does not require grammar to be in
CNF form. But there is a need to handle backtrack.
• PCGF (Probabilistic Context Free Grammar)

94
95
96
97

Techniques for POS Tagging Explained
No ratings yet
Techniques for POS Tagging Explained
12 pages
POS Tagging
No ratings yet
POS Tagging
5 pages
POS Tagging Techniques Explained
No ratings yet
POS Tagging Techniques Explained
10 pages
POS Tagging for NLP Enthusiasts
No ratings yet
POS Tagging for NLP Enthusiasts
47 pages
POS HMM Viterbi Algo 2025
No ratings yet
POS HMM Viterbi Algo 2025
52 pages
Part-of-Speech Tagging Techniques
No ratings yet
Part-of-Speech Tagging Techniques
83 pages
3.1 Chap NLP Pos - Tagging - Lecture3
No ratings yet
3.1 Chap NLP Pos - Tagging - Lecture3
38 pages
NLPChapter 3
No ratings yet
NLPChapter 3
14 pages
5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
POS Tagging-II
No ratings yet
POS Tagging-II
11 pages
Parts of Speech
No ratings yet
Parts of Speech
26 pages
Lecture#11 (POS Tagging)
No ratings yet
Lecture#11 (POS Tagging)
19 pages
HMM for POS Tagging in Python
No ratings yet
HMM for POS Tagging in Python
13 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
L11-POS - Tagging - II
No ratings yet
L11-POS - Tagging - II
43 pages
2021 25 Pos Tagging NLP
No ratings yet
2021 25 Pos Tagging NLP
8 pages
PoS Tagging and HMM in NLP
No ratings yet
PoS Tagging and HMM in NLP
50 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
What Is POS Tagging in NLP
No ratings yet
What Is POS Tagging in NLP
8 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
2.1 Rule Based POS Tagging
No ratings yet
2.1 Rule Based POS Tagging
5 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
94 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
NLP Notes Unit2 & Unit3
No ratings yet
NLP Notes Unit2 & Unit3
22 pages
NLP-Lectures 4,5,6
No ratings yet
NLP-Lectures 4,5,6
85 pages
POS Tagging: Techniques and Challenges
No ratings yet
POS Tagging: Techniques and Challenges
75 pages
Ai TXT Unit4
No ratings yet
Ai TXT Unit4
39 pages
Unit 2 Pos Tagger
No ratings yet
Unit 2 Pos Tagger
9 pages
9.chapter7 POS Tagging
No ratings yet
9.chapter7 POS Tagging
37 pages
Pos Tagging and Chunking
No ratings yet
Pos Tagging and Chunking
29 pages
CH-2 Natural Language Processing Models and Algorithm
No ratings yet
CH-2 Natural Language Processing Models and Algorithm
119 pages
2025-NLP-Lecture 05 - Sequence Labeling For Parts of Speech and Name Entities
No ratings yet
2025-NLP-Lecture 05 - Sequence Labeling For Parts of Speech and Name Entities
69 pages
Unit No 3
No ratings yet
Unit No 3
8 pages
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
No ratings yet
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
40 pages
Developing Methods For Part of Speech Tagging in Turkish Language
No ratings yet
Developing Methods For Part of Speech Tagging in Turkish Language
45 pages
Unit3 01
No ratings yet
Unit3 01
10 pages
Part of Speech Tagging and Hidden Markov Models
No ratings yet
Part of Speech Tagging and Hidden Markov Models
24 pages
Understanding Part-Of-Speech Tagging
No ratings yet
Understanding Part-Of-Speech Tagging
53 pages
POS Tagging Comparison
No ratings yet
POS Tagging Comparison
3 pages
POS Tagging HMM Notes With Diagrams
No ratings yet
POS Tagging HMM Notes With Diagrams
4 pages
POS Tagging and HMM in NLP
No ratings yet
POS Tagging and HMM in NLP
93 pages
Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
Unit 3
No ratings yet
Unit 3
16 pages
NLP: POS Tagging Techniques
No ratings yet
NLP: POS Tagging Techniques
24 pages
Lecture 20-23 Part of Speech Tagging
No ratings yet
Lecture 20-23 Part of Speech Tagging
36 pages
4 Pos
No ratings yet
4 Pos
62 pages
Lecture 16-17-18-19
No ratings yet
Lecture 16-17-18-19
42 pages
Hmms Spring2013
No ratings yet
Hmms Spring2013
22 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
36 pages
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
No ratings yet
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
108 pages
II ND Unit NLP
No ratings yet
II ND Unit NLP
21 pages
Understanding Part-of-Speech Tagging
No ratings yet
Understanding Part-of-Speech Tagging
18 pages
Unit2pdfpdf 2024 08 19 20 12 27
No ratings yet
Unit2pdfpdf 2024 08 19 20 12 27
80 pages
Unit-3.Word Level Analysis AIML
No ratings yet
Unit-3.Word Level Analysis AIML
5 pages
Pos Tagging Pushpak
No ratings yet
Pos Tagging Pushpak
88 pages
Lecture6 2022
No ratings yet
Lecture6 2022
101 pages
NLP Session 6
No ratings yet
NLP Session 6
5 pages
2018-P-Effects of Pre-Processing Methods On Landsat OLI-8 Land Cover Classification
No ratings yet
2018-P-Effects of Pre-Processing Methods On Landsat OLI-8 Land Cover Classification
9 pages
2015-P-Imp-Training Data-Exploring Issues of Training Data Imbalance
No ratings yet
2015-P-Imp-Training Data-Exploring Issues of Training Data Imbalance
14 pages
Module 2 Complete
No ratings yet
Module 2 Complete
134 pages
Module 1
No ratings yet
Module 1
40 pages
2014 Deep Learning Based Classification
No ratings yet
2014 Deep Learning Based Classification
14 pages
English Language I Fall Semester 2021: Chapter Five/ Conferences and Visits
No ratings yet
English Language I Fall Semester 2021: Chapter Five/ Conferences and Visits
35 pages
Understanding Derivative Word Forms
No ratings yet
Understanding Derivative Word Forms
27 pages
Formal and Functional Grammar
No ratings yet
Formal and Functional Grammar
4 pages
Polysemy
No ratings yet
Polysemy
55 pages
English: Fourth Quarter - Module 3
No ratings yet
English: Fourth Quarter - Module 3
14 pages
Noun Basics for Students
No ratings yet
Noun Basics for Students
7 pages
Lpe2501 Lecture Notes 5 (Week 11-14)
No ratings yet
Lpe2501 Lecture Notes 5 (Week 11-14)
11 pages
Structure/Grammar Parts of Speech
No ratings yet
Structure/Grammar Parts of Speech
7 pages
Construction of Japanese Dish Names Database Using Multi-Feature CRF From On-Line Reviews
No ratings yet
Construction of Japanese Dish Names Database Using Multi-Feature CRF From On-Line Reviews
2 pages
NLP - PPT - CH 3
No ratings yet
NLP - PPT - CH 3
74 pages
AEX-202-Communication Skills & Personality Development
No ratings yet
AEX-202-Communication Skills & Personality Development
79 pages
Understanding Corpus Linguistics
No ratings yet
Understanding Corpus Linguistics
7 pages
Understanding Concord in English
No ratings yet
Understanding Concord in English
37 pages
Parts of Speech in English Project
No ratings yet
Parts of Speech in English Project
21 pages
Lido Learning English Course Benefits
No ratings yet
Lido Learning English Course Benefits
18 pages
Einführung in Die Deutsche Linguistik - Introduction To German Li
No ratings yet
Einführung in Die Deutsche Linguistik - Introduction To German Li
40 pages
Purposive Communication Module 1
50% (2)
Purposive Communication Module 1
33 pages
Class VII Holiday Homework 2025-26
No ratings yet
Class VII Holiday Homework 2025-26
9 pages
English - Class New 8
No ratings yet
English - Class New 8
3 pages
C2 Prepositions Ame
No ratings yet
C2 Prepositions Ame
2 pages
Universidad Tecnologica Del Suroeste de Guanajuato: Obed Armando Cabrera Ybarra
No ratings yet
Universidad Tecnologica Del Suroeste de Guanajuato: Obed Armando Cabrera Ybarra
17 pages
Eng 018 P2
No ratings yet
Eng 018 P2
7 pages
Introductory Linguistics For Speech and Language Therapy Practice 1st Edition Jan Mcallister - Download The Ebook Today and Own The Complete Content
100% (3)
Introductory Linguistics For Speech and Language Therapy Practice 1st Edition Jan Mcallister - Download The Ebook Today and Own The Complete Content
48 pages
Simple Present Tense: Maryknoll Secondary School F.1 English Language 2007 - 2008 Mid-Year Examination Revision (Paper II)
No ratings yet
Simple Present Tense: Maryknoll Secondary School F.1 English Language 2007 - 2008 Mid-Year Examination Revision (Paper II)
6 pages
1
100% (3)
1
42 pages
English Lexicology Chapter 3
No ratings yet
English Lexicology Chapter 3
9 pages
Kumpulan Soal PPPK Bahasa Inggris
100% (1)
Kumpulan Soal PPPK Bahasa Inggris
11 pages
Syntax: Structure of Complementation
No ratings yet
Syntax: Structure of Complementation
89 pages
Functional English Course Outline
No ratings yet
Functional English Course Outline
9 pages
Linguistics Quiz for Students
100% (1)
Linguistics Quiz for Students
25 pages

Module 3 NLP

Uploaded by

Module 3 NLP

Uploaded by

MODULE 3

Word Sense Disambiguation (WSD) is the task of determining the correct

• “Around” can be a preposition, particle, or adverb

Tag Sequence Probabilities

• An inadmissible sequence of tags is a grammatically invalid or

• It is inspired by both the rule-based and stochastic taggers that were

• Rule Learning Phase

•Before: "to book a ticket" → "to/IN book/NN a/DT ticket/NN"

Let’s say we decide to use a Markov Chain Model to solve this

• One is the emission probabilities, which represent the probabilities of

• Traditional models like Hidden Markov Models (HMMs) make

• It belongs to the family of classifiers known as exponential or log-linear

These features, when combined with their corresponding weights, help

You might also like