0% found this document useful (0 votes)

53 views32 pages

Module 3 NLP

The document outlines Module 3 of the CSE3015 Natural Language Processing course, focusing on parsing structures in text. It covers shallow and deep parsing techniques, including their definitions, applications, and key concepts such as context-free grammar and dependency parsing. Additionally, it discusses various parsing approaches, types of parsers, and their relevance in NLP tasks like machine translation and question answering.

Uploaded by

praveena.22bce7873

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views32 pages

Module 3 NLP

Uploaded by

praveena.22bce7873

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Natural Language Processing

Course code: CSE3015

Module 3
Parsing Structure in Text

Prepared by
Dr. Venkata Rami Reddy Ch
SCOPE
Syllabus
• Shallow vs Deep parsing
• Approaches in parsing,
• Types of parsing-
• Regex parser,
• Dependency parser,
• Constituency Parsing
• Meaning Representation:
• Logical Semantics,
• Semantic Role Labelling,
• Distributional Semantics
• Discourse Processing: Anaphora and Coreference Resolution
Parsing in NLP

• Parsing is the process of analyzing the grammatical structure of a sentence to determine its
syntactic or semantic meaning.
• It is a crucial component of NLP and helps machines understand human language.
The main purposes of parsing in NLP include:
[Link] Sentence Structure – It helps in breaking down a sentence into its
grammatical components, such as nouns, verbs, subjects, and objects.
[Link] Analysis – It determines the relationships between words in a sentence, which
is useful for applications like machine translation, question answering, and Chatbots
[Link] Analysis – Parsing provides a foundation for understanding meaning by identifying
roles like subjects, predicates, and modifiers.
[Link] Translation – Syntax-based translation models rely on parsing to preserve the
grammatical structure in translated languages.
[Link] Answering and Chatbots – Understanding the structure of user queries helps
generate more relevant and meaningful responses.
Parsing in NLP

• The word syntax originates from the Greek word syntaxis, meaning “arrangement”, and
refers to how the words are arranged together.

• Sentence = S = Noun Phrase + Verb Phrase + Preposition Phrase

S = NP + VP + PP
• The different word groups that exist according to English grammar rules are:–
Noun Phrase(NP): Determiner + Nominal Nouns = DET + Nominal
Verb Phrase (VP): Verb + range of combinations
Prepositional Phrase (PP): Preposition + Noun Phrase = P + NP
Shallow Parsing (Chunking) in NLP

• Shallow parsing, also known as chunking, is used to extract phrases or chunks from
sentences without analyzing their deeper syntactic structure.
• It identifies syntactic constituents (chunks) such as noun phrases (NPs), verb phrases (VPs),
and prepositional phrases (PPs).

Steps in Shallow Parsing

[Link] – Splitting text into words.
[Link] Tagging – Assigning part-of-speech (POS) tags to words.
[Link] – Grouping words into meaningful phrases (noun phrases, verb phrases, etc.)
Example of Shallow Parsing

Sentence
• "The quick brown fox jumps over the lazy dog."
Deep Parsing in NLP

• Deep parsing, also known as full parsing or syntactic parsing, is the process of analyzing the
full syntactic structure of a sentence, typically producing a parse tree or dependency graph.
• Unlike shallow parsing , which only identifies phrases like noun phrases (NPs) and verb
phrases (VPs), deep parsing provides a detailed hierarchical structure of the sentence,
including grammatical relations.

• Key Aspects of Deep Parsing

Full Sentence Structure
1. Identifies subject, object, verb, modifiers, etc.
2. Determines phrase structure (e.g., noun phrases, verb phrases).
3. Builds a parse tree (constituency tree) or dependency graph.
Deep Parsing in NLP

• Parser is used to implement the task of parsing.

• It may be defined as the software component designed for taking input data (text) and
giving structural representation of the input after checking for correct syntax as per formal
grammar.
• It builds a data structure generally in the form of parse tree or syntax tree or other
hierarchical structure.
• As the parser splits the sequence of text into a bunch of words that are related in a sort of
phrase.

Input text Parser Valid parse tree

Set of grammar
rules(productions)
Example
parse tree
Input : Tom ate an apple
Shallow VS Deep Parsing
Shallow Parsing Deep Parsing
This technique generates only phrases This parsing technique generates the
of the syntactic structure of a sentence. complete syntactic structure of the
sentence.
It can be used for less complex NLP It is suitable for complex NLP
applications applications.
It is also called chunking. It is also called full parsing.
Applications: NER, chunking, POS Applications: Syntax analysis, translation,
tagging QA
Context-Free Grammar (CFG) in Parsing

• A context-free grammar (CFG) is a set of production rules used to generate all the possible sentences in a given language
• CFG is widely used for syntactic parsing, where a parser constructs a parse tree to analyze the structure of a sentence
based on the given grammar.
• These rules specify how individual words in a sentence can be grouped to form constituents such as noun phrases, verb
phrases, preposition phrases, etc

Key Components of CFG

• A CFG is defined as a 4-tuple:
G=(N,Σ,P,S)

where:
• N (Non-terminals): A set of non-terminal symbols (e.g., S, NP, VP).
• Σ (Terminals): A set of terminal symbols (actual words or tokens in a language).
• P (Production Rules): Rules that define how non-terminals can be replaced by other non-terminals or terminals.
• S (Start Symbol): A special non-terminal from which parsing starts.
Context-Free Grammar (CFG) in Parsing
CFG: • A parse tree (or syntax tree) in parsing plays a crucial role in NLP by visually
• S → NP VP representing the syntactic structure of a sentence based on a given grammar.
• NP → Det N • It is essential for understanding sentence structure and meaning.
• VP → V NP
• Det → "the" | "a" parse tree
• N → "dog" | "cat"
• V → "chases" | "sees“

Derivation:
• S → NP VP
• S → Det N VP
• S → the dog VP
• S → the dog V NP
• S → the dog chases NP
• S → the dog chases Det N
• S → the dog chases a N
• S → the dog chases a cat
Approaches in parsing

• Top-Down Parsing
• Bottom-up parsing
Top-Down parsing
• Top-down parsing starts from the root (start symbol) of the grammar and tries to derive the
input sentence by applying production rules.
• It recursively expands the non-terminals until the input is matched or parsing fails.
• Top-down, left-to-right and backtracking are prominent search strategies are used in this
approach.
Steps of Top-Down Parsing
[Link] with the start symbol
1. Begin with the root node (usually S in a context-free grammar).
[Link] using grammar rules
1. Apply production rules to expand non-terminals in a depth-first manner.
[Link] the input sentence
1. Compare generated symbols with input tokens.
2. If a match is found, continue expanding the next non-terminal.
[Link] (if necessary)
1. If a production does not match with input tokens, backtrack and try a different production.
[Link] until the input is fully parsed
1. Continue until the entire input sentence is matched with a valid parse tree.
Top-Down parsing
Bottom-up parsing
• Bottom-up parsing starts with the input words (tokens) and applies grammar rules in reverse
to construct the parse tree, moving from the leaves to the root.
• The goal of reaching the starting symbol S is accomplished through a series of reductions;
when the right-hand side of some rule matches the substring of the input string, the
substring is replaced with the left-hand side of the matched production, and the process is
repeated until the starting symbol is reached.
How It Works:
[Link] with the input sentence (sequence of words).
[Link] phrases (subtrees) by matching grammar rules.
[Link] merging phrases until reaching the root (start symbol of the grammar).
Sentence: John is playing a game
Sentence: John ate the cake
Sentence: John ate the
cake
Cons of Top-Down Parsing
[Link] Overhead – Can suffer from excessive backtracking in ambiguous grammars,
making parsing inefficient.
[Link] Context Sensitivity – Struggles with complex languages where parsing depends on
deeper contextual information.

Cons of Bottom-Up Parsing

[Link] Implementation – More difficult to implement compared to top-down
approaches.
[Link] Memory Usage – Stores intermediate parse states, leading to increased memory
consumption.
[Link] for Simple Grammars – For straightforward languages, top-down parsers (like
recursive descent) may be faster.
[Link] Readable Parse Trees – The parse tree is built in a non-intuitive order (from leaves to
root), making debugging harder.
Types of parsing/parsers
• Regex parser,
• Dependency parser,
• Constituency Parsing
Regexp Parser
• A Regex Parser in NLP is a rule-based method for chunking and parsing text using regular
expressions applied to Part-of-Speech (POS)-tagged words.
• Regexp Parser uses regular expressions defined in the form of grammar and applied on
POS-tagged string to generate a parse tree

Steps of Regex Parser

[Link] Sentence
2. Tokenization → Split sentence into words
[Link] Tagging → Assign part-of-speech labels
[Link] Regex Parser → Identify phrase structures
[Link] Key Phrases → Find NP, VP, PP, etc.
6. Display Parse Tree → Draw a visual representation
Regexp Parser
Define the Grammar Rules
Create regex-based rules to identify phrases like Noun Phrases (NP), Verb Phrases (VP), and
Prepositional Phrases (PP)
grammar = r"""
NP: {<DT>? <JJ>* <NN>*} # Noun Phrase (Determiner + Adjective(s) + Noun(s))
P: {<IN>} # Preposition (e.g., in, on, at)
V: {<V.*>} # Verb (any verb form)
PP: {<P> <NP>} # Prepositional Phrase (P + NP)
VP: {<V> <NP|PP>*} # Verb Phrase (V + NP/PP) ""“
Create a Regex Parser
reg_parser = RegexpParser(grammar)
Tokenize and POS Tag a Sentence
sentence = "The quick brown fox jumps over the lazy dog“
pos_tags = pos_tag(word_tokenize(sentence) )
Parse the Sentence Using the Regex Parser
parsed_sentence = reg_parser.parse(pos_tags)
Regexp Parser
import nltk
[Link]('punkt')
[Link]('averaged_perceptron_tagger')
from nltk import pos_tag, word_tokenize, RegexpParser

text = "The quick brown fox jumps over the lazy dog"
tokens = word_tokenize(text)
tags = pos_tag(tokens)

reg_parser = RegexpParser("""
NP: {<DT>?<JJ>*<NN>} # To extract Noun Phrases
P: {<IN>} # To extract Prepositions
V: {<V.*>} # To extract Verbs
PP: {<IN><NP>} # To extract Prepositional Phrases
VP: {<V.*><NP|PP>*} # To extract Verb Phrases
""")
result = reg_parser .parse(tags)
print('Parse Tree:', result)
[Link]()
Constituency
• Parsing
Constituency Parsing in NLP is a syntactic analysis technique that breaks down a sentence
into its constituent components (phrases) based on a context-free grammar (CFG).
• The output of a constituency parser is typically a parse tree, which represents the
hierarchical structure of the sentence
• It creates a parse tree that represents the syntactic structure of a sentence according to
grammar rules.
• The process involves identifying noun phrases, verb phrases, and other constituents, and
then determining the relationships between them.
• It helps in understanding the grammatical structure of sentences, which is crucial for
various NLP tasks such as text summarization, machine translation, question answering, and
text classification.
Constituency
Parsing
Steps in Constituency Parsing:
[Link] a Context-Free Grammar (CFG)
2. Constructing a Parse Tree
• A parse tree can be constructed from the CFG using either a Top-Down or Bottom-Up
parsing approach.

Example:

CFG:
S → NP VP
NP → Det N
VP → V NP
Det → "the" | "a"
N → "dog" | "cat"
V → "chases" | "sees“
Constituency
import nltk
from nltk import CFG
Parsing The ChartParser is a bottom-up dynamic programming
# Define a Context-Free Grammar (CFG)
parsing algorithm that efficiently constructs a parse tree by
grammar = [Link]("""
storing intermediate results in a chart table.
S -> NP VP
NP -> DT NN | DT NN
VP -> VBD PP Output:
PP -> IN NP
DT -> 'The' | 'the'
NN -> 'cat' | 'mat'
VBD -> 'sat'
IN -> 'on'
""")

# Create a ChartParser(Botton-up parser)

parser = [Link](grammar)

sentence = "The cat sat on the mat".split()

# Generate parse tree
for tree in [Link](sentence):
print(tree)
tree.pretty_print()
Application of
1.
Constituency
Machine Translation
Parsing
2. Information Retrieval
3. Question Answering
4. Text Summarization
5. Sentiment analysis
6. Grammar Checking
Dependency Parsing
• Dependency parsing, on the other hand, focuses on identifying the grammatical
relationships between words in a sentence.
• The output is a dependency tree or graph that shows how words are related to each other
• It involves constructing a tree-like structure of dependencies, where
each word is represented as a node and the relationships between words
are represented as edges.
• Dependency Parsing is a powerful technique for understanding the
meaning and structure of language, and is used in a variety of
applications, including text classification, sentiment analysis, and
machine translation.
Key Concepts of Dependency Parsing
1. Dependency Relations
•A sentence is represented as a directed graph where words (nodes) are connected by
dependency relations (edges).
•Each word (except the root) depends on a head (governor), forming a hierarchical tree.
2. Head and Dependent
•Head: The main word in a phrase (e.g., a verb in a sentence).
•Dependent: A word that modifies or depends on the head.
3. Root
•The central word of the sentence, usually the main verb, to which all other words are
directly or indirectly connected.
4. Dependency Types (Labels)
•Common relations include:
• nsubj (nominal subject) – The subject of a verb.
• dobj (direct object) – The object receiving the action.
• amod (adjectival modifier) – An adjective modifying a noun.
• prep (prepositional modifier) – A preposition connecting words.

Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
No ratings yet
Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
89 pages
Parsing Techniques in NLP
No ratings yet
Parsing Techniques in NLP
51 pages
NLP Parsing Techniques
No ratings yet
NLP Parsing Techniques
54 pages
13-Dependency Grammar-03-09-2024
No ratings yet
13-Dependency Grammar-03-09-2024
31 pages
Natural Language Processing UNIT 2
No ratings yet
Natural Language Processing UNIT 2
32 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
Parsing Techniques in Natural Language Processing
No ratings yet
Parsing Techniques in Natural Language Processing
16 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
Unit 2
No ratings yet
Unit 2
94 pages
4.chapter5 - Syntactic and Semantic Representations
No ratings yet
4.chapter5 - Syntactic and Semantic Representations
47 pages
Introduction to NLP Concepts
No ratings yet
Introduction to NLP Concepts
21 pages
Parsing
No ratings yet
Parsing
10 pages
Overview of Natural Language Processing
100% (1)
Overview of Natural Language Processing
49 pages
What Is Parsing
No ratings yet
What Is Parsing
47 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
23 pages
NLP Unit 3 - Student
No ratings yet
NLP Unit 3 - Student
26 pages
Unit-2 - NLP
No ratings yet
Unit-2 - NLP
54 pages
Parsing Algorithms
No ratings yet
Parsing Algorithms
20 pages
NLP Unit 3
No ratings yet
NLP Unit 3
17 pages
Unit Iii
No ratings yet
Unit Iii
17 pages
NLP - Shortnotes Unit 3
No ratings yet
NLP - Shortnotes Unit 3
16 pages
NLP Unit-Ii
No ratings yet
NLP Unit-Ii
71 pages
NLP 6 Lecture 1 1
No ratings yet
NLP 6 Lecture 1 1
43 pages
NLP Unit 2
No ratings yet
NLP Unit 2
13 pages
Unit 2
No ratings yet
Unit 2
53 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
67 pages
CH 08
No ratings yet
CH 08
31 pages
NLP Merged
No ratings yet
NLP Merged
135 pages
Syntax and Dependency Parsing Overview
No ratings yet
Syntax and Dependency Parsing Overview
19 pages
5th Unit NLP
No ratings yet
5th Unit NLP
32 pages
Practical Guide to Parsing Techniques
No ratings yet
Practical Guide to Parsing Techniques
9 pages
Unit 2 New One
No ratings yet
Unit 2 New One
12 pages
Unit 3
No ratings yet
Unit 3
10 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
NLP Unit 03
No ratings yet
NLP Unit 03
23 pages
Machine 22
No ratings yet
Machine 22
5 pages
NLP Parsing Techniques Explained
No ratings yet
NLP Parsing Techniques Explained
13 pages
Unit - 5 Natural Language Processing
No ratings yet
Unit - 5 Natural Language Processing
66 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
Parsing Derivations for bbaababa
No ratings yet
Parsing Derivations for bbaababa
19 pages
Natural Language Processing Overview
100% (1)
Natural Language Processing Overview
35 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
Natural Language Processing PDF
100% (1)
Natural Language Processing PDF
47 pages
Parsing and Ambiguity in NLP
No ratings yet
Parsing and Ambiguity in NLP
18 pages
Syntactic Analysis
No ratings yet
Syntactic Analysis
66 pages
Shallow Parsing: Natural Language Processing
No ratings yet
Shallow Parsing: Natural Language Processing
32 pages
Unit 3
No ratings yet
Unit 3
8 pages
Unit 5
No ratings yet
Unit 5
10 pages
Constituency Parsing in NLP
No ratings yet
Constituency Parsing in NLP
33 pages
Syntax Complete
No ratings yet
Syntax Complete
22 pages
Study MAterial Unit 2
No ratings yet
Study MAterial Unit 2
16 pages
Natural Language Processing
No ratings yet
Natural Language Processing
47 pages
Natural Language Processing Unit 3
No ratings yet
Natural Language Processing Unit 3
55 pages
NLP 3
No ratings yet
NLP 3
4 pages
Unit Ii 1
No ratings yet
Unit Ii 1
82 pages
22BCE7873
No ratings yet
22BCE7873
1 page
Q1. Explain The Multi-Armed Bandit Problem and Its Key Characteristics. Illustrate Their Real-World Applications
No ratings yet
Q1. Explain The Multi-Armed Bandit Problem and Its Key Characteristics. Illustrate Their Real-World Applications
11 pages
Img 0011
No ratings yet
Img 0011
1 page
22bce7676 CN Lab Assignment-7 (Slot L-47+48)
No ratings yet
22bce7676 CN Lab Assignment-7 (Slot L-47+48)
5 pages
Framing
No ratings yet
Framing
20 pages
FALLSEM2024-25 CSE2008 ETH AP2024252000483 2024-08-09 Reference-Material-I
No ratings yet
FALLSEM2024-25 CSE2008 ETH AP2024252000483 2024-08-09 Reference-Material-I
64 pages
AP2024256000126 Experiment-1
No ratings yet
AP2024256000126 Experiment-1
1 page
Lecture 1
No ratings yet
Lecture 1
32 pages
DIP Module 2-3 Spatial Filtering
No ratings yet
DIP Module 2-3 Spatial Filtering
15 pages
(Audrey) DR Seuss READING Workbook 1
100% (1)
(Audrey) DR Seuss READING Workbook 1
113 pages
04 Communication Skills II Important Questions Answers
No ratings yet
04 Communication Skills II Important Questions Answers
7 pages
ZTYPE Games for English Students
No ratings yet
ZTYPE Games for English Students
168 pages
English Grammar For NDA
No ratings yet
English Grammar For NDA
56 pages
Understanding Writing Mechanics
No ratings yet
Understanding Writing Mechanics
46 pages
Grammar Project - Grade 7
No ratings yet
Grammar Project - Grade 7
13 pages
Understanding Word Structure and Morphology
No ratings yet
Understanding Word Structure and Morphology
3 pages
Olympiad Anthology Study Guide 2020
No ratings yet
Olympiad Anthology Study Guide 2020
28 pages
Exploring Argentina's Culture and Geography
No ratings yet
Exploring Argentina's Culture and Geography
57 pages
Engage in The Mathematical Practices Strategies To Build Numeracy and Literacy With K5 Learners Illustrated Kit Norris PDF Download
No ratings yet
Engage in The Mathematical Practices Strategies To Build Numeracy and Literacy With K5 Learners Illustrated Kit Norris PDF Download
28 pages
ENGLISH REVIEWER - 4TH QUARTER (Grade 10)
100% (5)
ENGLISH REVIEWER - 4TH QUARTER (Grade 10)
10 pages
English Language Proficiency of Senior High School Students: Jomel B. Manuel
No ratings yet
English Language Proficiency of Senior High School Students: Jomel B. Manuel
17 pages
2223 S4 Writing 1 HKDSE 2016 Q1 Speech Ts
No ratings yet
2223 S4 Writing 1 HKDSE 2016 Q1 Speech Ts
18 pages
Week 8 Morphology
No ratings yet
Week 8 Morphology
22 pages
English Quarter 1 WEEK 8.1: Capsulized Self-Learning Empowerment Toolkit
No ratings yet
English Quarter 1 WEEK 8.1: Capsulized Self-Learning Empowerment Toolkit
5 pages
Gazipur Cantonment Public School & College: 1-Class: Six (English Version) 2-Class: Six (English Version)
No ratings yet
Gazipur Cantonment Public School & College: 1-Class: Six (English Version) 2-Class: Six (English Version)
12 pages
Prefixes and Suffixes Video Notes
No ratings yet
Prefixes and Suffixes Video Notes
3 pages
Middle School Grammar Reference Guide
No ratings yet
Middle School Grammar Reference Guide
27 pages
Giáo Trình Chính - An Introductory English Grammar - P2-Morphology
No ratings yet
Giáo Trình Chính - An Introductory English Grammar - P2-Morphology
115 pages
Yellow Pages of The Czech Language - Grammar and Usage For - Dominik Lukeš, Jitka Kauerová - 2012 - Lulu - Com - 9781471785092 - Anna's Archive
No ratings yet
Yellow Pages of The Czech Language - Grammar and Usage For - Dominik Lukeš, Jitka Kauerová - 2012 - Lulu - Com - 9781471785092 - Anna's Archive
141 pages
Kendrick School 11 Paper 1
No ratings yet
Kendrick School 11 Paper 1
25 pages
M Zubair and Aqeel Abbas
No ratings yet
M Zubair and Aqeel Abbas
6 pages
Zero Lecture: Communication Skills 1
No ratings yet
Zero Lecture: Communication Skills 1
35 pages
5 TH Grade Mixed Grammar
No ratings yet
5 TH Grade Mixed Grammar
8 pages
Grade 5 English Exam Practice Paper
No ratings yet
Grade 5 English Exam Practice Paper
8 pages
Bangla Text Sentiment Analysis Using Supervised Machine Learning With Extended Lexicon Dictionary
No ratings yet
Bangla Text Sentiment Analysis Using Supervised Machine Learning With Extended Lexicon Dictionary
12 pages
Understanding Derivative Word Forms
No ratings yet
Understanding Derivative Word Forms
27 pages
Old English Noun Morphology Explained
No ratings yet
Old English Noun Morphology Explained
8 pages
718answer Key 30161 ECCDLO8012 NLP EXTC Sem VIII May 2023
100% (1)
718answer Key 30161 ECCDLO8012 NLP EXTC Sem VIII May 2023
8 pages
Universidad Tecnologica Del Suroeste de Guanajuato: Obed Armando Cabrera Ybarra
No ratings yet
Universidad Tecnologica Del Suroeste de Guanajuato: Obed Armando Cabrera Ybarra
17 pages

Module 3 NLP

Uploaded by

Module 3 NLP

Uploaded by

Natural Language Processing

Course code: CSE3015

• Sentence = S = Noun Phrase + Verb Phrase + Preposition Phrase

Steps in Shallow Parsing

• Key Aspects of Deep Parsing

• Parser is used to implement the task of parsing.

Input text Parser Valid parse tree

Key Components of CFG

Cons of Bottom-Up Parsing

Steps of Regex Parser

# Create a ChartParser(Botton-up parser)

sentence = "The cat sat on the mat".split()

You might also like