0% found this document useful (0 votes)
142 views36 pages

Unit Iii - NLP

This document covers syntactic analysis, focusing on context-free grammars (CFGs) and their components, properties, and applications in natural language processing. It explains grammar rules for English, treebanks, and various normal forms for grammars, including Chomsky Normal Form and Greibach Normal Form. Additionally, it discusses the limitations of CFGs and the challenges associated with treebanks.

Uploaded by

Shobhit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views36 pages

Unit Iii - NLP

This document covers syntactic analysis, focusing on context-free grammars (CFGs) and their components, properties, and applications in natural language processing. It explains grammar rules for English, treebanks, and various normal forms for grammars, including Chomsky Normal Form and Greibach Normal Form. Additionally, it discusses the limitations of CFGs and the challenges associated with treebanks.

Uploaded by

Shobhit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

UNIT III:Syntactic Analysis: Context-Free Grammars, Grammar rules for English,

Treebanks, Normal Forms for grammar – Dependency Grammar – Syntactic Parsing,


Ambiguity, Dynamic Programming parsing – Shallow parsing – Probabilistic CFG,
Probabilistic CYK, Probabilistic Lexicalized CFGs – Feature structures, Unification of
feature structures.

Syntactic Analysis
Syntactic analysis is about understanding the structure of sentences and figuring out how
words fit together according to grammar rules. It’s like building a sentence puzzle to see how
everything connects.

Context-Free Grammars (CFGs)


Definition
A Context-Free Grammar (CFG) is a formal system used to describe the syntactic structure
of a language. It consists of a set of production rules that dictate how symbols in the language
can be combined to form valid sentences or strings.

Components of CFG
A CFG is defined as a 4-tuple G=(N,T,P,S)G = (N, T, P, S), where:

1. N (Non-terminals): A finite set of variables representing syntactic categories or


abstract structures (e.g., S, NP, VP).

2. T (Terminals): A finite set of symbols representing the actual words or tokens in the
language (e.g., dog, run).
3. P (Production Rules): A set of rules of the form A→αA \rightarrow \alpha, where:

o A∈NA \in N (a non-terminal).

o α∈(N∪T)∗\alpha \in (N \cup T)^* (a string of terminals and/or non-terminals).

4. S (Start Symbol): A special non-terminal from NN that represents the start of the
derivation process.

Example of a CFG for Simple English Sentences


Production Rules:
 S→NP VPS \rightarrow NP \, VP

 NP→Det N ∣ NNP \rightarrow Det \, N \, | \, N

 VP→V NP ∣ V VP \rightarrow V \, NP \, | \, V \,

 Det→"a" ∣ "the"Det \rightarrow \text{"a"} \, | \, \text{"the"}


 N→"dog" ∣ "cat"N \rightarrow \text{"dog"} \, | \, \text{"cat"}

 V→"chased" ∣ "saw"V \rightarrow \text{"chased"} \, | \, \text{"saw"}

Derivation Example:
 Start with SS.
 S→NP VPS \rightarrow NP \, VP
 NP→Det NNP \rightarrow Det \, N

 Det→"the"Det \rightarrow \text{"the"}, N→"dog"N \rightarrow \text{"dog"}, so


NP→"the dog"NP \rightarrow \text{"the dog"}.
 VP→V NPVP \rightarrow V \, NP

 V→"chased"V \rightarrow \text{"chased"}, NP→"the cat"NP \rightarrow \text{"the


cat"}.
 Final sentence: "The dog chased the cat."\text{"The dog chased the cat."}

Properties of CFGs
1. Generative: CFGs can generate all valid strings in a language and reject invalid ones.
2. Recursive: CFGs can define recursive structures, allowing representation of nested
constructs (e.g., "The cat that chased the dog saw the bird").
3. Ambiguity: A CFG is ambiguous if a string can have more than one valid parse tree.

Applications of CFGs
1. Natural Language Processing (NLP): CFGs model the syntax of languages for
parsing and understanding.
2. Programming Languages: Used in compilers to define the syntax of programming
languages.
3. Formal Languages: Defines languages in automata theory.

Limitations
 CFGs cannot handle context-sensitive constructs, such as agreement in number or
gender (e.g., "he runs" vs. "they run").
 They may struggle with long-distance dependencies in natural languages (e.g., "The
book that John read was interesting").
Grammar rules for English
Grammar Rules for English (using Context-Free Grammar)
English grammar can be modeled with a Context-Free Grammar (CFG) to describe how
sentences are structured. Below are commonly used grammar rules for English, written in
CFG notation.

1. Basic Sentence Structure


 S→NP VPS \rightarrow NP \, VP
(A sentence consists of a noun phrase followed by a verb phrase.)

2. Noun Phrase (NP)


A noun phrase represents a subject or object and can take several forms:

 NP→Det NNP \rightarrow Det \, N


(Determiner followed by a noun.)

 NP→Det AdjP NNP \rightarrow Det \, AdjP \, N


(Determiner, adjective phrase, and noun.)

 NP→NNP \rightarrow N
(Just a noun.)

 NP→PronounNP \rightarrow Pronoun


(A pronoun.)

 NP→NP PPNP \rightarrow NP \, PP


(A noun phrase followed by a prepositional phrase.)

3. Verb Phrase (VP)


A verb phrase expresses an action or state and can be expanded as:
 VP→VVP \rightarrow V
(A single verb.)

 VP→V NPVP \rightarrow V \, NP


(A verb followed by a noun phrase.)

 VP→V NP PPVP \rightarrow V \, NP \, PP


(A verb, a noun phrase, and a prepositional phrase.)

 VP→V AdjPVP \rightarrow V \, AdjP


(A verb followed by an adjective phrase.)

 VP→V SVP \rightarrow V \, S


(A verb followed by a subordinate clause.)
 VP→VP Conj VPVP \rightarrow VP \, Conj \, VP
(Two verb phrases joined by a conjunction.)

4. Prepositional Phrase (PP)


 PP→Prep NPPP \rightarrow Prep \, NP
(A preposition followed by a noun phrase.)

5. Adjective Phrase (AdjP)


 AdjP→AdjAdjP \rightarrow Adj
(A single adjective.)

 AdjP→Adv AdjAdjP \rightarrow Adv \, Adj


(An adverb followed by an adjective.)

6. Adverb Phrase (AdvP)


 AdvP→AdvAdvP \rightarrow Adv
(A single adverb.)

 AdvP→Adv AdvPAdvP \rightarrow Adv \, AdvP


(An adverb modifying another adverb phrase.)

7. Determiners and Pronouns


 Det→"a" ∣ "the" ∣ "some" ∣ "many"Det \rightarrow \text{"a"} \, | \, \text{"the"} \, | \,
\text{"some"} \, | \, \text{"many"}

 Pronoun→"he" ∣ "she" ∣ "it" ∣ "they"Pronoun \rightarrow \text{"he"} \, | \, \text{"she"}


\, | \, \text{"it"} \, | \, \text{"they"}

8. Subordinate Clause (S)


 S→SubConj SS \rightarrow SubConj \, S
(A subordinating conjunction followed by a sentence.)

 Example subordinating conjunctions: SubConj→"that" ∣ "because" ∣ "if"SubConj


\rightarrow \text{"that"} \, | \, \text{"because"} \, | \, \text{"if"}

9. Conjunctions
 Conj→"and" ∣ "or" ∣ "but"Conj \rightarrow \text{"and"} \, | \, \text{"or"} \, | \,
\text{"but"}

Example Grammar Derivation


Input Sentence: "The dog chased the cat."
1. Start with SS:
o S→NP VPS \rightarrow NP \, VP
2. Expand NPNP:

o NP→Det NNP \rightarrow Det \, N, Det→"the"Det \rightarrow \text{"the"},


N→"dog"N \rightarrow \text{"dog"}
3. Expand VPVP:
o VP→V NPVP \rightarrow V \, NP, V→"chased"V \rightarrow \text{"chased"}

o NP→Det NNP \rightarrow Det \, N, Det→"the"Det \rightarrow \text{"the"},


N→"cat"N \rightarrow \text{"cat"}
Result: Parse tree for "The dog chased the cat."

 Example: A parse of the sentence "the giraffe dreams" is: s => np vp => det n vp =>
the n vp => the giraffe vp => the giraffe iv => the giraffe dreams

Additional Notes
 These grammar rules are simplified and idealized; real English involves exceptions,
ambiguities, and context-sensitive features not captured by CFGs.

 Probabilistic CFGs (PCFGs) are often used to resolve ambiguities in natural English
sentences.
Treebanks
Definition
A Treebank is a linguistic resource that consists of a corpus of sentences annotated with
their syntactic structures, typically represented as parse trees. These parse trees are based on
a particular grammar (such as Context-Free Grammar) and describe the syntactic
relationships between words in the sentences.

Treebanks are used extensively in Natural Language Processing (NLP) for training and
evaluating syntactic parsers.

Key Features of Treebanks


1. Syntactic Representation: Sentences are annotated with their syntax using
constituency trees, dependency trees, or both.
2. Manual or Automatic Annotation:
 Many treebanks are manually annotated by linguists for high accuracy.

 Some are created using automatic parsers, often followed by manual


correction.
3. Standards and Formats:
 Annotation schemes are often standardized for specific languages.

 Common formats include Penn Treebank format, CoNLL-U format, and


XML-based formats.

Types of Representations in Treebanks


1. Constituency Trees (Phrase-Structure Trees):
 Represent sentences using a hierarchical structure of phrases.
 Based on Context-Free Grammar (CFG).
 Example:
 (S
 (NP (Det The) (N dog))
 (VP (V chased)
 (NP (Det the) (N cat))))
 "The dog chased the cat."
2. Dependency Trees:
 Focus on relationships between words (head-dependent pairs).
 Example:
 chased

 ↳ dog (subject)

 ↳ cat (object)

Commonly Used Treebanks


1. Penn Treebank:
 Language: English.
 Features: Annotated with constituency structures and POS tags.
 Widely used in NLP for syntactic parsing and language modeling.
2. Universal Dependencies (UD) Treebanks:
 Language: Multilingual (covers over 100 languages).
 Features: Dependency-based annotation using a consistent schema.
3. NEGRA and TIGER Treebanks:
 Language: German.
 Features: Both constituency and dependency structures.
4. TüBa-D/Z Treebank:
 Language: German.
 Features: Annotated with deep syntactic and semantic structures.
5. Chinese Treebank:
 Language: Chinese.
 Features: Includes constituency-based parse trees and POS tags.
6. Arabic Treebank:
 Language: Arabic.
 Features: Annotates complex morphological and syntactic structures.
Applications of Treebanks
1. Training and Evaluating Parsers:

o Supervised parsers (e.g., probabilistic or neural parsers) rely on treebanks for


learning.
2. Part-of-Speech (POS) Tagging:

o Treebanks often include POS tags, which are essential for various downstream
NLP tasks.
3. Language Modeling:
o Treebanks help in developing structured language models that consider syntax.
4. Linguistic Research:

o Used by linguists to study syntactic phenomena and variations across


languages.

Challenges with Treebanks


1. Ambiguity: Ambiguous sentences may require multiple annotations.
2. Cost of Annotation: Manual annotation is time-consuming and expensive.
3. Domain-Specificity: Treebanks are often limited to specific genres or domains (e.g.,
news articles), making generalization difficult.

Example
Sentence: "The quick brown fox jumps over the lazy dog."
1. Constituency Tree:
2. (S
3. (NP (Det The) (Adj quick) (Adj brown) (N fox))
4. (VP (V jumps)
5. (PP (P over)
6. (NP (Det the) (Adj lazy) (N dog)))))
7. Dependency Tree:
8. jumps

9. ↳ fox (subject)
10. ↳ over (prepositional object)

11. ↳ dog (object of preposition)

Normal Forms for Grammar


In computational linguistics and formal language theory, normal forms for grammars are
standardized formats that simplify the structure of grammar rules. These forms are useful for
theoretical analysis, algorithm development, and practical applications such as parsing and
compiler design. Below are the most commonly used normal forms:

1. Chomsky Normal Form (CNF)


A context-free grammar (CFG) is in Chomsky Normal Form if all production rules follow
these restrictions:
1. Each production is of the form:

o A→BCA \rightarrow BC (where A,B,CA, B, C are non-terminal symbols, and


B,C≠SB, C \neq S)
o A→aA \rightarrow a (where aa is a terminal symbol)
2. The start symbol SS cannot appear on the right-hand side of any production.

3. The grammar does not allow ϵ\epsilon-productions (productions of the form A→ϵA
\rightarrow \epsilon), except for the start symbol.

Example
Original Grammar:

 S→AB ∣ a ∣ ϵS \rightarrow AB \, | \, a \, | \, \epsilon

 A→aA ∣ aA \rightarrow aA \, | \, a
Converted to CNF:

 S→AB ∣ aS \rightarrow AB \, | \, a

 A→aA ∣ aA \rightarrow aA \, | \, a

Applications:
 Used in CYK Parsing (Cocke–Younger–Kasami algorithm) for CFG parsing.
 Simplifies proofs and formal analysis.

2. Greibach Normal Form (GNF)


A CFG is in Greibach Normal Form if all production rules follow this structure:

 A→aαA \rightarrow a\alpha (where aa is a terminal, and α\alpha is a sequence of zero


or more non-terminals).

Example
Original Grammar:

 S→AB ∣ aBS \rightarrow AB \, | \, aB

 A→b ∣ aAA \rightarrow b \, | \, aA

Converted to GNF:

 S→aB ∣ bAS \rightarrow aB \, | \, bA

 A→b ∣ aAA \rightarrow b \, | \, aA

Applications:
 GNF ensures that parsing can begin with a terminal symbol, making it suitable for
top-down parsing algorithms like recursive descent.

3. Kuroda Normal Form


A CFG is in Kuroda Normal Form if all production rules follow these structures:
1. A→BCA \rightarrow BC
2. A→aA \rightarrow a
3. A→ϵA \rightarrow \epsilon (optional, but not allowed in CNF).

Applications:
 Kuroda Normal Form is rarely used in practice but helps in the theoretical analysis of
unrestricted grammars.

4. Backus-Naur Form (BNF)


Backus-Naur Form is a notation for defining formal languages, commonly used for
programming language grammars. Rules are written as:

 <non−terminal>::=<expression><non-terminal> ::= <expression>, where


<expression><expression> consists of terminals and non-terminals combined with
operators like |.

Example
For a programming language:
 <statement>::=<if−statement>∣<while−statement>∣<assignment><statement> ::= <if-
statement> | <while-statement> | <assignment>

 <if−statement>::="if"<condition>"then"<statement><if-statement> ::= "if"


<condition> "then" <statement>

Applications:
 Widely used in language specifications and compiler design.

5. Extended Backus-Naur Form (EBNF)


Extended Backus-Naur Form is an enhancement of BNF that allows more expressive rules,
including:
1. Optional elements (denoted by []).
2. Repetitions (denoted by {}).
3. Grouping (denoted by ()).

Example
For a programming language:

 <statement>::=<if−statement>∣<while−statement>∣<assignment><statement> ::= <if-


statement> | <while-statement> | <assignment>

 <if−statement>::="if"<condition>"then"<statement>["else"<statement>]<if-
statement> ::= "if" <condition> "then" <statement> ["else" <statement>]

Applications:
 Used in language design and compiler construction.

6. Normal Forms in Dependency Grammar


For Dependency Grammars, normal forms often refer to simplifications such as:

 Binary branching structures (e.g., splitting multiple dependencies into binary


relations).
 Ensuring acyclic graphs in dependency trees.

Why Normalize Grammars?


1. Simplifies Parsing: Parsing algorithms (e.g., CYK or Earley) often require grammars
in a specific normal form.
2. Algorithmic Efficiency: Normal forms reduce complexity for machine processing.
3. Theoretical Analysis: Easier to prove properties like equivalence or ambiguity.
4. Standardization: Provides a uniform framework for comparing grammars.

Dependency Grammar
Definition
Dependency Grammar (DG) is a syntactic theory that emphasizes the relationships between
words in a sentence, known as dependencies. In this framework:

 Words (or tokens) are connected by directed links, where one word (the head)
governs or determines another word (the dependent).
 The syntactic structure of a sentence is represented as a dependency tree.

Key Concepts
1. Head: The central word in a dependency relationship, which determines the
grammatical function of the dependent.
o Example: In "The cat sleeps," the verb "sleeps" is the head of the sentence.
2. Dependent: A word that is grammatically subordinate to the head.

o Example: In "The cat sleeps," "cat" is dependent on "sleeps," and "The" is


dependent on "cat."
3. Dependency Tree:
o The structure is a directed acyclic graph (DAG) where each node is a word,
and edges represent dependency relations.
o Typically, the root of the tree corresponds to the main verb or predicate.

Dependency Relations
Dependency relations encode syntactic and grammatical roles such as:
1. Subject: The doer of an action (e.g., "She runs.").
2. Object: The receiver of an action (e.g., "He reads a book.").

3. Modifier: A word that describes or adds information about another word (e.g., "The
big cat sleeps.").
4. Prepositional Relation: Links prepositions to their objects (e.g., "on the table").
Example: Dependency Tree
Sentence: "The quick brown fox jumps over the lazy dog."
Dependency Tree Representation:
 Root: jumps
o Subject: fox
 Modifiers: quick, brown
o Object: dog
 Modifiers: the, lazy
o Prepositional Modifier: over
 Object of Preposition: dog
Graphically:
jumps
/ | \
fox over dog
/ \ / \
quick brown the lazy

Advantages of Dependency Grammar


1. Simplicity: Focuses on binary relationships between words, making it intuitive for
many languages.

2. Cross-Linguistic Applicability: Suitable for free-word-order languages (e.g., Hindi,


Russian), where constituent structure may be less rigid.
3. Compact Representation: Uses fewer nodes and simpler structures compared to
constituency grammars.

4. Practical Utility: Widely used in modern Natural Language Processing (NLP) for
tasks like syntactic parsing and semantic role labeling.
Dependency Parsing
Dependency parsing is the process of analyzing a sentence to produce its dependency tree.
Parsing can be done using:
1. Rule-Based Parsing: Based on predefined grammatical rules.
2. Machine Learning Approaches:
 Transition-based parsing: Builds the tree incrementally.

 Graph-based parsing: Finds the best tree by maximizing the global score of
dependencies.

Applications of Dependency Grammar


1. Part-of-Speech Tagging: Enhances tagging accuracy by considering syntactic
relationships.

2. Machine Translation: Ensures syntactic and semantic coherence between source and
target languages.

3. Information Extraction: Helps identify subject-verb-object triples for extracting


facts.

4. Question Answering: Improves understanding of sentence structure to extract


relevant answers.

Limitations of Dependency Grammar


1. Inadequate for Some Constructions: Complex syntactic phenomena like
coordination or ellipsis can be challenging.

2. Ambiguity: Dependency relations can sometimes be ambiguous, especially in long or


complex sentences.

3. Annotation Cost: Building dependency treebanks is resource-intensive and requires


linguistic expertise.
Syntactic Parsing
Definition
Syntactic parsing, also known as syntactic analysis, is the process of analyzing the
syntactic structure of a sentence based on a formal grammar. It determines how words in a
sentence relate to each other and how they conform to the rules of syntax for a given
language. The result of parsing is typically a parse tree or dependency tree.

Goals of Syntactic Parsing


1. To identify the grammatical structure of a sentence.
2. To determine relationships between words (e.g., subject, object, predicate).
3. To assist in higher-level tasks like semantic analysis and machine translation.

Types of Syntactic Parsing


1. Constituency Parsing:

o Focuses on grouping words into constituents or phrases (e.g., noun phrases,


verb phrases).

o Based on Phrase Structure Grammar (PSG) or Context-Free Grammar


(CFG).
o Output: Parse tree that shows hierarchical relationships between phrases.
o Example:
 Sentence: "The cat sleeps."
 Parse Tree:
 S
/
NP VP | | Det V | | The sleeps
2. Dependency Parsing:

o Focuses on identifying dependencies between words, where one word (the


head) governs another (the dependent).
o Output: Dependency tree showing relationships like subject, object, etc.
o Example:
 Sentence: "The cat sleeps."
 Dependency Tree:
 sleeps
 / \
cat The

Key Components
1. Grammar: Defines the rules for valid sentence structures.
o Example: S→NP VPS \rightarrow NP \, VP (A sentence is a noun phrase
followed by a verb phrase).
2. Parse Tree: A hierarchical representation of the syntactic structure.
3. Parsing Algorithm: The method used to generate the parse tree.

Common Parsing Algorithms


1. Top-Down Parsing
 Starts from the root of the parse tree and works down to the leaves.
 Example: Recursive Descent Parsing.
 Pros: Simple and intuitive.
 Cons: Inefficient for ambiguous or left-recursive grammars.

2. Bottom-Up Parsing
 Starts from the leaves (words) and works up to the root.
 Example: Shift-Reduce Parsing.
 Pros: Efficient for many practical applications.
 Cons: Can require complex grammar transformations.

3. Dynamic Programming Parsing


 Uses a chart to store intermediate results and avoid redundant computations.
 Examples:
o CYK Algorithm: Works on Chomsky Normal Form (CNF) grammars.
o Earley Parser: Handles all CFGs efficiently.
 Pros: Efficient and guarantees completeness.
 Cons: Computationally expensive for large grammars.
4. Probabilistic Parsing
 Extends traditional parsing by incorporating probabilities to handle ambiguity.
 Examples:
o Probabilistic Context-Free Grammars (PCFGs).
o Probabilistic CYK Parser.

Ambiguity in Parsing
Ambiguity arises when a sentence can be parsed in multiple ways:
1. Lexical Ambiguity: Words with multiple meanings (e.g., "bank").
2. Syntactic Ambiguity: Multiple valid parse trees for a sentence.
o Example: "The boy saw the man with the telescope."
 Did the boy use the telescope?
 Or did the man have the telescope?

Disambiguation Techniques:
 Probabilistic grammars (PCFGs).
 Semantic analysis.
 Contextual information.

Applications of Syntactic Parsing


1. Natural Language Processing (NLP):
o Machine Translation.
o Question Answering.
o Sentiment Analysis.
2. Speech Recognition:
o Helps convert spoken language into structured text.
3. Information Retrieval:
o Improves search engines by understanding sentence structure.
4. Code Analysis:
o Parsing programming languages for compilers and interpreters.
Ambiguity
Ambiguity in Syntactic Parsing
Definition
Ambiguity in syntactic parsing occurs when a sentence can be interpreted in more than one
way due to multiple valid syntactic structures or interpretations. This is a common challenge
in natural language processing (NLP) and computational linguistics.

Types of Ambiguity
1. Lexical Ambiguity:
 A single word has multiple meanings or parts of speech.
 Example:
 Sentence: “I saw a bat.”
 Ambiguity: Does "bat" mean an animal or a sports instrument?
2. Syntactic Ambiguity:

 A sentence can have multiple valid parse trees due to different grammatical
interpretations.
 Example:
 Sentence: “The boy saw the man with the telescope.”
 Ambiguity:
 Interpretation 1: The boy used the telescope to see the man.
 Interpretation 2: The man had the telescope.
3. Semantic Ambiguity:

 The meaning of the sentence is unclear even after resolving lexical and
syntactic ambiguities.
 Example:
 Sentence: “Visiting relatives can be annoying.”
 Ambiguity:
 Interpretation 1: The act of visiting relatives is annoying.
 Interpretation 2: Relatives who are visiting are annoying.
4. Pragmatic Ambiguity:
 The context of the sentence leads to multiple interpretations.
 Example:
 Sentence: “Can you pass the salt?”
 Ambiguity: Is it a request or a question about ability?
5. Structural Ambiguity:
 Arises from multiple ways to group words or phrases in a sentence.
 Example:
 Sentence: “Old men and women.”
 Ambiguity:
 Interpretation 1: Both men and women are old.
 Interpretation 2: Only the men are old, and the women are not.

Causes of Ambiguity
1. Polysemy: Words with multiple meanings.
2. Homonymy: Words that sound or look the same but have different meanings.
3. Grammatical Flexibility: Flexible word order in languages.
4. Contextual Variability: Lack of clear context in the sentence.
5. Complex Sentence Structures: Use of nested or compound clauses.

Ambiguity in NLP
Ambiguity is a significant challenge in natural language processing tasks like syntactic
parsing, semantic analysis, and machine translation.

How Ambiguity Affects NLP:


1. Syntactic Parsers: May produce multiple parse trees.
2. Translation Systems: Can result in incorrect or awkward translations.
3. Speech Recognition: May lead to misinterpretation of spoken words.

Disambiguation Techniques
1. Probabilistic Models:
 Use probabilistic grammars like Probabilistic Context-Free Grammar
(PCFG) to rank parse trees based on likelihood.
 Example: Viterbi algorithm for finding the most probable parse.
2. Contextual Information:
 Consider the surrounding words and phrases to resolve ambiguity.
 Example: Using word embeddings like Word2Vec or BERT.
3. Semantic Analysis:
 Use semantic rules to filter out implausible interpretations.
 Example: Semantic role labeling to determine roles like subject, object, etc.
4. Pragmatic and World Knowledge:
 Incorporate real-world knowledge to interpret ambiguous sentences.
 Example: “She is in the bank.” (Financial institution or riverbank?)
5. Supervised Learning:

 Train machine learning models on annotated datasets to learn disambiguation


patterns.

Examples of Resolving Ambiguity


1. Lexical Ambiguity:
 Input: “He went to the bank.”

 Resolution: Use context like preceding sentences to infer "bank" refers to a


financial institution.
2. Syntactic Ambiguity:
 Input: “I saw the man with a telescope.”
 Resolution: Use semantic plausibility or PCFGs to select the correct parse tree.
3. Structural Ambiguity:
 Input: “Flying planes can be dangerous.”

 Resolution: Analyze verb-noun dependencies to determine if "flying" is a


gerund or a participle.
Applications of Ambiguity Resolution
1. Machine Translation: Ensures accurate translations by resolving structural and
lexical ambiguities.

2. Speech Recognition: Improves transcription quality by handling homophones and


lexical ambiguities.
3. Question Answering Systems: Ensures correct interpretation of user queries.
4. Information Retrieval: Filters out irrelevant results caused by ambiguous queries.

Dynamic Programming Parsing


Definition
Dynamic Programming (DP) parsing is a computational approach that uses tabulation to
efficiently parse sentences by storing intermediate results. It avoids redundant computations,
making it especially useful for parsing ambiguous or complex sentences in natural language
processing (NLP).

Key Concepts in Dynamic Programming Parsing


1. Chart Parsing:
o A chart is a table that stores partial parse results for sub-spans of the sentence.

o Each cell in the chart represents a span of the sentence and contains possible
parses for that span.
2. Dynamic Programming:
o Breaks the problem into smaller subproblems.
o Reuses results of subproblems to construct solutions for larger spans.
3. Grammar Formalism:

o Often uses Context-Free Grammar (CFG) as the underlying grammar


model.

Advantages of Dynamic Programming Parsing


 Efficiently handles ambiguity by exploring all possible parses without redundant
computations.
 Guarantees completeness (finds all possible parses).
 Can handle long and complex sentences better than naive parsing methods.
Common Algorithms for Dynamic Programming Parsing
1. CYK Algorithm (Cocke-Younger-Kasami)
 Works on: Chomsky Normal Form (CNF) grammars.
 Input: A sentence and a CNF grammar.

 Output: A boolean indicating whether the sentence is part of the language, or a parse
tree.
Steps:
1. Convert the grammar into CNF.

 Each production must be of the form A→BCA \rightarrow BC or A→aA


\rightarrow a.

2. Create a 2D table (chart) where T[i][j]T[i][j] stores all non-terminals that can generate
the substring from position ii to jj.
3. Fill the table:

 For substrings of increasing length, check which non-terminals can derive the
substring based on grammar rules and smaller spans.
4. The sentence is valid if the start symbol SS spans the entire table.

Complexity: O(n3⋅∣G∣)O(n^3 \cdot |G|), where nn is the length of the sentence, and ∣G∣|G| is
the size of the grammar.
Example: Grammar:
 S→NP VPS \rightarrow NP \, VP
 NP→Det NNP \rightarrow Det \, N
 VP→V NPVP \rightarrow V \, NP
 Det→′the′Det \rightarrow 'the'
 N→′cat′N \rightarrow 'cat'
 V→′chased′V \rightarrow 'chased'
Sentence: “the cat chased”
CYK Table:

Start End Non-terminals


0 1 Det
1 2 N
2 3 V
0 2 NP
1 3 VP
0 3 S

2. Earley Parser
 Works on: All Context-Free Grammars (not limited to CNF).
 Input: A sentence and a CFG.
 Output: Parse tree(s) or a boolean indicating sentence validity.
Steps:
1. Maintain a chart with states (active or completed).

o States track the progress of parsing a rule (e.g., S→NP⋅VPS \rightarrow NP


\cdot VP).
2. Perform three operations:

o Prediction: Add new grammar rules to the chart when a non-terminal is


encountered.
o Scanning: Match the current word with a terminal symbol.
o Completion: Combine rules when all their components have been parsed.
3. The sentence is valid if the start symbol SS spans the entire input.
Complexity:
 Best Case: O(n2)O(n^2).
 Worst Case: O(n3)O(n^3).

Probabilistic Dynamic Programming Parsing


Incorporates probabilities into the parsing process to resolve ambiguities by selecting the
most likely parse tree.

Probabilistic CYK Parser:


 Extends the CYK algorithm by associating probabilities with grammar rules.
 Each cell in the chart stores the most probable derivation for the span it represents.

Steps:
1. Use Probabilistic Context-Free Grammar (PCFG) rules.
2. Modify the CYK table to store both non-terminals and their probabilities.
3. Use dynamic programming to calculate the most probable parse for each span.
4. Backtrack to construct the most likely parse tree.
Output: The parse tree with the highest probability.

Shallow Parsing with Dynamic Programming


 Aims to extract phrases (e.g., noun phrases, verb phrases) without constructing a full
parse tree.
 Efficiently finds phrase boundaries using a partial chart.
 Useful for tasks like named entity recognition (NER) and information extraction.

Applications of Dynamic Programming Parsing


1. Machine Translation:
o Improves translation quality by accurately analyzing sentence structure.
2. Speech Recognition:
o Handles ambiguous transcriptions by evaluating all possible parses.
3. Semantic Parsing:
o Assists in mapping sentences to logical forms for question-answering systems.
4. Information Extraction:
o Identifies structured data like entities and relations.

Shallow parsing
Shallow Parsing (Chunking)
Definition
Shallow parsing, also known as chunking, is a process in natural language processing (NLP)
that identifies non-overlapping segments (or chunks) in a sentence. These chunks typically
correspond to noun phrases (NP), verb phrases (VP), prepositional phrases (PP), etc.
Shallow parsing does not involve constructing full syntactic trees but focuses on extracting
these phrases.

The goal of shallow parsing is to segment the sentence into its basic components, which are
useful for tasks like named entity recognition (NER), information extraction, and part-of-
speech tagging.

Key Concepts
1. Chunks: Phrases that are formed by grouping words together based on their syntactic
role in the sentence.
o Noun Phrases (NP): Grouping of nouns with their modifiers.
o Verb Phrases (VP): Grouping of verbs with their arguments.
o Prepositional Phrases (PP): Preposition with its object.
2. Chunking vs. Full Parsing:

o Chunking: Extracts phrases, does not need to generate complete syntactic


structures.

o Full Parsing: Builds a complete syntactic tree with all details about sentence
structure.

Shallow Parsing Approach


Shallow parsing can be done using different methods like rule-based, statistical models, and
machine learning approaches.

1. Rule-based chunking: Involves using regular expressions or context-free grammar


(CFG) rules to define how chunks should be identified.

2. Statistical chunking: Uses machine learning models trained on annotated corpora to


identify chunks.

Applications of Shallow Parsing


1. Named Entity Recognition (NER): Identifying entities like names, dates, and
locations from noun phrases.

2. Information Extraction: Extracting structured information from unstructured text


(e.g., extracting addresses, dates).
3. Machine Translation: Simplifying sentence structure before translation.

4. Speech Recognition: Chunking helps identify the main components of spoken


sentences.

Conclusion
Shallow parsing is a useful technique to extract key phrases from sentences, and it can be
performed efficiently using regular expressions or machine learning techniques like Hidden
Markov Models (HMMs) or Conditional Random Fields (CRFs) for more complex cases.
Let me know if you need further explanations or more examples!
Probabilistic CFG
Probabilistic Context-Free Grammar (PCFG)
A Probabilistic Context-Free Grammar (PCFG) is an extension of the standard Context-
Free Grammar (CFG), where each production rule is associated with a probability. These
probabilities represent the likelihood of a particular rule being applied, making PCFGs
suitable for tasks like parsing where different possible derivations of a sentence may have
different probabilities, helping to choose the most likely parse tree.

PCFG Structure
A PCFG consists of:

1. Non-Terminals (Variables): Represent different parts of a sentence (e.g., NP for


noun phrase, VP for verb phrase).
2. Terminals: The actual words of the sentence.
3. Production Rules: A set of rules describing how non-terminals can be replaced by
other non-terminals or terminals.

4. Probabilities: Associated with each production rule to indicate how likely it is to be


applied.
A PCFG is defined as a 4-tuple:
 G=(N,Σ,P,S)G = (N, \Sigma, P, S)
o NN is the set of non-terminals
o Σ\Sigma is the set of terminals
o PP is the set of production rules, where each rule has an associated probability
o SS is the start symbol

Each production rule A→αA \rightarrow \alpha in PP has an associated probability


P(A→α)P(A \rightarrow \alpha), and the sum of probabilities of all rules for a non-terminal
AA must be 1:
∑αP(A→α)=1\sum_{\alpha} P(A \rightarrow \alpha) = 1

Example of a PCFG
Here is an example of a simple PCFG:
 S→NP VPS \rightarrow NP \, VP with probability 0.9
 S→VPS \rightarrow VP with probability 0.1
 NP→Det NounNP \rightarrow Det \, Noun with probability 0.7
 NP→NounNP \rightarrow Noun with probability 0.3
 VP→Verb NPVP \rightarrow Verb \, NP with probability 0.8
 VP→VerbVP \rightarrow Verb with probability 0.2
 Det→′the′Det \rightarrow 'the' with probability 1.0
 Noun→′cat′Noun \rightarrow 'cat' with probability 0.6
 Noun→′dog′Noun \rightarrow 'dog' with probability 0.4
 Verb→′chased′Verb \rightarrow 'chased' with probability 1.0

Probabilistic Parsing
In probabilistic parsing, we compute the most probable parse tree for a given sentence by
using algorithms like CYK or Earley but modified to account for the probabilities associated
with each production rule.

Output Example:
Parse Table: [[{}, {'S': 0.9}, {}], [{'NP': 0.63}, {'S': 0.81, 'VP': 0.72}, {'S': 0.63}], [{}, {},
{}]]
Best Parse Probability: 0.9

Probabilistic Parsing Applications


1. Natural Language Understanding (NLU): PCFGs help in understanding the
structure of sentences, making it easier to extract meaning.

2. Machine Translation: PCFGs can be used in translation models to select the most
likely sentence structure when translating from one language to another.

3. Speech Recognition: PCFGs are used to decode spoken sentences, providing the
most probable syntactic structure for a given audio input.

Conclusion
Probabilistic Context-Free Grammars (PCFGs) extend standard CFGs by incorporating
probabilities, allowing parsers to choose the most likely structure for a sentence. The
Probabilistic CYK algorithm is a powerful method for parsing sentences using PCFGs, and
it is widely used in tasks like syntactic parsing, machine translation, and speech recognition.

Probabilistic CYK
Probabilistic CYK (Cocke-Younger-Kasami) Algorithm
The Probabilistic CYK (PCYK) algorithm is a dynamic programming-based parsing
algorithm used for parsing sentences with Probabilistic Context-Free Grammars (PCFGs).
It extends the CYK algorithm to incorporate probabilities, allowing the selection of the most
probable parse tree for a given sentence. The PCYK algorithm is commonly used in syntactic
parsing tasks where we need to identify the structure of a sentence and assign probabilities to
possible parse trees.

How It Works
The core idea of the CYK algorithm is to fill a triangular table where each entry
corresponds to a span of words in the sentence, and each cell contains the possible non-
terminal symbols that can generate that span. In the Probabilistic CYK algorithm, each
production rule is associated with a probability, and the table will store the highest
probability for each non-terminal symbol that can generate a specific span.

PCYK Parsing Steps


1. Initialization: The first step is to fill in the diagonal of the table with the terminal
rules, i.e., the parts of speech that correspond to the words in the sentence.

2. Filling the table: The algorithm then fills in the rest of the table using production
rules from the PCFG. For each span, the algorithm checks all possible ways of
splitting the span into two smaller spans, and updates the table with the highest
probability rule that can generate the span.

3. Final step: The final step is to check the start symbol's probability in the top-right
corner of the table. This gives the probability of the best parse tree for the sentence.

Conclusion
The Probabilistic CYK (PCYK) algorithm is a powerful technique for probabilistic
parsing using Probabilistic Context-Free Grammars (PCFGs). It allows for efficient
parsing of sentences by selecting the most probable parse tree from a set of possible parses.
This approach is widely used in syntactic parsing, machine translation, and speech
recognition.

Probabilistic Lexicalized CFGs


Feature Structures
Feature Structures are a representation framework used in computational linguistics to
capture syntactic, semantic, and morphological properties of linguistic objects such as words,
phrases, or sentences. They are widely used in grammar formalism such as HPSG (Head-
driven Phrase Structure Grammar) and LFG (Lexical Functional Grammar), and are
key in many unification-based grammars.
A feature structure is essentially a collection of attributes (features) and their associated
values, which can be simple values or more complex structures. It can represent various
linguistic properties like tense, number, gender, person, case, and so on.

Basic Components of Feature Structures


1. Features: Attributes or properties that describe a linguistic object (e.g., person,
number, tense, case).
2. Values: The value associated with a feature. The value could be atomic (e.g., singular,
plural, past, present), or it could be another feature structure (recursive feature
structures).

3. Unification: The process of merging two feature structures into one, by combining
their features and values, if compatible.

Example of a Feature Structure


For example, a noun phrase (NP) could be represented with the following feature structure:
NP
├── number: singular
├── gender: masculine
├── case: nominative
This feature structure represents a noun phrase with the following properties:
 The number is singular.
 The gender is masculine.
 The case is nominative.

Unification of Feature Structures


Unification is the key operation in feature structure grammars, where two feature structures
are combined into one if they are compatible. Unification occurs when the feature structures
have the same features, and their corresponding values match. If the values are incompatible
(e.g., a noun marked as both singular and plural), unification fails.

Example of Unification
If we have two feature structures:
1. A feature structure for a noun:
2. Noun
3. ├── number: singular
4. └── gender: masculine
5. A feature structure for a verb:
6. Verb
7. ├── number: singular
8. └── tense: present
When unifying these two structures, we get:
Unified Structure
├── number: singular
├── gender: masculine
└── tense: present

The unification is successful because the number matches in both structures, and the tense is
added from the verb structure.

However, if we attempted to unify a noun with number: singular and number: plural,
unification would fail due to the conflicting values for the number feature.

Applications of Feature Structures


1. Syntactic Parsing: Feature structures are used to represent the syntactic properties of
words, phrases, and sentences in parsing algorithms. For example, in HPSG, syntactic
rules are expressed in terms of feature structures.

2. Morphological Analysis: Feature structures are used to represent morphological


properties of words (e.g., root, tense, aspect, person, number).

3. Semantic Interpretation: Feature structures can also encode semantic properties,


allowing the construction of meaning representations during the parsing process.

4. Constraint-based Grammar Frameworks: Feature structures form the basis for


constraint-based grammar frameworks, which rely on unification to enforce
grammatical constraints.

5. Machine Translation: Feature structures are often used in interlingual


representations in machine translation systems. Conclusion

Feature structures are a powerful tool in computational linguistics for representing and
manipulating syntactic, semantic, and morphological information. By using unification to
combine feature structures, we can model complex linguistic phenomena in a structured and
efficient manner. Feature structures form the foundation of many advanced grammar
frameworks and parsing algorithms.
Feature Structures
Feature Structures are a powerful tool used in computational linguistics and formal
grammar frameworks, especially in unification-based grammars such as Head-driven
Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), and
Combinatory Categorial Grammar (CCG). A feature structure is a representation that
consists of features (attributes) and their corresponding values, which describe properties of
linguistic objects like words, phrases, or even entire sentences.

Feature structures allow for the representation of syntactic, semantic, and morphological
properties of linguistic elements in a modular and flexible way. They are particularly useful
in parsing, morphological analysis, and semantic interpretation.

Basic Components of Feature Structures


1. Features: These are the properties or attributes that describe the object. Examples
include number, gender, person, tense, case, aspect, voice, etc.
2. Values: The value associated with each feature can be atomic (such as singular,
plural, male, female, present, past) or another feature structure, creating a hierarchical
structure.

3. Unification: Unification is the operation used to combine two feature structures. If


the structures are compatible (i.e., no conflicting features), unification merges them
into a single, unified structure.

Example of a Feature Structure


Let's consider an example of a noun phrase (NP):
NP
├── number: singular
├── gender: masculine
├── case: nominative
This feature structure represents a noun phrase with the following properties:
 The number is singular.
 The gender is masculine.
 The case is nominative.

Unification of Feature Structures


Unification is the process of combining two feature structures. When unifying, the feature
structures must be compatible; if there is a conflict (e.g., two structures have the same feature
with different values), the unification will fail.

Example of Unification
1. A feature structure for a noun:
Noun
├── number: singular
├── gender: masculine
2. A feature structure for a verb:
Verb
├── number: singular
├── tense: present
When we try to unify these two structures, we get:
Unified Structure
├── number: singular
├── gender: masculine
├── tense: present

This is because the number feature is the same in both the noun and verb structures, so they
can be unified. The gender and tense features are simply added from their respective
structures.

However, if we attempt to unify a noun with number: singular and number: plural, unification
will fail because of the conflicting number values.

Applications of Feature Structures


Feature structures are used in various computational linguistics tasks, including:

1. Syntactic Parsing: Feature structures can represent syntactic properties like


agreement, constituency, and subcategorization. For example, in HPSG, syntax is
described in terms of feature structures and their unification.
2. Morphological Analysis: Words can be analyzed in terms of their morphological
features, such as tense, aspect, person, number, case, etc.

3. Semantic Interpretation: Feature structures can also capture semantic properties,


allowing them to be used for constructing meaning representations in semantic
parsing.

4. Machine Translation: Feature structures are often used to represent intermediate


linguistic forms in machine translation systems.

5. Constraint-based Grammar Frameworks: Many grammar frameworks use feature


structures to enforce linguistic constraints, ensuring that the generated sentences
conform to grammatical rules.
Conclusion
Feature structures are an essential tool in formal grammar frameworks and are widely used in
syntactic parsing, morphological analysis, and semantic processing. They allow for the
flexible and modular representation of linguistic properties and provide a powerful
mechanism for enforcing linguistic constraints through unification. Feature structures are
particularly useful in unification-based grammar frameworks such as HPSG and LFG, as
well as in constraint-based parsing and machine translation systems.

Unification of Feature Structures


Unification is a key operation in computational linguistics, especially within frameworks that
use feature structures to represent syntactic, semantic, and morphological properties. It is
the process of combining two or more feature structures into one, provided that they are
compatible (i.e., their features can be unified without conflict). The idea behind unification is
that if two structures have the same features, the values of those features should be
compatible; if they differ, the unification fails.

Unification is commonly used in unification-based grammars like Head-driven Phrase


Structure Grammar (HPSG), Lexical Functional Grammar (LFG), and other constraint-
based grammars. Unification ensures that the rules of the grammar are adhered to by
checking for consistency in features and values.

Basic Principles of Unification


1. Compatible Features: Two feature structures can be unified if they have the same
feature and the values for those features are either the same or compatible. For
example, if both structures have the feature number: singular, they can be unified.

2. Atomic vs. Structured Values: Feature values can be atomic (e.g., singular, plural,
masculine, feminine) or they can themselves be feature structures. When the value is
another feature structure, unification involves recursively unifying the two feature
structures.
3. Failure of Unification: If there is a conflict in feature values (e.g., two feature
structures that both have number: singular and number: plural), unification will fail.

4. Unification Operation: The unification operation merges two feature structures,


adding features from both if they are compatible. If there is a conflict, the unification
fails.
Unification Example
Let's consider two feature structures: one for a noun and one for a verb.

Example 1: Unifying Compatible Feature Structures


1. A feature structure for a noun:
Noun
├── number: singular
├── gender: masculine
2. A feature structure for a verb:
Verb
├── number: singular
├── tense: present
When these two feature structures are unified, we get:
Unified Structure
├── number: singular
├── gender: masculine
├── tense: present
Since both feature structures have a matching value for number: singular, they can be unified
successfully. The resulting unified structure combines features from both structures.

Example 2: Unification Fails Due to Conflicting Features


1. A feature structure for a noun:
Noun
├── number: singular
├── gender: masculine
2. A feature structure for a noun with conflicting number:
Noun
├── number: plural
├── gender: feminine

In this case, unification will fail because the feature number has conflicting values: singular
and plural. Since feature structures cannot have contradictory values, unification cannot
proceed.
Recursive Unification
Feature values can themselves be feature structures, which means unification must be
recursive. This allows us to model more complex linguistic structures, such as nested phrases.

Example of Recursive Unification


1. A feature structure for a noun phrase (NP):
NP
├── number: singular
├── case: nominative
2. A feature structure for a verb phrase (VP):
VP
├── number: singular
├── tense: present
3. The overall sentence structure (S):
S
├── subject: NP
├── predicate: VP
When we unify these structures, the result is:
Unified Structure
├── subject:
│ ├── number: singular
│ ├── case: nominative
├── predicate:
│ ├── number: singular
│ ├── tense: present

Here, the unification happens recursively, unifying the subject (NP) and predicate (VP)
feature structures, keeping their internal features intact.

Applications of Unification
1. Syntactic Parsing: Unification is central in parsers based on unification grammars
(like HPSG), where syntactic structures are represented by feature structures.
2. Morphological Analysis: Unification can model the relationship between a root word
and its inflections (e.g., verb tense or noun number).
3. Semantic Parsing: Feature structures are often used in compositional semantics,
where syntactic feature structures are unified with semantic feature structures.

4. Machine Translation: In some systems, interlingual representations use feature


structures to unify syntactic, semantic, and lexical information across languages.

5. Constraint-Based Grammatical Frameworks: Feature structure unification ensures


that the constraints of a grammar are satisfied by checking the compatibility of
features during parsing.

Conclusion
Unification of feature structures is a critical operation in unification-based grammar
frameworks, and it plays a vital role in syntactic parsing, morphological analysis, and
semantic interpretation. Unification ensures that linguistic structures are consistent and
compatible, allowing for the construction of grammatically valid sentences. By combining
feature structures recursively, complex hierarchical structures can be represented and
processed.

You might also like