Unit Iii - NLP
Unit Iii - NLP
Syntactic Analysis
Syntactic analysis is about understanding the structure of sentences and figuring out how
words fit together according to grammar rules. It’s like building a sentence puzzle to see how
everything connects.
Components of CFG
A CFG is defined as a 4-tuple G=(N,T,P,S)G = (N, T, P, S), where:
2. T (Terminals): A finite set of symbols representing the actual words or tokens in the
language (e.g., dog, run).
3. P (Production Rules): A set of rules of the form A→αA \rightarrow \alpha, where:
4. S (Start Symbol): A special non-terminal from NN that represents the start of the
derivation process.
VP→V NP ∣ V VP \rightarrow V \, NP \, | \, V \,
Derivation Example:
Start with SS.
S→NP VPS \rightarrow NP \, VP
NP→Det NNP \rightarrow Det \, N
Properties of CFGs
1. Generative: CFGs can generate all valid strings in a language and reject invalid ones.
2. Recursive: CFGs can define recursive structures, allowing representation of nested
constructs (e.g., "The cat that chased the dog saw the bird").
3. Ambiguity: A CFG is ambiguous if a string can have more than one valid parse tree.
Applications of CFGs
1. Natural Language Processing (NLP): CFGs model the syntax of languages for
parsing and understanding.
2. Programming Languages: Used in compilers to define the syntax of programming
languages.
3. Formal Languages: Defines languages in automata theory.
Limitations
CFGs cannot handle context-sensitive constructs, such as agreement in number or
gender (e.g., "he runs" vs. "they run").
They may struggle with long-distance dependencies in natural languages (e.g., "The
book that John read was interesting").
Grammar rules for English
Grammar Rules for English (using Context-Free Grammar)
English grammar can be modeled with a Context-Free Grammar (CFG) to describe how
sentences are structured. Below are commonly used grammar rules for English, written in
CFG notation.
NP→NNP \rightarrow N
(Just a noun.)
9. Conjunctions
Conj→"and" ∣ "or" ∣ "but"Conj \rightarrow \text{"and"} \, | \, \text{"or"} \, | \,
\text{"but"}
Example: A parse of the sentence "the giraffe dreams" is: s => np vp => det n vp =>
the n vp => the giraffe vp => the giraffe iv => the giraffe dreams
Additional Notes
These grammar rules are simplified and idealized; real English involves exceptions,
ambiguities, and context-sensitive features not captured by CFGs.
Probabilistic CFGs (PCFGs) are often used to resolve ambiguities in natural English
sentences.
Treebanks
Definition
A Treebank is a linguistic resource that consists of a corpus of sentences annotated with
their syntactic structures, typically represented as parse trees. These parse trees are based on
a particular grammar (such as Context-Free Grammar) and describe the syntactic
relationships between words in the sentences.
Treebanks are used extensively in Natural Language Processing (NLP) for training and
evaluating syntactic parsers.
↳ dog (subject)
↳ cat (object)
o Treebanks often include POS tags, which are essential for various downstream
NLP tasks.
3. Language Modeling:
o Treebanks help in developing structured language models that consider syntax.
4. Linguistic Research:
Example
Sentence: "The quick brown fox jumps over the lazy dog."
1. Constituency Tree:
2. (S
3. (NP (Det The) (Adj quick) (Adj brown) (N fox))
4. (VP (V jumps)
5. (PP (P over)
6. (NP (Det the) (Adj lazy) (N dog)))))
7. Dependency Tree:
8. jumps
9. ↳ fox (subject)
10. ↳ over (prepositional object)
3. The grammar does not allow ϵ\epsilon-productions (productions of the form A→ϵA
\rightarrow \epsilon), except for the start symbol.
Example
Original Grammar:
A→aA ∣ aA \rightarrow aA \, | \, a
Converted to CNF:
S→AB ∣ aS \rightarrow AB \, | \, a
A→aA ∣ aA \rightarrow aA \, | \, a
Applications:
Used in CYK Parsing (Cocke–Younger–Kasami algorithm) for CFG parsing.
Simplifies proofs and formal analysis.
Example
Original Grammar:
Converted to GNF:
Applications:
GNF ensures that parsing can begin with a terminal symbol, making it suitable for
top-down parsing algorithms like recursive descent.
Applications:
Kuroda Normal Form is rarely used in practice but helps in the theoretical analysis of
unrestricted grammars.
Example
For a programming language:
<statement>::=<if−statement>∣<while−statement>∣<assignment><statement> ::= <if-
statement> | <while-statement> | <assignment>
Applications:
Widely used in language specifications and compiler design.
Example
For a programming language:
<if−statement>::="if"<condition>"then"<statement>["else"<statement>]<if-
statement> ::= "if" <condition> "then" <statement> ["else" <statement>]
Applications:
Used in language design and compiler construction.
Dependency Grammar
Definition
Dependency Grammar (DG) is a syntactic theory that emphasizes the relationships between
words in a sentence, known as dependencies. In this framework:
Words (or tokens) are connected by directed links, where one word (the head)
governs or determines another word (the dependent).
The syntactic structure of a sentence is represented as a dependency tree.
Key Concepts
1. Head: The central word in a dependency relationship, which determines the
grammatical function of the dependent.
o Example: In "The cat sleeps," the verb "sleeps" is the head of the sentence.
2. Dependent: A word that is grammatically subordinate to the head.
Dependency Relations
Dependency relations encode syntactic and grammatical roles such as:
1. Subject: The doer of an action (e.g., "She runs.").
2. Object: The receiver of an action (e.g., "He reads a book.").
3. Modifier: A word that describes or adds information about another word (e.g., "The
big cat sleeps.").
4. Prepositional Relation: Links prepositions to their objects (e.g., "on the table").
Example: Dependency Tree
Sentence: "The quick brown fox jumps over the lazy dog."
Dependency Tree Representation:
Root: jumps
o Subject: fox
Modifiers: quick, brown
o Object: dog
Modifiers: the, lazy
o Prepositional Modifier: over
Object of Preposition: dog
Graphically:
jumps
/ | \
fox over dog
/ \ / \
quick brown the lazy
4. Practical Utility: Widely used in modern Natural Language Processing (NLP) for
tasks like syntactic parsing and semantic role labeling.
Dependency Parsing
Dependency parsing is the process of analyzing a sentence to produce its dependency tree.
Parsing can be done using:
1. Rule-Based Parsing: Based on predefined grammatical rules.
2. Machine Learning Approaches:
Transition-based parsing: Builds the tree incrementally.
Graph-based parsing: Finds the best tree by maximizing the global score of
dependencies.
2. Machine Translation: Ensures syntactic and semantic coherence between source and
target languages.
Key Components
1. Grammar: Defines the rules for valid sentence structures.
o Example: S→NP VPS \rightarrow NP \, VP (A sentence is a noun phrase
followed by a verb phrase).
2. Parse Tree: A hierarchical representation of the syntactic structure.
3. Parsing Algorithm: The method used to generate the parse tree.
2. Bottom-Up Parsing
Starts from the leaves (words) and works up to the root.
Example: Shift-Reduce Parsing.
Pros: Efficient for many practical applications.
Cons: Can require complex grammar transformations.
Ambiguity in Parsing
Ambiguity arises when a sentence can be parsed in multiple ways:
1. Lexical Ambiguity: Words with multiple meanings (e.g., "bank").
2. Syntactic Ambiguity: Multiple valid parse trees for a sentence.
o Example: "The boy saw the man with the telescope."
Did the boy use the telescope?
Or did the man have the telescope?
Disambiguation Techniques:
Probabilistic grammars (PCFGs).
Semantic analysis.
Contextual information.
Types of Ambiguity
1. Lexical Ambiguity:
A single word has multiple meanings or parts of speech.
Example:
Sentence: “I saw a bat.”
Ambiguity: Does "bat" mean an animal or a sports instrument?
2. Syntactic Ambiguity:
A sentence can have multiple valid parse trees due to different grammatical
interpretations.
Example:
Sentence: “The boy saw the man with the telescope.”
Ambiguity:
Interpretation 1: The boy used the telescope to see the man.
Interpretation 2: The man had the telescope.
3. Semantic Ambiguity:
The meaning of the sentence is unclear even after resolving lexical and
syntactic ambiguities.
Example:
Sentence: “Visiting relatives can be annoying.”
Ambiguity:
Interpretation 1: The act of visiting relatives is annoying.
Interpretation 2: Relatives who are visiting are annoying.
4. Pragmatic Ambiguity:
The context of the sentence leads to multiple interpretations.
Example:
Sentence: “Can you pass the salt?”
Ambiguity: Is it a request or a question about ability?
5. Structural Ambiguity:
Arises from multiple ways to group words or phrases in a sentence.
Example:
Sentence: “Old men and women.”
Ambiguity:
Interpretation 1: Both men and women are old.
Interpretation 2: Only the men are old, and the women are not.
Causes of Ambiguity
1. Polysemy: Words with multiple meanings.
2. Homonymy: Words that sound or look the same but have different meanings.
3. Grammatical Flexibility: Flexible word order in languages.
4. Contextual Variability: Lack of clear context in the sentence.
5. Complex Sentence Structures: Use of nested or compound clauses.
Ambiguity in NLP
Ambiguity is a significant challenge in natural language processing tasks like syntactic
parsing, semantic analysis, and machine translation.
Disambiguation Techniques
1. Probabilistic Models:
Use probabilistic grammars like Probabilistic Context-Free Grammar
(PCFG) to rank parse trees based on likelihood.
Example: Viterbi algorithm for finding the most probable parse.
2. Contextual Information:
Consider the surrounding words and phrases to resolve ambiguity.
Example: Using word embeddings like Word2Vec or BERT.
3. Semantic Analysis:
Use semantic rules to filter out implausible interpretations.
Example: Semantic role labeling to determine roles like subject, object, etc.
4. Pragmatic and World Knowledge:
Incorporate real-world knowledge to interpret ambiguous sentences.
Example: “She is in the bank.” (Financial institution or riverbank?)
5. Supervised Learning:
o Each cell in the chart represents a span of the sentence and contains possible
parses for that span.
2. Dynamic Programming:
o Breaks the problem into smaller subproblems.
o Reuses results of subproblems to construct solutions for larger spans.
3. Grammar Formalism:
Output: A boolean indicating whether the sentence is part of the language, or a parse
tree.
Steps:
1. Convert the grammar into CNF.
2. Create a 2D table (chart) where T[i][j]T[i][j] stores all non-terminals that can generate
the substring from position ii to jj.
3. Fill the table:
For substrings of increasing length, check which non-terminals can derive the
substring based on grammar rules and smaller spans.
4. The sentence is valid if the start symbol SS spans the entire table.
Complexity: O(n3⋅∣G∣)O(n^3 \cdot |G|), where nn is the length of the sentence, and ∣G∣|G| is
the size of the grammar.
Example: Grammar:
S→NP VPS \rightarrow NP \, VP
NP→Det NNP \rightarrow Det \, N
VP→V NPVP \rightarrow V \, NP
Det→′the′Det \rightarrow 'the'
N→′cat′N \rightarrow 'cat'
V→′chased′V \rightarrow 'chased'
Sentence: “the cat chased”
CYK Table:
2. Earley Parser
Works on: All Context-Free Grammars (not limited to CNF).
Input: A sentence and a CFG.
Output: Parse tree(s) or a boolean indicating sentence validity.
Steps:
1. Maintain a chart with states (active or completed).
Steps:
1. Use Probabilistic Context-Free Grammar (PCFG) rules.
2. Modify the CYK table to store both non-terminals and their probabilities.
3. Use dynamic programming to calculate the most probable parse for each span.
4. Backtrack to construct the most likely parse tree.
Output: The parse tree with the highest probability.
Shallow parsing
Shallow Parsing (Chunking)
Definition
Shallow parsing, also known as chunking, is a process in natural language processing (NLP)
that identifies non-overlapping segments (or chunks) in a sentence. These chunks typically
correspond to noun phrases (NP), verb phrases (VP), prepositional phrases (PP), etc.
Shallow parsing does not involve constructing full syntactic trees but focuses on extracting
these phrases.
The goal of shallow parsing is to segment the sentence into its basic components, which are
useful for tasks like named entity recognition (NER), information extraction, and part-of-
speech tagging.
Key Concepts
1. Chunks: Phrases that are formed by grouping words together based on their syntactic
role in the sentence.
o Noun Phrases (NP): Grouping of nouns with their modifiers.
o Verb Phrases (VP): Grouping of verbs with their arguments.
o Prepositional Phrases (PP): Preposition with its object.
2. Chunking vs. Full Parsing:
o Full Parsing: Builds a complete syntactic tree with all details about sentence
structure.
Conclusion
Shallow parsing is a useful technique to extract key phrases from sentences, and it can be
performed efficiently using regular expressions or machine learning techniques like Hidden
Markov Models (HMMs) or Conditional Random Fields (CRFs) for more complex cases.
Let me know if you need further explanations or more examples!
Probabilistic CFG
Probabilistic Context-Free Grammar (PCFG)
A Probabilistic Context-Free Grammar (PCFG) is an extension of the standard Context-
Free Grammar (CFG), where each production rule is associated with a probability. These
probabilities represent the likelihood of a particular rule being applied, making PCFGs
suitable for tasks like parsing where different possible derivations of a sentence may have
different probabilities, helping to choose the most likely parse tree.
PCFG Structure
A PCFG consists of:
Example of a PCFG
Here is an example of a simple PCFG:
S→NP VPS \rightarrow NP \, VP with probability 0.9
S→VPS \rightarrow VP with probability 0.1
NP→Det NounNP \rightarrow Det \, Noun with probability 0.7
NP→NounNP \rightarrow Noun with probability 0.3
VP→Verb NPVP \rightarrow Verb \, NP with probability 0.8
VP→VerbVP \rightarrow Verb with probability 0.2
Det→′the′Det \rightarrow 'the' with probability 1.0
Noun→′cat′Noun \rightarrow 'cat' with probability 0.6
Noun→′dog′Noun \rightarrow 'dog' with probability 0.4
Verb→′chased′Verb \rightarrow 'chased' with probability 1.0
Probabilistic Parsing
In probabilistic parsing, we compute the most probable parse tree for a given sentence by
using algorithms like CYK or Earley but modified to account for the probabilities associated
with each production rule.
Output Example:
Parse Table: [[{}, {'S': 0.9}, {}], [{'NP': 0.63}, {'S': 0.81, 'VP': 0.72}, {'S': 0.63}], [{}, {},
{}]]
Best Parse Probability: 0.9
2. Machine Translation: PCFGs can be used in translation models to select the most
likely sentence structure when translating from one language to another.
3. Speech Recognition: PCFGs are used to decode spoken sentences, providing the
most probable syntactic structure for a given audio input.
Conclusion
Probabilistic Context-Free Grammars (PCFGs) extend standard CFGs by incorporating
probabilities, allowing parsers to choose the most likely structure for a sentence. The
Probabilistic CYK algorithm is a powerful method for parsing sentences using PCFGs, and
it is widely used in tasks like syntactic parsing, machine translation, and speech recognition.
Probabilistic CYK
Probabilistic CYK (Cocke-Younger-Kasami) Algorithm
The Probabilistic CYK (PCYK) algorithm is a dynamic programming-based parsing
algorithm used for parsing sentences with Probabilistic Context-Free Grammars (PCFGs).
It extends the CYK algorithm to incorporate probabilities, allowing the selection of the most
probable parse tree for a given sentence. The PCYK algorithm is commonly used in syntactic
parsing tasks where we need to identify the structure of a sentence and assign probabilities to
possible parse trees.
How It Works
The core idea of the CYK algorithm is to fill a triangular table where each entry
corresponds to a span of words in the sentence, and each cell contains the possible non-
terminal symbols that can generate that span. In the Probabilistic CYK algorithm, each
production rule is associated with a probability, and the table will store the highest
probability for each non-terminal symbol that can generate a specific span.
2. Filling the table: The algorithm then fills in the rest of the table using production
rules from the PCFG. For each span, the algorithm checks all possible ways of
splitting the span into two smaller spans, and updates the table with the highest
probability rule that can generate the span.
3. Final step: The final step is to check the start symbol's probability in the top-right
corner of the table. This gives the probability of the best parse tree for the sentence.
Conclusion
The Probabilistic CYK (PCYK) algorithm is a powerful technique for probabilistic
parsing using Probabilistic Context-Free Grammars (PCFGs). It allows for efficient
parsing of sentences by selecting the most probable parse tree from a set of possible parses.
This approach is widely used in syntactic parsing, machine translation, and speech
recognition.
3. Unification: The process of merging two feature structures into one, by combining
their features and values, if compatible.
Example of Unification
If we have two feature structures:
1. A feature structure for a noun:
2. Noun
3. ├── number: singular
4. └── gender: masculine
5. A feature structure for a verb:
6. Verb
7. ├── number: singular
8. └── tense: present
When unifying these two structures, we get:
Unified Structure
├── number: singular
├── gender: masculine
└── tense: present
The unification is successful because the number matches in both structures, and the tense is
added from the verb structure.
However, if we attempted to unify a noun with number: singular and number: plural,
unification would fail due to the conflicting values for the number feature.
Feature structures are a powerful tool in computational linguistics for representing and
manipulating syntactic, semantic, and morphological information. By using unification to
combine feature structures, we can model complex linguistic phenomena in a structured and
efficient manner. Feature structures form the foundation of many advanced grammar
frameworks and parsing algorithms.
Feature Structures
Feature Structures are a powerful tool used in computational linguistics and formal
grammar frameworks, especially in unification-based grammars such as Head-driven
Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), and
Combinatory Categorial Grammar (CCG). A feature structure is a representation that
consists of features (attributes) and their corresponding values, which describe properties of
linguistic objects like words, phrases, or even entire sentences.
Feature structures allow for the representation of syntactic, semantic, and morphological
properties of linguistic elements in a modular and flexible way. They are particularly useful
in parsing, morphological analysis, and semantic interpretation.
Example of Unification
1. A feature structure for a noun:
Noun
├── number: singular
├── gender: masculine
2. A feature structure for a verb:
Verb
├── number: singular
├── tense: present
When we try to unify these two structures, we get:
Unified Structure
├── number: singular
├── gender: masculine
├── tense: present
This is because the number feature is the same in both the noun and verb structures, so they
can be unified. The gender and tense features are simply added from their respective
structures.
However, if we attempt to unify a noun with number: singular and number: plural, unification
will fail because of the conflicting number values.
2. Atomic vs. Structured Values: Feature values can be atomic (e.g., singular, plural,
masculine, feminine) or they can themselves be feature structures. When the value is
another feature structure, unification involves recursively unifying the two feature
structures.
3. Failure of Unification: If there is a conflict in feature values (e.g., two feature
structures that both have number: singular and number: plural), unification will fail.
In this case, unification will fail because the feature number has conflicting values: singular
and plural. Since feature structures cannot have contradictory values, unification cannot
proceed.
Recursive Unification
Feature values can themselves be feature structures, which means unification must be
recursive. This allows us to model more complex linguistic structures, such as nested phrases.
Here, the unification happens recursively, unifying the subject (NP) and predicate (VP)
feature structures, keeping their internal features intact.
Applications of Unification
1. Syntactic Parsing: Unification is central in parsers based on unification grammars
(like HPSG), where syntactic structures are represented by feature structures.
2. Morphological Analysis: Unification can model the relationship between a root word
and its inflections (e.g., verb tense or noun number).
3. Semantic Parsing: Feature structures are often used in compositional semantics,
where syntactic feature structures are unified with semantic feature structures.
Conclusion
Unification of feature structures is a critical operation in unification-based grammar
frameworks, and it plays a vital role in syntactic parsing, morphological analysis, and
semantic interpretation. Unification ensures that linguistic structures are consistent and
compatible, allowing for the construction of grammatically valid sentences. By combining
feature structures recursively, complex hierarchical structures can be represented and
processed.