0% found this document useful (0 votes)
6 views

9-Syntax_part1

The document outlines the topics covered in a Natural Language Processing course, focusing on Syntax and Context-Free Grammar (CFG). It explains the importance of syntax in language structure, derivation, and parsing, as well as the components of CFGs. Additionally, it provides examples and questions to illustrate grammatical and ungrammatical sentences within the context of CFGs.

Uploaded by

dw9324764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

9-Syntax_part1

The document outlines the topics covered in a Natural Language Processing course, focusing on Syntax and Context-Free Grammar (CFG). It explains the importance of syntax in language structure, derivation, and parsing, as well as the components of CFGs. Additionally, it provides examples and questions to illustrate grammatical and ungrammatical sentences within the context of CFGs.

Uploaded by

dw9324764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Year: 2024-2025

Spring Semester

Natural Language
Processing
Dr. Wafaa Samy
Dr. Hanaa Eissa
Syntax (Part 1)

Lecture (9)

2
Contents
• Syntax
• Context-Free Grammar (CFG)
o Derivation
o Parsing
o Grammatical and Ungrammatical Sentences

3
Syntax
• Study the order of words in a sentence and their
relationships.
o Syntax defines word categories and functions.

• Subject, verb, object is a sequence of functions


that corresponds to a common order in many
European languages including English and French.
• Syntactic Parsing determines the structure of a
sentence.
4
Syntax (Cont.)
• Why should we care about syntax?
• Grammars (and parsing) are key components
in many applications like:
o Grammar checkers
o Question answering
o Information extraction
o Machine translation

5
Context-Free Grammars
• Context-Free Grammars (CFGs).
o Also known as:
 Phrase structure grammars.
 Backus-Naur Form (BNF).
• CFG consist of: Rules, Terminals, and Non-terminals
1. Terminals:
o We’ll take these to be words (for now).
2. Non-Terminals:
o The constituents in a language.
 Like noun phrase, verb phrase, verb, noun, sentence, etc.

3. Rules:
o Rules are equations that consist of a single non-terminal on the left
and any number of terminals and non-terminals on the right.

6
Some NP Rules
• Here are some rules for our noun phrases (NP):

• Together, these three rules describe two kinds of NPs.


1. One that consists of a determiner followed by a nominal.
2. And another that says that proper names are NPs.
• The third rule illustrates two things:
o An explicit disjunction (by using “|”): Two kinds of nominal.
o A recursive definition: Same non-terminal “Nominal” on the
right and left-side of the rule.
7
L0 Grammar

8
Formal Definition

9
Context-Free Grammar
• A context-free grammar consists of:
1. A set of production rules, each of which expresses the ways
that symbols of the language can be grouped and ordered
together, and
2. A lexicon of words and symbols.

10
Derivation and Parsing
• A derivation is a sequence of rules applied to a string
that accounts for that string:
o Covers all the elements in the string.
o Covers only the elements in the string.

11
Example (1): Derivation

• Derivation for the statement:


• Use this grammar to
S ⇒ NP VP
make a derivation for the ⇒ ProperNoun VP
following statement: ⇒ Adel VP
⇒ Adel Verb NP
Adel study Automata with Maged ⇒ Adel study NP
⇒ Adel study NP PP
⇒ Adel study ProperNoun PP
⇒ Adel study Automata PP
⇒ Adel study Automata Preposition NP
⇒ Adel study Automata with NP
⇒ Adel study Automata with ProperNoun
12 ⇒ Adel study Automata with Maged
Derivation

If A → α is a production rule, and β and γ are any strings in the set (Σ U V)*, then
we say that β A γ directly derives β α γ, written as:
βAγ⇒βαγ

• Derivation is a generalization of direct derivation.

13
Language Defined by CFG
• The formal language defined by a CFG is the set of strings
that are derivable from the designated start symbol.

14
Parsing
• Parsing (or Syntactic parsing) is the process of taking
a string and a grammar and returning a (multiple?)
parse tree(s) for that string.
• It is completely analogous to running a finite-state
transducer with a tape.
o It’s just more powerful:
 Remember this means that there are languages we
can capture with CFGs that we can’t capture with
finite-state methods.

15
Example (2): Parsing

• Parse tree:

16 Adel study Automata with Maged


Example (3): Parsing

Parse Tree

17
Question (1)
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

18
Question (1): Solution
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

19
Question (2)
• Given {a, the, study} ∈ terminals and
{VP, Verb, Det} ∈ non-terminals, which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

20
Question (2): Solution
• Given {a, the, study} ∈ terminals and
{VP, Verb, Det} ∈ non-terminals, which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

21
Question (3)
• Given {x, y} is a set of terminals and A is a
non-terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

22
Question (3): Solution
• Given {x, y} is a set of terminals and A is a
non-terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

23
Grammatical and Ungrammatical
Sentences
• A CFG defines a formal language ( a set of strings ).

• Sentences (strings of words) that can be derived by a


grammar are in the formal language defined by that
grammar, and are called grammatical sentences.

24
Grammatical and Ungrammatical
Sentences (Cont.)
• Sentences that cannot be derived by a given formal
grammar are not in the language defined by that grammar,
and are referred to as ungrammatical sentences.

ungrammatical sentence grammatical sentence


25

You might also like