0% found this document useful (0 votes)
43 views16 pages

NLP Course: Theory & Applications

Uploaded by

Jangala Priyanka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views16 pages

NLP Course: Theory & Applications

Uploaded by

Jangala Priyanka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Natural Language

Processing [05 hours/week, 09 Credits] [Theory]

Eighth Semester: Computer Science & Engineering


Dr.M.B.Chandak
[email protected], www.mbchandak.com
Course Contents
• The course is divided into following major components
• Basics of Natural Language Processing and Language modeling
techniques.
• Syntactic and Semantic Parsing
• NLP: Applications: Information Extraction & Machine Translation
• Total units: 6
• Unit 1 and 2: Basics and modeling techniques
• Unit 3 and 4: Syntactic and Semantic Parsing
• Unit 5 and 6: Information Extraction & Machine Translation
Course Pre-requisite
• Basic knowledge of English Grammar
• Theoretical foundations of Computer Science
[TOFCS]
• Extension of Language Processing
• Python and Open Source tools
• Active Class participation and Regularity
Unitized course
Unit-I:
Introduction NLP tasks in syntax, semantics, and pragmatics. Key
issues &Applications such as information extraction, question
answering, and machine translation. The problem of ambiguity. The
role of machine learning. Brief history of the field
UNIT-II:
N-gram Language Models Role of language models. Simple N-gram
models. Estimating parameters and smoothing. Evaluating language
models. Part Of Speech Tagging and Sequence Labeling Lexical
syntax. Hidden Markov Models. Maximum Entropy models.
Unitized course
Unit-III:
Grammar formalisms and tree banks. Efficient parsing for context-free grammars
(CFGs). Statistical parsing and probabilistic CFGs (PCFGs). Lexicalized PCFGs.
UNIT-IV:
Lexical semantics and word-sense disambiguation. Compositional semantics.
Semantic Role Labeling and Semantic Parsing.
Unit - V
Named entity recognition and relation extraction. IE using sequence labeling.
Automatic summarization Subjectivity and sentiment analysis.
Unit - VI
Basic issues in MT. Statistical translation, word alignment, phrase-based
translation, and synchronous grammars.
Text and Reference books
1. D. Jurafsky and R. Martin; Speech and Language Processing; 2nd
edition, Pearson Education, 2009.
2. 2. Allen and James; Natural Language Understanding; Second
Edition, Benjamin/Cumming, 1995. Charniack & Eugene, Statistical
Language Learning, MIT Press, 1993
3. Web Resources
Course Outcomes
CO Course Outcome Unit
1 Ability to differentiate various NLP tasks and Unit 1
understand the problem of ambiguity.
2 Ability to model and preprocess language Unit 2
3 Ability to perform syntactical parsing using different Unit 3
grammars.
4 Ability to perform semantic parsing and word sense Unit 4
disambiguation.
5 Ability to perform Information Extraction and Machine Unit 5,6
translation.
Grading Scheme: Internal Examination
• Total: 40 marks
• Three Test: Best TWO [15 x 2 = 30 marks]
• Generally Third test will be complex.
• 10 marks distribution:
(i) Class participation: 03 marks [may include attendance]
(ii)Assignment – 1: 04 marks [Design/Coding] {After T1}
(iii)Assignment– 2: 03 marks [Objective/Coding] {After T2}
(iv) Challenging problems: [Individual : 07: marks]
Introduction: Basics
• Natural Language Processing (NLP) is the study of the computational
treatment of natural (human) language.
• In other words, teaching computers how to understand (and
generate) human language.
• It is field of Computer Science, Artificial Intelligence and
Computational Linguistics.
• Natural language processing systems take strings of words (sentences)
as their input and produce structured representations capturing the
meaning of those strings as their output. The nature of this output
depends heavily on the task at hand.
Introduction: NLP tasks
• Processing Language is complex task
• Modular approach is followed
Conferences:
ACL/NAACL, EMNLP, SIGIR, AAAI/IJCAI, Coling, HLT, EACL/NAACL, AMTA/MT Summit,
ICSLP/Eurospeech
Journals:
Computational Linguistics, TACL, Natural Language Engineering, Information Retrieval,
Information Processing and Management, ACM Transactions on Information Systems,
ACM TALIP, ACM TSLP
University centers:
Berkeley, Columbia, Stanford, CMU, JHU, Brown, UMass, MIT, UPenn, USC/ISI, Illinois,
Michigan, UW, Maryland, etc.
Toronto, Edinburgh, Cambridge, Sheffield, Saarland, Trento, Prague, QCRI, NUS, and
many others
Industrial research sites:
Google, MSR, Yahoo!, FB, IBM, SRI, BBN, MITRE, AT&T Labs
The ACL Anthology
http://www.aclweb.org/anthology
The ACL Anthology Network (AAN)
http://clair.eecs.umich.edu/aan/index.php
Why NLP is complex
• Natural language is extremely rich in form and structure, and
very ambiguous.
• How to represent meaning,
• Which structures map to which meaning structures.
• One input can mean many different things. Ambiguity can be at
different levels.
• Lexical (word level) ambiguity -- different meanings of words
• Syntactic ambiguity -- different ways to parse the sentence
• Interpreting partial information -- how to interpret pronouns
• Contextual information -- context of the sentence may affect the meaning of
that sentence.
Example: Ambiguity
• Consider Sentence: “I made her duck”
• Various levels of ambiguity
• How many different interpretations does this sentence have?
• What are the reasons for the ambiguity?
• The categories of knowledge of language can be thought of as
ambiguity resolving components.
• How can each ambiguous piece be resolved?
• Does speech input make the sentence even more ambiguous?
• Yes – deciding word boundaries
Example: Ambiguity
• Some interpretations of : I made her duck.
1. I cooked duck for her.
2. I cooked duck belonging to her.
3. I created a toy duck which she owns.
4. I caused her to quickly lower her head or body.
5. I used magic and turned her into a duck.
• duck – morphologically and syntactically ambiguous: noun or verb.
• her – syntactically ambiguous: dative or possessive.
• make – semantically ambiguous: cook or create.
• make – syntactically ambiguous
Example: Ambiguity Resolution
• Ambiguity resolution is possible by modeling language. For example:
• part-of-speech tagging -- Deciding whether duck is verb or noun.
• word-sense disambiguation -- Deciding whether make is create or
cook or action
• lexical disambiguation -- Resolution of part-of-speech and
word-sense ambiguities are two important kinds of lexical
disambiguation.
• syntactic ambiguity -- her duck is an example of syntactic
ambiguity, and can be addressed by probabilistic parsing
Language: Knowledge components
• Phonology – concerns how words are related to the sounds for realization.

• Morphology – concerns how words are constructed from more basic meaning
units called morphemes. A morpheme is the primitive unit of meaning in a
language.

• Syntax – concerns how can be put together to form correct sentences and
determines what structural role each word plays in the sentence and what
phrases are subparts of other phrases.

• Semantics – concerns what words mean and how these meaning combine in
sentences to form sentence meaning. The study of context-independent
meaning.
Language: Knowledge components
• Pragmatics – concerns how sentences are used in different situations
and how use affects the interpretation of the sentence.

• Discourse – concerns how the immediately preceding sentences


affect the interpretation of the next sentence. For example, interpreting
pronouns and interpreting the temporal aspects of the information.

• World Knowledge – includes general knowledge about the world. What


each language user must know about the other’s beliefs and goals.

You might also like