0% found this document useful (0 votes)

51 views28 pages

Machine Learning Natural Language 2023

This document discusses natural language processing (NLP) and machine learning approaches for NLP tasks. It covers various NLP applications including question answering systems like IBM Watson, information retrieval, machine translation, and information extraction. It also describes common NLP tasks such as segmentation, morphology, syntactic analysis including part-of-speech tagging and parsing, semantics, pragmatics, and discourse analysis. Finally, it discusses machine learning methods for NLP tasks such as part-of-speech tagging, parsing, and references seminal NLP books and papers.

Uploaded by

cawifi4523

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views28 pages

Machine Learning Natural Language 2023

Uploaded by

cawifi4523

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Machine Learning Natural Language

1
NLP System : IBM Watson

Question Answering System

Quiz show Jeopardy!

• “The first person mentioned by name in 'The man in the Iron

mask' is this hero of a previous book by the same author”

2
Natural Language Processing
• NLP focuses on developing systems that allow
computers to perform useful tasks involving human
language
– Also called Computational Linguistics
• NLP applications
– Information Retrieval
– Question Answering
– Machine Translation
– Information Extraction

3
NLP application : Information Retrieval
• Stemming
• Spell checking
• Query expansion
• Word sense
disambiguation

6
NLP application : Question Answering

• Determine type of question and answer

• Parse the question and identify relations
POS tagging, Parsing, named entity recognition

7
NLP application : Machine Translation

• Sentence alignment
• POS tagging
• Parsing
• Sentence generation grammars
• Named Entity Recognition (“New Delhi”)

8
NLP application : Information Extraction

• Identifying/Extracting specific kinds of information

• Named entities (NEs): person, location, price, product
– Mohandas Karamchand Gandhi was born in Porbandar, Gujarath
• Coreference resolution: linking pronouns/abbreviations to entities
– “Indian Institute of Science” <> “IISc.”
• Relations: <DOB>, <spouse>, <attribute>

27
NLP application : Categorization
• Topical : politics, sports, business
• Sentiment: positive, negative, neutral
POS tagging to obtain adjectives

28
NLP : Tasks

• Segmentation : words, sentences

● Morphology : plural “boy” “boys” , “agree” ---> “agreement”
Stemming "fishing", "fished", "fish", "fisher" ---> "fish"
• Syntactic Analysis : structural relationships between words
– Part of Speech (POS) Tagging
Machine[N] learning[V] natural[Adj] language[N]
– Parsing

Machine[N] learning[V] natural[Adj] language[N]

NLP : Tasks
• Semantics
– Word Sense Disambiguation : “I went to bank”
– Semantic role labelling :
“Mary[Agent] sold the book[goods] to John[Recepient]”

• Pragmatics : how language is used to accomplish goals

– I’m sorry Dave, I’m afraid I can’t do that [Polite]
– I can't do that [Rude]

• Discourse
Coreference Resolution : linking pronouns/abbreviations to entities
“I saw Scott yesterday. He was fishing by the lake.”
“Indian Institute of Technology Hyderabad is a public institution
located in Hyderbad. IITH was established in 2007.”

Named Entity recognition (NER) : person, location, price, product

Mohandas Karamchand Gandhi was born in Porbandar, Gujarath
NLP is hard

● Natural language is ambigious

● Sentence Segmentation : “I went out with Mr. Smith.”
• Syntactic
“Flies[Noun/Verb] like flower[Noun/Verb]”

“I saw the man with the telescope” vs

“I saw the man with the telescope”

• Semantic
“I put the plant in the window” vs “Ford put the plant in Mexico”
• Ambiguity is Explosive
“I saw the man on the hill with the telescope.”: 4 parses
Machine Learning Natural Language
● “Rules” in language have numerous exceptions and irregularities
● Manual knowledge engineering, is difficult, time-consuming, and error
prone.
●
Use machine learning methods to automatically acquire the required
knowledge from appropriately annotated text corpora.
●
Annotating corpora is easier and requires less expertise than manual
knowledge engineering.
Machine Learning POS Tagging
•
Lowest level of syntactic analysis
• Useful for Parsing and word sense disambiguation
• Ambiguity in POS tagging
Flies[Noun] like[Verb] flower[Noun]
Time flies[Verb] like[Prep] an arrow.

Learning : Train models on human annotated corpora like the Penn Treebank.
POS Tagging
Classification

Classify each word independently but use as input features,

information about the surrounding words.

Time flies like an arrow.

classifier

14
POS Tagging
Classification
NN
Time flies like an arrow.

classifier

VBZ

15
POS Tagging
● Classification
NN VBZ
Time flies like an arrow.

classifier

VBP

16
POS Tagging
● Classification
NN VBZ VBP
Time flies like an arrow.

classifier

17
POS Tagging
Classification
NN VBZ VBP DT
Time flies like an arrow.

classifier

18
POS Tagging
Classification
NN VBZ VBP DT NN
Time flies like an arrow.

Sequence Labeling
Tags of words are dependent on the tags of other words in
the sentence, particularly their neighbors

Time flies like an arrow.

classifier

NN
19
POS Tagging
Classification
NN VBZ VBP DT NN
Time flies like an arrow.

Sequence Labeling
NN
Time flies like an arrow.

classifier

VBZ

20
POS Tagging
Classification
NN VBZ VBP DT NN
Time flies like an arrow.

Sequence Labeling
NN VBZ
Time flies like an arrow.

classifier

21
POS Tagging
Classification
NN VBZ VBP DT NN
Time flies like an arrow.

Sequence Labeling
NN VBZ IN
Time flies like an arrow.

classifier

22
POS Tagging
Classification
NN VBZ VBP DT NN
Time flies like an arrow.

Sequence Labeling
NN VBZ IN DT
Time flies like an arrow.

classifier

23
Sequence Labeling
Classification
NN VBZ VBP DT NN
Time flies like an arrow.
Sequence Labeling

NN VBZ IN DT NN
Time flies like an arrow.

POS Tagging is best modeled as a sequence learning problem than as

a classification problem
- Information Extraction, Named Entity recognition

Statistical models: Hidden Markov Model (HMM), Maximum Entropy Markov

Model (MEMM), Conditional Random Field (CRF)

24
Parsing

•
Ambiguity
“I saw the man with the telescope” vs
“I saw the man with the telescope”
Probabilistic Context Free Grammars (PCFG)

• Structured Prediction

Machine learning natural language

Strings Trees

Statistical models: Conditional Random Field, Structured perceptrons, Structured support

vector machines
25
Machine learning for NLP

• Transfer Learning, domain adaptation

– Adapting a model learned on a resource rich language to
resource scarce language
• Deep learning
– Unsupervised learning of useful features

● Conferences : Association of Computational Linguistics(ACL),

Computational Linguistics (COLING), Empirical Methods in NLP (EMNLP)

• Software tools
Stanford CoreNLP, openNLP, NLTK, Lingpipe

26
References
Daniel Jurafsky and James H. Martin (2008). Speech and Language Processing

Christopher D. Manning and Hinrich Schütze (1999). Foundations of Statistical

Natural Language Processing.

Machine Learning Methods in Natural Language Processing

http://www.cs.columbia.edu/~mcollins/papers/tutorial_colt.pdf

Lafferty, J., McCallum, A., Pereira, F. (2001). Conditional random fields:

Probabilistic models for segmenting and labeling sequence data.

Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann and Yasemin Altun

(2005), Large Margin Methods for Structured and Interdependent Output Variables

Deep learning for NLP,

http://www.socher.org/index.php/DeepLearningTutorial/DeepLearningTutorial
Thank you

Derrida LetterPeterEisenman 1990
No ratings yet
Derrida LetterPeterEisenman 1990
9 pages
MOD-1
No ratings yet
MOD-1
71 pages
Natural Language Processing
No ratings yet
Natural Language Processing
27 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
unit-4 NLP
No ratings yet
unit-4 NLP
54 pages
POStagging
No ratings yet
POStagging
72 pages
NLP StudyMaterial
No ratings yet
NLP StudyMaterial
540 pages
lect1-intro-3jan08 (1)
No ratings yet
lect1-intro-3jan08 (1)
94 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
Natural Language Processing
No ratings yet
Natural Language Processing
21 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
7-text classification-13-11-2024
No ratings yet
7-text classification-13-11-2024
53 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Sample
No ratings yet
Sample
8 pages
Brocode OP
No ratings yet
Brocode OP
133 pages
01 NLP Unit 4 Part 1
No ratings yet
01 NLP Unit 4 Part 1
25 pages
NLP
No ratings yet
NLP
11 pages
NLP
No ratings yet
NLP
88 pages
Natural Language Processing Unit 1-2
No ratings yet
Natural Language Processing Unit 1-2
18 pages
NLP Unit 1 to 5
No ratings yet
NLP Unit 1 to 5
91 pages
تعلم ML4 (1)
No ratings yet
تعلم ML4 (1)
42 pages
NLP Module 1
No ratings yet
NLP Module 1
124 pages
Introduction to NLP_first_week_lecture_2st
No ratings yet
Introduction to NLP_first_week_lecture_2st
4 pages
Introduction to NLP_first_week_lecture_1st
No ratings yet
Introduction to NLP_first_week_lecture_1st
6 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
NLP Intro
No ratings yet
NLP Intro
74 pages
9.Chapter7 POS Tagging
No ratings yet
9.Chapter7 POS Tagging
37 pages
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
No ratings yet
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
93 pages
Natural Language Processing: Rada Mihalcea
No ratings yet
Natural Language Processing: Rada Mihalcea
27 pages
Natural Language Processing
100% (5)
Natural Language Processing
49 pages
AIYA Session 3 Presentation (1)
No ratings yet
AIYA Session 3 Presentation (1)
40 pages
NLP Survey - Presentation
No ratings yet
NLP Survey - Presentation
31 pages
NLP CHAPTER-1
No ratings yet
NLP CHAPTER-1
24 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
Introduction To NLP: Natural Language Processing
No ratings yet
Introduction To NLP: Natural Language Processing
21 pages
CSC 528 Lecture 3
No ratings yet
CSC 528 Lecture 3
42 pages
NLP-Unit-1-part1
No ratings yet
NLP-Unit-1-part1
61 pages
Bhawini NLP Practical
No ratings yet
Bhawini NLP Practical
98 pages
CL and Topic Models
No ratings yet
CL and Topic Models
33 pages
Introducing Natural Language Processing
No ratings yet
Introducing Natural Language Processing
13 pages
Project Report
No ratings yet
Project Report
12 pages
Parvathy V J, Engineer Special Programs, Livewire, Trivandrum
No ratings yet
Parvathy V J, Engineer Special Programs, Livewire, Trivandrum
35 pages
Unit Ii Part of Speech Tagging and Syntactic Parsing
No ratings yet
Unit Ii Part of Speech Tagging and Syntactic Parsing
29 pages
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
No ratings yet
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
41 pages
AI_M3_Merged.pdf
No ratings yet
AI_M3_Merged.pdf
98 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
Module I NLP
No ratings yet
Module I NLP
65 pages
Intro NLP
No ratings yet
Intro NLP
47 pages
AI4youngster - 6 - Topic NLP
No ratings yet
AI4youngster - 6 - Topic NLP
66 pages
Module 1 Lecture 1
No ratings yet
Module 1 Lecture 1
29 pages
Ima 2000
No ratings yet
Ima 2000
56 pages
Nlpslide
No ratings yet
Nlpslide
21 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
1_NLP.docx
No ratings yet
1_NLP.docx
26 pages
NLP Unit 1
No ratings yet
NLP Unit 1
44 pages
DocScanner 10-Mar-2024 7-05 PM
No ratings yet
DocScanner 10-Mar-2024 7-05 PM
10 pages
Introduction To Machine Learning - 2023
No ratings yet
Introduction To Machine Learning - 2023
44 pages
Group 11
No ratings yet
Group 11
3 pages
Phi Challenge 2018 Yuqing Gao
No ratings yet
Phi Challenge 2018 Yuqing Gao
1 page
Full download Problem Solving Cases in Microsoft(r) AccessTM and Excel(r) pdf docx
No ratings yet
Full download Problem Solving Cases in Microsoft(r) AccessTM and Excel(r) pdf docx
24 pages
MSDOS 3.3 Reference Sep89
No ratings yet
MSDOS 3.3 Reference Sep89
438 pages
DETAILED LESSON PLAN (English 3)
0% (1)
DETAILED LESSON PLAN (English 3)
4 pages
Individual Studies Grade 7 Sample
No ratings yet
Individual Studies Grade 7 Sample
19 pages
Advanced Python Notes
No ratings yet
Advanced Python Notes
24 pages
Meeting 5 English Tenses
No ratings yet
Meeting 5 English Tenses
10 pages
Exposición Individual 1 - Ebe
No ratings yet
Exposición Individual 1 - Ebe
2 pages
Top 10 Tips For Interpreters
No ratings yet
Top 10 Tips For Interpreters
1 page
Au Pair Training Manual
No ratings yet
Au Pair Training Manual
44 pages
Birthday Homework Pass
100% (2)
Birthday Homework Pass
8 pages
Đề Xuất - Tiếng Anh 10
No ratings yet
Đề Xuất - Tiếng Anh 10
13 pages
Course Paper - Lupii - FL-34 - Modality in English Legal Texts
No ratings yet
Course Paper - Lupii - FL-34 - Modality in English Legal Texts
32 pages
Holiday_Assignment__Grade_9_A_and_B
No ratings yet
Holiday_Assignment__Grade_9_A_and_B
1 page
The Origin of Sumerians
No ratings yet
The Origin of Sumerians
3 pages
Grimm2015 Gyeli Grammar Dissertation-Version
100% (1)
Grimm2015 Gyeli Grammar Dissertation-Version
640 pages
EXAMEN FINAL INGLÉS 4
No ratings yet
EXAMEN FINAL INGLÉS 4
10 pages
Third Quarterly Examination For English 9: Department of Education Division of Davao Del Norte
50% (2)
Third Quarterly Examination For English 9: Department of Education Division of Davao Del Norte
2 pages
Lesson Plans World View 2a & 2b
No ratings yet
Lesson Plans World View 2a & 2b
224 pages
K17_Nói 5_Hướng dẫn ôn tập KTĐK_Hằng (FEL702102)
No ratings yet
K17_Nói 5_Hướng dẫn ôn tập KTĐK_Hằng (FEL702102)
13 pages
Cultural Events Rule Book
No ratings yet
Cultural Events Rule Book
9 pages
Download ebooks file Primrose and the Magic Snowglobe Fairy Chronicles 1st Edition J.H Sweet all chapters
100% (3)
Download ebooks file Primrose and the Magic Snowglobe Fairy Chronicles 1st Edition J.H Sweet all chapters
81 pages
TNPSC - Education System in Tamilnadu - Material 1
No ratings yet
TNPSC - Education System in Tamilnadu - Material 1
7 pages
PHD Thesis Present or Past Tense
100% (3)
PHD Thesis Present or Past Tense
8 pages
H Is For Hawk
No ratings yet
H Is For Hawk
2 pages
Literasi Bahasa Inggris
No ratings yet
Literasi Bahasa Inggris
6 pages
Lost and Found - Oliver Jeffers Australian A4 Version
No ratings yet
Lost and Found - Oliver Jeffers Australian A4 Version
80 pages
Skimming and Scanning Worksheets With Answers
100% (1)
Skimming and Scanning Worksheets With Answers
8 pages
اثبات انني
No ratings yet
اثبات انني
6 pages
Gothic Typography
No ratings yet
Gothic Typography
2 pages

Machine Learning Natural Language 2023

Uploaded by

Machine Learning Natural Language 2023

Uploaded by

Machine Learning Natural Language

Question Answering System

• “The first person mentioned by name in 'The man in the Iron

• Determine type of question and answer

• Identifying/Extracting specific kinds of information

• Segmentation : words, sentences

Machine[N] learning[V] natural[Adj] language[N]

• Pragmatics : how language is used to accomplish goals

Named Entity recognition (NER) : person, location, price, product

● Natural language is ambigious

“I saw the man with the telescope” vs

Classify each word independently but use as input features,

Time flies like an arrow.

Time flies like an arrow.

POS Tagging is best modeled as a sequence learning problem than as

Statistical models: Hidden Markov Model (HMM), Maximum Entropy Markov

Machine learning natural language

Statistical models: Conditional Random Field, Structured perceptrons, Structured support

• Transfer Learning, domain adaptation

● Conferences : Association of Computational Linguistics(ACL),

Christopher D. Manning and Hinrich Schütze (1999). Foundations of Statistical

Machine Learning Methods in Natural Language Processing

Lafferty, J., McCallum, A., Pereira, F. (2001). Conditional random fields:

Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann and Yasemin Altun

Deep learning for NLP,

You might also like