0% found this document useful (0 votes)
88 views

Chapter 6-NLP Basics

This document provides an overview of natural language processing (NLP). It defines NLP as the field of artificial intelligence concerned with giving computers the ability to understand human language. The key aspects covered include: how NLP combines computational linguistics and machine learning; common NLP tasks like syntax analysis, semantic analysis, and language generation; techniques used in NLP including part-of-speech tagging and named entity recognition; and challenges in NLP like ambiguity and understanding context and intent.

Uploaded by

amanterefe99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Chapter 6-NLP Basics

This document provides an overview of natural language processing (NLP). It defines NLP as the field of artificial intelligence concerned with giving computers the ability to understand human language. The key aspects covered include: how NLP combines computational linguistics and machine learning; common NLP tasks like syntax analysis, semantic analysis, and language generation; techniques used in NLP including part-of-speech tagging and named entity recognition; and challenges in NLP like ambiguity and understanding context and intent.

Uploaded by

amanterefe99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Chapter 6: Natural Language Processing (NLP) Basics

By Misganu T.
What is natural language processing?
 Natural language processing (NLP) refers to the
branch of computer science and more specifically, the
branch of artificial intelligence concerned with giving
computers the ability to understand text and spoken
words in much the same way human beings can.
 Natural language processing strives to build machines
that understand and respond to text or voice data and
respond with text or speech of their own.
What is natural language processing? Cont..

• NLP combines computational linguistics, rule-


based modeling of human language, machine
learning and deep learning models. Together,
these technologies enable computers to
process human language in the form of text or
voice data and to ‘understand’ its full meaning,
complete with the speaker or writer’s intent and
sentiment.
What is natural language processing? Cont..

In fact, a typical interaction between humans and machines


(Speech Recognition) could go as follows:
1. A human talks to the machine
2. The machine captures the audio
3. Audio to text conversion takes place
4. Processing of the text’s data
5. Data to audio conversion takes place
6. The machine responds to the human by playing the
audio file
Why is NLP difficult?
• Natural Language processing is considered a difficult problem in
computer science. It’s the nature of the human language that
makes NLP difficult.

• The rules that dictate the passing of information using natural


languages are not easy for computers to understand.

• Some of these rules can be high-leveled and abstract; for example,


when someone uses a sarcastic remark to pass information.

• On the other hand, some of these rules can be low-levelled; for


example, using the character “s” to signify the plurality of items.
Why is NLP difficult? Cont.

• Comprehensively understanding the human


language requires understanding both the
words and how the concepts are connected
to deliver the intended message.
• While humans can easily master a language,
the ambiguity and imprecise characteristics
of the natural languages are what make NLP
difficult for machines to implement.
NLP tasks (How does it Works)
 NLP entails applying algorithms to identify and extract the
natural language rules such that the unstructured language
data is converted into a form that computers can understand.

When the text has been provided, the computer will utilize
algorithms to extract meaning associated with every sentence and
collect the essential data from them.

What are the techniques used in NLP?


Syntactic analysis and semantic analysis are the main techniques
used to complete Natural Language Processing tasks.
NLP tasks (How does it Works) Cont.
1. Syntactic analysis
• Syntax refers to the arrangement of words in a
sentence such that they make grammatical sense.
• In NLP, syntactic analysis is used to assess how
the natural language aligns with the grammatical
rules.
• Computer algorithms are used to apply
grammatical rules to a group of words and derive
meaning from them.
Syntactic analysis Cont.
Here are some syntax techniques that can be used:
• Stemming: It involves cutting the inflected words to their root form.
• Morphological segmentation: It involves dividing words into individual units
called morphemes.
• Word segmentation: It involves dividing a large piece of continuous text
into distinct units.

• Part-of-speech tagging: also called grammatical tagging, is the


process of determining the part of speech of a particular word or
piece of text based on its use and context.

For example: noun, pronoun, verb, adjective, adverb, preposition,


conjunction..
Syntactic analysis Cont.
• Parsing: the process of determining the syntactic structure
of a text by analyzing its constituent words based on an
underlying grammar of the language.

• Sentence breaking: It involves placing sentence boundaries


on a large piece of text.
2. Semantic analysis
• Semantics refers to the meaning that is conveyed by
a text. Semantic analysis is one of the difficult
aspects of Natural Language Processing that has not
been fully resolved yet.
• It involves applying computer algorithms to
understand the meaning and interpretation of words
and how sentences are structured.
Semantic analysis Cont.

Here are some techniques in semantic


analysis:
• Named entity recognition (NER): It involves
determining the parts of a text that can be
identified and categorized into preset
groups. Examples of such groups include
names of people and names of places.
Semantic analysis Cont.
• Word sense disambiguation is the selection of the meaning of a word with

multiple meanings through a process of semantic analysis that determine the


word that makes the most sense in the given context.

For example, word sense disambiguation helps distinguish the meaning of the
verb 'make' in ‘make the grade’ (achieve) vs. ‘make a bet’ (place).

• Speech recognition, also called speech-to-text, is the task of reliably converting

voice data into text data. Speech recognition is required for any application that
follows voice commands or answers spoken questions.

What makes speech recognition especially challenging is the way people talk quickly,
slurring words together, with varying emphasis and intonation, in different accents,

and often using incorrect grammar.


Semantic analysis Cont.
• Natural language generation is sometimes described as the

opposite of speech recognition or speech-to-text; it's the task of

putting structured information into human language.

• Sentiment analysis attempts to extract subjective qualities,

attitudes, emotions, sarcasm, confusion, suspicion from text.

• Co-reference resolution is the task of identifying if and when two

words refer to the same entity. The most common example is

determining the person or object to which a certain pronoun refers

(e.g., ‘she’ = ‘Mary’), but it can also involve identifying Anaphor or

an idiom in the text.


The Levels of Linguistic Description
(Terminologies)
The following list provides more detailed descriptions of the language:
 Morphology
 Syntax
 Semantics
 Phonology
 Phonetics
 Lexicon
 Pragmatics
 Discourse analysis
 Text structure analysis
The Levels of Linguistic Description
(Terminologies)
Morphology
The study of units of meaning in a language. A morpheme is the smallest
unit of language that has meaning or function, a definition that includes
words, prefixes, affixes, and other word structures that impart meaning.
Syntax
The study of how words are combined to form sentences. This includes
examining parts of speech and how they combine to make larger
constructions.
Semantics
The study of meaning in language. Semantics examines the relations
between words and what they are being used to represent.
The Levels of Linguistic Description
(Terminologies)
Phonology
The study of the sound patterns of a particular language. Aspects of study include
determining which phones are significant and have meaning (i.e., the phonemes);
how syllables are structured and combined; and what features are needed to
describe the discrete units (segments) in the language, and how they are interpreted.
Phonetics
The study of the sounds of human speech, and how they are made and perceived.
A phoneme is the term for an individual sound, and is essentially the smallest unit of
human speech.
Lexicon
The study of the words and phrases used in a language,
that is, a language’s vocabulary.
contains information (semantic, grammatical) about individual words.
The Levels of Linguistic Description
(Terminologies)
 Pragmatics

The study of how the context of text affects the meaning of an expression,
and what information is necessary to infer a hidden or presupposed
meaning.

 Discourse analysis

The study of exchanges of information, usually in the form of conversations,


and particularly the flow of information across sentence boundaries.

 Text structure analysis

The study of how narratives and other textual styles are constructed to
make larger textual compositions.
Ambiguity
• We say some input is ambiguous if there are multiple
alternative linguistic structures that can be built for it.
Example:
– I made her duck.
• Possible interpretations:
1. I cooked waterfowl for her
2. I cooked waterfowl belonging to her
3. I created (plaster?) duck she owns.
4. I caused her to quickly lower her head or body.
5. I waived my magic want and turned her into undifferentiated waterfowl.
Ambiguity Cont.
• These different meanings are caused by a
number of ambiguities.
1. First, the words duck and her are morphologically
or syntactically ambiguous in their part-of-speech.
• Duck can be a verb or a noun, while
• her can be a dative pronoun or a possessive
pronoun.
Duck (webster.com)
1duck noun, often attributive \ˈdək\, plural ducks
1 or plural duck
a : any of various swimming birds (family Anatidae, the duck family) in which the neck and legs
are short, the feet typically webbed, the bill often broad and flat, and the sexes usually
different from each other in plumage
b : the flesh of any of these birds used as food
2 : a female duck — compare DRAKE
3 chiefly British : DARLING —often used in plural but singular in construction
4 : PERSON, CREATURE

duck verb intransitive verb


Definition of DUCK 1 a : to plunge under the surface of water
transitive verb b : to descend suddenly : DIP
1 : to thrust under water 2 a : to lower the head or body suddenly : DODGE
2 : to lower (as the head) quickly : BOW b : BOW, BOB
3 : AVOID, EVADE <duck the issue> 3 a : to move quickly
b : to evade a duty, question, or responsibility
Her (webster.com)
1her adj\(h)ər, ˈhər\
Definition of HER
: of or relating to her or herself especially as possessor, agent, or object of an

action <her house> <her research>


2her pronoun

objective form of SHE


dative pronoun
— used to refer to a certain woman, girl, or female animal as the object of a verb or a
preposition ▪ Tell her I said hello. ▪ Did you invite her? ▪ I gave the book to her. ▪ a gift
for her ▪ The dress fits her sister as well as her

possessive pronoun
I gave her book back to her.
Ambiguity Cont.
2. Second, the word make is semantically
ambiguous; it can mean create or cook.
3. Finally, the verb make is syntactically ambiguous
in a different way.
Make can be transitive, that is, taking a single direct
object (2), or
it can be intransitive, that is, taking two objects (5),
meaning that the first object (her) got made into the
second object (duck).
– Finally, make can take a direct object and a verb,
meaning that the object (her) got caused to
perform the verbal action (duck).
Approaches for Disambiguation
• The models and algorithms as ways to
resolve or disambiguate these ambiguities.
• For example:
– deciding whether duck is a verb or a noun can be
solved by part-of-speech tagging.
– deciding whether make means “create” or “cook”
can be solved by word sense disambiguation.
• Resolution of part-of-speech and word sense
ambiguities are two important kinds of lexical
disambiguation.
– A wide variety of tasks can be framed as lexical
disambiguation problems. For example,
• A text-to-speech synthesis system reading the word lead
needs to decide whether it should be pronounced as in
lead pipe or as in lead me on.
Some NLP applications
• Spelling and grammar checking
• Optical character recognition (OCR)
• Screen readers
• Machine aided translation
• Information retrieval
• Document classification
• Document clustering
• Information extraction
• Augmentative and alternative communication
Some NLP applications Cont.
• Question answering

• Summarization

• Text segmentation

• Exam marking

• Report generation

• Machine translation

• Email understanding

• Dialogue systems
Thank You
27

You might also like