0% found this document useful (0 votes)
24 views14 pages

Natural Language Processing

The document discusses Natural Language Processing (NLP) including its components like Natural Language Understanding and Natural Language Generation. It describes various applications of NLP like machine translation, spam detection, question answering etc. It then discusses the different phases and steps to build an NLP pipeline including sentence segmentation, word tokenization, stemming, lemmatization, identifying stop words, dependency parsing, POS tagging, named entity recognition and chunking. It also discusses different types of ambiguities in language.

Uploaded by

baccha1556677788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views14 pages

Natural Language Processing

The document discusses Natural Language Processing (NLP) including its components like Natural Language Understanding and Natural Language Generation. It describes various applications of NLP like machine translation, spam detection, question answering etc. It then discusses the different phases and steps to build an NLP pipeline including sentence segmentation, word tokenization, stemming, lemmatization, identifying stop words, dependency parsing, POS tagging, named entity recognition and chunking. It also discusses different types of ambiguities in language.

Uploaded by

baccha1556677788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Natural Language Processing

(NLP)
Natural Language Processing (NLP)
Components of NLP

1. Natural Language Understanding (NLU)


Natural Language Understanding (NLU) helps the machine to understand and analyse human language by extracting the
metadata from content such as concepts, entities, keywords, emotion, relations, and semantic roles.
NLU involves the following tasks -
o It is used to map the given input into useful representation.
o It is used to analyze different aspects of the language.

2. Natural Language Generation (NLG)


Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural
language representation. It mainly involves Text planning, Sentence planning, and Text Realization.
Applications of NLP

• Machine Translation
Spam Detection • Spelling Correction
• Speech Recognition
• Chatbots

Question Answering Sentiment Analysis


NLP Phases

This phase scans the source code as a stream of characters and converts it into
meaningful lexemes. It divides the whole text into paragraphs, sentences, and words.

Also called as parsing is used to check grammar, word arrangements, and shows the
relationship among the words.

Semantic analysis is concerned with the meaning representation. It mainly focuses on


the literal meaning of words, phrases, and sentences.

Discourse Integration depends upon the sentences that proceeds it and also invokes the
meaning of the sentences that follow it.

Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended
effect by applying a set of rules that characterize cooperative dialogues.
How to build NLP pipeline

Step1: Sentence Segmentation

Independence Day is one of the important festivals for every Indian citizen. It is celebrated on the 15th of
August each year ever since India got independence from the British rule. The day celebrates
independence in the true sense.

Sentence Segment produces the following result:


1. "Independence Day is one of the important festivals for every Indian citizen."
2. "It is celebrated on the 15th of August each year ever since India got independence from the British rule."
3. "This day celebrates independence in the true sense."
How to build NLP pipeline

Step2: Word Tokenization


Word Tokenizer is used to break the sentence into separate words or tokens.

Example:
JavaTpoint offers Corporate Training, Summer Training, Online Training, and Winter Training.

Word Tokenizer generates the following result:


"JavaTpoint", "offers", "Corporate", "Training", "Summer", "Training", "Online", "Training", "and", "Winter",
"Training", "."
How to build NLP pipeline

Step3: Stemming

Stemming is used to normalize words into its base form or root form. For example, celebrates, celebrated and
celebrating, all these words are originated with a single root word "celebrate." The big problem with stemming is
that sometimes it produces the root word which may not have any meaning.

For Example, intelligence, intelligent, and intelligently, all these words are originated with a single root word
"intelligen." In English, the word "intelligen" do not have any meaning.
How to build NLP pipeline

Step 4: Lemmatization
Lemmatization is quite similar to the Stemming. It is used to group different inflected forms of the word, called
Lemma. The main difference between Stemming and lemmatization is that it produces the root word, which has
a meaning.

For example: In lemmatization, the words intelligence, intelligent, and intelligently has a root word intelligent,
which has a meaning.
How to build NLP pipeline

Step 5: Identifying Stop Words

In English, there are a lot of words that appear very frequently like "is", "and", "the", and "a". NLP pipelines will
flag these words as stop words. Stop words might be filtered out before doing any statistical analysis.

Example: He is a good boy.


How to build NLP pipeline

Step 6: Dependency Parsing


Dependency Parsing is used to find that how all the words in the sentence are related to each other.\

Step 7: POS tags


POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that how a word
functions with its meaning as well as grammatically within the sentences. A word has one or more parts of
speech based on the context in which it is used.
Example: "Google" something on the Internet.
In the above example, Google is used as a verb, although it is a proper noun.
How to build NLP pipeline

Step 8: Named Entity Recognition (NER)


Named Entity Recognition (NER) is the process of detecting the named entity such as person name, movie name,
organization name, or location.
Example: Steve Jobs introduced iPhone at the Macworld Conference in San Francisco, California.

Step 9: Chunking
Chunking is used to collect the individual piece of information and grouping them into bigger pieces of
sentences.
How to build NLP pipeline

Step 6: Dependency Parsing


Dependency Parsing is used to find that how all the words in the sentence are related to each other.\

Step 7: POS tags


POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that how a word
functions with its meaning as well as grammatically within the sentences. A word has one or more parts of
speech based on the context in which it is used.
Example: "Google" something on the Internet.
In the above example, Google is used as a verb, although it is a proper noun.
Ambiguity

Lexical Ambiguity
Lexical Ambiguity exists in the presence of two or more possible meanings of the sentence within a
single word. Example: Manya is looking for a match.

Syntactic Ambiguity
Syntactic Ambiguity exists in the presence of two or more possible meanings within the sentence.
Example: I saw the girl with the binocular.

Referential Ambiguity
Referential Ambiguity exists when you are referring to something using the pronoun.
Example: Kiran went to Sunita. She said, "I am hungry."

You might also like