0% found this document useful (0 votes)
53 views15 pages

A10 NLP Exp 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views15 pages

A10 NLP Exp 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Terna Engineering College

Computer Engineering Department

Class: BE Sem.: VII

Course: Natural Language Processing[NLP]


PART A
(PART A : TO BE REFFERED BY STUDENTS)

Experiment No.01
A.1 Aim:
Perform and analyse a Word analysis and word generation to study morphology using Virtual
Lab

A.2 Prerequisite: Python/R

A.3 Outcome:
After successfully completing the experiment students will be able to
understand word analysis and word generation models using R/Python
A.4 Theory:

Word Analysis

A word can be simple or complex. For example, the word 'cat' is simple because one
cannot further decompose the word into smaller part. On the
other hand, the word 'cats' is complex, because the word is made up of two parts: root
'cat' and plural suffix '-s'

Analysis of a word into root and affix(es) is called as Morphological analysis of a word. It
is mandatory to identify root of a word for any natural language processing task. A root
word can have various forms. For example, the word 'play' in English has the following
forms: 'play', 'plays', 'played' and 'playing'. Hindi shows more number of forms for the
word 'खेल' (khela) which is equivalent to 'play'. The forms of 'खेल'(khela) are the
following:
खेल(khela), खेला(khelaa), खेल (khelii), ◌ू ◌ा(kheluungaa), ◌ू ◌ी(kheluungii),

खेलेगा(khelegaa), खेलेगी(khelegii), खेलत(◌ेkhelate), खेलती(khelatii), खेलने(khelane),

खेलकर(khelakar)

For Telugu root ఆడడం (Adadam), the forms are the following::

Adutaanu, AdutunnAnu, Adenu, Ademu, AdevA, AdutAru, Adutunnaru, AdadAniki,


Adesariki, AdanA, Adinxi, Adutunxi, AdinxA, AdeserA, Adestunnaru, ...

Thus we understand that the morphological richness of one language might vary from
one language to another. Indian languages are generally morphologically rich languages
and therefore morphological analysis of words becomes a very significant task for
Indian languages.

Types of Morphology

Morphology is of two types,

1. Inflectional morphology

Deals with word forms of a root, where there is no change in lexical category. For
example, 'played' is an inflection of the root word 'play'. Here, both 'played' and 'play'
are verbs.

2. Derivational morphology
Deals with word forms of a root, where there is a change in the lexical category. For
example, the word form 'happiness' is a derivation of the word 'happy'. Here,
'happiness' is a derived noun form of the adjective 'happy'.

Morphological Features:

All words will have their lexical category attested during morphological analysis.
A noun and pronoun can take suffixes of the following features: gender, number,
person, case

For example, morphological analysis of a few words is given below:

Languag
einput:word output:analysis

Hindi लडके (ladake) rt=लड़का(ladakaa), cat=n, gen=m, num=sg, case=obl

Hindi लडके (ladake) rt=लड़का(ladakaa), cat=n, gen=m, num=pl, case=dir

Hindi लड़क (ladakoM) rt=लड़का(ladakaa), cat=n, gen=m, num=pl, case=obl

English boy rt=boy, cat=n, gen=m, num=sg

English boys rt=boy, cat=n, gen=m, num=pl


A verb can take suffixes of the following features: tense, aspect, modality, gender, number, person

Languag einput:word
output:analysis
Hindi हँसी(han rt=हँस(hans), cat=v, gen=fem, num=sg/pl, per=1/2/3
English tense=past, aspect=pft rt=toy,
sii) toys cat=n, num=pl, per=3

'rt' stands for root. 'cat' stands for lexical category. Thev value of lexicat category
can be noun, verb, adjective, pronoun, adverb, preposition. 'gen' stands for
gender. The value of gender can be masculine or feminine.

'num' stands for number. The value of number can be singular (sg) or plural (pl).

'per' stands for person. The value of person can be 1, 2 or 3


The value of tense can be present, past or future. This feature is applicable for verbs.

The value of aspect can be perfect (pft), continuous (cont) or habitual (hab). This
feature is not applicable for verbs.

'case' can be direct or oblique. This feature is applicable for nouns. A case is an oblique
case when a postposition occurs after noun. If no postposition can occur after noun,
then the case is a direct case. This is applicable for hindi but not english as it doesn't
have any postpositions. Some of the postpsitions in hindi are: का(kaa), क (kii), के (ke),
को(ko), म(meM)
Refer below code for word analysis and try for other minimum 10 English words

] Word Generation
Given the root and suffix information, a word can be generated. For example,
Language
input:analysis output:wor d

Hindi rt=लड़का(ladakaa), cat=n, gen=m, num=sg, case=obl लड़के (ladake)

Hindi rt=लड़का(ladakaa), cat=n, gen=m, num=pl, case=dir लड़के (ladake)

English rt=boy, cat=n, num=pl boys

English rt=play, cat=v, num=sg, per=3, tense=pr plays


- Morphological analysis and generation: Inverse processes.
- Analysis may involve non-determinism, since more than one analysis is possible.
- Generation is a deterministic process. In case a language allows spelling variation,
then till that extent, generation would also involve non-determinism.
PART B

(PARTB:TOBECOMPLETEDBYSTUDENTS)

(Students must submit the soft copy as per following segments within two hours of the practical. The
soft copy must be uploaded on the ERP or emailed to the concerned lab in charge faculties at the end
of the practical in case the there is no ERP access available)
Roll. No. 34 Name: Ganesh Deepak Sanap

Class: BE-A Batch: A2

Date of Experiment: 12/06/2024 Date of Submission:12/06/2024


Grade:
B.1 Software Code written by student:

(DOCUMENT WHICH EXECUTED DURING 2 HOURS LAB SESSION HAVE TO BE PASTED)

B.2 Input and Output:

(DOCUMENT WHICH EXECUTED DURING 2 HOURS LAB SESSION HAVE TO BE PASTED)


B.3 Observations and learning:

(DOCUMENT WHICH HAVE BEEN UNDERSTOOD DURING 2 HOURS LAB SESSION BY


STUDENT
HAVE TO BE PASTED)

Using a Virtual Lab for word analysis and generation helps to understand word construction
by breaking down words into morphemes and identifying patterns in their formation. This
process also allows for the generation of new words following morphological rules,
demonstrating the systematic nature of language structure.

B.4 Conclusion:
(DOCUMENT WHICH HAVE BEEN UNDERSTOOD DURING 2 HOURS LAB SESSION BY
STUDENT HAVE TO BE PASTED)

Performed and analysed a Word analysis and word generation to study morphology using
Virtual Lab
B.5 Question of Curiosity
1. Choose a typical masculine noun, ending in ’A’, from your language. Write down its
various forms along with various features and their values associated with them.
The Spanish masculine noun "problema" (problem), despite ending in 'a', has the
following forms:

1. Singular: problema
- Features: Masculine, Singular

2. Plural: problemas
- Features: Masculine, Plural

Example sentences:
- Singular: El problema es difícil. (The problem is difficult.) - Plural: Los
problemas son difíciles. (The problems are difficult.) 2.

English has a suffix –en whose use is illustrated in the following lists:

In regard to these data, answer the following questions:


A. What part of speech does the suffix –en attach to? That is, what is the part of
speech of the words in list A?

B. When the suffix –en is attached to a word, what part of speech is the resulting
word? Give some specific morphological properties of one of the words in list B, in
order to justify your answer.

A. Part of Speech of Words in List A


The suffix -en attaches to adjectives. The words in List A (e.g., red, mad, soft, wide,
sharp) are all adjectives.
B. Part of Speech of Resulting Words
When the suffix -en is attached to an adjective, the resulting word is a verb.

Example Word from List B: Soften

Specific Morphological Properties:


- Base Word (Adjective): soft
- Resulting Word (Verb): soften
Justification:
1. Inflectional Forms: The verb "soften" can take on various verb forms: -
Present: soften
- Past: softened
- Present participle: softening

2. Syntax: As a verb, "soften" is used to indicate an action or process. For example:


- "The fabric will soften after washing."

This demonstrates that the suffix -en converts adjectives into verbs, with the
resulting verb indicating the action of making something become more like the
adjective.

3. Take one verb from your mother tongue, gloss it (i.e., give the Engish meaning) and
conjugate it in all tenses and aspects and persons.
Ans:-
Present Tense (Simple Present)

● 1st Person Singular: मी करतो/करते(mī kartō/kartē) – I do


● 2nd Person Singular: तकरतोू /करते(tū kartō/kartē) – You do (informal) ● 3rd
Person Singular: तो/ती/तकेरतो/करते/करते(tō/tī/tē kartō/kartē/kartē) –
He/She/It does
● 1st Person Plural: आ ह करतो/करतो (āṁmhī kartō/kartō) – We do ●
2nd Person Plural: तहु करतात (tumhī kartāt) – You do (formal/plural)

● 3rd Person Plural: तकेरतात (tē kartāt) – They do

Past Tense (Simple Past)

● 1st Person Singular: मी के ले(mī kēlē) – I did


● 2nd Person Singular: तकेू ले(tū kēlē) – You did (informal) ● 3rd Person
Singular: तो/ती/तकेे ले(tō/tī/tē kēlē) – He/She/It did ● 1st Person Plural: आ ह के
ले(āṁmhī kēlē) – We did ● 2nd Person Plural: तहु के ले(tumhī kēlē) –

You did
(formal/plural) ● 3rd Person Plural: तकेे ले(tē kēlē) – They did

Future Tense (Simple Future)

● 1st Person Singular: मी क शकतो/शकते(mī karū śakatō/śakatē) – I will do ●

2nd Person Singular: तकू शकतो/शकते(tū karū śakatō/śakatē) – You will do


(informal)
● 3rd Person Singular: तो/ती/तके शकतो/शकते/शकते(tō/tī/tē karū
śakatō/śakatē/śakatē) – He/She/It will do
● 1st Person Plural: आ ह क शकतो (āṁmhī karū śakatō) – We will do

● 2nd Person Plural: तहु क शकता (tumhī karū śakatā) – You will do
(formal/plural)
● 3rd Person Plural: तके शकतात (tē karū śakatāt) – They will do

Perfect Tense (Present Perfect)

● 1st Person Singular: मी के लेआहे(mī kēlē āhē) – I have done


● 2nd Person Singular: तकेू लेआहेस (tū kēlē āhēs) – You have done (informal)
● 3rd Person Singular: तो/ती/तकेे लेआहे(tō/tī/tē kēlē āhē) – He/She/It has done
● 1st Person Plural: आ ह के लेआहे(āṁmhī kēlē āhē) – We have done
● 2nd Person Plural: तहु के लेआहात (tumhī kēlē āhāt) – You have done
(formal/plural)
● 3rd Person Plural: तकेे लेआहेत (tē kēlē āhēta) – They have done

Continuous Tense (Present Continuous)

● 1st Person Singular: मी करत आहे(mī karat āhē) – I am doing


● 2nd Person Singular: तकरतू आहेस (tū karat āhēs) – You are doing (informal)
● 3rd Person Singular: तो/ती/तकेरत आहे(tō/tī/tē karat āhē) – He/She/It is doing
● 1st Person Plural: आ ह करत आहोत (āṁmhī karat āhōta) – We are doing ●
2nd Person Plural: तहु करत आहात (tumhī karat āhāt) – You are doing (formal/plural)
● 3rd Person Plural: तकेरत आहेत (tē karat āhēta) – They are doing

Future Continuous Tense


● 1st Person Singular: मी करत असेन (mī karat asēna) – I will be doing ●
2nd Person Singular: तकरतू असशील (tū karat asśīla) – You will be doing
(informal)
● 3rd Person Singular: तो/ती/तकेरत असेल (tō/tī/tē karat asēla) – He/She/It will be
doing
● 1st Person Plural: आ ह करत अस(ू āṁmhī karat asū) – We will be doing ●
2nd Person Plural: तहु करत असाल (tumhī karat asāla) – You will be doing
(formal/plural)
● 3rd Person Plural: तकेरत असतील (tē karat asatīla) – They will be doing Perfect

Continuous Tense (Present Perfect Continuous)

● 1st Person Singular: मी करत आलो आहे(mī karat ālō āhē) – I have been doing ● 2nd
Person Singular: तकरतू आला आहेस (tū karat ālā āhēs) – You have been doing
(informal)
● 3rd Person Singular: तो/ती/तकेरत आलेआहे(tō/tī/tē karat ālē āhē) – He/She/It has
been doing
● 1st Person Plural: आ ह करत आलो आहे(āṁmhī karat ālō āhē) – We have been doing
● 2nd Person Plural: तहु करत आला आहात (tumhī karat ālā āhāt) – You have been doing
(formal/plural)
● 3rd Person Plural: तकेरत आलेआहेत (tē karat ālē āhēta) – They have been doing

4. Refer to the following data and answer the question below:

List 1: taller, shorter, higher, lower, smarter

List 2: mower, teacher, sailor, caller, operator

List 3: never, cover, finger, river

Are the words ending with ‘er’/’or’ have some common features?
Common Features are
Part of Speech:

● List 1: These words are adjectives (e.g., "taller," "shorter," "higher," "lower," "smarter"). ● List 2:
These words are nouns (e.g., "mower," "teacher," "sailor," "caller," "operator"). ● List 3: These
words are nouns (e.g., "never," "cover," "finger," "river").

Function and Meaning:

● List 1: The suffix ‘-er’ is used to form the comparative degree of adjectives. It compares the
quality of nouns in terms of higher, lower, greater, etc.
● List 2: The suffixes ‘-er’ and ‘-or’ are used to form agent nouns. These are nouns that refer
to people or things that perform an action or function (e.g., "mower" (someone who
mows), "teacher" (someone who teaches), "sailor" (someone who sails)).
● List 3: The suffixes ‘-er’ and ‘-or’ in this context are used to form common nouns without
specific action-related meanings. They are often considered to be part of the noun's base form
without indicating an agent or a comparative degree.

Formation and Usage:

● List 1: The ‘-er’ suffix is used with adjectives to create comparative forms (e.g., "taller" means
more tall).
● List 2: The ‘-er’ and ‘-or’ suffixes are used to create agent nouns from verbs or other bases,
indicating someone or something that performs an action (e.g., "teacher" from "teach"). ● List 3:
The ‘-er’ and ‘-or’ suffixes are used to create common nouns, often without a specific agent or
comparative meaning (e.g., "river" does not denote an action but rather a natural feature).

5. Identify root and suffix in the following words:


kissed
stronger
goodness
teacher
achievement

kissed

● Root: kiss
● Suffix: -ed stronger

● Root: strong
● Suffix: -er goodness

● Root: good
● Suffix: -ness teacher

● Root: teach ● Suffix: -er achievement

● Root: achieve
● Suffix: -ment

6. Generate words for the following features:


English:
root: boy category: noun number: singular root: child category: noun number: plural root: play
category: verb gender: male number: singular person:first tense: simple-present root: play
category: verb gender: male number: singular person: third tense: simple-present Write
Python/Java code for 10 English words and perform word analysis Ans:-

import nltk
from nltk.corpus import wordnet
# Download WordNet data if you haven't already
nltk.download('wordnet')
nltk.download('omw-1.4')
# Define words based on features
words = {
"boy": {
"root": "boy",
"category": "noun",
"number": "singular"
},
"children": {
"root": "child",
"category": "noun",
"number": "plural"
},
"plays": {
"root": "play",
"category": "verb",
"gender": "male",
"number": "singular",
"person": "third",
"tense": "simple-present"
},
"play": {
"root": "play",
"category": "verb",
"gender": "male",
"number": "singular",
"person": "first",
"tense": "simple-present"
}
}

# Function to perform basic word analysis


def analyze_word(word):
synsets = wordnet.synsets(word) if not
synsets: return "No information found for
this word."

analysis = [] for
synset in synsets:
analysis.append({ "
Word": word,
"Definition": synset.definition(),
"Examples": synset.examples(),
"Synonyms": synset.lemma_names(),
"Antonyms": [antonym.name() for lemma in synset.lemmas() for antonym in
lemma.antonyms()]
}) return
analysis

# Generate and analyze words for


word, features in words.items():
print(f"Features for word '{word}':
{features}") print("Word Analysis:") analysis =
analyze_word(word) if isinstance(analysis,
list): for item in analysis:
print(f"Definition: {item['Definition']}")
print(f"Examples: {item['Examples']}")
print(f"Synonyms: {item['Synonyms']}")
print(f"Antonyms: {item['Antonyms']}")
else:
print(analysis) print("\n" +
"="*40 + "\n")

You might also like