A10 NLP Exp 1
A10 NLP Exp 1
Experiment No.01
A.1 Aim:
Perform and analyse a Word analysis and word generation to study morphology using Virtual
Lab
A.3 Outcome:
After successfully completing the experiment students will be able to
understand word analysis and word generation models using R/Python
A.4 Theory:
Word Analysis
A word can be simple or complex. For example, the word 'cat' is simple because one
cannot further decompose the word into smaller part. On the
other hand, the word 'cats' is complex, because the word is made up of two parts: root
'cat' and plural suffix '-s'
Analysis of a word into root and affix(es) is called as Morphological analysis of a word. It
is mandatory to identify root of a word for any natural language processing task. A root
word can have various forms. For example, the word 'play' in English has the following
forms: 'play', 'plays', 'played' and 'playing'. Hindi shows more number of forms for the
word 'खेल' (khela) which is equivalent to 'play'. The forms of 'खेल'(khela) are the
following:
खेल(khela), खेला(khelaa), खेल (khelii), ◌ू ◌ा(kheluungaa), ◌ू ◌ी(kheluungii),
खेलकर(khelakar)
For Telugu root ఆడడం (Adadam), the forms are the following::
Thus we understand that the morphological richness of one language might vary from
one language to another. Indian languages are generally morphologically rich languages
and therefore morphological analysis of words becomes a very significant task for
Indian languages.
Types of Morphology
1. Inflectional morphology
Deals with word forms of a root, where there is no change in lexical category. For
example, 'played' is an inflection of the root word 'play'. Here, both 'played' and 'play'
are verbs.
2. Derivational morphology
Deals with word forms of a root, where there is a change in the lexical category. For
example, the word form 'happiness' is a derivation of the word 'happy'. Here,
'happiness' is a derived noun form of the adjective 'happy'.
Morphological Features:
All words will have their lexical category attested during morphological analysis.
A noun and pronoun can take suffixes of the following features: gender, number,
person, case
Languag
einput:word output:analysis
Languag einput:word
output:analysis
Hindi हँसी(han rt=हँस(hans), cat=v, gen=fem, num=sg/pl, per=1/2/3
English tense=past, aspect=pft rt=toy,
sii) toys cat=n, num=pl, per=3
'rt' stands for root. 'cat' stands for lexical category. Thev value of lexicat category
can be noun, verb, adjective, pronoun, adverb, preposition. 'gen' stands for
gender. The value of gender can be masculine or feminine.
'num' stands for number. The value of number can be singular (sg) or plural (pl).
The value of aspect can be perfect (pft), continuous (cont) or habitual (hab). This
feature is not applicable for verbs.
'case' can be direct or oblique. This feature is applicable for nouns. A case is an oblique
case when a postposition occurs after noun. If no postposition can occur after noun,
then the case is a direct case. This is applicable for hindi but not english as it doesn't
have any postpositions. Some of the postpsitions in hindi are: का(kaa), क (kii), के (ke),
को(ko), म(meM)
Refer below code for word analysis and try for other minimum 10 English words
] Word Generation
Given the root and suffix information, a word can be generated. For example,
Language
input:analysis output:wor d
(PARTB:TOBECOMPLETEDBYSTUDENTS)
(Students must submit the soft copy as per following segments within two hours of the practical. The
soft copy must be uploaded on the ERP or emailed to the concerned lab in charge faculties at the end
of the practical in case the there is no ERP access available)
Roll. No. 34 Name: Ganesh Deepak Sanap
Using a Virtual Lab for word analysis and generation helps to understand word construction
by breaking down words into morphemes and identifying patterns in their formation. This
process also allows for the generation of new words following morphological rules,
demonstrating the systematic nature of language structure.
B.4 Conclusion:
(DOCUMENT WHICH HAVE BEEN UNDERSTOOD DURING 2 HOURS LAB SESSION BY
STUDENT HAVE TO BE PASTED)
Performed and analysed a Word analysis and word generation to study morphology using
Virtual Lab
B.5 Question of Curiosity
1. Choose a typical masculine noun, ending in ’A’, from your language. Write down its
various forms along with various features and their values associated with them.
The Spanish masculine noun "problema" (problem), despite ending in 'a', has the
following forms:
1. Singular: problema
- Features: Masculine, Singular
2. Plural: problemas
- Features: Masculine, Plural
Example sentences:
- Singular: El problema es difícil. (The problem is difficult.) - Plural: Los
problemas son difíciles. (The problems are difficult.) 2.
English has a suffix –en whose use is illustrated in the following lists:
B. When the suffix –en is attached to a word, what part of speech is the resulting
word? Give some specific morphological properties of one of the words in list B, in
order to justify your answer.
This demonstrates that the suffix -en converts adjectives into verbs, with the
resulting verb indicating the action of making something become more like the
adjective.
3. Take one verb from your mother tongue, gloss it (i.e., give the Engish meaning) and
conjugate it in all tenses and aspects and persons.
Ans:-
Present Tense (Simple Present)
You did
(formal/plural) ● 3rd Person Plural: तकेे ले(tē kēlē) – They did
● 2nd Person Plural: तहु क शकता (tumhī karū śakatā) – You will do
(formal/plural)
● 3rd Person Plural: तके शकतात (tē karū śakatāt) – They will do
● 1st Person Singular: मी करत आलो आहे(mī karat ālō āhē) – I have been doing ● 2nd
Person Singular: तकरतू आला आहेस (tū karat ālā āhēs) – You have been doing
(informal)
● 3rd Person Singular: तो/ती/तकेरत आलेआहे(tō/tī/tē karat ālē āhē) – He/She/It has
been doing
● 1st Person Plural: आ ह करत आलो आहे(āṁmhī karat ālō āhē) – We have been doing
● 2nd Person Plural: तहु करत आला आहात (tumhī karat ālā āhāt) – You have been doing
(formal/plural)
● 3rd Person Plural: तकेरत आलेआहेत (tē karat ālē āhēta) – They have been doing
Are the words ending with ‘er’/’or’ have some common features?
Common Features are
Part of Speech:
● List 1: These words are adjectives (e.g., "taller," "shorter," "higher," "lower," "smarter"). ● List 2:
These words are nouns (e.g., "mower," "teacher," "sailor," "caller," "operator"). ● List 3: These
words are nouns (e.g., "never," "cover," "finger," "river").
● List 1: The suffix ‘-er’ is used to form the comparative degree of adjectives. It compares the
quality of nouns in terms of higher, lower, greater, etc.
● List 2: The suffixes ‘-er’ and ‘-or’ are used to form agent nouns. These are nouns that refer
to people or things that perform an action or function (e.g., "mower" (someone who
mows), "teacher" (someone who teaches), "sailor" (someone who sails)).
● List 3: The suffixes ‘-er’ and ‘-or’ in this context are used to form common nouns without
specific action-related meanings. They are often considered to be part of the noun's base form
without indicating an agent or a comparative degree.
● List 1: The ‘-er’ suffix is used with adjectives to create comparative forms (e.g., "taller" means
more tall).
● List 2: The ‘-er’ and ‘-or’ suffixes are used to create agent nouns from verbs or other bases,
indicating someone or something that performs an action (e.g., "teacher" from "teach"). ● List 3:
The ‘-er’ and ‘-or’ suffixes are used to create common nouns, often without a specific agent or
comparative meaning (e.g., "river" does not denote an action but rather a natural feature).
kissed
● Root: kiss
● Suffix: -ed stronger
● Root: strong
● Suffix: -er goodness
● Root: good
● Suffix: -ness teacher
● Root: achieve
● Suffix: -ment
import nltk
from nltk.corpus import wordnet
# Download WordNet data if you haven't already
nltk.download('wordnet')
nltk.download('omw-1.4')
# Define words based on features
words = {
"boy": {
"root": "boy",
"category": "noun",
"number": "singular"
},
"children": {
"root": "child",
"category": "noun",
"number": "plural"
},
"plays": {
"root": "play",
"category": "verb",
"gender": "male",
"number": "singular",
"person": "third",
"tense": "simple-present"
},
"play": {
"root": "play",
"category": "verb",
"gender": "male",
"number": "singular",
"person": "first",
"tense": "simple-present"
}
}
analysis = [] for
synset in synsets:
analysis.append({ "
Word": word,
"Definition": synset.definition(),
"Examples": synset.examples(),
"Synonyms": synset.lemma_names(),
"Antonyms": [antonym.name() for lemma in synset.lemmas() for antonym in
lemma.antonyms()]
}) return
analysis