0% found this document useful (0 votes)

3 views32 pages

Nlp Lab Manual

The document outlines various experiments using Python and the NLTK library for natural language processing tasks. It includes code examples for tokenization, stop word removal, stemming, word analysis, word sense disambiguation, and installation instructions for NLTK. Each experiment demonstrates specific NLP techniques and provides sample outputs for clarity.

Uploaded by

srivarsha013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views32 pages

Nlp Lab Manual

Uploaded by

srivarsha013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 32

EXPERIMENT 1

1Q) write a python program to perform following tasks on

text a)Tokenization b) Stop Word Removal

a)Tokenization

import nltk
nltk.download('punkt_tab')
from nltk.tokenize import sent_tokenize, word_tokenize
text = "Natural language processing (NPL) is a field"
print(sent_tokenize(text))
print(word_tokenize(text))

output:

['Natural language processing (NPL) is a field']

['Natural', 'language', 'processing', '(', 'NPL', ')', 'is', 'a', 'field']

b)Stop Word Removal

Stopwords:

import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
print(stopwords.words('english'))
output:

['a', 'about', 'above', 'after', 'again', 'against', 'ain', 'all', 'am', 'an', 'and', 'any', 'are', 'aren', "aren't", 'as', 'at', 'be',
'because', 'been', 'before', 'being', 'below', 'between', 'both', 'but', 'by', 'can', 'couldn', "couldn't", 'd', 'did',
'didn', "didn't", 'do', 'does', 'doesn', "doesn't", 'doing', 'don', "don't", 'down', 'during', 'each', 'few', 'for',
'from', 'further', 'had', 'hadn', "hadn't", 'has', 'hasn', "hasn't", 'have', 'haven', "haven't", 'having', 'he', "he'd",
"he'll", 'her', 'here', 'hers', 'herself', "he's", 'him', 'himself', 'his', 'how', 'i', "i'd", 'if', "i'll", "i'm", 'in', 'into', 'is',
'isn', "isn't", 'it', "it'd", "it'll", "it's", 'its', 'itself', "i've", 'just', 'll', 'm', 'ma', 'me', 'mightn', "mightn't", 'more',
'most', 'mustn', "mustn't", 'my', 'myself', 'needn', "needn't", 'no', 'nor', 'not', 'now', 'o', 'of', 'off', 'on', 'once',
'only', 'or', 'other', 'our', 'ours', 'ourselves', 'out', 'over', 'own', 're', 's', 'same', 'shan', "shan't", 'she', "she'd",
"she'll", "she's", 'should', 'shouldn', "shouldn't", "should've", 'so', 'some', 'such', 't', 'than', 'that', "that'll",
'the', 'their', 'theirs', 'them', 'themselves', 'then', 'there', 'these', 'they', "they'd", "they'll", "they're", "they've",
'this', 'those', 'through', 'to', 'too', 'under', 'until', 'up', 've', 'very', 'was', 'wasn', "wasn't", 'we', "we'd", "we'll",
"we're", 'were', 'weren', "weren't", "we've", 'what', 'when', 'where', 'which', 'while', 'who', 'whom', 'why',
'will', 'with', 'won', "won't", 'wouldn', "wouldn't", 'y', 'you', "you'd", "you'll", 'your', "you're", 'yours',
'yourself', 'yourselves', "you've"]

Stopwords removal in given text:

import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

def preprocess_text(text):
tokens = word_tokenize(text)
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word.lower() not in
stop_words]

return filtered_tokens

def main():
text = "NLTK is a leading platform for building Python programs to
work with human language data."
# The following line was incorrectly indented
preprocessed_text = preprocess_text(text)

print("Original Text:")
print(text)
print("\nTokenized Text:")
print(preprocessed_text)

if __name__ == "__main__":
main()

output:
Original Text:
NLTK is a leading platform for building Python programs to work with human
language data.
Tokenized Text:
['NLTK', 'leading', 'platform', 'building', 'Python', 'programs', 'work', 'human',
'language', 'data', '.']
EXPERIMENT 2
2Q)Write a python program to implement porter stemmer algorithm for
stemming?

import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
def preprocess_text(text):
# Tokenization
tokens = word_tokenize(text)
# Removing stop words
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word.lower() not in
stop_words]

return filtered_tokens

def apply_stemming(tokens):
porter = PorterStemmer()
stemmed_tokens = [porter.stem(token) for token in tokens]
return stemmed_tokens

def main():
text = "NLTK is a leading platform for building Python programs to
work with human language data."
preprocessed_text = preprocess_text(text)
stemmed_text = apply_stemming(preprocessed_text)

print("Original Text:")
print(text)
print("\nTokenized Text:")
print(preprocessed_text)
print("\nStemmed Text:")
print(stemmed_text) # Removed extra space before print

if __name__ == "__main__":
main()

output:

Original Text:
NLTK is a leading platform for building Python programs to work with
human language data.

Tokenized Text:
['NLTK', 'leading', 'platform', 'building', 'Python', 'programs', 'work',
'human', 'language', 'data', '.']

Stemmed Text:
['nltk', 'lead', 'platform', 'build', 'python', 'program', 'work', 'human',
'languag', 'data', '.']
EXPERIMENT 3
3Q) Write nltk Python program that performs word analysis and generation?

import nltk
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer
import random

# Ensure necessary NLTK resources are downloaded

nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

# Initialize WordNet lemmatizer

lemmatizer = WordNetLemmatizer()

def word_analysis(word):
# Get synsets (word senses) from WordNet
synsets = wordnet.synsets(word)

# Print each sense with definition and examples

for i, synset in enumerate(synsets):
print(f"Sense {i+1}: {synset.definition()}")
print(f"Examples: {synset.examples()}")
print()
def word_generation(word):
# Get synsets (word senses) from WordNet
synsets = wordnet.synsets(word)

# Get related words (hyponyms, hypernyms, holonyms)

related_words = set()
for synset in synsets:
for hyponym in synset.hyponyms():
related_words.add(hyponym.lemmas()[0].name())
for hypernym in synset.hypernyms():
related_words.add(hypernym.lemmas()[0].name())
for holonym in synset.part_holonyms():
related_words.add(holonym.lemmas()[0].name())

# Return a random related word

return random.choice(list(related_words))

# Test word analysis and generation

word = "bank"
print(f"Word: {word}")
print("Word Analysis:")
word_analysis(word)
print("Related Word:")
related_word = word_generation(word)
print(related_word)
output:

================================================ RESTART:
C:/Users/admin/nlp3.py
================================================
[nltk_data] Downloading package wordnet to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
Word: bank
Word Analysis:
Sense 1: sloping land (especially the slope beside a body of water)
Examples: ['they pulled the canoe up on the bank', 'he sat on the bank of the river
and watched the currents']

Sense 2: a financial institution that accepts deposits and channels the money into
lending activities
Examples: ['he cashed a check at the bank', 'that bank holds the mortgage on my
home']

Sense 3: a long ridge or pile

Examples: ['a huge bank of earth']

Sense 4: an arrangement of similar objects in a row or in tiers

Examples: ['he operated a bank of switches']

Sense 5: a supply or stock held in reserve for future use (especially in emergencies)
Examples: []

Sense 6: the funds held by a gambling house or the dealer in some gambling games
Examples: ['he tried to break the bank at Monte Carlo']

Sense 7: a slope in the turn of a road or track; the outside is higher than the inside
in order to reduce the effects of centrifugal force
Examples: []

Sense 8: a container (usually with a slot in the top) for keeping money at home
Examples: ['the coin bank was empty']

Sense 9: a building in which the business of banking transacted

Examples: ['the bank is on the corner of Nassau and Witherspoon']

Sense 10: a flight maneuver; aircraft tips laterally about its longitudinal axis
(especially in turning)
Examples: ['the plane went into a steep bank']

Sense 11: tip laterally

Examples: ['the pilot had to bank the aircraft']

Sense 12: enclose with a bank

Examples: ['bank roads']

Sense 13: do business with a bank or keep an account at a bank

Examples: ['Where do you bank in this town?']

Sense 14: act as the banker in a game or in gambling

Examples: []

Sense 15: be in the banking business

Examples: []

Sense 16: put into a bank account

Examples: ['She deposits her paycheck every month']

Sense 17: cover with ashes so to control the rate of burning

Examples: ['bank a fire']

Sense 18: have confidence or faith in

Examples: ['We can trust in God', 'Rely on your friends', 'bank on your good
education', "I swear by my grandmother's recipes"]

Related Word:
give
EXPERIMENT 4
4Q)Create a sample list for atleast 5 words with ambiguous sensesand python
program to implement WSD?

import nltk
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer

# Ensure necessary NLTK resources are downloaded

nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

# Initialize WordNet lemmatizer

lemmatizer = WordNetLemmatizer()

def get_word_senses(word):
# Get synsets (word senses) from WordNet
synsets = wordnet.synsets(word)

# Print each sense with definition and examples

for i, synset in enumerate(synsets):
print(f"Sense {i+1}: {synset.definition()}")
print(f"Examples: {synset.examples()}")
print()

def whd_program(word, context):

# Tokenize context
tokens = nltk.word_tokenize(context)

# Tag parts of speech

tagged = nltk.pos_tag(tokens)

# Get synsets for word in context

synsets = wordnet.synsets(word)

# Initialize best sense and max similarity

best_sense = None
max_similarity = 0

# Iterate through synsets

for synset in synsets:
# Get definition and examples for synset
definition = synset.definition()
examples = synset.examples()

# Calculate similarity between context and synset definition

similarity = calculate_similarity(context, definition)

# Update best sense if similarity is higher

if similarity > max_similarity:
max_similarity = similarity
best_sense = synset
# Return best sense
return best_sense

def calculate_similarity(context, definition):

# Tokenize context and definition
context_tokens = nltk.word_tokenize(context)
definition_tokens = nltk.word_tokenize(definition)

# Calculate Jaccard similarity

similarity = len(set(context_tokens) & set(definition_tokens)) /
len(set(context_tokens) | set(definition_tokens))

return similarity

# Test WHD program

words = ["bank", "bat", "cloud", "spring", "saw"]
contexts = [
"I went to the bank to deposit my paycheck.",
"The bat flew through the dark cave.",
"The company uses cloud computing to store its data.",
"The spring season is my favorite time of year.",
"I saw the movie last night."
]

for word, context in zip(words, contexts):

print(f"Word: {word}")
print(f"Context: {context}")
best_sense = whd_program(word, context)
print(f"Best sense: {best_sense.definition()}")
print()

output:
======================== RESTART: C:/Users/admin/nlp.py
========================
[nltk_data] Downloading package wordnet to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
Word: bank
Context: I went to the bank to deposit my paycheck.
Best sense: cover with ashes so to control the rate of burning

Word: bat
Context: The bat flew through the dark cave.
Best sense: use a bat

Word: cloud
Context: The company uses cloud computing to store its data.
Best sense: billow up in the form of a cloud

Word: spring
Context: The spring season is my favorite time of year.
Best sense: the season of growth

Word: saw
Context: I saw the movie last night.
Best sense: cut with a saw

EXPERIMENT 5
5Q) Install NLK tool kit and perform stemming?
Install NLK tool kit
Here are the steps to install the NLTK toolkit in Python:
## Step 1: Install NLTK using pip
1. Open your terminal or command prompt.
2. Type the following command: pip install nltk
3. Press Enter to run the command.

## Step 2: Import NLTK in Python

1. Open a Python interpreter or create a new Python file.
2. Import the NLTK library: import nltk
## Step 3: Download NLTK Data
1. Use the NLTK downloader: nltk.download()
2. Download the entire corpus or select specific packages.

Some popular NLTK packages to download:

- nltk.download('punkt') for sentence tokenization

- nltk.download('wordnet') for word sense disambiguation
- nltk.download('averaged_perceptron_tagger') for part-of-speech tagging

## Step 4: Verify NLTK Installation

1. Run a simple NLTK command: nltk.word_tokenize("Hello, world!")
2. If NLTK is installed correctly, you should see the output: ['Hello', ',', 'world', '!']

That's it! You should now have NLTK installed and be ready to start working with
natural language processing tasks.

perform stemming:
import nltk
from nltk.stem import PorterStemmer, WordNetLemmatizer

# Ensure necessary NLTK resources are downloaded

nltk.download('wordnet')

# Initialize stemmers
porter_stemmer = PorterStemmer()
wordnet_lemmatizer = WordNetLemmatizer()

def perform_stemming(word):
# Perform stemming using Porter Stemmer
porter_stem = porter_stemmer.stem(word)
print(f"Porter Stem: {porter_stem}")

# Perform lemmatization using WordNet Lemmatizer

wordnet_lemmatize = wordnet_lemmatizer.lemmatize(word)
print(f"WordNet Lemmatize: {wordnet_lemmatize}")

# Test stemming
word = "orbiting"
print(f"Original Word: {word}")
perform_stemming(word)

output:

================================================ RESTART:
C:/Users/admin/nlp5.py
================================================
[nltk_data] Downloading package wordnet to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
Original Word: orbiting
Porter Stem: orbit
WordNet Lemmatize: orbiting

EXPERIMENT 6
6Q) Create sample list of atleast 10 wordsPOS tagging and find the POS for
any given word?
Here's a list of 10 words with their corresponding Part-of-Speech (POS) tags:
Word POS Tag
--------------------------------
Run VB (Verb)
Dog NN (Noun)
Happy JJ (Adjective)
Quickly RB (Adverb)
Eat VB (Verb)
Big JJ (Adjective)
Car NN (Noun)
Slowly RB (Adverb)
Think VB (Verb)
Beautiful JJ (Adjective)
Here's a Python code using NLTK to perform POS tagging:

import nltk

# Ensure necessary NLTK resources are downloaded

nltk.download('averaged_perceptron_tagger')
nltk.download('punkt')

def perform_pos_tagging(sentence):
# Tokenize sentence
tokens = nltk.word_tokenize(sentence)

# Perform POS tagging

tagged = nltk.pos_tag(tokens)

return tagged

# Test POS tagging

sentence = "The dog runs quickly."
tagged = perform_pos_tagging(sentence)
print(tagged)
output:
================================================ RESTART:
C:/Users/admin/nlp6.py
================================================
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
[('The', 'DT'), ('dog', 'NN'), ('runs', 'VBZ'), ('quickly', 'RB'), ('.', '.')]

EXPERIMENT 7
7Q)Write a python program to
a)Perform Morphological Analysis is using NLTK library
b)Generate n-grams using NLTK N-Grams library.
c)Implement N-Grams Smoothing

a)Perform Morphological Analysis is using NLTK library:

import nltk
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

# Ensure necessary NLTK resources are downloaded

nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

def perform_morphological_analysis(word):
# Initialize lemmatizer
lemmatizer = WordNetLemmatizer()

# Perform part-of-speech tagging

pos_tag = nltk.pos_tag([word])[0][1]

# Perform lemmatization
if wordnet_tag:
lemma = lemmatizer.lemmatize(word, pos=wordnet_tag)
else:
lemma = word

return pos_tag, lemma

# Test morphological analysis

word = "running"
pos_tag, lemma = perform_morphological_analysis(word)
print(f"Word: {word}")
print(f"Part-of-speech tag: {pos_tag}")
print(f"Lemma: {lemma}")

output:
=============================================== RESTART:
C:/Users/admin/nlp7a.py
===============================================
[nltk_data] Downloading package wordnet to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
Word: running
Part-of-speech tag: VBG
Lemma: run

b)Generate n-grams using NLTK N-Grams library:

import nltk
from nltk.util import ngrams
from nltk.tokenize import word_tokenize

# Ensure necessary NLTK resources are downloaded

nltk.download('punkt')

def generate_ngrams(text, n):

# Tokenize text
tokens = word_tokenize(text)

# Generate n-grams
n_grams = list(ngrams(tokens, n))

return n_grams
# Test n-gram generation
text = "This is a sample text for generating n-grams."
n=3

n_grams = generate_ngrams(text, n)
print(f"{n}-grams:")
for n_gram in n_grams:
print(n_gram)
output:
============================================== RESTART:
C:/Users/admin/nlp7b,py.py
==============================================
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
3-grams:
('This', 'is', 'a')
('is', 'a', 'sample')
('a', 'sample', 'text')
('sample', 'text', 'for')
('text', 'for', 'generating')
('for', 'generating', 'n-grams')
('generating', 'n-grams', '.')

c)Implement N-Grams Smoothing:

import nltk
from nltk.util import ngrams
from nltk.tokenize import word_tokenize
from nltk.probability import FreqDist

# Ensure necessary NLTK resources are downloaded

nltk.download('punkt')

def calculate_ngram_probabilities(text, n):

# Tokenize text
tokens = word_tokenize(text)

# Generate n-grams
n_grams = list(ngrams(tokens, n))

# Calculate n-gram probabilities

n_gram_freq_dist = FreqDist(n_grams)
n_gram_probabilities = {n_gram: freq / len(n_grams) for n_gram, freq in
n_gram_freq_dist.items()}

return n_gram_probabilities

def smooth_ngram_probabilities(n_gram_probabilities, alpha):

# Apply Laplace smoothing
smoothed_n_gram_probabilities = {n_gram: (prob + alpha) / (1 + alpha *
len(n_gram_probabilities)) for n_gram, prob in n_gram_probabilities.items()}
return smoothed_n_gram_probabilities

# Test n-gram smoothing

text = "This is a sample text for generating n-grams. This text is just a sample."
n=2
alpha = 0.1

n_gram_probabilities = calculate_ngram_probabilities(text, n)
smoothed_n_gram_probabilities =
smooth_ngram_probabilities(n_gram_probabilities, alpha)

print("N-gram Probabilities:")
for n_gram, prob in n_gram_probabilities.items():
print(f"{n_gram}: {prob}")

print("\nSmoothed N-gram Probabilities:")

for n_gram, prob in smoothed_n_gram_probabilities.items():
print(f"{n_gram}: {prob}")

output:
================================================ RESTART:
C:/Users/admin/nlp7c.py
===============================================
[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\admin\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
N-gram Probabilities:
('This', 'is'): 0.06666666666666667
('is', 'a'): 0.06666666666666667
('a', 'sample'): 0.13333333333333333
('sample', 'text'): 0.06666666666666667
('text', 'for'): 0.06666666666666667
('for', 'generating'): 0.06666666666666667
('generating', 'n-grams'): 0.06666666666666667
('n-grams', '.'): 0.06666666666666667
('.', 'This'): 0.06666666666666667
('This', 'text'): 0.06666666666666667
('text', 'is'): 0.06666666666666667
('is', 'just'): 0.06666666666666667
('just', 'a'): 0.06666666666666667
('sample', '.'): 0.06666666666666667

Smoothed N-gram Probabilities:

('This', 'is'): 0.06944444444444445
('is', 'a'): 0.06944444444444445
('a', 'sample'): 0.09722222222222221
('sample', 'text'): 0.06944444444444445
('text', 'for'): 0.06944444444444445
('for', 'generating'): 0.06944444444444445
('generating', 'n-grams'): 0.06944444444444445
('n-grams', '.'): 0.06944444444444445
('.', 'This'): 0.06944444444444445
('This', 'text'): 0.06944444444444445
('text', 'is'): 0.06944444444444445
('is', 'just'): 0.06944444444444445
('just', 'a'): 0.06944444444444445
('sample', '.'): 0.06944444444444445

EXPERIMENT 8
8) Using NLTK packageto convert audio file to text and text file toaudio files?
Program to Convert audio file to text:
## Audio to Text
#This program uses the speech_recognition library to convert an audio file to text.
import speech_recognition as sr
from nltk.tokenize import word_tokenize

def audio_to_text(audio_file):
# Create a speech recognition object
r = sr.Recognizer()

# Use the audio file as input

with sr.AudioFile(audio_file) as source:
audio = r.record(source)

# Transcribe the audio

try:
text = r.recognize_google(audio)
return text
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
return None
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service;
{0}".format(e))
return None

# Test the function

audio_file = "example.wav"
text = audio_to_text(audio_file)
print("Transcribed Text:")
print(text)

# Tokenize the text

tokens = word_tokenize(text)
print("\nTokenized Text:")
print(tokens)
Output:
Transcribed Text:
Hello, how are you?

Tokenized Text:
['Hello', ',', 'how', 'are', 'you', '?']

Program to Convert text file to audio:

## Text to Audio
#This program uses the gTTS library to convert a text file to an audio file.

from gtts import gTTS

from nltk.tokenize import word_tokenize
def text_to_audio(text, audio_file):
# Create a gTTS object
tts = gTTS(text=text, lang='en')

# Save the audio to a file

tts.save(audio_file)

# Test the function

text = "Hello, world! This is an example text."
audio_file = "example.mp3"
text_to_audio(text, audio_file)

# Tokenize the text

tokens = word_tokenize(text)
print("Tokenized Text:")
print(tokens)

Output:
Tokenized Text:
['Hello', ',', 'world', '!', 'This', 'is', 'an', 'example', 'text', '.']

Note:

- Make sure to install the required libraries by running pip install

SpeechRecognition gTTS nltk.
- Replace "example.wav" and "example.mp3" with your actual audio file paths.
- The speech_recognition library requires an active internet connection to work.
- The gTTS library requires an active internet connection to work.

Apex English Book PDF
No ratings yet
Apex English Book PDF
230 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
R22 Nlp Python Programs
No ratings yet
R22 Nlp Python Programs
15 pages
NLP LAB MANUAL
No ratings yet
NLP LAB MANUAL
17 pages
NLP Op
No ratings yet
NLP Op
16 pages
1
No ratings yet
1
13 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
7 idf
No ratings yet
7 idf
5 pages
NLP PRATICAL
No ratings yet
NLP PRATICAL
14 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
20BCP123 - NLP Lab Manual
No ratings yet
20BCP123 - NLP Lab Manual
45 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
33 pages
NLP_record[1][1] (1)
No ratings yet
NLP_record[1][1] (1)
23 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
20BCP112 - NLP Lab - LAB - Manual
No ratings yet
20BCP112 - NLP Lab - LAB - Manual
65 pages
NLP 3
No ratings yet
NLP 3
3 pages
Nlp Lab Manual
No ratings yet
Nlp Lab Manual
21 pages
7.TextAnalysis
No ratings yet
7.TextAnalysis
3 pages
AI Lab Manual aktu
No ratings yet
AI Lab Manual aktu
11 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
AIML_P4
No ratings yet
AIML_P4
12 pages
NLP Lecture2 Text Pre Processing
No ratings yet
NLP Lecture2 Text Pre Processing
54 pages
NLP (1)
No ratings yet
NLP (1)
12 pages
NLP Programs
No ratings yet
NLP Programs
5 pages
a7 dsbda sana
No ratings yet
a7 dsbda sana
15 pages
CH4
No ratings yet
CH4
15 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
NLP Lab Manual (1)
No ratings yet
NLP Lab Manual (1)
19 pages
DSBD 7 Ass
No ratings yet
DSBD 7 Ass
9 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
UBC Summer School in NLP - VSP 2019 Lecture 10
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 10
33 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
Rajeev Mishra 20 SCSE1180087
No ratings yet
Rajeev Mishra 20 SCSE1180087
29 pages
Lab2 IR
No ratings yet
Lab2 IR
16 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
NLP Intro
No ratings yet
NLP Intro
15 pages
Text Processing
No ratings yet
Text Processing
16 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP Using Python
No ratings yet
NLP Using Python
4 pages
ASTW RA03 PracticalManual
No ratings yet
ASTW RA03 PracticalManual
18 pages
NLP Assignment(917722H031)
No ratings yet
NLP Assignment(917722H031)
18 pages
NLP Record
No ratings yet
NLP Record
15 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Text Mining Basics
No ratings yet
Text Mining Basics
16 pages
123nlp456
No ratings yet
123nlp456
4 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
24 pages
p4
No ratings yet
p4
10 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
Lab1 IR
No ratings yet
Lab1 IR
14 pages
Batch 2
No ratings yet
Batch 2
13 pages
ir manual
No ratings yet
ir manual
53 pages
Lab - Manual - IR - BE AI&DS CL II
No ratings yet
Lab - Manual - IR - BE AI&DS CL II
38 pages
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
100% (1)
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
20 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
Sree017 NLP
No ratings yet
Sree017 NLP
3 pages
NLP Lab Programs
No ratings yet
NLP Lab Programs
18 pages
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
From Everand
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
Shari Eskenas
5/5 (1)
Summary of Verb Tenses
100% (1)
Summary of Verb Tenses
4 pages
Visual Aids Make A Big Impact On ESL Students - A Guidebook For ES
No ratings yet
Visual Aids Make A Big Impact On ESL Students - A Guidebook For ES
119 pages
Industrial Attachment Report Writing Guidelines
No ratings yet
Industrial Attachment Report Writing Guidelines
3 pages
Nihongo Reviewer
No ratings yet
Nihongo Reviewer
5 pages
6-Lecture Six (Chapter Four-Semantic Analysis)
No ratings yet
6-Lecture Six (Chapter Four-Semantic Analysis)
25 pages
Mixed Languages 15 Case Studies in Langu
No ratings yet
Mixed Languages 15 Case Studies in Langu
12 pages
TRẮC NGHIỆM KIẾN THỨC TRỌNG TÂM 1
No ratings yet
TRẮC NGHIỆM KIẾN THỨC TRỌNG TÂM 1
2 pages
Understanding and Using English Grammar (Workbook) 4th Edition Betty S. Azar all chapter instant download
100% (1)
Understanding and Using English Grammar (Workbook) 4th Edition Betty S. Azar all chapter instant download
57 pages
Panghalip Panao
No ratings yet
Panghalip Panao
8 pages
12A_barem_2025_TM
No ratings yet
12A_barem_2025_TM
3 pages
ENG 13 - Lesson 1 Fundamentals of Reading Academic Texts
No ratings yet
ENG 13 - Lesson 1 Fundamentals of Reading Academic Texts
27 pages
Basic: Fayol Inc
No ratings yet
Basic: Fayol Inc
251 pages
Present Perfect Continuous Tense
No ratings yet
Present Perfect Continuous Tense
2 pages
Prepositions of Place Practice 1
No ratings yet
Prepositions of Place Practice 1
18 pages
Listen To The Following Audio and Fill The Spaces (30/100, 2 Points Each)
No ratings yet
Listen To The Following Audio and Fill The Spaces (30/100, 2 Points Each)
5 pages
Around The Neighborhood - What Do You Know About The People and The Places in Your Neighborhood?
No ratings yet
Around The Neighborhood - What Do You Know About The People and The Places in Your Neighborhood?
4 pages
Ielts Science Result
No ratings yet
Ielts Science Result
7 pages
PARTICIPLES: - ED' vs. - ING': Grammar Worksheet
100% (1)
PARTICIPLES: - ED' vs. - ING': Grammar Worksheet
3 pages
Chapter 7-The Phoneme Feb 08
No ratings yet
Chapter 7-The Phoneme Feb 08
19 pages
Asko Parpola On Indus Script
No ratings yet
Asko Parpola On Indus Script
2 pages
Katherine Lesson Plan
No ratings yet
Katherine Lesson Plan
8 pages
Present Tenses - Revision
No ratings yet
Present Tenses - Revision
2 pages
DAILY SHORT DISCUSSION N°9
No ratings yet
DAILY SHORT DISCUSSION N°9
10 pages
Pronunciation Workshop
No ratings yet
Pronunciation Workshop
1 page
Guide Number 2 Activities: Do You Like Your Job?
No ratings yet
Guide Number 2 Activities: Do You Like Your Job?
3 pages
Dan Nhap Ngon Ngu - 8h00
No ratings yet
Dan Nhap Ngon Ngu - 8h00
6 pages
Introduction To Chineseenglish Translation Zinan Ye Lynette Xiaojing Shi download
No ratings yet
Introduction To Chineseenglish Translation Zinan Ye Lynette Xiaojing Shi download
89 pages
Phantasy Star Portable 2 Infinity - Japanese To English Translation Guide by DeviFoxx PDF
No ratings yet
Phantasy Star Portable 2 Infinity - Japanese To English Translation Guide by DeviFoxx PDF
236 pages
1 SB
No ratings yet
1 SB
169 pages

Nlp Lab Manual

Uploaded by

Nlp Lab Manual

Uploaded by

EXPERIMENT 1

1Q) write a python program to perform following tasks on

['Natural language processing (NPL) is a field']

b)Stop Word Removal

Stopwords removal in given text:

# Ensure necessary NLTK resources are downloaded

# Initialize WordNet lemmatizer

# Print each sense with definition and examples

# Get related words (hyponyms, hypernyms, holonyms)

# Return a random related word

# Test word analysis and generation

Sense 3: a long ridge or pile

Sense 4: an arrangement of similar objects in a row or in tiers

Sense 9: a building in which the business of banking transacted

Sense 11: tip laterally

Sense 12: enclose with a bank

Sense 13: do business with a bank or keep an account at a bank

Sense 14: act as the banker in a game or in gambling

Sense 15: be in the banking business

Sense 16: put into a bank account

Sense 17: cover with ashes so to control the rate of burning

Sense 18: have confidence or faith in

# Ensure necessary NLTK resources are downloaded

# Initialize WordNet lemmatizer

# Print each sense with definition and examples

def whd_program(word, context):

# Tag parts of speech

# Get synsets for word in context

# Initialize best sense and max similarity

# Iterate through synsets

# Calculate similarity between context and synset definition

# Update best sense if similarity is higher

def calculate_similarity(context, definition):

# Calculate Jaccard similarity

# Test WHD program

for word, context in zip(words, contexts):

## Step 2: Import NLTK in Python

Some popular NLTK packages to download:

- nltk.download('punkt') for sentence tokenization

## Step 4: Verify NLTK Installation

# Ensure necessary NLTK resources are downloaded

# Perform lemmatization using WordNet Lemmatizer

# Ensure necessary NLTK resources are downloaded

# Perform POS tagging

# Test POS tagging

a)Perform Morphological Analysis is using NLTK library:

# Ensure necessary NLTK resources are downloaded

# Perform part-of-speech tagging

# Map Penn Treebank tag to WordNet tag

return pos_tag, lemma

# Test morphological analysis

b)Generate n-grams using NLTK N-Grams library:

# Ensure necessary NLTK resources are downloaded

def generate_ngrams(text, n):

c)Implement N-Grams Smoothing:

# Ensure necessary NLTK resources are downloaded

def calculate_ngram_probabilities(text, n):

# Calculate n-gram probabilities

def smooth_ngram_probabilities(n_gram_probabilities, alpha):

# Test n-gram smoothing

print("\nSmoothed N-gram Probabilities:")

Smoothed N-gram Probabilities:

# Use the audio file as input

# Transcribe the audio

# Test the function

# Tokenize the text

Program to Convert text file to audio:

from gtts import gTTS

# Save the audio to a file

# Test the function

# Tokenize the text

- Make sure to install the required libraries by running pip install

You might also like