0% found this document useful (0 votes)
10 views67 pages

dhara_NLP_Practical

The document outlines a series of practical exercises related to Natural Language Toolkit (NLTK) and its integration with various Python modules such as Pyttsx3 and Speech Recognition. It includes tasks on text-to-speech conversion, tokenization, and audio file manipulation, along with installation commands and code snippets. Additionally, it provides a schedule with dates for each practical exercise and the corresponding signatures.

Uploaded by

uzmasaiyed713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views67 pages

dhara_NLP_Practical

The document outlines a series of practical exercises related to Natural Language Toolkit (NLTK) and its integration with various Python modules such as Pyttsx3 and Speech Recognition. It includes tasks on text-to-speech conversion, tokenization, and audio file manipulation, along with installation commands and code snippets. Additionally, it provides a schedule with dates for each practical exercise and the corresponding signatures.

Uploaded by

uzmasaiyed713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

IN D E X

Sr. no Practical Date Signature

NLTK with Pyttsx3 module


• Volume
1 • Voice 19-12-2023
• Rate
• Save to file

NLTK with Speech Recognition module


• Microphone
2 03-01-2024
• PyAudio

• Play Sound

Corpus
• Brown
3
28-03-2024
• Stop words

Tokenization

• Word Tokenition
• Sentence Tokenization
4 30-01-2024
• Blank Function

Bigrmans

5 • Unigrams Module
• Bigrams Module 06-03-2024
• Trigrams Module
• Ngrams Module
Stemming
6
• Porter Stemmer Module 15-03-2024
• Lemmatization Module
• Functools Using Reduce Module
POS (Part Of Speech) tagging
7 • Upenn_tagset Method 03-04-2024

SpaCy
8
• Load Method
• Doc Container 08-04-2024

• Split Method
• Sentence Boundary Detection (SBD)
• Token Attributes
4/23/24, 12:24 PM NLTK.16.01.23

In [1]: !pip install pyttsx3

Defaulting to user installation because normal site-packages is not writeable


Requirement already satisfied: pyttsx3 in c:\users\dhara\appdata\roaming\python\py
thon311\site-packages (2.90)
Requirement already satisfied: comtypes in c:\users\dhara\appdata\roaming\python\p
ython311\site-packages (from pyttsx3) (1.2.1)
Requirement already satisfied: pypiwin32 in c:\users\dhara\appdata\roaming\python
\python311\site-packages (from pyttsx3) (223)
Requirement already satisfied: pywin32 in c:\programdata\anaconda3\lib\site-packag
es (from pyttsx3) (305.1)

In [2]: import pyttsx3

In [3]: engine = pyttsx3.init()


engine.say("hello Dhara!")
engine.runAndWait()

In [4]: import nltk as nl

In [5]: print("nltk version:"+ nl.__version__)

nltk version:3.8.1

In [6]: text_speech = pyttsx3.init()

In [7]: answer = input("Which text you want to convert: ") ## by user interface
text_speech.say(answer)
text_speech.runAndWait()

Which text you want to convert: hello

call the text file to speech using nltk

In [8]: import pyttsx3


pyjob = pyttsx3.init()
fo = open(r"C:\\dkp\\nltk..txt")
ip = fo.read()
pyjob.say(ip)
pyjob.runAndWait()

create speaking voice rate and update new voice rate

In [9]: import pyttsx3


engine = pyttsx3.init()
rate = engine.getProperty('rate') #getting details of current speaking rate
print(rate) #printing current voice rate
engine.setProperty('rate',125) #setting up new voice rate

200

In [10]: volume = engine.getProperty('volume') #getting current volume level


print(volume) #printing current volume level
engine.setProperty('volume',1.0) #setting up volume level between 0 to 1

1.0

In [11]: voices = engine.getProperty('voices')


engine.setProperty('voice',voices[1].id)

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 1/16
4/23/24, 12:24 PM NLTK.16.01.23

In [12]: engine.say("guru brahma guru vishnu guru devo maheshvaray guru sakshat parabrahma t
engine.say("my current speaking rate is "+ str(rate))
engine.runAndWait()
engine.stop()

In [13]: engine.save_to_file("hello",'test.mp3')
engine.runAndWait()

create an audio file using nltk and rate the audio

home work

In [14]: pyjob = pyttsx3.init()


fo = open(r"C:\Users\dhara\OneDrive\Desktop\shlok.txt")
ip = fo.read()
pyjob.setProperty("rate",300)
pyjob.setProperty('volume',1) #higher speed and high volume
pyjob.say(ip)
pyjob.runAndWait()

In [15]: pyjob = pyttsx3.init()


fo = open(r"C:\Users\dhara\OneDrive\Desktop\shlok.txt")
ip = fo.read()
pyjob.setProperty("rate",50)
pyjob.setProperty('volume',0.1) #lower speed and lower volume
pyjob.say(ip)
pyjob.runAndWait()

In [16]: pyjob.setProperty("rate",700)
pyjob.setProperty('volume',1) #higher speed and higher volume
pyjob.say(ip)
pyjob.runAndWait()

In [17]: pyjob.setProperty("rate",100)
pyjob.setProperty('volume',1) #normal speed and higher volume
pyjob.say("om bhur bhuvasvah tatsavi turvareniyam bhargodevasyadhimahi dhiyo yo na
pyjob.runAndWait()

In [18]: pyjob.setProperty("rate",100)
pyjob.setProperty('volume',1)
pyjob.say("taro virah pan lage vahalo re valam aavo ne aavo ne ")
pyjob.runAndWait()

In [19]: pyjob.setProperty("rate",100)
pyjob.setProperty('volume',1)
pyjob.say("after seeing all the results we can conclude that our model has better a
pyjob.runAndWait()

In [20]: pyjob.setProperty("rate",100)
pyjob.setProperty('volume',1)
pyjob.say("haayl haayl aam haaltino tha")
pyjob.runAndWait()

convert text file into mp3 file

In [21]: pyjob = pyttsx3.init()


fo = open(r"C:\Users\dhara\OneDrive\Desktop\shlok.txt")
ip = fo.read()
pyjob.setProperty("rate",300)
localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 2/16
4/23/24, 12:24 PM NLTK.16.01.23
pyjob.setProperty('volume',1)
pyjob.save_to_file(ip,r"C:\Users\dhara\OneDrive\Desktop\shlok.mp3")
pyjob.runAndWait()

In [22]: import os

In [23]: os.startfile(r"C:\Users\dhara\OneDrive\Desktop\shlok.mp3")

In [24]: get_ipython().system('pip install playsound')

Defaulting to user installation because normal site-packages is not writeable


Requirement already satisfied: playsound in c:\users\dhara\appdata\roaming\python
\python311\site-packages (1.2.2)

In [25]: import playsound as ps

In [26]: ps.playsound(r"C:\Users\dhara\OneDrive\Desktop\shlok.mp3")

---------------------------------------------------------------------------
PlaysoundException Traceback (most recent call last)
Cell In[26], line 1
----> 1 ps.playsound(r"C:\Users\dhara\OneDrive\Desktop\shlok.mp3")

File ~\AppData\Roaming\Python\Python311\site-packages\playsound.py:35, in _playsou


ndWin(sound, block)
32 return buf.value
34 alias = 'playsound_' + str(random())
---> 35 winCommand('open "' + sound + '" alias', alias)
36 winCommand('set', alias, 'time format milliseconds')
37 durationInMS = winCommand('status', alias, 'length')

File ~\AppData\Roaming\Python\Python311\site-packages\playsound.py:31, in _playsou


ndWin.<locals>.winCommand(*command)
27 windll.winmm.mciGetErrorStringA(errorCode, errorBuffer, 254)
28 exceptionMessage = ('\n Error ' + str(errorCode) + ' for command:'
29 '\n ' + command.decode() +
30 '\n ' + errorBuffer.value.decode())
---> 31 raise PlaysoundException(exceptionMessage)
32 return buf.value

PlaysoundException:
Error 277 for command:
open "C:\Users\dhara\OneDrive\Desktop\shlok.mp3" alias playsound_0.7199283
380254814
A problem occurred in initializing MCI.

In [27]: from playsound import playsound # convert any audio local fil
try:
playsound(r"C:\Users\dhara\Downloads\shlok (online-audio-converter.com).mp3")
except:
playsound(r"C:\Users\dhara\Downloads\shlok (online-audio-converter.com).mp3")

if above codes will not work try below method

In [28]: pip install playsound==1.2.2

Defaulting to user installation because normal site-packages is not writeable


Requirement already satisfied: playsound==1.2.2 in c:\users\dhara\appdata\roaming
\python\python311\site-packages (1.2.2)
Note: you may need to restart the kernel to use updated packages.

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 3/16
4/23/24, 12:24 PM NLTK.16.01.23

In [29]: from playsound import playsound


try:
playsound(r"C:\Users\dhara\Downloads\shlok (online-audio-converter.com).mp3")
except:
pass

In [34]: pip install SpeechRecognition

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 4/16
4/23/24, 12:24 PM NLTK.16.01.23
Defaulting to user installation because normal site-packages is not writeable
Collecting SpeechRecognition
Obtaining dependency information for SpeechRecognition from https://files.python
hosted.org/packages/9e/e9/edd24b7000e209f34b5f7d13daa05337a1c306b120c0b92bb24e4527
d579/SpeechRecognition-3.10.3-py2.py3-none-any.whl.metadata
Downloading SpeechRecognition-3.10.3-py2.py3-none-any.whl.metadata (29 kB)
Requirement already satisfied: requests>=2.26.0 in c:\programdata\anaconda3\lib\si
te-packages (from SpeechRecognition) (2.31.0)
Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\s
ite-packages (from SpeechRecognition) (4.7.1)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\programdata\anaconda
3\lib\site-packages (from requests>=2.26.0->SpeechRecognition) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\programdata\anaconda3\lib\site-p
ackages (from requests>=2.26.0->SpeechRecognition) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\programdata\anaconda3\lib
\site-packages (from requests>=2.26.0->SpeechRecognition) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib
\site-packages (from requests>=2.26.0->SpeechRecognition) (2023.7.22)
Downloading SpeechRecognition-3.10.3-py2.py3-none-any.whl (32.8 MB)
---------------------------------------- 0.0/32.8 MB ? eta -:--:--
---------------------------------------- 0.1/32.8 MB ? eta -:--:--
---------------------------------------- 0.1/32.8 MB ? eta -:--:--
---------------------------------------- 0.2/32.8 MB 1.3 MB/s eta 0:00:25
---------------------------------------- 0.2/32.8 MB 1.3 MB/s eta 0:00:25
---------------------------------------- 0.4/32.8 MB 1.6 MB/s eta 0:00:20
---------------------------------------- 0.4/32.8 MB 1.6 MB/s eta 0:00:20
---------------------------------------- 0.4/32.8 MB 1.6 MB/s eta 0:00:20
---------------------------------------- 0.4/32.8 MB 1.6 MB/s eta 0:00:20
--------------------------------------- 0.5/32.8 MB 1.2 MB/s eta 0:00:28
--------------------------------------- 0.5/32.8 MB 1.2 MB/s eta 0:00:28
--------------------------------------- 0.6/32.8 MB 1.1 MB/s eta 0:00:30
--------------------------------------- 0.6/32.8 MB 1.2 MB/s eta 0:00:28
--------------------------------------- 0.8/32.8 MB 1.3 MB/s eta 0:00:25
- -------------------------------------- 0.9/32.8 MB 1.4 MB/s eta 0:00:23
- -------------------------------------- 1.1/32.8 MB 1.6 MB/s eta 0:00:21
- -------------------------------------- 1.2/32.8 MB 1.7 MB/s eta 0:00:20
- -------------------------------------- 1.4/32.8 MB 1.7 MB/s eta 0:00:19
- -------------------------------------- 1.5/32.8 MB 1.8 MB/s eta 0:00:18
-- ------------------------------------- 1.7/32.8 MB 1.9 MB/s eta 0:00:17
-- ------------------------------------- 1.8/32.8 MB 1.9 MB/s eta 0:00:16
-- ------------------------------------- 2.0/32.8 MB 2.0 MB/s eta 0:00:16
-- ------------------------------------- 2.1/32.8 MB 2.1 MB/s eta 0:00:15
-- ------------------------------------- 2.3/32.8 MB 2.1 MB/s eta 0:00:15
-- ------------------------------------- 2.4/32.8 MB 2.2 MB/s eta 0:00:15
--- ------------------------------------ 2.6/32.8 MB 2.2 MB/s eta 0:00:14
--- ------------------------------------ 2.7/32.8 MB 2.2 MB/s eta 0:00:14
--- ------------------------------------ 2.9/32.8 MB 2.3 MB/s eta 0:00:14
--- ------------------------------------ 3.0/32.8 MB 2.3 MB/s eta 0:00:13
--- ------------------------------------ 3.2/32.8 MB 2.3 MB/s eta 0:00:13
---- ----------------------------------- 3.3/32.8 MB 2.4 MB/s eta 0:00:13
---- ----------------------------------- 3.5/32.8 MB 2.4 MB/s eta 0:00:13
---- ----------------------------------- 3.6/32.8 MB 2.4 MB/s eta 0:00:13
---- ----------------------------------- 3.8/32.8 MB 2.4 MB/s eta 0:00:12
---- ----------------------------------- 3.9/32.8 MB 2.5 MB/s eta 0:00:12
---- ----------------------------------- 4.1/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.2/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.4/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.5/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.6/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.8/32.8 MB 2.5 MB/s eta 0:00:12
----- ---------------------------------- 4.9/32.8 MB 2.5 MB/s eta 0:00:11
------ --------------------------------- 5.0/32.8 MB 2.6 MB/s eta 0:00:11
------ --------------------------------- 5.2/32.8 MB 2.6 MB/s eta 0:00:11
------ --------------------------------- 5.3/32.8 MB 2.6 MB/s eta 0:00:11

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 5/16
4/23/24, 12:24 PM NLTK.16.01.23
------ --------------------------------- 5.5/32.8 MB 2.6 MB/s eta 0:00:11
------ --------------------------------- 5.6/32.8 MB 2.6 MB/s eta 0:00:11
------- -------------------------------- 5.8/32.8 MB 2.6 MB/s eta 0:00:11
------- -------------------------------- 5.9/32.8 MB 2.6 MB/s eta 0:00:11
------- -------------------------------- 6.1/32.8 MB 2.7 MB/s eta 0:00:11
------- -------------------------------- 6.3/32.8 MB 2.7 MB/s eta 0:00:10
------- -------------------------------- 6.4/32.8 MB 2.7 MB/s eta 0:00:10
------- -------------------------------- 6.6/32.8 MB 2.7 MB/s eta 0:00:10
-------- ------------------------------- 6.7/32.8 MB 2.7 MB/s eta 0:00:10
-------- ------------------------------- 6.9/32.8 MB 2.7 MB/s eta 0:00:10
-------- ------------------------------- 7.0/32.8 MB 2.7 MB/s eta 0:00:10
-------- ------------------------------- 7.2/32.8 MB 2.7 MB/s eta 0:00:10
-------- ------------------------------- 7.3/32.8 MB 2.7 MB/s eta 0:00:10
--------- ------------------------------ 7.5/32.8 MB 2.7 MB/s eta 0:00:10
--------- ------------------------------ 7.6/32.8 MB 2.7 MB/s eta 0:00:10
--------- ------------------------------ 7.8/32.8 MB 2.8 MB/s eta 0:00:10
--------- ------------------------------ 7.9/32.8 MB 2.7 MB/s eta 0:00:10
--------- ------------------------------ 8.1/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.2/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.3/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.5/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.6/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.8/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 8.9/32.8 MB 2.8 MB/s eta 0:00:09
---------- ----------------------------- 9.0/32.8 MB 2.8 MB/s eta 0:00:09
----------- ---------------------------- 9.2/32.8 MB 2.8 MB/s eta 0:00:09
----------- ---------------------------- 9.3/32.8 MB 2.8 MB/s eta 0:00:09
----------- ---------------------------- 9.5/32.8 MB 2.8 MB/s eta 0:00:09
----------- ---------------------------- 9.6/32.8 MB 2.8 MB/s eta 0:00:09
----------- ---------------------------- 9.8/32.8 MB 2.8 MB/s eta 0:00:09
------------ --------------------------- 9.9/32.8 MB 2.8 MB/s eta 0:00:09
------------ --------------------------- 10.0/32.8 MB 2.8 MB/s eta 0:00:09
------------ --------------------------- 10.2/32.8 MB 2.8 MB/s eta 0:00:09
------------ --------------------------- 10.3/32.8 MB 2.9 MB/s eta 0:00:08
------------ --------------------------- 10.5/32.8 MB 2.9 MB/s eta 0:00:08
------------ --------------------------- 10.6/32.8 MB 3.0 MB/s eta 0:00:08
------------- -------------------------- 10.7/32.8 MB 3.1 MB/s eta 0:00:08
------------- -------------------------- 10.9/32.8 MB 3.1 MB/s eta 0:00:08
------------- -------------------------- 11.1/32.8 MB 3.1 MB/s eta 0:00:07
------------- -------------------------- 11.2/32.8 MB 3.1 MB/s eta 0:00:07
------------- -------------------------- 11.4/32.8 MB 3.1 MB/s eta 0:00:07
------------- -------------------------- 11.4/32.8 MB 3.1 MB/s eta 0:00:07
-------------- ------------------------- 11.6/32.8 MB 3.1 MB/s eta 0:00:07
-------------- ------------------------- 11.7/32.8 MB 3.1 MB/s eta 0:00:07
-------------- ------------------------- 11.9/32.8 MB 3.1 MB/s eta 0:00:07
-------------- ------------------------- 12.0/32.8 MB 3.1 MB/s eta 0:00:07
-------------- ------------------------- 12.2/32.8 MB 3.1 MB/s eta 0:00:07
--------------- ------------------------ 12.3/32.8 MB 3.1 MB/s eta 0:00:07
--------------- ------------------------ 12.4/32.8 MB 3.1 MB/s eta 0:00:07
--------------- ------------------------ 12.5/32.8 MB 3.0 MB/s eta 0:00:07
--------------- ------------------------ 12.7/32.8 MB 3.1 MB/s eta 0:00:07
--------------- ------------------------ 12.9/32.8 MB 3.1 MB/s eta 0:00:07
---------------- ----------------------- 13.1/32.8 MB 3.1 MB/s eta 0:00:07
---------------- ----------------------- 13.4/32.8 MB 3.1 MB/s eta 0:00:07
---------------- ----------------------- 13.6/32.8 MB 3.2 MB/s eta 0:00:07
---------------- ----------------------- 13.7/32.8 MB 3.1 MB/s eta 0:00:07
---------------- ----------------------- 13.9/32.8 MB 3.1 MB/s eta 0:00:07
----------------- ---------------------- 14.1/32.8 MB 3.2 MB/s eta 0:00:06
----------------- ---------------------- 14.3/32.8 MB 3.2 MB/s eta 0:00:06
----------------- ---------------------- 14.6/32.8 MB 3.2 MB/s eta 0:00:06
------------------ --------------------- 14.9/32.8 MB 3.3 MB/s eta 0:00:06
------------------ --------------------- 15.2/32.8 MB 3.3 MB/s eta 0:00:06
------------------ --------------------- 15.4/32.8 MB 3.4 MB/s eta 0:00:06
------------------- -------------------- 15.7/32.8 MB 3.4 MB/s eta 0:00:05

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 6/16
4/23/24, 12:24 PM NLTK.16.01.23
------------------- -------------------- 16.0/32.8 MB 3.5 MB/s eta 0:00:05
------------------- -------------------- 16.2/32.8 MB 3.5 MB/s eta 0:00:05
-------------------- ------------------- 16.5/32.8 MB 3.5 MB/s eta 0:00:05
-------------------- ------------------- 16.8/32.8 MB 3.6 MB/s eta 0:00:05
-------------------- ------------------- 17.0/32.8 MB 3.6 MB/s eta 0:00:05
-------------------- ------------------- 17.2/32.8 MB 3.7 MB/s eta 0:00:05
--------------------- ------------------ 17.5/32.8 MB 3.7 MB/s eta 0:00:05
--------------------- ------------------ 17.8/32.8 MB 3.8 MB/s eta 0:00:04
---------------------- ----------------- 18.1/32.8 MB 3.8 MB/s eta 0:00:04
---------------------- ----------------- 18.4/32.8 MB 3.9 MB/s eta 0:00:04
---------------------- ----------------- 18.6/32.8 MB 4.0 MB/s eta 0:00:04
----------------------- ---------------- 18.9/32.8 MB 4.0 MB/s eta 0:00:04
----------------------- ---------------- 19.2/32.8 MB 4.1 MB/s eta 0:00:04
----------------------- ---------------- 19.4/32.8 MB 4.2 MB/s eta 0:00:04
------------------------ --------------- 19.8/32.8 MB 4.3 MB/s eta 0:00:04
------------------------ --------------- 20.0/32.8 MB 4.4 MB/s eta 0:00:03
------------------------ --------------- 20.3/32.8 MB 4.4 MB/s eta 0:00:03
------------------------ --------------- 20.5/32.8 MB 4.5 MB/s eta 0:00:03
------------------------- -------------- 20.7/32.8 MB 4.5 MB/s eta 0:00:03
------------------------- -------------- 21.0/32.8 MB 4.6 MB/s eta 0:00:03
------------------------- -------------- 21.3/32.8 MB 4.7 MB/s eta 0:00:03
-------------------------- ------------- 21.6/32.8 MB 4.8 MB/s eta 0:00:03
-------------------------- ------------- 21.8/32.8 MB 4.9 MB/s eta 0:00:03
-------------------------- ------------- 22.1/32.8 MB 5.1 MB/s eta 0:00:03
--------------------------- ------------ 22.4/32.8 MB 5.1 MB/s eta 0:00:03
--------------------------- ------------ 22.6/32.8 MB 5.3 MB/s eta 0:00:02
--------------------------- ------------ 22.9/32.8 MB 5.4 MB/s eta 0:00:02
---------------------------- ----------- 23.1/32.8 MB 5.4 MB/s eta 0:00:02
---------------------------- ----------- 23.4/32.8 MB 5.4 MB/s eta 0:00:02
---------------------------- ----------- 23.7/32.8 MB 5.5 MB/s eta 0:00:02
----------------------------- ---------- 24.0/32.8 MB 5.6 MB/s eta 0:00:02
----------------------------- ---------- 24.2/32.8 MB 5.6 MB/s eta 0:00:02
----------------------------- ---------- 24.4/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------ --------- 24.8/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------ --------- 25.0/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------ --------- 25.3/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------- -------- 25.5/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------- -------- 25.7/32.8 MB 5.6 MB/s eta 0:00:02
------------------------------- -------- 25.9/32.8 MB 5.5 MB/s eta 0:00:02
------------------------------- -------- 26.1/32.8 MB 5.5 MB/s eta 0:00:02
-------------------------------- ------- 26.3/32.8 MB 5.5 MB/s eta 0:00:02
-------------------------------- ------- 26.6/32.8 MB 5.5 MB/s eta 0:00:02
-------------------------------- ------- 26.8/32.8 MB 5.5 MB/s eta 0:00:02
-------------------------------- ------- 27.0/32.8 MB 5.4 MB/s eta 0:00:02
--------------------------------- ------ 27.2/32.8 MB 5.4 MB/s eta 0:00:02
--------------------------------- ------ 27.4/32.8 MB 5.4 MB/s eta 0:00:02
--------------------------------- ------ 27.5/32.8 MB 5.3 MB/s eta 0:00:01
--------------------------------- ------ 27.8/32.8 MB 5.3 MB/s eta 0:00:01
---------------------------------- ----- 28.0/32.8 MB 5.3 MB/s eta 0:00:01
---------------------------------- ----- 28.2/32.8 MB 5.2 MB/s eta 0:00:01
---------------------------------- ----- 28.5/32.8 MB 5.2 MB/s eta 0:00:01
---------------------------------- ----- 28.7/32.8 MB 5.2 MB/s eta 0:00:01
----------------------------------- ---- 28.9/32.8 MB 5.2 MB/s eta 0:00:01
----------------------------------- ---- 29.2/32.8 MB 5.2 MB/s eta 0:00:01
----------------------------------- ---- 29.4/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------ --- 29.7/32.8 MB 5.2 MB/s eta 0:00:01
------------------------------------ --- 29.9/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------ --- 30.2/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------- -- 30.4/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------- -- 30.7/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------- -- 30.9/32.8 MB 5.1 MB/s eta 0:00:01
------------------------------------- -- 31.1/32.8 MB 5.1 MB/s eta 0:00:01
-------------------------------------- - 31.4/32.8 MB 5.1 MB/s eta 0:00:01
-------------------------------------- - 31.6/32.8 MB 5.0 MB/s eta 0:00:01

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 7/16
4/23/24, 12:24 PM NLTK.16.01.23
-------------------------------------- - 31.9/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.1/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.3/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.5/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.7/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.8/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.8/32.8 MB 5.0 MB/s eta 0:00:01
--------------------------------------- 32.8/32.8 MB 5.0 MB/s eta 0:00:01
---------------------------------------- 32.8/32.8 MB 4.6 MB/s eta 0:00:00
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.10.3
Note: you may need to restart the kernel to use updated packages.

In [35]: import speech_recognition as sr

In [37]: sr.__version__

'3.10.3'
Out[37]:

In [38]: r = sr.Recognizer()

In [41]: pip install PyAudio

Defaulting to user installation because normal site-packages is not writeable


Collecting PyAudio
Obtaining dependency information for PyAudio from https://files.pythonhosted.or
g/packages/82/d8/f043c854aad450a76e476b0cf9cda1956419e1dacf1062eb9df3c0055abe/PyAu
dio-0.2.14-cp311-cp311-win_amd64.whl.metadata
Downloading PyAudio-0.2.14-cp311-cp311-win_amd64.whl.metadata (2.7 kB)
Downloading PyAudio-0.2.14-cp311-cp311-win_amd64.whl (164 kB)
---------------------------------------- 0.0/164.1 kB ? eta -:--:--
------- -------------------------------- 30.7/164.1 kB 1.3 MB/s eta 0:00:01
--------------------------- ------------ 112.6/164.1 kB 1.3 MB/s eta 0:00:01
---------------------------------------- 164.1/164.1 kB 1.4 MB/s eta 0:00:00
Installing collected packages: PyAudio
Successfully installed PyAudio-0.2.14
Note: you may need to restart the kernel to use updated packages.

In [42]: def record_audio():


with sr.Microphone() as source:
print("Listening...")
audio = recognizer.listen(source)
return audio

In [43]: ##Recognizing Speech


def recognize_speech(audio):
try:
text = recognizer.recognize_google(audio)
print(f"You said: {text}")
except sr.UnknownValueError:
print("Sorry, I couldn't understand that.")
except sr.RequestError:
print("Sorry, there was an error processing your request.")

In [46]: #Putting It All Together


import speech_recognition as sr

recognizer = sr.Recognizer()

def record_audio():
with sr.Microphone() as source:
print("Listening...")

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 8/16
4/23/24, 12:24 PM NLTK.16.01.23
audio = recognizer.listen(source)
return audio

def recognize_speech(audio):
try:
text = recognizer.recognize_google(audio)
print(f"You said: {text}")
except sr.UnknownValueError:
print("Sorry, I couldn't understand that.")
except sr.RequestError:
print("Sorry, there was an error processing your request.")

if __name__ == "__main__":
audio = record_audio()
recognize_speech(audio)

Listening...
You said: hello

In [48]: pip install pocketsphinx

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 9/16
4/23/24, 12:24 PM NLTK.16.01.23
Defaulting to user installation because normal site-packages is not writeableNote:
you may need to restart the kernel to use updated packages.

Collecting pocketsphinx
Obtaining dependency information for pocketsphinx from https://files.pythonhoste
d.org/packages/12/c0/525628542371625d05314efe4ecdadfba51d783b21bc2af2357ac24dc2ce/
pocketsphinx-5.0.3-cp311-cp311-win_amd64.whl.metadata
Downloading pocketsphinx-5.0.3-cp311-cp311-win_amd64.whl.metadata (6.8 kB)
Collecting sounddevice (from pocketsphinx)
Obtaining dependency information for sounddevice from https://files.pythonhoste
d.org/packages/39/ae/5e84220bfca4256e4ca2a62a174636089ab6ff671b5f9ddd7e8238587acd/
sounddevice-0.4.6-py3-none-win_amd64.whl.metadata
Downloading sounddevice-0.4.6-py3-none-win_amd64.whl.metadata (1.4 kB)
Requirement already satisfied: CFFI>=1.0 in c:\programdata\anaconda3\lib\site-pack
ages (from sounddevice->pocketsphinx) (1.15.1)
Requirement already satisfied: pycparser in c:\programdata\anaconda3\lib\site-pack
ages (from CFFI>=1.0->sounddevice->pocketsphinx) (2.21)
Downloading pocketsphinx-5.0.3-cp311-cp311-win_amd64.whl (29.1 MB)
---------------------------------------- 0.0/29.1 MB ? eta -:--:--
---------------------------------------- 0.0/29.1 MB 653.6 kB/s eta 0:00:45
---------------------------------------- 0.1/29.1 MB 1.2 MB/s eta 0:00:25
---------------------------------------- 0.2/29.1 MB 1.9 MB/s eta 0:00:16
--------------------------------------- 0.4/29.1 MB 1.9 MB/s eta 0:00:15
--------------------------------------- 0.5/29.1 MB 2.1 MB/s eta 0:00:14
--------------------------------------- 0.6/29.1 MB 2.1 MB/s eta 0:00:14
--------------------------------------- 0.7/29.1 MB 2.0 MB/s eta 0:00:14
- -------------------------------------- 0.8/29.1 MB 2.1 MB/s eta 0:00:14
- -------------------------------------- 0.9/29.1 MB 2.2 MB/s eta 0:00:14
- -------------------------------------- 1.0/29.1 MB 2.2 MB/s eta 0:00:13
- -------------------------------------- 1.1/29.1 MB 2.3 MB/s eta 0:00:13
- -------------------------------------- 1.2/29.1 MB 2.2 MB/s eta 0:00:13
- -------------------------------------- 1.2/29.1 MB 2.2 MB/s eta 0:00:13
- -------------------------------------- 1.4/29.1 MB 2.1 MB/s eta 0:00:14
-- ------------------------------------- 1.5/29.1 MB 2.1 MB/s eta 0:00:14
-- ------------------------------------- 1.6/29.1 MB 2.1 MB/s eta 0:00:13
-- ------------------------------------- 1.7/29.1 MB 2.1 MB/s eta 0:00:14
-- ------------------------------------- 1.8/29.1 MB 2.1 MB/s eta 0:00:13
-- ------------------------------------- 1.9/29.1 MB 2.1 MB/s eta 0:00:13
-- ------------------------------------- 2.0/29.1 MB 2.2 MB/s eta 0:00:13
-- ------------------------------------- 2.2/29.1 MB 2.2 MB/s eta 0:00:13
--- ------------------------------------ 2.3/29.1 MB 2.2 MB/s eta 0:00:13
--- ------------------------------------ 2.4/29.1 MB 2.2 MB/s eta 0:00:13
--- ------------------------------------ 2.5/29.1 MB 2.2 MB/s eta 0:00:13
--- ------------------------------------ 2.6/29.1 MB 2.2 MB/s eta 0:00:13
--- ------------------------------------ 2.7/29.1 MB 2.2 MB/s eta 0:00:12
--- ------------------------------------ 2.8/29.1 MB 2.2 MB/s eta 0:00:12
---- ----------------------------------- 3.0/29.1 MB 2.2 MB/s eta 0:00:12
---- ----------------------------------- 3.1/29.1 MB 2.2 MB/s eta 0:00:12
---- ----------------------------------- 3.2/29.1 MB 2.2 MB/s eta 0:00:12
---- ----------------------------------- 3.3/29.1 MB 2.3 MB/s eta 0:00:12
---- ----------------------------------- 3.4/29.1 MB 2.3 MB/s eta 0:00:12
---- ----------------------------------- 3.5/29.1 MB 2.3 MB/s eta 0:00:12
----- ---------------------------------- 3.6/29.1 MB 2.3 MB/s eta 0:00:12
----- ---------------------------------- 3.8/29.1 MB 2.3 MB/s eta 0:00:12
----- ---------------------------------- 3.9/29.1 MB 2.3 MB/s eta 0:00:12
----- ---------------------------------- 4.0/29.1 MB 2.3 MB/s eta 0:00:11
----- ---------------------------------- 4.1/29.1 MB 2.3 MB/s eta 0:00:11
----- ---------------------------------- 4.2/29.1 MB 2.3 MB/s eta 0:00:11
----- ---------------------------------- 4.3/29.1 MB 2.3 MB/s eta 0:00:11
------ --------------------------------- 4.4/29.1 MB 2.3 MB/s eta 0:00:11
------ --------------------------------- 4.5/29.1 MB 2.3 MB/s eta 0:00:11
------ --------------------------------- 4.6/29.1 MB 2.3 MB/s eta 0:00:11
------ --------------------------------- 4.8/29.1 MB 2.3 MB/s eta 0:00:11
------ --------------------------------- 4.9/29.1 MB 2.3 MB/s eta 0:00:11

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 10/16
4/23/24, 12:24 PM NLTK.16.01.23
------ --------------------------------- 5.0/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.1/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.2/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.3/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.4/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.6/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.7/29.1 MB 2.3 MB/s eta 0:00:11
------- -------------------------------- 5.8/29.1 MB 2.3 MB/s eta 0:00:11
-------- ------------------------------- 5.9/29.1 MB 2.3 MB/s eta 0:00:11
-------- ------------------------------- 6.0/29.1 MB 2.3 MB/s eta 0:00:10
-------- ------------------------------- 6.1/29.1 MB 2.3 MB/s eta 0:00:10
-------- ------------------------------- 6.3/29.1 MB 2.3 MB/s eta 0:00:10
-------- ------------------------------- 6.4/29.1 MB 2.3 MB/s eta 0:00:10
-------- ------------------------------- 6.5/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 6.6/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 6.7/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 6.8/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 6.9/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 7.1/29.1 MB 2.3 MB/s eta 0:00:10
--------- ------------------------------ 7.2/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.3/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.4/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.5/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.6/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.7/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 7.9/29.1 MB 2.3 MB/s eta 0:00:10
---------- ----------------------------- 8.0/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.0/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.1/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.1/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.2/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.4/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.5/29.1 MB 2.3 MB/s eta 0:00:10
----------- ---------------------------- 8.6/29.1 MB 2.3 MB/s eta 0:00:09
----------- ---------------------------- 8.7/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 8.8/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 8.9/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 9.0/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 9.1/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 9.3/29.1 MB 2.3 MB/s eta 0:00:09
------------ --------------------------- 9.4/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 9.5/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 9.6/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 9.7/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 9.8/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 9.9/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 10.0/29.1 MB 2.3 MB/s eta 0:00:09
------------- -------------------------- 10.2/29.1 MB 2.3 MB/s eta 0:00:09
-------------- ------------------------- 10.3/29.1 MB 2.3 MB/s eta 0:00:09
-------------- ------------------------- 10.4/29.1 MB 2.3 MB/s eta 0:00:09
-------------- ------------------------- 10.5/29.1 MB 2.3 MB/s eta 0:00:08
-------------- ------------------------- 10.6/29.1 MB 2.3 MB/s eta 0:00:08
-------------- ------------------------- 10.8/29.1 MB 2.3 MB/s eta 0:00:08
-------------- ------------------------- 10.9/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.0/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.1/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.2/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.3/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.4/29.1 MB 2.3 MB/s eta 0:00:08
--------------- ------------------------ 11.6/29.1 MB 2.4 MB/s eta 0:00:08
---------------- ----------------------- 11.7/29.1 MB 2.4 MB/s eta 0:00:08
---------------- ----------------------- 11.8/29.1 MB 2.4 MB/s eta 0:00:08
---------------- ----------------------- 11.9/29.1 MB 2.4 MB/s eta 0:00:08
---------------- ----------------------- 12.0/29.1 MB 2.4 MB/s eta 0:00:08

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 11/16
4/23/24, 12:24 PM NLTK.16.01.23
---------------- ----------------------- 12.1/29.1 MB 2.4 MB/s eta 0:00:08
---------------- ----------------------- 12.3/29.1 MB 2.4 MB/s eta 0:00:08
----------------- ---------------------- 12.4/29.1 MB 2.4 MB/s eta 0:00:08
----------------- ---------------------- 12.5/29.1 MB 2.4 MB/s eta 0:00:07
----------------- ---------------------- 12.6/29.1 MB 2.4 MB/s eta 0:00:07
----------------- ---------------------- 12.7/29.1 MB 2.4 MB/s eta 0:00:07
----------------- ---------------------- 12.8/29.1 MB 2.4 MB/s eta 0:00:07
----------------- ---------------------- 13.0/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.1/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.2/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.3/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.4/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.6/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.7/29.1 MB 2.4 MB/s eta 0:00:07
------------------ --------------------- 13.8/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 13.9/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 14.0/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 14.1/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 14.2/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 14.4/29.1 MB 2.4 MB/s eta 0:00:07
------------------- -------------------- 14.5/29.1 MB 2.4 MB/s eta 0:00:07
-------------------- ------------------- 14.6/29.1 MB 2.4 MB/s eta 0:00:07
-------------------- ------------------- 14.7/29.1 MB 2.4 MB/s eta 0:00:07
-------------------- ------------------- 14.8/29.1 MB 2.4 MB/s eta 0:00:07
-------------------- ------------------- 14.8/29.1 MB 2.4 MB/s eta 0:00:07
-------------------- ------------------- 14.9/29.1 MB 2.3 MB/s eta 0:00:07
-------------------- ------------------- 15.0/29.1 MB 2.3 MB/s eta 0:00:07
-------------------- ------------------- 15.1/29.1 MB 2.3 MB/s eta 0:00:06
-------------------- ------------------- 15.2/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.3/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.5/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.6/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.7/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.8/29.1 MB 2.3 MB/s eta 0:00:06
--------------------- ------------------ 15.9/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.0/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.1/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.2/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.4/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.5/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.6/29.1 MB 2.3 MB/s eta 0:00:06
---------------------- ----------------- 16.7/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 16.8/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 16.9/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 17.0/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 17.2/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 17.3/29.1 MB 2.3 MB/s eta 0:00:06
----------------------- ---------------- 17.4/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 17.5/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 17.6/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 17.7/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 17.8/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 18.0/29.1 MB 2.3 MB/s eta 0:00:05
------------------------ --------------- 18.1/29.1 MB 2.3 MB/s eta 0:00:05
------------------------- -------------- 18.2/29.1 MB 2.3 MB/s eta 0:00:05
------------------------- -------------- 18.3/29.1 MB 2.4 MB/s eta 0:00:05
------------------------- -------------- 18.4/29.1 MB 2.4 MB/s eta 0:00:05
------------------------- -------------- 18.5/29.1 MB 2.4 MB/s eta 0:00:05
------------------------- -------------- 18.6/29.1 MB 2.4 MB/s eta 0:00:05
------------------------- -------------- 18.7/29.1 MB 2.4 MB/s eta 0:00:05
------------------------- -------------- 18.9/29.1 MB 2.4 MB/s eta 0:00:05
-------------------------- ------------- 19.0/29.1 MB 2.4 MB/s eta 0:00:05
-------------------------- ------------- 19.1/29.1 MB 2.4 MB/s eta 0:00:05
-------------------------- ------------- 19.2/29.1 MB 2.4 MB/s eta 0:00:05

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 12/16
4/23/24, 12:24 PM NLTK.16.01.23
-------------------------- ------------- 19.3/29.1 MB 2.4 MB/s eta 0:00:05
-------------------------- ------------- 19.4/29.1 MB 2.4 MB/s eta 0:00:05
-------------------------- ------------- 19.6/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 19.7/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 19.8/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 19.9/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 20.0/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 20.1/29.1 MB 2.4 MB/s eta 0:00:04
--------------------------- ------------ 20.3/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.4/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.5/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.6/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.7/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.8/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 20.9/29.1 MB 2.4 MB/s eta 0:00:04
---------------------------- ----------- 21.1/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.2/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.3/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.4/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.5/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.6/29.1 MB 2.4 MB/s eta 0:00:04
----------------------------- ---------- 21.7/29.1 MB 2.4 MB/s eta 0:00:04
------------------------------ --------- 21.9/29.1 MB 2.4 MB/s eta 0:00:04
------------------------------ --------- 22.0/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.1/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.1/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.2/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.3/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------ --------- 22.4/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.4/29.1 MB 2.4 MB/s eta 0:00:03
------------------------------ --------- 22.5/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.6/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.6/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.7/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.7/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.8/29.1 MB 2.3 MB/s eta 0:00:03
------------------------------- -------- 22.8/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 22.9/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 23.0/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 23.0/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 23.1/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 23.2/29.1 MB 2.2 MB/s eta 0:00:03
------------------------------- -------- 23.2/29.1 MB 2.2 MB/s eta 0:00:03
-------------------------------- ------- 23.3/29.1 MB 2.2 MB/s eta 0:00:03
-------------------------------- ------- 23.3/29.1 MB 2.2 MB/s eta 0:00:03
-------------------------------- ------- 23.4/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.5/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.5/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.6/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.7/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.7/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.8/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.9/29.1 MB 2.1 MB/s eta 0:00:03
-------------------------------- ------- 23.9/29.1 MB 2.1 MB/s eta 0:00:03
--------------------------------- ------ 24.0/29.1 MB 2.1 MB/s eta 0:00:03
--------------------------------- ------ 24.1/29.1 MB 2.1 MB/s eta 0:00:03
--------------------------------- ------ 24.1/29.1 MB 2.1 MB/s eta 0:00:03
--------------------------------- ------ 24.2/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.3/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.4/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.4/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.5/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.6/29.1 MB 2.0 MB/s eta 0:00:03
--------------------------------- ------ 24.6/29.1 MB 2.0 MB/s eta 0:00:03

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 13/16
4/23/24, 12:24 PM NLTK.16.01.23
---------------------------------- ----- 24.7/29.1 MB 2.0 MB/s eta 0:00:03
---------------------------------- ----- 24.8/29.1 MB 2.0 MB/s eta 0:00:03
---------------------------------- ----- 24.9/29.1 MB 2.0 MB/s eta 0:00:03
---------------------------------- ----- 24.9/29.1 MB 2.0 MB/s eta 0:00:03
---------------------------------- ----- 25.0/29.1 MB 2.0 MB/s eta 0:00:03
---------------------------------- ----- 25.1/29.1 MB 2.0 MB/s eta 0:00:02
---------------------------------- ----- 25.2/29.1 MB 2.0 MB/s eta 0:00:02
---------------------------------- ----- 25.3/29.1 MB 2.0 MB/s eta 0:00:02
---------------------------------- ----- 25.3/29.1 MB 2.0 MB/s eta 0:00:02
---------------------------------- ----- 25.4/29.1 MB 2.0 MB/s eta 0:00:02
----------------------------------- ---- 25.5/29.1 MB 2.0 MB/s eta 0:00:02
----------------------------------- ---- 25.6/29.1 MB 2.0 MB/s eta 0:00:02
----------------------------------- ---- 25.7/29.1 MB 2.0 MB/s eta 0:00:02
----------------------------------- ---- 25.7/29.1 MB 2.0 MB/s eta 0:00:02
----------------------------------- ---- 25.8/29.1 MB 1.9 MB/s eta 0:00:02
----------------------------------- ---- 25.9/29.1 MB 1.9 MB/s eta 0:00:02
----------------------------------- ---- 26.0/29.1 MB 1.9 MB/s eta 0:00:02
----------------------------------- ---- 26.1/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.2/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.3/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.3/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.4/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.5/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.6/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.7/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.7/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------ --- 26.8/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------- -- 27.0/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------- -- 27.0/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------- -- 27.1/29.1 MB 1.9 MB/s eta 0:00:02
------------------------------------- -- 27.2/29.1 MB 1.9 MB/s eta 0:00:01
------------------------------------- -- 27.3/29.1 MB 1.9 MB/s eta 0:00:01
------------------------------------- -- 27.4/29.1 MB 1.9 MB/s eta 0:00:01
------------------------------------- -- 27.5/29.1 MB 1.9 MB/s eta 0:00:01
------------------------------------- -- 27.6/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 27.7/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 27.8/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 27.9/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 28.0/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 28.0/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 28.1/29.1 MB 1.9 MB/s eta 0:00:01
-------------------------------------- - 28.3/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.4/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.4/29.1 MB 1.9 MB/s eta 0:00:01
--------------------------------------- 28.5/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.7/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.8/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.8/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 28.9/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 29.1/29.1 MB 1.8 MB/s eta 0:00:01
--------------------------------------- 29.1/29.1 MB 1.8 MB/s eta 0:00:01
---------------------------------------- 29.1/29.1 MB 1.8 MB/s eta 0:00:00
Downloading sounddevice-0.4.6-py3-none-win_amd64.whl (199 kB)
---------------------------------------- 0.0/199.7 kB ? eta -:--:--
---------------------- ----------------- 112.6/199.7 kB 3.3 MB/s eta 0:00:01
-------------------------------------- - 194.6/199.7 kB 2.4 MB/s eta 0:00:01
---------------------------------------- 199.7/199.7 kB 2.0 MB/s eta 0:00:00
Installing collected packages: sounddevice, pocketsphinx
Successfully installed pocketsphinx-5.0.3 sounddevice-0.4.6

In [52]: # obtain audio from the microphone


r = sr.Recognizer()
with sr.Microphone() as source:
print("Say something!")

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 14/16
4/23/24, 12:24 PM NLTK.16.01.23
audio = r.listen(source)

# recognize speech using Sphinx


try:
print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
print("Sphinx could not understand audio")
except sr.RequestError as e:
print("Sphinx error; {0}".format(e))

Say something!
Sphinx thinks you said an

In [57]: import speech_recognition as sr

recognizer = sr.Recognizer()

with sr.Microphone() as source:


print("Adjusting noise ")
recognizer.adjust_for_ambient_noise(source, duration=1)
print("Recording for 4 seconds")
recorded_audio = recognizer.listen(source, timeout=4)
print("Done recording")

try:
print("Recognizing the text")
text = recognizer.recognize_google(
recorded_audio,
language="en-US"
)

print("Decoded Text : {}".format(text))

except Exception as ex:

print(ex)

sr.Microphone.list_microphone_names()

Adjusting noise
Recording for 4 seconds
Done recording
Recognizing the text
Decoded Text : get lost

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 15/16
4/23/24, 12:24 PM NLTK.16.01.23
['Microsoft Sound Mapper - Input',
Out[57]:
'Microphone Array (Intel® Smart ',
'Microsoft Sound Mapper - Output',
'Speaker (Realtek(R) Audio)',
'Primary Sound Capture Driver',
'Microphone Array (Intel® Smart Sound Technology for Digital Microphones)',
'Primary Sound Driver',
'Speaker (Realtek(R) Audio)',
'Speaker (Realtek(R) Audio)',
'Microphone Array (Intel® Smart Sound Technology for Digital Microphones)',
'Microphone (Realtek HD Audio Mic input)',
'Speakers 1 (Realtek HD Audio output with SST)',
'Speakers 2 (Realtek HD Audio output with SST)',
'PC Speaker (Realtek HD Audio output with SST)',
'Stereo Mix (Realtek HD Audio Stereo input)',
'Headphones 1 (Realtek HD Audio 2nd output with SST)',
'Headphones 2 (Realtek HD Audio 2nd output with SST)',
'PC Speaker (Realtek HD Audio 2nd output with SST)',
'Microphone Array 1 ()',
'Microphone Array 2 ()',
'Microphone Array 3 ()']

In [56]: import speech_recognition as sr

recognizer = sr.Recognizer()

''' recording the sound '''

with sr.AudioFile(r"C:\Users\dhara\OneDrive\Desktop\shlok.mp3") as source:


recorded_audio = recognizer.listen(source)
print("Done recording")

''' Recorgnizing the Audio '''


try:
print("Recognizing the text")
text = recognizer.recognize_google(
recorded_audio,
language="en-US"
)
print("Decoded Text : {}".format(text))

except Exception as ex:


print(ex)

Done recording
Recognizing the text

In [ ]:

localhost:8889/nbconvert/html/NLTK.16.01.23.ipynb?download=false 16/16
4/23/24, 12:25 PM NLP.27.03.24

Tokenization
In [1]: import nltk

In [2]: nltk.download()

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml


True
Out[2]:

In [4]: import nltk.corpus

In [5]: import os

In [6]: x="hello data science, how are you? where were you these days?"
type(x)

str
Out[6]:

In [7]: from nltk.tokenize import word_tokenize, sent_tokenize


z=word_tokenize(x)
y= sent_tokenize(x)
print(z)
print(y)

['hello', 'data', 'science', ',', 'how', 'are', 'you', '?', 'where', 'were', 'yo
u', 'these', 'days', '?']
['hello data science, how are you?', 'where were you these days?']

In [8]: s="python is awesome language"


tokens= nltk.word_tokenize(s)
tokens

['python', 'is', 'awesome', 'language']


Out[8]:

In [9]: print(os.listdir(nltk.data.find("corpora")))

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 1/36
4/23/24, 12:25 PM NLP.27.03.24
['abc', 'abc.zip', 'alpino', 'alpino.zip', 'bcp47.zip', 'biocreative_ppi', 'biocre
ative_ppi.zip', 'brown', 'brown.zip', 'brown_tei', 'brown_tei.zip', 'cess_cat', 'c
ess_cat.zip', 'cess_esp', 'cess_esp.zip', 'chat80', 'chat80.zip', 'city_database',
'city_database.zip', 'cmudict', 'cmudict.zip', 'comparative_sentences', 'comparati
ve_sentences.zip', 'comtrans.zip', 'conll2000', 'conll2000.zip', 'conll2002', 'con
ll2002.zip', 'conll2007.zip', 'crubadan', 'crubadan.zip', 'dependency_treebank',
'dependency_treebank.zip', 'dolch', 'dolch.zip', 'europarl_raw', 'europarl_raw.zi
p', 'extended_omw.zip', 'floresta', 'floresta.zip', 'framenet_v15', 'framenet_v15.
zip', 'framenet_v17', 'framenet_v17.zip', 'gazetteers', 'gazetteers.zip', 'genesi
s', 'genesis.zip', 'gutenberg', 'gutenberg.zip', 'ieer', 'ieer.zip', 'inaugural',
'inaugural.zip', 'indian', 'indian.zip', 'jeita.zip', 'kimmo', 'kimmo.zip', 'knbc.
zip', 'lin_thesaurus', 'lin_thesaurus.zip', 'machado.zip', 'mac_morpho', 'mac_morp
ho.zip', 'masc_tagged.zip', 'movie_reviews', 'movie_reviews.zip', 'mte_teip5', 'mt
e_teip5.zip', 'names', 'names.zip', 'nombank.1.0.zip', 'nonbreaking_prefixes', 'no
nbreaking_prefixes.zip', 'nps_chat', 'nps_chat.zip', 'omw-1.4.zip', 'omw.zip', 'op
inion_lexicon', 'opinion_lexicon.zip', 'panlex_swadesh.zip', 'paradigms', 'paradig
ms.zip', 'pe08', 'pe08.zip', 'pil', 'pil.zip', 'pl196x', 'pl196x.zip', 'ppattach',
'ppattach.zip', 'problem_reports', 'problem_reports.zip', 'product_reviews_1', 'pr
oduct_reviews_1.zip', 'product_reviews_2', 'product_reviews_2.zip', 'propbank.zi
p', 'pros_cons', 'pros_cons.zip', 'ptb', 'ptb.zip', 'qc', 'qc.zip', 'reuters.zip',
'rte', 'rte.zip', 'semcor.zip', 'senseval', 'senseval.zip', 'sentence_polarity',
'sentence_polarity.zip', 'sentiwordnet', 'sentiwordnet.zip', 'shakespeare', 'shake
speare.zip', 'sinica_treebank', 'sinica_treebank.zip', 'smultron', 'smultron.zip',
'state_union', 'state_union.zip', 'stopwords', 'stopwords.zip', 'subjectivity', 's
ubjectivity.zip', 'swadesh', 'swadesh.zip', 'switchboard', 'switchboard.zip', 'tim
it', 'timit.zip', 'toolbox', 'toolbox.zip', 'treebank', 'treebank.zip', 'twitter_s
amples', 'twitter_samples.zip', 'udhr', 'udhr.zip', 'udhr2', 'udhr2.zip', 'unicode
_samples', 'unicode_samples.zip', 'universal_treebanks_v20.zip', 'verbnet', 'verbn
et.zip', 'verbnet3', 'verbnet3.zip', 'webtext', 'webtext.zip', 'wordnet.zip', 'wor
dnet2021.zip', 'wordnet2022', 'wordnet2022.zip', 'wordnet31.zip', 'wordnet_ic', 'w
ordnet_ic.zip', 'words', 'words.zip', 'ycoe', 'ycoe.zip']

In [10]: from nltk.corpus import brown


brown.words()

['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]


Out[10]:

In [11]: nltk.corpus.gutenberg.fileids()

['austen-emma.txt',
Out[11]:
'austen-persuasion.txt',
'austen-sense.txt',
'bible-kjv.txt',
'blake-poems.txt',
'bryant-stories.txt',
'burgess-busterbrown.txt',
'carroll-alice.txt',
'chesterton-ball.txt',
'chesterton-brown.txt',
'chesterton-thursday.txt',
'edgeworth-parents.txt',
'melville-moby_dick.txt',
'milton-paradise.txt',
'shakespeare-caesar.txt',
'shakespeare-hamlet.txt',
'shakespeare-macbeth.txt',
'whitman-leaves.txt']

In [12]: bible= nltk.corpus.gutenberg.words( 'bible-kjv.txt')


bible

['[', 'The', 'King', 'James', 'Bible', ']', 'The', ...]


Out[12]:

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 2/36
4/23/24, 12:25 PM NLP.27.03.24

In [13]: for words in bible[:100]:


print(words, sep="", end="")

[TheKingJamesBible]TheOldTestamentoftheKingJamesBibleTheFirstBookofMoses:CalledGen
esis1:1InthebeginningGodcreatedtheheavenandtheearth.1:2Andtheearthwaswithoutform,a
ndvoid;anddarknesswasuponthefaceofthedeep.AndtheSpiritofGodmoveduponthefaceofthewa
ters.1:3AndGodsaid,Lettherebelight:andtherewaslight.1:4AndGodsawthelight,thatit

In [16]: from nltk.tokenize import word_tokenize

In [17]: NLP="Why Tokenization is Required? ANS:Every sentence gets its meaning by the words

In [18]: type(NLP)

str
Out[18]:

In [19]: NLP_tokens= word_tokenize(NLP)


NLP_tokens

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 3/36
4/23/24, 12:25 PM NLP.27.03.24
['Why',
Out[19]:
'Tokenization',
'is',
'Required',
'?',
'ANS',
':',
'Every',
'sentence',
'gets',
'its',
'meaning',
'by',
'the',
'words',
'present',
'in',
'it.So',
'by',
'analyzing',
'the',
'words',
'present',
'in',
'the',
'text',
'we',
'can',
'easily',
'interpret',
'the',
'meaning',
'of',
'the',
'text.Once',
'we',
'have',
'a',
'list',
'of',
'words',
'we',
'can',
'also',
'use',
'statistical',
'tools',
'and',
'methods',
'to',
'get',
'more',
'insights',
'into',
'the',
'text',
'.']

In [20]: from nltk.tokenize import sent_tokenize


NLP_TOEKN_SENT=sent_tokenize(NLP)
NLP_TOEKN_SENT

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 4/36
4/23/24, 12:25 PM NLP.27.03.24
['Why Tokenization is Required?',
Out[20]:
'ANS:Every sentence gets its meaning by the words present in it.So by analyzing t
he words present in the text we can easily interpret the meaning of the text.Once
we have a list of words we can also use statistical tools and methods to get more
insights into the text.']

In [21]: len(NLP_tokens)

57
Out[21]:

In [22]: from nltk.probability import FreqDist


fdist=FreqDist()

In [23]: for words in NLP_tokens:


fdist[words.lower()]+=1
fdist

FreqDist({'the': 6, 'words': 3, 'we': 3, 'meaning': 2, 'by': 2, 'present': 2, 'i


Out[23]:
n': 2, 'text': 2, 'can': 2, 'of': 2, ...})

In [24]: fdist['natural']

0
Out[24]:

In [25]: fdist['meaning']

2
Out[25]:

In [26]: len(fdist) #from total 51 words, 41 are unique

41
Out[26]:

In [27]: #selecting top 5 tokens with highest frequency


fdist_tops5= fdist. most_common(5) # most used words common
fdist_tops5

[('the', 6), ('words', 3), ('we', 3), ('meaning', 2), ('by', 2)]
Out[27]:

In [28]: #selecting top 5 tokens with highest frequency


fdist_tops10= fdist. most_common(10)
fdist_tops10

[('the', 6),
Out[28]:
('words', 3),
('we', 3),
('meaning', 2),
('by', 2),
('present', 2),
('in', 2),
('text', 2),
('can', 2),
('of', 2)]

In [29]: fdist_BOTTOM10= fdist. most_common(10)


fdist_BOTTOM10

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 5/36
4/23/24, 12:25 PM NLP.27.03.24
[('the', 6),
Out[29]:
('words', 3),
('we', 3),
('meaning', 2),
('by', 2),
('present', 2),
('in', 2),
('text', 2),
('can', 2),
('of', 2)]

NLP_blank
In [30]: #now lets look at other tokenizer, blankline tokenizer over the same paragraph, it
from nltk.tokenize import blankline_tokenize
NLP_blank= blankline_tokenize(NLP)

In [31]: NLP_blank

['Why Tokenization is Required? ANS:Every sentence gets its meaning by the words p
Out[31]:
resent in it.So by analyzing the words present in the text we can easily interpret
the meaning of the text.Once we have a list of words we can also use statistical t
ools and methods to get more insights into the text.']

In [32]: len(NLP_blank)

1
Out[32]:

In [33]: #to count no.of paragraphs separated by blank line/ new line
len(NLP_blank)

1
Out[33]:

In [35]: # to view 1st paragraph use this, to view 2nd para use index 1 and so on...
NLP_blank[0]

'Why Tokenization is Required? ANS:Every sentence gets its meaning by the words pr
Out[35]:
esent in it.So by analyzing the words present in the text we can easily interpret
the meaning of the text.Once we have a list of words we can also use statistical t
ools and methods to get more insights into the text.'

In [36]: NLP1="Natural Language Processing, usually shortend as NLP is a branch of artificia

In [37]: NLP1_blank= blankline_tokenize(NLP1)

In [38]: NLP1_blank

['Natural Language Processing, usually shortend as NLP is a branch of artificial i


Out[38]:
ntelligence that deals with the interaction between computers andd humans using th
e natural language.The ultimate objective of NLP is to read, decipher, understand,
and make sense of the human languagees in a manaer that is valuable.Most NLP techn
iques rely on machine learning to derive meaning from human language.']

Bigram
biagram is used to define accurance of each word depends only on its previous words.
Hence two words are counted as onegram or unigram features. unigram cannot give good
model.
localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 6/36
4/23/24, 12:25 PM NLP.27.03.24

In [39]: from nltk.util import bigrams, trigrams, ngrams

In [40]: NLP_bigrams= list(nltk.bigrams(NLP_tokens))


NLP_bigrams

[('Why', 'Tokenization'),
Out[40]:
('Tokenization', 'is'),
('is', 'Required'),
('Required', '?'),
('?', 'ANS'),
('ANS', ':'),
(':', 'Every'),
('Every', 'sentence'),
('sentence', 'gets'),
('gets', 'its'),
('its', 'meaning'),
('meaning', 'by'),
('by', 'the'),
('the', 'words'),
('words', 'present'),
('present', 'in'),
('in', 'it.So'),
('it.So', 'by'),
('by', 'analyzing'),
('analyzing', 'the'),
('the', 'words'),
('words', 'present'),
('present', 'in'),
('in', 'the'),
('the', 'text'),
('text', 'we'),
('we', 'can'),
('can', 'easily'),
('easily', 'interpret'),
('interpret', 'the'),
('the', 'meaning'),
('meaning', 'of'),
('of', 'the'),
('the', 'text.Once'),
('text.Once', 'we'),
('we', 'have'),
('have', 'a'),
('a', 'list'),
('list', 'of'),
('of', 'words'),
('words', 'we'),
('we', 'can'),
('can', 'also'),
('also', 'use'),
('use', 'statistical'),
('statistical', 'tools'),
('tools', 'and'),
('and', 'methods'),
('methods', 'to'),
('to', 'get'),
('get', 'more'),
('more', 'insights'),
('insights', 'into'),
('into', 'the'),
('the', 'text'),
('text', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 7/36
4/23/24, 12:25 PM NLP.27.03.24

In [41]: NLP_trigrams= list(nltk.trigrams(NLP_tokens))


NLP_trigrams

[('Why', 'Tokenization', 'is'),


Out[41]:
('Tokenization', 'is', 'Required'),
('is', 'Required', '?'),
('Required', '?', 'ANS'),
('?', 'ANS', ':'),
('ANS', ':', 'Every'),
(':', 'Every', 'sentence'),
('Every', 'sentence', 'gets'),
('sentence', 'gets', 'its'),
('gets', 'its', 'meaning'),
('its', 'meaning', 'by'),
('meaning', 'by', 'the'),
('by', 'the', 'words'),
('the', 'words', 'present'),
('words', 'present', 'in'),
('present', 'in', 'it.So'),
('in', 'it.So', 'by'),
('it.So', 'by', 'analyzing'),
('by', 'analyzing', 'the'),
('analyzing', 'the', 'words'),
('the', 'words', 'present'),
('words', 'present', 'in'),
('present', 'in', 'the'),
('in', 'the', 'text'),
('the', 'text', 'we'),
('text', 'we', 'can'),
('we', 'can', 'easily'),
('can', 'easily', 'interpret'),
('easily', 'interpret', 'the'),
('interpret', 'the', 'meaning'),
('the', 'meaning', 'of'),
('meaning', 'of', 'the'),
('of', 'the', 'text.Once'),
('the', 'text.Once', 'we'),
('text.Once', 'we', 'have'),
('we', 'have', 'a'),
('have', 'a', 'list'),
('a', 'list', 'of'),
('list', 'of', 'words'),
('of', 'words', 'we'),
('words', 'we', 'can'),
('we', 'can', 'also'),
('can', 'also', 'use'),
('also', 'use', 'statistical'),
('use', 'statistical', 'tools'),
('statistical', 'tools', 'and'),
('tools', 'and', 'methods'),
('and', 'methods', 'to'),
('methods', 'to', 'get'),
('to', 'get', 'more'),
('get', 'more', 'insights'),
('more', 'insights', 'into'),
('insights', 'into', 'the'),
('into', 'the', 'text'),
('the', 'text', '.')]

In [42]: NLP_ngrams= list(nltk.ngrams(NLP_tokens,4))


NLP_ngrams

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 8/36
4/23/24, 12:25 PM NLP.27.03.24
[('Why', 'Tokenization', 'is', 'Required'),
Out[42]:
('Tokenization', 'is', 'Required', '?'),
('is', 'Required', '?', 'ANS'),
('Required', '?', 'ANS', ':'),
('?', 'ANS', ':', 'Every'),
('ANS', ':', 'Every', 'sentence'),
(':', 'Every', 'sentence', 'gets'),
('Every', 'sentence', 'gets', 'its'),
('sentence', 'gets', 'its', 'meaning'),
('gets', 'its', 'meaning', 'by'),
('its', 'meaning', 'by', 'the'),
('meaning', 'by', 'the', 'words'),
('by', 'the', 'words', 'present'),
('the', 'words', 'present', 'in'),
('words', 'present', 'in', 'it.So'),
('present', 'in', 'it.So', 'by'),
('in', 'it.So', 'by', 'analyzing'),
('it.So', 'by', 'analyzing', 'the'),
('by', 'analyzing', 'the', 'words'),
('analyzing', 'the', 'words', 'present'),
('the', 'words', 'present', 'in'),
('words', 'present', 'in', 'the'),
('present', 'in', 'the', 'text'),
('in', 'the', 'text', 'we'),
('the', 'text', 'we', 'can'),
('text', 'we', 'can', 'easily'),
('we', 'can', 'easily', 'interpret'),
('can', 'easily', 'interpret', 'the'),
('easily', 'interpret', 'the', 'meaning'),
('interpret', 'the', 'meaning', 'of'),
('the', 'meaning', 'of', 'the'),
('meaning', 'of', 'the', 'text.Once'),
('of', 'the', 'text.Once', 'we'),
('the', 'text.Once', 'we', 'have'),
('text.Once', 'we', 'have', 'a'),
('we', 'have', 'a', 'list'),
('have', 'a', 'list', 'of'),
('a', 'list', 'of', 'words'),
('list', 'of', 'words', 'we'),
('of', 'words', 'we', 'can'),
('words', 'we', 'can', 'also'),
('we', 'can', 'also', 'use'),
('can', 'also', 'use', 'statistical'),
('also', 'use', 'statistical', 'tools'),
('use', 'statistical', 'tools', 'and'),
('statistical', 'tools', 'and', 'methods'),
('tools', 'and', 'methods', 'to'),
('and', 'methods', 'to', 'get'),
('methods', 'to', 'get', 'more'),
('to', 'get', 'more', 'insights'),
('get', 'more', 'insights', 'into'),
('more', 'insights', 'into', 'the'),
('insights', 'into', 'the', 'text'),
('into', 'the', 'text', '.')]

In [43]: from nltk.corpus import stopwords


stop_words = set(stopwords.words('english')) # here english language. you may do i

print(stop_words)

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 9/36
4/23/24, 12:25 PM NLP.27.03.24
{'our', 'than', 'between', "don't", "you'd", "mustn't", "wasn't", 'yourself', 'a
t', 'the', "hadn't", 'wouldn', 'because', 'not', "didn't", 'below', 'own', 'were',
'just', 's', 'on', "doesn't", 'by', 'shan', "shouldn't", 'which', 'for', 'myself',
'didn', 'of', 'what', 'and', 'yourselves', "you've", 'through', 'why', "aren't",
'itself', 'themselves', 'all', 'nor', 'don', 'very', 'more', 'mustn', 'down', 'a
n', 'ain', "hasn't", "weren't", 'does', 'their', 'who', "it's", 'a', 'no', 'befor
e', "won't", 'theirs', 'or', 'with', 'was', 'up', 'hadn', 'doing', 'this', 'durin
g', 'now', 'weren', 'over', 'your', 'each', 'wasn', 'do', 'you', 'i', 'while', 'th
ere', 'd', 'll', 'here', 'most', 'yours', 'did', 'after', 'both', 'whom', 'above',
'any', 'am', "mightn't", 'he', 'only', 'hers', 'have', 'it', 'until', 'if', 'are',
'when', 'other', "should've", 'me', 'these', 'ourselves', 'as', 'needn', 'has',
't', 'can', 'under', 'about', 'such', 'haven', 'been', 'we', 'will', 'she', 'the
n', 'herself', 'aren', 'is', 'should', 'being', 'ours', 'against', 'isn', 'too',
'that', 'mightn', 'again', 'him', "needn't", 'but', "you'll", 'm', 'those', 'onc
e', 'hasn', "shan't", 'couldn', 'some', 'out', 'had', 'o', 'her', 'shouldn', "woul
dn't", 'further', "couldn't", 'from', "haven't", "isn't", 'doesn', 'in', 'ma', 'm
y', 'won', 'its', "you're", 'having', 'same', 'to', 'into', 'few', "that'll", 'whe
re', 'y', 'how', 'so', 'be', "she's", 've', 'they', 'his', 'off', 'them', 'himsel
f', 're'}

In [44]: print(NLP_tokens)
filtered_sen=[]
for w in NLP_tokens:
if w not in stop_words:
filtered_sen.append(w)
print(filtered_sen)

['Why', 'Tokenization', 'is', 'Required', '?', 'ANS', ':', 'Every', 'sentence', 'g
ets', 'its', 'meaning', 'by', 'the', 'words', 'present', 'in', 'it.So', 'by', 'ana
lyzing', 'the', 'words', 'present', 'in', 'the', 'text', 'we', 'can', 'easily', 'i
nterpret', 'the', 'meaning', 'of', 'the', 'text.Once', 'we', 'have', 'a', 'list',
'of', 'words', 'we', 'can', 'also', 'use', 'statistical', 'tools', 'and', 'method
s', 'to', 'get', 'more', 'insights', 'into', 'the', 'text', '.']
['Why', 'Tokenization', 'Required', '?', 'ANS', ':', 'Every', 'sentence', 'gets',
'meaning', 'words', 'present', 'it.So', 'analyzing', 'words', 'present', 'text',
'easily', 'interpret', 'meaning', 'text.Once', 'list', 'words', 'also', 'use', 'st
atistical', 'tools', 'methods', 'get', 'insights', 'text', '.']

In [45]: filtered_sent=[w for w in NLP_tokens if not w in stop_words]


print(filtered_sent)

['Why', 'Tokenization', 'Required', '?', 'ANS', ':', 'Every', 'sentence', 'gets',


'meaning', 'words', 'present', 'it.So', 'analyzing', 'words', 'present', 'text',
'easily', 'interpret', 'meaning', 'text.Once', 'list', 'words', 'also', 'use', 'st
atistical', 'tools', 'methods', 'get', 'insights', 'text', '.']

In [46]: from nltk.stem import PorterStemmer


ps=PorterStemmer()

sample_words= ["python","pythoning","pythoner","pythoned", "pythonly"]

for w in sample_words:
print(ps.stem(w))

python
python
python
python
pythonli

In [47]: for w in NLP_tokens:


print(ps.stem(w))

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 10/36
4/23/24, 12:25 PM NLP.27.03.24
whi
token
is
requir
?
an
:
everi
sentenc
get
it
mean
by
the
word
present
in
it.so
by
analyz
the
word
present
in
the
text
we
can
easili
interpret
the
mean
of
the
text.onc
we
have
a
list
of
word
we
can
also
use
statist
tool
and
method
to
get
more
insight
into
the
text
.

In [48]: from nltk.corpus import state_union


from nltk.tokenize import PunktSentenceTokenizer #unspervised
train_txt= state_union.raw("2005-GWBush.txt")
sample_txt= state_union.raw("2006-GWBush.txt")
customsent_tokenizer= PunktSentenceTokenizer(train_txt)
tokenized=customsent_tokenizer.tokenize(sample_txt)

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 11/36
4/23/24, 12:25 PM NLP.27.03.24

def process_content():
try:
for i in tokenized:
words= nltk.word_tokenize(i)
tagged=nltk.pos_tag(words)
print(tagged)
except Exception as e:
print(str(e))
process_content()

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 12/36
4/23/24, 12:25 PM NLP.27.03.24
[('PRESIDENT', 'NNP'), ('GEORGE', 'NNP'), ('W.', 'NNP'), ('BUSH', 'NNP'), ("'S",
'POS'), ('ADDRESS', 'NNP'), ('BEFORE', 'IN'), ('A', 'NNP'), ('JOINT', 'NNP'), ('SE
SSION', 'NNP'), ('OF', 'IN'), ('THE', 'NNP'), ('CONGRESS', 'NNP'), ('ON', 'NNP'),
('THE', 'NNP'), ('STATE', 'NNP'), ('OF', 'IN'), ('THE', 'NNP'), ('UNION', 'NNP'),
('January', 'NNP'), ('31', 'CD'), (',', ','), ('2006', 'CD'), ('THE', 'NNP'), ('PR
ESIDENT', 'NNP'), (':', ':'), ('Thank', 'NNP'), ('you', 'PRP'), ('all', 'DT'),
('.', '.')]
[('Mr.', 'NNP'), ('Speaker', 'NNP'), (',', ','), ('Vice', 'NNP'), ('President', 'N
NP'), ('Cheney', 'NNP'), (',', ','), ('members', 'NNS'), ('of', 'IN'), ('Congres
s', 'NNP'), (',', ','), ('members', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('Suprem
e', 'NNP'), ('Court', 'NNP'), ('and', 'CC'), ('diplomatic', 'JJ'), ('corps', 'N
N'), (',', ','), ('distinguished', 'JJ'), ('guests', 'NNS'), (',', ','), ('and',
'CC'), ('fellow', 'JJ'), ('citizens', 'NNS'), (':', ':'), ('Today', 'VB'), ('our',
'PRP$'), ('nation', 'NN'), ('lost', 'VBD'), ('a', 'DT'), ('beloved', 'VBN'), (',',
','), ('graceful', 'JJ'), (',', ','), ('courageous', 'JJ'), ('woman', 'NN'), ('wh
o', 'WP'), ('called', 'VBD'), ('America', 'NNP'), ('to', 'TO'), ('its', 'PRP$'),
('founding', 'NN'), ('ideals', 'NNS'), ('and', 'CC'), ('carried', 'VBD'), ('on',
'IN'), ('a', 'DT'), ('noble', 'JJ'), ('dream', 'NN'), ('.', '.')]
[('Tonight', 'NN'), ('we', 'PRP'), ('are', 'VBP'), ('comforted', 'VBN'), ('by', 'I
N'), ('the', 'DT'), ('hope', 'NN'), ('of', 'IN'), ('a', 'DT'), ('glad', 'JJ'), ('r
eunion', 'NN'), ('with', 'IN'), ('the', 'DT'), ('husband', 'NN'), ('who', 'WP'),
('was', 'VBD'), ('taken', 'VBN'), ('so', 'RB'), ('long', 'RB'), ('ago', 'RB'),
(',', ','), ('and', 'CC'), ('we', 'PRP'), ('are', 'VBP'), ('grateful', 'JJ'), ('fo
r', 'IN'), ('the', 'DT'), ('good', 'JJ'), ('life', 'NN'), ('of', 'IN'), ('Corett
a', 'NNP'), ('Scott', 'NNP'), ('King', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('President', 'NNP'), ('George', 'NNP'), ('W.', 'NNP'), ('Bush', 'NNP'), ('react
s', 'VBZ'), ('to', 'TO'), ('applause', 'VB'), ('during', 'IN'), ('his', 'PRP$'),
('State', 'NNP'), ('of', 'IN'), ('the', 'DT'), ('Union', 'NNP'), ('Address', 'NN
P'), ('at', 'IN'), ('the', 'DT'), ('Capitol', 'NNP'), (',', ','), ('Tuesday', 'NN
P'), (',', ','), ('Jan', 'NNP'), ('.', '.')]
[('31', 'CD'), (',', ','), ('2006', 'CD'), ('.', '.')]
[('White', 'NNP'), ('House', 'NNP'), ('photo', 'NN'), ('by', 'IN'), ('Eric', 'NN
P'), ('DraperEvery', 'NNP'), ('time', 'NN'), ('I', 'PRP'), ("'m", 'VBP'), ('invite
d', 'JJ'), ('to', 'TO'), ('this', 'DT'), ('rostrum', 'NN'), (',', ','), ('I', 'PR
P'), ("'m", 'VBP'), ('humbled', 'VBN'), ('by', 'IN'), ('the', 'DT'), ('privilege',
'NN'), (',', ','), ('and', 'CC'), ('mindful', 'NN'), ('of', 'IN'), ('the', 'DT'),
('history', 'NN'), ('we', 'PRP'), ("'ve", 'VBP'), ('seen', 'VBN'), ('together', 'R
B'), ('.', '.')]
[('We', 'PRP'), ('have', 'VBP'), ('gathered', 'VBN'), ('under', 'IN'), ('this', 'D
T'), ('Capitol', 'NNP'), ('dome', 'NN'), ('in', 'IN'), ('moments', 'NNS'), ('of',
'IN'), ('national', 'JJ'), ('mourning', 'NN'), ('and', 'CC'), ('national', 'JJ'),
('achievement', 'NN'), ('.', '.')]
[('We', 'PRP'), ('have', 'VBP'), ('served', 'VBN'), ('America', 'NNP'), ('throug
h', 'IN'), ('one', 'CD'), ('of', 'IN'), ('the', 'DT'), ('most', 'RBS'), ('conseque
ntial', 'JJ'), ('periods', 'NNS'), ('of', 'IN'), ('our', 'PRP$'), ('history', 'N
N'), ('--', ':'), ('and', 'CC'), ('it', 'PRP'), ('has', 'VBZ'), ('been', 'VBN'),
('my', 'PRP$'), ('honor', 'NN'), ('to', 'TO'), ('serve', 'VB'), ('with', 'IN'),
('you', 'PRP'), ('.', '.')]
[('In', 'IN'), ('a', 'DT'), ('system', 'NN'), ('of', 'IN'), ('two', 'CD'), ('parti
es', 'NNS'), (',', ','), ('two', 'CD'), ('chambers', 'NNS'), (',', ','), ('and',
'CC'), ('two', 'CD'), ('elected', 'JJ'), ('branches', 'NNS'), (',', ','), ('ther
e', 'EX'), ('will', 'MD'), ('always', 'RB'), ('be', 'VB'), ('differences', 'NNS'),
('and', 'CC'), ('debate', 'NN'), ('.', '.')]
[('But', 'CC'), ('even', 'RB'), ('tough', 'JJ'), ('debates', 'NNS'), ('can', 'M
D'), ('be', 'VB'), ('conducted', 'VBN'), ('in', 'IN'), ('a', 'DT'), ('civil', 'J
J'), ('tone', 'NN'), (',', ','), ('and', 'CC'), ('our', 'PRP$'), ('differences',
'NNS'), ('can', 'MD'), ('not', 'RB'), ('be', 'VB'), ('allowed', 'VBN'), ('to', 'T
O'), ('harden', 'VB'), ('into', 'IN'), ('anger', 'NN'), ('.', '.')]
[('To', 'TO'), ('confront', 'VB'), ('the', 'DT'), ('great', 'JJ'), ('issues', 'NN
S'), ('before', 'IN'), ('us', 'PRP'), (',', ','), ('we', 'PRP'), ('must', 'MD'),
('act', 'VB'), ('in', 'IN'), ('a', 'DT'), ('spirit', 'NN'), ('of', 'IN'), ('goodwi
ll', 'NN'), ('and', 'CC'), ('respect', 'NN'), ('for', 'IN'), ('one', 'CD'), ('anot
her', 'DT'), ('--', ':'), ('and', 'CC'), ('I', 'PRP'), ('will', 'MD'), ('do', 'V

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 13/36
4/23/24, 12:25 PM NLP.27.03.24
B'), ('my', 'PRP$'), ('part', 'NN'), ('.', '.')]
[('Tonight', 'NNP'), ('the', 'DT'), ('state', 'NN'), ('of', 'IN'), ('our', 'PRP
$'), ('Union', 'NNP'), ('is', 'VBZ'), ('strong', 'JJ'), ('--', ':'), ('and', 'C
C'), ('together', 'RB'), ('we', 'PRP'), ('will', 'MD'), ('make', 'VB'), ('it', 'PR
P'), ('stronger', 'JJR'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('In', 'IN'), ('this', 'DT'), ('decisive', 'JJ'), ('year', 'NN'), (',', ','), ('y
ou', 'PRP'), ('and', 'CC'), ('I', 'PRP'), ('will', 'MD'), ('make', 'VB'), ('choice
s', 'NNS'), ('that', 'WDT'), ('determine', 'VBP'), ('both', 'DT'), ('the', 'DT'),
('future', 'NN'), ('and', 'CC'), ('the', 'DT'), ('character', 'NN'), ('of', 'IN'),
('our', 'PRP$'), ('country', 'NN'), ('.', '.')]
[('We', 'PRP'), ('will', 'MD'), ('choose', 'VB'), ('to', 'TO'), ('act', 'VB'), ('c
onfidently', 'RB'), ('in', 'IN'), ('pursuing', 'VBG'), ('the', 'DT'), ('enemies',
'NNS'), ('of', 'IN'), ('freedom', 'NN'), ('--', ':'), ('or', 'CC'), ('retreat', 'N
N'), ('from', 'IN'), ('our', 'PRP$'), ('duties', 'NNS'), ('in', 'IN'), ('the', 'D
T'), ('hope', 'NN'), ('of', 'IN'), ('an', 'DT'), ('easier', 'JJR'), ('life', 'N
N'), ('.', '.')]
[('We', 'PRP'), ('will', 'MD'), ('choose', 'VB'), ('to', 'TO'), ('build', 'VB'),
('our', 'PRP$'), ('prosperity', 'NN'), ('by', 'IN'), ('leading', 'VBG'), ('the',
'DT'), ('world', 'NN'), ('economy', 'NN'), ('--', ':'), ('or', 'CC'), ('shut', 'V
B'), ('ourselves', 'PRP'), ('off', 'RP'), ('from', 'IN'), ('trade', 'NN'), ('and',
'CC'), ('opportunity', 'NN'), ('.', '.')]
[('In', 'IN'), ('a', 'DT'), ('complex', 'JJ'), ('and', 'CC'), ('challenging', 'J
J'), ('time', 'NN'), (',', ','), ('the', 'DT'), ('road', 'NN'), ('of', 'IN'), ('is
olationism', 'NN'), ('and', 'CC'), ('protectionism', 'NN'), ('may', 'MD'), ('see
m', 'VB'), ('broad', 'JJ'), ('and', 'CC'), ('inviting', 'NN'), ('--', ':'), ('ye
t', 'CC'), ('it', 'PRP'), ('ends', 'VBZ'), ('in', 'IN'), ('danger', 'NN'), ('and',
'CC'), ('decline', 'NN'), ('.', '.')]
[('The', 'DT'), ('only', 'JJ'), ('way', 'NN'), ('to', 'TO'), ('protect', 'VB'),
('our', 'PRP$'), ('people', 'NNS'), (',', ','), ('the', 'DT'), ('only', 'JJ'), ('w
ay', 'NN'), ('to', 'TO'), ('secure', 'VB'), ('the', 'DT'), ('peace', 'NN'), (',',
','), ('the', 'DT'), ('only', 'JJ'), ('way', 'NN'), ('to', 'TO'), ('control', 'V
B'), ('our', 'PRP$'), ('destiny', 'NN'), ('is', 'VBZ'), ('by', 'IN'), ('our', 'PRP
$'), ('leadership', 'NN'), ('--', ':'), ('so', 'IN'), ('the', 'DT'), ('United', 'N
NP'), ('States', 'NNPS'), ('of', 'IN'), ('America', 'NNP'), ('will', 'MD'), ('cont
inue', 'VB'), ('to', 'TO'), ('lead', 'VB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Abroad', 'RB'), (',', ','), ('our', 'PRP$'), ('nation', 'NN'), ('is', 'VBZ'),
('committed', 'VBN'), ('to', 'TO'), ('an', 'DT'), ('historic', 'JJ'), (',', ','),
('long-term', 'JJ'), ('goal', 'NN'), ('--', ':'), ('we', 'PRP'), ('seek', 'VBP'),
('the', 'DT'), ('end', 'NN'), ('of', 'IN'), ('tyranny', 'NN'), ('in', 'IN'), ('ou
r', 'PRP$'), ('world', 'NN'), ('.', '.')]
[('Some', 'DT'), ('dismiss', 'VBP'), ('that', 'DT'), ('goal', 'NN'), ('as', 'IN'),
('misguided', 'JJ'), ('idealism', 'NN'), ('.', '.')]
[('In', 'IN'), ('reality', 'NN'), (',', ','), ('the', 'DT'), ('future', 'JJ'), ('s
ecurity', 'NN'), ('of', 'IN'), ('America', 'NNP'), ('depends', 'VBZ'), ('on', 'I
N'), ('it', 'PRP'), ('.', '.')]
[('On', 'IN'), ('September', 'NNP'), ('the', 'DT'), ('11th', 'CD'), (',', ','),
('2001', 'CD'), (',', ','), ('we', 'PRP'), ('found', 'VBD'), ('that', 'IN'), ('pro
blems', 'NNS'), ('originating', 'VBG'), ('in', 'IN'), ('a', 'DT'), ('failed', 'J
J'), ('and', 'CC'), ('oppressive', 'JJ'), ('state', 'NN'), ('7,000', 'CD'), ('mile
s', 'NNS'), ('away', 'RB'), ('could', 'MD'), ('bring', 'VB'), ('murder', 'NN'),
('and', 'CC'), ('destruction', 'NN'), ('to', 'TO'), ('our', 'PRP$'), ('country',
'NN'), ('.', '.')]
[('Dictatorships', 'NNP'), ('shelter', 'NN'), ('terrorists', 'NNS'), (',', ','),
('and', 'CC'), ('feed', 'VB'), ('resentment', 'NN'), ('and', 'CC'), ('radicalism',
'NN'), (',', ','), ('and', 'CC'), ('seek', 'JJ'), ('weapons', 'NNS'), ('of', 'I
N'), ('mass', 'NN'), ('destruction', 'NN'), ('.', '.')]
[('Democracies', 'NNS'), ('replace', 'VB'), ('resentment', 'NN'), ('with', 'IN'),
('hope', 'NN'), (',', ','), ('respect', 'VB'), ('the', 'DT'), ('rights', 'NNS'),
('of', 'IN'), ('their', 'PRP$'), ('citizens', 'NNS'), ('and', 'CC'), ('their', 'PR
P$'), ('neighbors', 'NNS'), (',', ','), ('and', 'CC'), ('join', 'VB'), ('the', 'D
T'), ('fight', 'NN'), ('against', 'IN'), ('terror', 'NN'), ('.', '.')]
[('Every', 'DT'), ('step', 'NN'), ('toward', 'IN'), ('freedom', 'NN'), ('in', 'I

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 14/36
4/23/24, 12:25 PM NLP.27.03.24
N'), ('the', 'DT'), ('world', 'NN'), ('makes', 'VBZ'), ('our', 'PRP$'), ('countr
y', 'NN'), ('safer', 'NN'), ('--', ':'), ('so', 'IN'), ('we', 'PRP'), ('will', 'M
D'), ('act', 'VB'), ('boldly', 'RB'), ('in', 'IN'), ('freedom', 'NN'), ("'s", 'PO
S'), ('cause', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Far', 'RB'), ('from', 'IN'), ('being', 'VBG'), ('a', 'DT'), ('hopeless', 'NN'),
('dream', 'NN'), (',', ','), ('the', 'DT'), ('advance', 'NN'), ('of', 'IN'), ('fre
edom', 'NN'), ('is', 'VBZ'), ('the', 'DT'), ('great', 'JJ'), ('story', 'NN'), ('o
f', 'IN'), ('our', 'PRP$'), ('time', 'NN'), ('.', '.')]
[('In', 'IN'), ('1945', 'CD'), (',', ','), ('there', 'EX'), ('were', 'VBD'), ('abo
ut', 'RB'), ('two', 'CD'), ('dozen', 'NN'), ('lonely', 'RB'), ('democracies', 'VB
Z'), ('in', 'IN'), ('the', 'DT'), ('world', 'NN'), ('.', '.')]
[('Today', 'NN'), (',', ','), ('there', 'EX'), ('are', 'VBP'), ('122', 'CD'),
('.', '.')]
[('And', 'CC'), ('we', 'PRP'), ("'re", 'VBP'), ('writing', 'VBG'), ('a', 'DT'),
('new', 'JJ'), ('chapter', 'NN'), ('in', 'IN'), ('the', 'DT'), ('story', 'NN'),
('of', 'IN'), ('self-government', 'JJ'), ('--', ':'), ('with', 'IN'), ('women', 'N
NS'), ('lining', 'VBG'), ('up', 'RP'), ('to', 'TO'), ('vote', 'VB'), ('in', 'IN'),
('Afghanistan', 'NNP'), (',', ','), ('and', 'CC'), ('millions', 'NNS'), ('of', 'I
N'), ('Iraqis', 'NNP'), ('marking', 'VBG'), ('their', 'PRP$'), ('liberty', 'NN'),
('with', 'IN'), ('purple', 'JJ'), ('ink', 'NN'), (',', ','), ('and', 'CC'), ('me
n', 'NNS'), ('and', 'CC'), ('women', 'NNS'), ('from', 'IN'), ('Lebanon', 'NNP'),
('to', 'TO'), ('Egypt', 'NNP'), ('debating', 'VBG'), ('the', 'DT'), ('rights', 'NN
S'), ('of', 'IN'), ('individuals', 'NNS'), ('and', 'CC'), ('the', 'DT'), ('necessi
ty', 'NN'), ('of', 'IN'), ('freedom', 'NN'), ('.', '.')]
[('At', 'IN'), ('the', 'DT'), ('start', 'NN'), ('of', 'IN'), ('2006', 'CD'), (',',
','), ('more', 'JJR'), ('than', 'IN'), ('half', 'PDT'), ('the', 'DT'), ('people',
'NNS'), ('of', 'IN'), ('our', 'PRP$'), ('world', 'NN'), ('live', 'VBP'), ('in', 'I
N'), ('democratic', 'JJ'), ('nations', 'NNS'), ('.', '.')]
[('And', 'CC'), ('we', 'PRP'), ('do', 'VBP'), ('not', 'RB'), ('forget', 'VB'), ('t
he', 'DT'), ('other', 'JJ'), ('half', 'NN'), ('--', ':'), ('in', 'IN'), ('places',
'NNS'), ('like', 'IN'), ('Syria', 'NNP'), ('and', 'CC'), ('Burma', 'NNP'), (',',
','), ('Zimbabwe', 'NNP'), (',', ','), ('North', 'NNP'), ('Korea', 'NNP'), (',',
','), ('and', 'CC'), ('Iran', 'NNP'), ('--', ':'), ('because', 'IN'), ('the', 'D
T'), ('demands', 'NNS'), ('of', 'IN'), ('justice', 'NN'), (',', ','), ('and', 'C
C'), ('the', 'DT'), ('peace', 'NN'), ('of', 'IN'), ('this', 'DT'), ('world', 'N
N'), (',', ','), ('require', 'VBP'), ('their', 'PRP$'), ('freedom', 'NN'), (',',
','), ('as', 'RB'), ('well', 'RB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('President', 'NNP'), ('George', 'NNP'), ('W.', 'NNP'), ('Bush', 'NNP'), ('delive
rs', 'VBZ'), ('his', 'PRP$'), ('State', 'NN'), ('of', 'IN'), ('the', 'DT'), ('Unio
n', 'NNP'), ('Address', 'NNP'), ('at', 'IN'), ('the', 'DT'), ('Capitol', 'NNP'),
(',', ','), ('Tuesday', 'NNP'), (',', ','), ('Jan', 'NNP'), ('.', '.')]
[('31', 'CD'), (',', ','), ('2006', 'CD'), ('.', '.')]
[('White', 'NNP'), ('House', 'NNP'), ('photo', 'NN'), ('by', 'IN'), ('Eric', 'NN
P'), ('Draper', 'NNP'), ('No', 'NNP'), ('one', 'NN'), ('can', 'MD'), ('deny', 'V
B'), ('the', 'DT'), ('success', 'NN'), ('of', 'IN'), ('freedom', 'NN'), (',',
','), ('but', 'CC'), ('some', 'DT'), ('men', 'NNS'), ('rage', 'VB'), ('and', 'C
C'), ('fight', 'VB'), ('against', 'IN'), ('it', 'PRP'), ('.', '.')]
[('And', 'CC'), ('one', 'CD'), ('of', 'IN'), ('the', 'DT'), ('main', 'JJ'), ('sour
ces', 'NNS'), ('of', 'IN'), ('reaction', 'NN'), ('and', 'CC'), ('opposition', 'N
N'), ('is', 'VBZ'), ('radical', 'JJ'), ('Islam', 'NNP'), ('--', ':'), ('the', 'D
T'), ('perversion', 'NN'), ('by', 'IN'), ('a', 'DT'), ('few', 'JJ'), ('of', 'IN'),
('a', 'DT'), ('noble', 'JJ'), ('faith', 'NN'), ('into', 'IN'), ('an', 'DT'), ('ide
ology', 'NN'), ('of', 'IN'), ('terror', 'NN'), ('and', 'CC'), ('death', 'NN'),
('.', '.')]
[('Terrorists', 'NNS'), ('like', 'IN'), ('bin', 'NN'), ('Laden', 'NNP'), ('are',
'VBP'), ('serious', 'JJ'), ('about', 'IN'), ('mass', 'NN'), ('murder', 'NN'), ('--
', ':'), ('and', 'CC'), ('all', 'DT'), ('of', 'IN'), ('us', 'PRP'), ('must', 'M
D'), ('take', 'VB'), ('their', 'PRP$'), ('declared', 'JJ'), ('intentions', 'NNS'),
('seriously', 'RB'), ('.', '.')]
[('They', 'PRP'), ('seek', 'VBP'), ('to', 'TO'), ('impose', 'VB'), ('a', 'DT'),
('heartless', 'NN'), ('system', 'NN'), ('of', 'IN'), ('totalitarian', 'JJ'), ('con
trol', 'NN'), ('throughout', 'IN'), ('the', 'DT'), ('Middle', 'NNP'), ('East', 'NN

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 15/36
4/23/24, 12:25 PM NLP.27.03.24
P'), (',', ','), ('and', 'CC'), ('arm', 'NN'), ('themselves', 'PRP'), ('with', 'I
N'), ('weapons', 'NNS'), ('of', 'IN'), ('mass', 'NN'), ('murder', 'NN'), ('.',
'.')]
[('Their', 'PRP$'), ('aim', 'NN'), ('is', 'VBZ'), ('to', 'TO'), ('seize', 'VB'),
('power', 'NN'), ('in', 'IN'), ('Iraq', 'NNP'), (',', ','), ('and', 'CC'), ('use',
'VB'), ('it', 'PRP'), ('as', 'IN'), ('a', 'DT'), ('safe', 'JJ'), ('haven', 'NN'),
('to', 'TO'), ('launch', 'VB'), ('attacks', 'NNS'), ('against', 'IN'), ('America',
'NNP'), ('and', 'CC'), ('the', 'DT'), ('world', 'NN'), ('.', '.')]
[('Lacking', 'VBG'), ('the', 'DT'), ('military', 'JJ'), ('strength', 'NN'), ('to',
'TO'), ('challenge', 'VB'), ('us', 'PRP'), ('directly', 'RB'), (',', ','), ('the',
'DT'), ('terrorists', 'NNS'), ('have', 'VBP'), ('chosen', 'VBN'), ('the', 'DT'),
('weapon', 'NN'), ('of', 'IN'), ('fear', 'NN'), ('.', '.')]
[('When', 'WRB'), ('they', 'PRP'), ('murder', 'VBP'), ('children', 'NNS'), ('at',
'IN'), ('a', 'DT'), ('school', 'NN'), ('in', 'IN'), ('Beslan', 'NNP'), (',', ','),
('or', 'CC'), ('blow', 'VB'), ('up', 'RP'), ('commuters', 'NNS'), ('in', 'IN'),
('London', 'NNP'), (',', ','), ('or', 'CC'), ('behead', 'VB'), ('a', 'DT'), ('boun
d', 'NN'), ('captive', 'NN'), (',', ','), ('the', 'DT'), ('terrorists', 'NNS'),
('hope', 'VBP'), ('these', 'DT'), ('horrors', 'NNS'), ('will', 'MD'), ('break', 'V
B'), ('our', 'PRP$'), ('will', 'MD'), (',', ','), ('allowing', 'VBG'), ('the', 'D
T'), ('violent', 'NN'), ('to', 'TO'), ('inherit', 'VB'), ('the', 'DT'), ('Earth',
'NNP'), ('.', '.')]
[('But', 'CC'), ('they', 'PRP'), ('have', 'VBP'), ('miscalculated', 'VBN'), (':',
':'), ('We', 'PRP'), ('love', 'VBP'), ('our', 'PRP$'), ('freedom', 'NN'), (',',
','), ('and', 'CC'), ('we', 'PRP'), ('will', 'MD'), ('fight', 'VB'), ('to', 'TO'),
('keep', 'VB'), ('it', 'PRP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('In', 'IN'), ('a', 'DT'), ('time', 'NN'), ('of', 'IN'), ('testing', 'VBG'),
(',', ','), ('we', 'PRP'), ('can', 'MD'), ('not', 'RB'), ('find', 'VB'), ('securit
y', 'NN'), ('by', 'IN'), ('abandoning', 'VBG'), ('our', 'PRP$'), ('commitments',
'NNS'), ('and', 'CC'), ('retreating', 'VBG'), ('within', 'IN'), ('our', 'PRP$'),
('borders', 'NNS'), ('.', '.')]
[('If', 'IN'), ('we', 'PRP'), ('were', 'VBD'), ('to', 'TO'), ('leave', 'VB'), ('th
ese', 'DT'), ('vicious', 'JJ'), ('attackers', 'NNS'), ('alone', 'RB'), (',', ','),
('they', 'PRP'), ('would', 'MD'), ('not', 'RB'), ('leave', 'VB'), ('us', 'PRP'),
('alone', 'RB'), ('.', '.')]
[('They', 'PRP'), ('would', 'MD'), ('simply', 'RB'), ('move', 'VB'), ('the', 'D
T'), ('battlefield', 'NN'), ('to', 'TO'), ('our', 'PRP$'), ('own', 'JJ'), ('shore
s', 'NNS'), ('.', '.')]
[('There', 'EX'), ('is', 'VBZ'), ('no', 'DT'), ('peace', 'NN'), ('in', 'IN'), ('re
treat', 'NN'), ('.', '.')]
[('And', 'CC'), ('there', 'EX'), ('is', 'VBZ'), ('no', 'DT'), ('honor', 'NN'), ('i
n', 'IN'), ('retreat', 'NN'), ('.', '.')]
[('By', 'IN'), ('allowing', 'VBG'), ('radical', 'JJ'), ('Islam', 'NNP'), ('to', 'T
O'), ('work', 'VB'), ('its', 'PRP$'), ('will', 'MD'), ('--', ':'), ('by', 'IN'),
('leaving', 'VBG'), ('an', 'DT'), ('assaulted', 'JJ'), ('world', 'NN'), ('to', 'T
O'), ('fend', 'VB'), ('for', 'IN'), ('itself', 'PRP'), ('--', ':'), ('we', 'PRP'),
('would', 'MD'), ('signal', 'VB'), ('to', 'TO'), ('all', 'PDT'), ('that', 'IN'),
('we', 'PRP'), ('no', 'DT'), ('longer', 'RBR'), ('believe', 'VBP'), ('in', 'IN'),
('our', 'PRP$'), ('own', 'JJ'), ('ideals', 'NNS'), (',', ','), ('or', 'CC'), ('eve
n', 'RB'), ('in', 'IN'), ('our', 'PRP$'), ('own', 'JJ'), ('courage', 'NN'), ('.',
'.')]
[('But', 'CC'), ('our', 'PRP$'), ('enemies', 'NNS'), ('and', 'CC'), ('our', 'PRP
$'), ('friends', 'NNS'), ('can', 'MD'), ('be', 'VB'), ('certain', 'JJ'), (':',
':'), ('The', 'DT'), ('United', 'NNP'), ('States', 'NNPS'), ('will', 'MD'), ('no
t', 'RB'), ('retreat', 'VB'), ('from', 'IN'), ('the', 'DT'), ('world', 'NN'),
(',', ','), ('and', 'CC'), ('we', 'PRP'), ('will', 'MD'), ('never', 'RB'), ('surre
nder', 'VB'), ('to', 'TO'), ('evil', 'VB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('America', 'NNP'), ('rejects', 'VBZ'), ('the', 'DT'), ('false', 'JJ'), ('comfor
t', 'NN'), ('of', 'IN'), ('isolationism', 'NN'), ('.', '.')]
[('We', 'PRP'), ('are', 'VBP'), ('the', 'DT'), ('nation', 'NN'), ('that', 'IN'),
('saved', 'VBD'), ('liberty', 'NN'), ('in', 'IN'), ('Europe', 'NNP'), (',', ','),
('and', 'CC'), ('liberated', 'VBD'), ('death', 'NN'), ('camps', 'NNS'), (',',
','), ('and', 'CC'), ('helped', 'VBD'), ('raise', 'VB'), ('up', 'RP'), ('democraci

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 16/36
4/23/24, 12:25 PM NLP.27.03.24
es', 'NNS'), (',', ','), ('and', 'CC'), ('faced', 'VBD'), ('down', 'IN'), ('an',
'DT'), ('evil', 'JJ'), ('empire', 'NN'), ('.', '.')]
[('Once', 'RB'), ('again', 'RB'), (',', ','), ('we', 'PRP'), ('accept', 'VBP'),
('the', 'DT'), ('call', 'NN'), ('of', 'IN'), ('history', 'NN'), ('to', 'TO'), ('de
liver', 'VB'), ('the', 'DT'), ('oppressed', 'VBN'), ('and', 'CC'), ('move', 'VB'),
('this', 'DT'), ('world', 'NN'), ('toward', 'IN'), ('peace', 'NN'), ('.', '.')]
[('We', 'PRP'), ('remain', 'VBP'), ('on', 'IN'), ('the', 'DT'), ('offensive', 'J
J'), ('against', 'IN'), ('terror', 'NN'), ('networks', 'NNS'), ('.', '.')]
[('We', 'PRP'), ('have', 'VBP'), ('killed', 'VBN'), ('or', 'CC'), ('captured', 'VB
N'), ('many', 'JJ'), ('of', 'IN'), ('their', 'PRP$'), ('leaders', 'NNS'), ('--',
':'), ('and', 'CC'), ('for', 'IN'), ('the', 'DT'), ('others', 'NNS'), (',', ','),
('their', 'PRP$'), ('day', 'NN'), ('will', 'MD'), ('come', 'VB'), ('.', '.')]
[('President', 'NNP'), ('George', 'NNP'), ('W.', 'NNP'), ('Bush', 'NNP'), ('greet
s', 'VBZ'), ('members', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), ('after', 'I
N'), ('his', 'PRP$'), ('State', 'NN'), ('of', 'IN'), ('the', 'DT'), ('Union', 'NN
P'), ('Address', 'NNP'), ('at', 'IN'), ('the', 'DT'), ('Capitol', 'NNP'), (',',
','), ('Tuesday', 'NNP'), (',', ','), ('Jan', 'NNP'), ('.', '.')]
[('31', 'CD'), (',', ','), ('2006', 'CD'), ('.', '.')]
[('White', 'NNP'), ('House', 'NNP'), ('photo', 'NN'), ('by', 'IN'), ('Eric', 'NN
P'), ('Draper', 'NNP'), ('We', 'PRP'), ('remain', 'VBP'), ('on', 'IN'), ('the', 'D
T'), ('offensive', 'JJ'), ('in', 'IN'), ('Afghanistan', 'NNP'), (',', ','), ('wher
e', 'WRB'), ('a', 'DT'), ('fine', 'JJ'), ('President', 'NNP'), ('and', 'CC'),
('a', 'DT'), ('National', 'NNP'), ('Assembly', 'NNP'), ('are', 'VBP'), ('fightin
g', 'VBG'), ('terror', 'NN'), ('while', 'IN'), ('building', 'VBG'), ('the', 'DT'),
('institutions', 'NNS'), ('of', 'IN'), ('a', 'DT'), ('new', 'JJ'), ('democracy',
'NN'), ('.', '.')]
[('We', 'PRP'), ("'re", 'VBP'), ('on', 'IN'), ('the', 'DT'), ('offensive', 'JJ'),
('in', 'IN'), ('Iraq', 'NNP'), (',', ','), ('with', 'IN'), ('a', 'DT'), ('clear',
'JJ'), ('plan', 'NN'), ('for', 'IN'), ('victory', 'NN'), ('.', '.')]
[('First', 'RB'), (',', ','), ('we', 'PRP'), ("'re", 'VBP'), ('helping', 'VBG'),
('Iraqis', 'NNP'), ('build', 'VB'), ('an', 'DT'), ('inclusive', 'JJ'), ('governmen
t', 'NN'), (',', ','), ('so', 'IN'), ('that', 'DT'), ('old', 'JJ'), ('resentment
s', 'NNS'), ('will', 'MD'), ('be', 'VB'), ('eased', 'VBN'), ('and', 'CC'), ('the',
'DT'), ('insurgency', 'NN'), ('will', 'MD'), ('be', 'VB'), ('marginalized', 'VB
N'), ('.', '.')]
[('Second', 'JJ'), (',', ','), ('we', 'PRP'), ("'re", 'VBP'), ('continuing', 'VB
G'), ('reconstruction', 'NN'), ('efforts', 'NNS'), (',', ','), ('and', 'CC'), ('he
lping', 'VBG'), ('the', 'DT'), ('Iraqi', 'NNP'), ('government', 'NN'), ('to', 'T
O'), ('fight', 'VB'), ('corruption', 'NN'), ('and', 'CC'), ('build', 'VB'), ('a',
'DT'), ('modern', 'JJ'), ('economy', 'NN'), (',', ','), ('so', 'IN'), ('all', 'D
T'), ('Iraqis', 'NNP'), ('can', 'MD'), ('experience', 'VB'), ('the', 'DT'), ('bene
fits', 'NNS'), ('of', 'IN'), ('freedom', 'NN'), ('.', '.')]
[('And', 'CC'), (',', ','), ('third', 'JJ'), (',', ','), ('we', 'PRP'), ("'re", 'V
BP'), ('striking', 'VBG'), ('terrorist', 'JJ'), ('targets', 'NNS'), ('while', 'I
N'), ('we', 'PRP'), ('train', 'VBP'), ('Iraqi', 'JJ'), ('forces', 'NNS'), ('that',
'WDT'), ('are', 'VBP'), ('increasingly', 'RB'), ('capable', 'JJ'), ('of', 'IN'),
('defeating', 'VBG'), ('the', 'DT'), ('enemy', 'NN'), ('.', '.')]
[('Iraqis', 'NNP'), ('are', 'VBP'), ('showing', 'VBG'), ('their', 'PRP$'), ('coura
ge', 'NN'), ('every', 'DT'), ('day', 'NN'), (',', ','), ('and', 'CC'), ('we', 'PR
P'), ('are', 'VBP'), ('proud', 'JJ'), ('to', 'TO'), ('be', 'VB'), ('their', 'PRP
$'), ('allies', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('cause', 'NN'), ('of', 'I
N'), ('freedom', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Our', 'PRP$'), ('work', 'NN'), ('in', 'IN'), ('Iraq', 'NNP'), ('is', 'VBZ'),
('difficult', 'JJ'), ('because', 'IN'), ('our', 'PRP$'), ('enemy', 'NN'), ('is',
'VBZ'), ('brutal', 'JJ'), ('.', '.')]
[('But', 'CC'), ('that', 'DT'), ('brutality', 'NN'), ('has', 'VBZ'), ('not', 'R
B'), ('stopped', 'VBN'), ('the', 'DT'), ('dramatic', 'JJ'), ('progress', 'NN'),
('of', 'IN'), ('a', 'DT'), ('new', 'JJ'), ('democracy', 'NN'), ('.', '.')]
[('In', 'IN'), ('less', 'JJR'), ('than', 'IN'), ('three', 'CD'), ('years', 'NNS'),
(',', ','), ('the', 'DT'), ('nation', 'NN'), ('has', 'VBZ'), ('gone', 'VBN'), ('fr
om', 'IN'), ('dictatorship', 'NN'), ('to', 'TO'), ('liberation', 'NN'), (',',
','), ('to', 'TO'), ('sovereignty', 'VB'), (',', ','), ('to', 'TO'), ('a', 'DT'),
('constitution', 'NN'), (',', ','), ('to', 'TO'), ('national', 'JJ'), ('election

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 17/36
4/23/24, 12:25 PM NLP.27.03.24
s', 'NNS'), ('.', '.')]
[('At', 'IN'), ('the', 'DT'), ('same', 'JJ'), ('time', 'NN'), (',', ','), ('our',
'PRP$'), ('coalition', 'NN'), ('has', 'VBZ'), ('been', 'VBN'), ('relentless', 'VB
N'), ('in', 'IN'), ('shutting', 'VBG'), ('off', 'RP'), ('terrorist', 'JJ'), ('infi
ltration', 'NN'), (',', ','), ('clearing', 'VBG'), ('out', 'RP'), ('insurgent', 'J
J'), ('strongholds', 'NNS'), (',', ','), ('and', 'CC'), ('turning', 'VBG'), ('ove
r', 'RP'), ('territory', 'NN'), ('to', 'TO'), ('Iraqi', 'NNP'), ('security', 'N
N'), ('forces', 'NNS'), ('.', '.')]
[('I', 'PRP'), ('am', 'VBP'), ('confident', 'JJ'), ('in', 'IN'), ('our', 'PRP$'),
('plan', 'NN'), ('for', 'IN'), ('victory', 'NN'), (';', ':'), ('I', 'PRP'), ('am',
'VBP'), ('confident', 'JJ'), ('in', 'IN'), ('the', 'DT'), ('will', 'MD'), ('of',
'IN'), ('the', 'DT'), ('Iraqi', 'NNP'), ('people', 'NNS'), (';', ':'), ('I', 'PR
P'), ('am', 'VBP'), ('confident', 'JJ'), ('in', 'IN'), ('the', 'DT'), ('skill', 'N
N'), ('and', 'CC'), ('spirit', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('military',
'JJ'), ('.', '.')]
[('Fellow', 'NNP'), ('citizens', 'NNS'), (',', ','), ('we', 'PRP'), ('are', 'VB
P'), ('in', 'IN'), ('this', 'DT'), ('fight', 'NN'), ('to', 'TO'), ('win', 'VB'),
(',', ','), ('and', 'CC'), ('we', 'PRP'), ('are', 'VBP'), ('winning', 'VBG'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('The', 'DT'), ('road', 'NN'), ('of', 'IN'), ('victory', 'NN'), ('is', 'VBZ'),
('the', 'DT'), ('road', 'NN'), ('that', 'WDT'), ('will', 'MD'), ('take', 'VB'),
('our', 'PRP$'), ('troops', 'NNS'), ('home', 'NN'), ('.', '.')]
[('As', 'IN'), ('we', 'PRP'), ('make', 'VBP'), ('progress', 'NN'), ('on', 'IN'),
('the', 'DT'), ('ground', 'NN'), (',', ','), ('and', 'CC'), ('Iraqi', 'NNP'), ('fo
rces', 'NNS'), ('increasingly', 'RB'), ('take', 'VBP'), ('the', 'DT'), ('lead', 'N
N'), (',', ','), ('we', 'PRP'), ('should', 'MD'), ('be', 'VB'), ('able', 'JJ'),
('to', 'TO'), ('further', 'JJ'), ('decrease', 'VB'), ('our', 'PRP$'), ('troop', 'N
N'), ('levels', 'NNS'), ('--', ':'), ('but', 'CC'), ('those', 'DT'), ('decisions',
'NNS'), ('will', 'MD'), ('be', 'VB'), ('made', 'VBN'), ('by', 'IN'), ('our', 'PRP
$'), ('military', 'JJ'), ('commanders', 'NNS'), (',', ','), ('not', 'RB'), ('by',
'IN'), ('politicians', 'NNS'), ('in', 'IN'), ('Washington', 'NNP'), (',', ','),
('D.C', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Our', 'PRP$'), ('coalition', 'NN'), ('has', 'VBZ'), ('learned', 'VBN'), ('fro
m', 'IN'), ('our', 'PRP$'), ('experience', 'NN'), ('in', 'IN'), ('Iraq', 'NNP'),
('.', '.')]
[('We', 'PRP'), ("'ve", 'VBP'), ('adjusted', 'VBN'), ('our', 'PRP$'), ('military',
'JJ'), ('tactics', 'NNS'), ('and', 'CC'), ('changed', 'VBD'), ('our', 'PRP$'), ('a
pproach', 'NN'), ('to', 'TO'), ('reconstruction', 'NN'), ('.', '.')]
[('Along', 'IN'), ('the', 'DT'), ('way', 'NN'), (',', ','), ('we', 'PRP'), ('hav
e', 'VBP'), ('benefitted', 'VBN'), ('from', 'IN'), ('responsible', 'JJ'), ('critic
ism', 'NN'), ('and', 'CC'), ('counsel', 'NN'), ('offered', 'VBN'), ('by', 'IN'),
('members', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), ('of', 'IN'), ('both', 'D
T'), ('parties', 'NNS'), ('.', '.')]
[('In', 'IN'), ('the', 'DT'), ('coming', 'VBG'), ('year', 'NN'), (',', ','), ('I',
'PRP'), ('will', 'MD'), ('continue', 'VB'), ('to', 'TO'), ('reach', 'VB'), ('out',
'RP'), ('and', 'CC'), ('seek', 'VB'), ('your', 'PRP$'), ('good', 'JJ'), ('advice',
'NN'), ('.', '.')]
[('Yet', 'RB'), (',', ','), ('there', 'EX'), ('is', 'VBZ'), ('a', 'DT'), ('differe
nce', 'NN'), ('between', 'IN'), ('responsible', 'JJ'), ('criticism', 'NN'), ('tha
t', 'WDT'), ('aims', 'VBZ'), ('for', 'IN'), ('success', 'NN'), (',', ','), ('and',
'CC'), ('defeatism', 'NN'), ('that', 'WDT'), ('refuses', 'VBZ'), ('to', 'TO'), ('a
cknowledge', 'VB'), ('anything', 'NN'), ('but', 'CC'), ('failure', 'NN'), ('.',
'.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Hindsight', 'NNP'), ('alone', 'RB'), ('is', 'VBZ'), ('not', 'RB'), ('wisdom',
'JJ'), (',', ','), ('and', 'CC'), ('second-guessing', 'NN'), ('is', 'VBZ'), ('no
t', 'RB'), ('a', 'DT'), ('strategy', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('With', 'IN'), ('so', 'RB'), ('much', 'JJ'), ('in', 'IN'), ('the', 'DT'), ('bala
nce', 'NN'), (',', ','), ('those', 'DT'), ('of', 'IN'), ('us', 'PRP'), ('in', 'I
N'), ('public', 'JJ'), ('office', 'NN'), ('have', 'VBP'), ('a', 'DT'), ('duty', 'N
N'), ('to', 'TO'), ('speak', 'VB'), ('with', 'IN'), ('candor', 'NN'), ('.', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 18/36
4/23/24, 12:25 PM NLP.27.03.24
[('A', 'DT'), ('sudden', 'JJ'), ('withdrawal', 'NN'), ('of', 'IN'), ('our', 'PRP
$'), ('forces', 'NNS'), ('from', 'IN'), ('Iraq', 'NNP'), ('would', 'MD'), ('abando
n', 'VB'), ('our', 'PRP$'), ('Iraqi', 'NNP'), ('allies', 'NNS'), ('to', 'TO'), ('d
eath', 'NN'), ('and', 'CC'), ('prison', 'NN'), (',', ','), ('would', 'MD'), ('pu
t', 'VB'), ('men', 'NNS'), ('like', 'IN'), ('bin', 'NN'), ('Laden', 'NNP'), ('an
d', 'CC'), ('Zarqawi', 'NNP'), ('in', 'IN'), ('charge', 'NN'), ('of', 'IN'), ('a',
'DT'), ('strategic', 'JJ'), ('country', 'NN'), (',', ','), ('and', 'CC'), ('show',
'VBP'), ('that', 'IN'), ('a', 'DT'), ('pledge', 'NN'), ('from', 'IN'), ('America',
'NNP'), ('means', 'VBZ'), ('little', 'JJ'), ('.', '.')]
[('Members', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), (',', ','), ('however', 'R
B'), ('we', 'PRP'), ('feel', 'VBP'), ('about', 'IN'), ('the', 'DT'), ('decisions',
'NNS'), ('and', 'CC'), ('debates', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('past',
'NN'), (',', ','), ('our', 'PRP$'), ('nation', 'NN'), ('has', 'VBZ'), ('only', 'R
B'), ('one', 'CD'), ('option', 'NN'), (':', ':'), ('We', 'PRP'), ('must', 'MD'),
('keep', 'VB'), ('our', 'PRP$'), ('word', 'NN'), (',', ','), ('defeat', 'VB'), ('o
ur', 'PRP$'), ('enemies', 'NNS'), (',', ','), ('and', 'CC'), ('stand', 'VBP'), ('b
ehind', 'IN'), ('the', 'DT'), ('American', 'JJ'), ('military', 'NN'), ('in', 'I
N'), ('this', 'DT'), ('vital', 'JJ'), ('mission', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Laura', 'NNP'), ('Bush', 'NNP'), ('is', 'VBZ'), ('applauded', 'VBN'), ('as', 'I
N'), ('she', 'PRP'), ('is', 'VBZ'), ('introduced', 'VBN'), ('Tuesday', 'NNP'), ('e
vening', 'NN'), (',', ','), ('Jan', 'NNP'), ('.', '.')]
[('31', 'CD'), (',', ','), ('2006', 'CD'), ('during', 'IN'), ('the', 'DT'), ('Stat
e', 'NNP'), ('of', 'IN'), ('the', 'DT'), ('Union', 'NNP'), ('Address', 'NNP'), ('a
t', 'IN'), ('United', 'NNP'), ('States', 'NNPS'), ('Capitol', 'NNP'), ('in', 'I
N'), ('Washington', 'NNP'), ('.', '.')]
[('White', 'NNP'), ('House', 'NNP'), ('photo', 'NN'), ('by', 'IN'), ('Eric', 'NN
P'), ('Draper', 'NNP'), ('Our', 'PRP$'), ('men', 'NNS'), ('and', 'CC'), ('women',
'NNS'), ('in', 'IN'), ('uniform', 'JJ'), ('are', 'VBP'), ('making', 'VBG'), ('sacr
ifices', 'NNS'), ('--', ':'), ('and', 'CC'), ('showing', 'VBG'), ('a', 'DT'), ('se
nse', 'NN'), ('of', 'IN'), ('duty', 'NN'), ('stronger', 'JJR'), ('than', 'IN'),
('all', 'DT'), ('fear', 'NN'), ('.', '.')]
[('They', 'PRP'), ('know', 'VBP'), ('what', 'WP'), ('it', 'PRP'), ("'s", 'VBZ'),
('like', 'IN'), ('to', 'TO'), ('fight', 'VB'), ('house', 'NN'), ('to', 'TO'), ('ho
use', 'NN'), ('in', 'IN'), ('a', 'DT'), ('maze', 'NN'), ('of', 'IN'), ('streets',
'NNS'), (',', ','), ('to', 'TO'), ('wear', 'VB'), ('heavy', 'JJ'), ('gear', 'NN'),
('in', 'IN'), ('the', 'DT'), ('desert', 'NN'), ('heat', 'NN'), (',', ','), ('to',
'TO'), ('see', 'VB'), ('a', 'DT'), ('comrade', 'NN'), ('killed', 'VBN'), ('by', 'I
N'), ('a', 'DT'), ('roadside', 'NN'), ('bomb', 'NN'), ('.', '.')]
[('And', 'CC'), ('those', 'DT'), ('who', 'WP'), ('know', 'VBP'), ('the', 'DT'),
('costs', 'NNS'), ('also', 'RB'), ('know', 'VBP'), ('the', 'DT'), ('stakes', 'NN
S'), ('.', '.')]
[('Marine', 'JJ'), ('Staff', 'NNP'), ('Sergeant', 'NNP'), ('Dan', 'NNP'), ('Clay',
'NNP'), ('was', 'VBD'), ('killed', 'VBN'), ('last', 'JJ'), ('month', 'NN'), ('figh
ting', 'VBG'), ('in', 'IN'), ('Fallujah', 'NNP'), ('.', '.')]
[('He', 'PRP'), ('left', 'VBD'), ('behind', 'RP'), ('a', 'DT'), ('letter', 'NN'),
('to', 'TO'), ('his', 'PRP$'), ('family', 'NN'), (',', ','), ('but', 'CC'), ('hi
s', 'PRP$'), ('words', 'NNS'), ('could', 'MD'), ('just', 'RB'), ('as', 'RB'), ('we
ll', 'RB'), ('be', 'VB'), ('addressed', 'VBN'), ('to', 'TO'), ('every', 'DT'), ('A
merican', 'NNP'), ('.', '.')]
[('Here', 'RB'), ('is', 'VBZ'), ('what', 'WP'), ('Dan', 'NNP'), ('wrote', 'VBD'),
(':', ':'), ('``', '``'), ('I', 'PRP'), ('know', 'VBP'), ('what', 'WP'), ('honor',
'NN'), ('is', 'VBZ'), ('.', '.')]
[('...', ':'), ('It', 'PRP'), ('has', 'VBZ'), ('been', 'VBN'), ('an', 'DT'), ('hon
or', 'NN'), ('to', 'TO'), ('protect', 'VB'), ('and', 'CC'), ('serve', 'VB'), ('al
l', 'DT'), ('of', 'IN'), ('you', 'PRP'), ('.', '.')]
[('I', 'PRP'), ('faced', 'VBD'), ('death', 'NN'), ('with', 'IN'), ('the', 'DT'),
('secure', 'NN'), ('knowledge', 'NN'), ('that', 'IN'), ('you', 'PRP'), ('would',
'MD'), ('not', 'RB'), ('have', 'VB'), ('to', 'TO'), ('....', 'VB')]
[('Never', 'RB'), ('falter', 'NN'), ('!', '.')]
[('Do', 'VBP'), ("n't", 'RB'), ('hesitate', 'VB'), ('to', 'TO'), ('honor', 'VB'),
('and', 'CC'), ('support', 'VB'), ('those', 'DT'), ('of', 'IN'), ('us', 'PRP'),
('who', 'WP'), ('have', 'VBP'), ('the', 'DT'), ('honor', 'NN'), ('of', 'IN'), ('pr
otecting', 'VBG'), ('that', 'DT'), ('which', 'WDT'), ('is', 'VBZ'), ('worth', 'J

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 19/36
4/23/24, 12:25 PM NLP.27.03.24
J'), ('protecting', 'VBG'), ('.', '.'), ("''", "''")]
[('Staff', 'NNP'), ('Sergeant', 'NNP'), ('Dan', 'NNP'), ('Clay', 'NNP'), ("'s", 'P
OS'), ('wife', 'NN'), (',', ','), ('Lisa', 'NNP'), (',', ','), ('and', 'CC'), ('hi
s', 'PRP$'), ('mom', 'NN'), ('and', 'CC'), ('dad', 'NN'), (',', ','), ('Sara', 'NN
P'), ('Jo', 'NNP'), ('and', 'CC'), ('Bud', 'NNP'), (',', ','), ('are', 'VBP'), ('w
ith', 'IN'), ('us', 'PRP'), ('this', 'DT'), ('evening', 'NN'), ('.', '.')]
[('Welcome', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Our', 'PRP$'), ('nation', 'NN'), ('is', 'VBZ'), ('grateful', 'JJ'), ('to', 'T
O'), ('the', 'DT'), ('fallen', 'VBN'), (',', ','), ('who', 'WP'), ('live', 'VBP'),
('in', 'IN'), ('the', 'DT'), ('memory', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('co
untry', 'NN'), ('.', '.')]
[('We', 'PRP'), ("'re", 'VBP'), ('grateful', 'JJ'), ('to', 'TO'), ('all', 'DT'),
('who', 'WP'), ('volunteer', 'VBP'), ('to', 'TO'), ('wear', 'VB'), ('our', 'PRP
$'), ('nation', 'NN'), ("'s", 'POS'), ('uniform', 'NN'), ('--', ':'), ('and', 'C
C'), ('as', 'IN'), ('we', 'PRP'), ('honor', 'VBP'), ('our', 'PRP$'), ('brave', 'N
N'), ('troops', 'NNS'), (',', ','), ('let', 'VB'), ('us', 'PRP'), ('never', 'RB'),
('forget', 'VBP'), ('the', 'DT'), ('sacrifices', 'NNS'), ('of', 'IN'), ('America',
'NNP'), ("'s", 'POS'), ('military', 'JJ'), ('families', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Our', 'PRP$'), ('offensive', 'JJ'), ('against', 'IN'), ('terror', 'NN'), ('invo
lves', 'VBZ'), ('more', 'JJR'), ('than', 'IN'), ('military', 'JJ'), ('action', 'N
N'), ('.', '.')]
[('Ultimately', 'RB'), (',', ','), ('the', 'DT'), ('only', 'JJ'), ('way', 'NN'),
('to', 'TO'), ('defeat', 'VB'), ('the', 'DT'), ('terrorists', 'NNS'), ('is', 'VB
Z'), ('to', 'TO'), ('defeat', 'VB'), ('their', 'PRP$'), ('dark', 'JJ'), ('vision',
'NN'), ('of', 'IN'), ('hatred', 'VBN'), ('and', 'CC'), ('fear', 'VBN'), ('by', 'I
N'), ('offering', 'VBG'), ('the', 'DT'), ('hopeful', 'JJ'), ('alternative', 'NN'),
('of', 'IN'), ('political', 'JJ'), ('freedom', 'NN'), ('and', 'CC'), ('peaceful',
'JJ'), ('change', 'NN'), ('.', '.')]
[('So', 'IN'), ('the', 'DT'), ('United', 'NNP'), ('States', 'NNPS'), ('of', 'IN'),
('America', 'NNP'), ('supports', 'NNS'), ('democratic', 'JJ'), ('reform', 'NN'),
('across', 'IN'), ('the', 'DT'), ('broader', 'JJR'), ('Middle', 'NNP'), ('East',
'NNP'), ('.', '.')]
[('Elections', 'NNS'), ('are', 'VBP'), ('vital', 'JJ'), (',', ','), ('but', 'CC'),
('they', 'PRP'), ('are', 'VBP'), ('only', 'RB'), ('the', 'DT'), ('beginning', 'N
N'), ('.', '.')]
[('Raising', 'VBG'), ('up', 'RP'), ('a', 'DT'), ('democracy', 'NN'), ('requires',
'VBZ'), ('the', 'DT'), ('rule', 'NN'), ('of', 'IN'), ('law', 'NN'), (',', ','),
('and', 'CC'), ('protection', 'NN'), ('of', 'IN'), ('minorities', 'NNS'), (',',
','), ('and', 'CC'), ('strong', 'JJ'), (',', ','), ('accountable', 'JJ'), ('instit
utions', 'NNS'), ('that', 'IN'), ('last', 'JJ'), ('longer', 'JJR'), ('than', 'I
N'), ('a', 'DT'), ('single', 'JJ'), ('vote', 'NN'), ('.', '.')]
[('The', 'DT'), ('great', 'JJ'), ('people', 'NNS'), ('of', 'IN'), ('Egypt', 'NN
P'), ('have', 'VBP'), ('voted', 'VBN'), ('in', 'IN'), ('a', 'DT'), ('multi-party',
'JJ'), ('presidential', 'JJ'), ('election', 'NN'), ('--', ':'), ('and', 'CC'), ('n
ow', 'RB'), ('their', 'PRP$'), ('government', 'NN'), ('should', 'MD'), ('open', 'V
B'), ('paths', 'NNS'), ('of', 'IN'), ('peaceful', 'JJ'), ('opposition', 'NN'), ('t
hat', 'WDT'), ('will', 'MD'), ('reduce', 'VB'), ('the', 'DT'), ('appeal', 'NN'),
('of', 'IN'), ('radicalism', 'NN'), ('.', '.')]
[('The', 'DT'), ('Palestinian', 'JJ'), ('people', 'NNS'), ('have', 'VBP'), ('vote
d', 'VBN'), ('in', 'IN'), ('elections', 'NNS'), ('.', '.')]
[('And', 'CC'), ('now', 'RB'), ('the', 'DT'), ('leaders', 'NNS'), ('of', 'IN'),
('Hamas', 'NNP'), ('must', 'MD'), ('recognize', 'VB'), ('Israel', 'NNP'), (',',
','), ('disarm', 'NN'), (',', ','), ('reject', 'JJ'), ('terrorism', 'NN'), (',',
','), ('and', 'CC'), ('work', 'NN'), ('for', 'IN'), ('lasting', 'VBG'), ('peace',
'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Saudi', 'NNP'), ('Arabia', 'NNP'), ('has', 'VBZ'), ('taken', 'VBN'), ('the', 'D
T'), ('first', 'JJ'), ('steps', 'NNS'), ('of', 'IN'), ('reform', 'NN'), ('--',
':'), ('now', 'RB'), ('it', 'PRP'), ('can', 'MD'), ('offer', 'VB'), ('its', 'PRP
$'), ('people', 'NNS'), ('a', 'DT'), ('better', 'JJR'), ('future', 'NN'), ('by',
'IN'), ('pressing', 'VBG'), ('forward', 'RB'), ('with', 'IN'), ('those', 'DT'),
('efforts', 'NNS'), ('.', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 20/36
4/23/24, 12:25 PM NLP.27.03.24
[('Democracies', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('Middle', 'NNP'), ('East',
'NNP'), ('will', 'MD'), ('not', 'RB'), ('look', 'VB'), ('like', 'IN'), ('our', 'PR
P$'), ('own', 'JJ'), (',', ','), ('because', 'IN'), ('they', 'PRP'), ('will', 'M
D'), ('reflect', 'VB'), ('the', 'DT'), ('traditions', 'NNS'), ('of', 'IN'), ('thei
r', 'PRP$'), ('own', 'JJ'), ('citizens', 'NNS'), ('.', '.')]
[('Yet', 'RB'), ('liberty', 'NN'), ('is', 'VBZ'), ('the', 'DT'), ('future', 'NN'),
('of', 'IN'), ('every', 'DT'), ('nation', 'NN'), ('in', 'IN'), ('the', 'DT'), ('Mi
ddle', 'NNP'), ('East', 'NNP'), (',', ','), ('because', 'IN'), ('liberty', 'NN'),
('is', 'VBZ'), ('the', 'DT'), ('right', 'NN'), ('and', 'CC'), ('hope', 'NN'), ('o
f', 'IN'), ('all', 'DT'), ('humanity', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('President', 'NNP'), ('George', 'NNP'), ('W.', 'NNP'), ('Bush', 'NNP'), ('wave
s', 'VBZ'), ('toward', 'IN'), ('the', 'DT'), ('upper', 'JJ'), ('visitors', 'NNS'),
('gallery', 'NN'), ('of', 'IN'), ('the', 'DT'), ('House', 'NNP'), ('Chamber', 'NN
P'), ('following', 'VBG'), ('his', 'PRP$'), ('State', 'NN'), ('of', 'IN'), ('the',
'DT'), ('Union', 'NNP'), ('remarks', 'NNS'), ('Tuesday', 'NNP'), (',', ','), ('Ja
n', 'NNP'), ('.', '.')]
[('31', 'CD'), (',', ','), ('2006', 'CD'), ('at', 'IN'), ('the', 'DT'), ('United',
'NNP'), ('States', 'NNPS'), ('Capitol', 'NNP'), ('.', '.')]
[('White', 'NNP'), ('House', 'NNP'), ('photo', 'NN'), ('by', 'IN'), ('Eric', 'NN
P'), ('Draper', 'NNP'), ('The', 'DT'), ('same', 'JJ'), ('is', 'VBZ'), ('true', 'J
J'), ('of', 'IN'), ('Iran', 'NNP'), (',', ','), ('a', 'DT'), ('nation', 'NN'), ('n
ow', 'RB'), ('held', 'VBN'), ('hostage', 'NN'), ('by', 'IN'), ('a', 'DT'), ('smal
l', 'JJ'), ('clerical', 'JJ'), ('elite', 'NN'), ('that', 'WDT'), ('is', 'VBZ'),
('isolating', 'VBG'), ('and', 'CC'), ('repressing', 'VBG'), ('its', 'PRP$'), ('peo
ple', 'NNS'), ('.', '.')]
[('The', 'DT'), ('regime', 'NN'), ('in', 'IN'), ('that', 'DT'), ('country', 'NN'),
('sponsors', 'NNS'), ('terrorists', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('Palesti
nian', 'JJ'), ('territories', 'NNS'), ('and', 'CC'), ('in', 'IN'), ('Lebanon', 'NN
P'), ('--', ':'), ('and', 'CC'), ('that', 'DT'), ('must', 'MD'), ('come', 'VB'),
('to', 'TO'), ('an', 'DT'), ('end', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('The', 'DT'), ('Iranian', 'JJ'), ('government', 'NN'), ('is', 'VBZ'), ('defyin
g', 'VBG'), ('the', 'DT'), ('world', 'NN'), ('with', 'IN'), ('its', 'PRP$'), ('nuc
lear', 'JJ'), ('ambitions', 'NNS'), (',', ','), ('and', 'CC'), ('the', 'DT'), ('na
tions', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('world', 'NN'), ('must', 'MD'), ('no
t', 'RB'), ('permit', 'VB'), ('the', 'DT'), ('Iranian', 'JJ'), ('regime', 'NN'),
('to', 'TO'), ('gain', 'VB'), ('nuclear', 'JJ'), ('weapons', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('America', 'NNP'), ('will', 'MD'), ('continue', 'VB'), ('to', 'TO'), ('rally',
'VB'), ('the', 'DT'), ('world', 'NN'), ('to', 'TO'), ('confront', 'VB'), ('these',
'DT'), ('threats', 'NNS'), ('.', '.')]
[('Tonight', 'NNP'), (',', ','), ('let', 'VB'), ('me', 'PRP'), ('speak', 'VB'),
('directly', 'RB'), ('to', 'TO'), ('the', 'DT'), ('citizens', 'NNS'), ('of', 'I
N'), ('Iran', 'NNP'), (':', ':'), ('America', 'NNP'), ('respects', 'VBZ'), ('you',
'PRP'), (',', ','), ('and', 'CC'), ('we', 'PRP'), ('respect', 'VBP'), ('your', 'PR
P$'), ('country', 'NN'), ('.', '.')]
[('We', 'PRP'), ('respect', 'VBP'), ('your', 'PRP$'), ('right', 'NN'), ('to', 'T
O'), ('choose', 'VB'), ('your', 'PRP$'), ('own', 'JJ'), ('future', 'NN'), ('and',
'CC'), ('win', 'VB'), ('your', 'PRP$'), ('own', 'JJ'), ('freedom', 'NN'), ('.',
'.')]
[('And', 'CC'), ('our', 'PRP$'), ('nation', 'NN'), ('hopes', 'VBZ'), ('one', 'C
D'), ('day', 'NN'), ('to', 'TO'), ('be', 'VB'), ('the', 'DT'), ('closest', 'JJS'),
('of', 'IN'), ('friends', 'NNS'), ('with', 'IN'), ('a', 'DT'), ('free', 'JJ'), ('a
nd', 'CC'), ('democratic', 'JJ'), ('Iran', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('To', 'TO'), ('overcome', 'VB'), ('dangers', 'NNS'), ('in', 'IN'), ('our', 'PRP
$'), ('world', 'NN'), (',', ','), ('we', 'PRP'), ('must', 'MD'), ('also', 'RB'),
('take', 'VB'), ('the', 'DT'), ('offensive', 'JJ'), ('by', 'IN'), ('encouraging',
'VBG'), ('economic', 'JJ'), ('progress', 'NN'), (',', ','), ('and', 'CC'), ('fight
ing', 'VBG'), ('disease', 'NN'), (',', ','), ('and', 'CC'), ('spreading', 'VBG'),
('hope', 'NN'), ('in', 'IN'), ('hopeless', 'JJ'), ('lands', 'NNS'), ('.', '.')]
[('Isolationism', 'NNP'), ('would', 'MD'), ('not', 'RB'), ('only', 'RB'), ('tie',
'VB'), ('our', 'PRP$'), ('hands', 'NNS'), ('in', 'IN'), ('fighting', 'VBG'), ('ene

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 21/36
4/23/24, 12:25 PM NLP.27.03.24
mies', 'NNS'), (',', ','), ('it', 'PRP'), ('would', 'MD'), ('keep', 'VB'), ('us',
'PRP'), ('from', 'IN'), ('helping', 'VBG'), ('our', 'PRP$'), ('friends', 'NNS'),
('in', 'IN'), ('desperate', 'JJ'), ('need', 'NN'), ('.', '.')]
[('We', 'PRP'), ('show', 'VBP'), ('compassion', 'JJ'), ('abroad', 'RB'), ('becaus
e', 'IN'), ('Americans', 'NNPS'), ('believe', 'VBP'), ('in', 'IN'), ('the', 'DT'),
('God-given', 'NNP'), ('dignity', 'NN'), ('and', 'CC'), ('worth', 'NN'), ('of', 'I
N'), ('a', 'DT'), ('villager', 'NN'), ('with', 'IN'), ('HIV/AIDS', 'NNP'), (',',
','), ('or', 'CC'), ('an', 'DT'), ('infant', 'NN'), ('with', 'IN'), ('malaria', 'N
NS'), (',', ','), ('or', 'CC'), ('a', 'DT'), ('refugee', 'JJ'), ('fleeing', 'NN'),
('genocide', 'NN'), (',', ','), ('or', 'CC'), ('a', 'DT'), ('young', 'JJ'), ('gir
l', 'NN'), ('sold', 'VBN'), ('into', 'IN'), ('slavery', 'NN'), ('.', '.')]
[('We', 'PRP'), ('also', 'RB'), ('show', 'VBP'), ('compassion', 'NN'), ('abroad',
'RB'), ('because', 'IN'), ('regions', 'NNS'), ('overwhelmed', 'VBN'), ('by', 'I
N'), ('poverty', 'NN'), (',', ','), ('corruption', 'NN'), (',', ','), ('and', 'C
C'), ('despair', 'NN'), ('are', 'VBP'), ('sources', 'NNS'), ('of', 'IN'), ('terror
ism', 'NN'), (',', ','), ('and', 'CC'), ('organized', 'VBD'), ('crime', 'NN'),
(',', ','), ('and', 'CC'), ('human', 'JJ'), ('trafficking', 'NN'), (',', ','), ('a
nd', 'CC'), ('the', 'DT'), ('drug', 'NN'), ('trade', 'NN'), ('.', '.')]
[('In', 'IN'), ('recent', 'JJ'), ('years', 'NNS'), (',', ','), ('you', 'PRP'), ('a
nd', 'CC'), ('I', 'PRP'), ('have', 'VBP'), ('taken', 'VBN'), ('unprecedented', 'J
J'), ('action', 'NN'), ('to', 'TO'), ('fight', 'VB'), ('AIDS', 'NNP'), ('and', 'C
C'), ('malaria', 'NNS'), (',', ','), ('expand', 'VBP'), ('the', 'DT'), ('educatio
n', 'NN'), ('of', 'IN'), ('girls', 'NNS'), (',', ','), ('and', 'CC'), ('reward',
'RB'), ('developing', 'VBG'), ('nations', 'NNS'), ('that', 'WDT'), ('are', 'VBP'),
('moving', 'VBG'), ('forward', 'RB'), ('with', 'IN'), ('economic', 'JJ'), ('and',
'CC'), ('political', 'JJ'), ('reform', 'NN'), ('.', '.')]
[('For', 'IN'), ('people', 'NNS'), ('everywhere', 'RB'), (',', ','), ('the', 'D
T'), ('United', 'NNP'), ('States', 'NNPS'), ('is', 'VBZ'), ('a', 'DT'), ('partne
r', 'NN'), ('for', 'IN'), ('a', 'DT'), ('better', 'JJR'), ('life', 'NN'), ('.',
'.')]
[('Short-changing', 'VBG'), ('these', 'DT'), ('efforts', 'NNS'), ('would', 'MD'),
('increase', 'VB'), ('the', 'DT'), ('suffering', 'NN'), ('and', 'CC'), ('chaos',
'NN'), ('of', 'IN'), ('our', 'PRP$'), ('world', 'NN'), (',', ','), ('undercut', 'J
J'), ('our', 'PRP$'), ('long-term', 'JJ'), ('security', 'NN'), (',', ','), ('and',
'CC'), ('dull', 'VB'), ('the', 'DT'), ('conscience', 'NN'), ('of', 'IN'), ('our',
'PRP$'), ('country', 'NN'), ('.', '.')]
[('I', 'PRP'), ('urge', 'VBP'), ('members', 'NNS'), ('of', 'IN'), ('Congress', 'NN
P'), ('to', 'TO'), ('serve', 'VB'), ('the', 'DT'), ('interests', 'NNS'), ('of', 'I
N'), ('America', 'NNP'), ('by', 'IN'), ('showing', 'VBG'), ('the', 'DT'), ('compas
sion', 'NN'), ('of', 'IN'), ('America', 'NNP'), ('.', '.')]
[('Our', 'PRP$'), ('country', 'NN'), ('must', 'MD'), ('also', 'RB'), ('remain', 'V
B'), ('on', 'IN'), ('the', 'DT'), ('offensive', 'JJ'), ('against', 'IN'), ('terror
ism', 'NN'), ('here', 'RB'), ('at', 'IN'), ('home', 'NN'), ('.', '.')]
[('The', 'DT'), ('enemy', 'NN'), ('has', 'VBZ'), ('not', 'RB'), ('lost', 'VBN'),
('the', 'DT'), ('desire', 'NN'), ('or', 'CC'), ('capability', 'NN'), ('to', 'TO'),
('attack', 'VB'), ('us', 'PRP'), ('.', '.')]
[('Fortunately', 'RB'), (',', ','), ('this', 'DT'), ('nation', 'NN'), ('has', 'VB
Z'), ('superb', 'VBN'), ('professionals', 'NNS'), ('in', 'IN'), ('law', 'NN'), ('e
nforcement', 'NN'), (',', ','), ('intelligence', 'NN'), (',', ','), ('the', 'DT'),
('military', 'JJ'), (',', ','), ('and', 'CC'), ('homeland', 'VBP'), ('security',
'NN'), ('.', '.')]
[('These', 'DT'), ('men', 'NNS'), ('and', 'CC'), ('women', 'NNS'), ('are', 'VBP'),
('dedicating', 'VBG'), ('their', 'PRP$'), ('lives', 'NNS'), (',', ','), ('protecti
ng', 'VBG'), ('us', 'PRP'), ('all', 'DT'), (',', ','), ('and', 'CC'), ('they', 'PR
P'), ('deserve', 'VBP'), ('our', 'PRP$'), ('support', 'NN'), ('and', 'CC'), ('ou
r', 'PRP$'), ('thanks', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('They', 'PRP'), ('also', 'RB'), ('deserve', 'VBP'), ('the', 'DT'), ('same', 'J
J'), ('tools', 'NNS'), ('they', 'PRP'), ('already', 'RB'), ('use', 'VBP'), ('to',
'TO'), ('fight', 'VB'), ('drug', 'NN'), ('trafficking', 'NN'), ('and', 'CC'), ('or
ganized', 'VBN'), ('crime', 'NN'), ('--', ':'), ('so', 'RB'), ('I', 'PRP'), ('as
k', 'VBP'), ('you', 'PRP'), ('to', 'TO'), ('reauthorize', 'VB'), ('the', 'DT'),
('Patriot', 'NNP'), ('Act', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 22/36
4/23/24, 12:25 PM NLP.27.03.24
[('It', 'PRP'), ('is', 'VBZ'), ('said', 'VBD'), ('that', 'IN'), ('prior', 'JJ'),
('to', 'TO'), ('the', 'DT'), ('attacks', 'NNS'), ('of', 'IN'), ('September', 'NN
P'), ('the', 'DT'), ('11th', 'CD'), (',', ','), ('our', 'PRP$'), ('government', 'N
N'), ('failed', 'VBD'), ('to', 'TO'), ('connect', 'VB'), ('the', 'DT'), ('dots',
'NNS'), ('of', 'IN'), ('the', 'DT'), ('conspiracy', 'NN'), ('.', '.')]
[('We', 'PRP'), ('now', 'RB'), ('know', 'VBP'), ('that', 'IN'), ('two', 'CD'), ('o
f', 'IN'), ('the', 'DT'), ('hijackers', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('Uni
ted', 'NNP'), ('States', 'NNPS'), ('placed', 'VBD'), ('telephone', 'NN'), ('call
s', 'NNS'), ('to', 'TO'), ('al', 'VB'), ('Qaeda', 'NNP'), ('operatives', 'VBZ'),
('overseas', 'RB'), ('.', '.')]
[('But', 'CC'), ('we', 'PRP'), ('did', 'VBD'), ('not', 'RB'), ('know', 'VB'), ('ab
out', 'IN'), ('their', 'PRP$'), ('plans', 'NNS'), ('until', 'IN'), ('it', 'PRP'),
('was', 'VBD'), ('too', 'RB'), ('late', 'JJ'), ('.', '.')]
[('So', 'RB'), ('to', 'TO'), ('prevent', 'VB'), ('another', 'DT'), ('attack', 'N
N'), ('--', ':'), ('based', 'VBN'), ('on', 'IN'), ('authority', 'NN'), ('given',
'VBN'), ('to', 'TO'), ('me', 'PRP'), ('by', 'IN'), ('the', 'DT'), ('Constitution',
'NNP'), ('and', 'CC'), ('by', 'IN'), ('statute', 'NN'), ('--', ':'), ('I', 'PRP'),
('have', 'VBP'), ('authorized', 'VBN'), ('a', 'DT'), ('terrorist', 'JJ'), ('survei
llance', 'NN'), ('program', 'NN'), ('to', 'TO'), ('aggressively', 'RB'), ('pursu
e', 'VB'), ('the', 'DT'), ('international', 'JJ'), ('communications', 'NNS'), ('o
f', 'IN'), ('suspected', 'JJ'), ('al', 'JJ'), ('Qaeda', 'NNP'), ('operatives', 'NN
S'), ('and', 'CC'), ('affiliates', 'NNS'), ('to', 'TO'), ('and', 'CC'), ('from',
'IN'), ('America', 'NNP'), ('.', '.')]
[('Previous', 'JJ'), ('Presidents', 'NNS'), ('have', 'VBP'), ('used', 'VBN'), ('th
e', 'DT'), ('same', 'JJ'), ('constitutional', 'JJ'), ('authority', 'NN'), ('I', 'P
RP'), ('have', 'VBP'), (',', ','), ('and', 'CC'), ('federal', 'JJ'), ('courts', 'N
NS'), ('have', 'VBP'), ('approved', 'VBN'), ('the', 'DT'), ('use', 'NN'), ('of',
'IN'), ('that', 'DT'), ('authority', 'NN'), ('.', '.')]
[('Appropriate', 'JJ'), ('members', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), ('h
ave', 'VBP'), ('been', 'VBN'), ('kept', 'VBN'), ('informed', 'VBN'), ('.', '.')]
[('The', 'DT'), ('terrorist', 'JJ'), ('surveillance', 'NN'), ('program', 'NN'),
('has', 'VBZ'), ('helped', 'VBN'), ('prevent', 'VB'), ('terrorist', 'JJ'), ('attac
ks', 'NNS'), ('.', '.')]
[('It', 'PRP'), ('remains', 'VBZ'), ('essential', 'JJ'), ('to', 'TO'), ('the', 'D
T'), ('security', 'NN'), ('of', 'IN'), ('America', 'NNP'), ('.', '.')]
[('If', 'IN'), ('there', 'EX'), ('are', 'VBP'), ('people', 'NNS'), ('inside', 'I
N'), ('our', 'PRP$'), ('country', 'NN'), ('who', 'WP'), ('are', 'VBP'), ('talkin
g', 'VBG'), ('with', 'IN'), ('al', 'NN'), ('Qaeda', 'NNP'), (',', ','), ('we', 'PR
P'), ('want', 'VBP'), ('to', 'TO'), ('know', 'VB'), ('about', 'IN'), ('it', 'PR
P'), (',', ','), ('because', 'IN'), ('we', 'PRP'), ('will', 'MD'), ('not', 'RB'),
('sit', 'VB'), ('back', 'RB'), ('and', 'CC'), ('wait', 'NN'), ('to', 'TO'), ('be',
'VB'), ('hit', 'VBN'), ('again', 'RB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('In', 'IN'), ('all', 'PDT'), ('these', 'DT'), ('areas', 'NNS'), ('--', ':'), ('f
rom', 'IN'), ('the', 'DT'), ('disruption', 'NN'), ('of', 'IN'), ('terror', 'NN'),
('networks', 'NNS'), (',', ','), ('to', 'TO'), ('victory', 'NN'), ('in', 'IN'),
('Iraq', 'NNP'), (',', ','), ('to', 'TO'), ('the', 'DT'), ('spread', 'NN'), ('of',
'IN'), ('freedom', 'NN'), ('and', 'CC'), ('hope', 'NN'), ('in', 'IN'), ('trouble
d', 'JJ'), ('regions', 'NNS'), ('--', ':'), ('we', 'PRP'), ('need', 'VBP'), ('th
e', 'DT'), ('support', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('friends', 'NNS'),
('and', 'CC'), ('allies', 'NNS'), ('.', '.')]
[('To', 'TO'), ('draw', 'VB'), ('that', 'DT'), ('support', 'NN'), (',', ','), ('w
e', 'PRP'), ('must', 'MD'), ('always', 'RB'), ('be', 'VB'), ('clear', 'JJ'), ('i
n', 'IN'), ('our', 'PRP$'), ('principles', 'NNS'), ('and', 'CC'), ('willing', 'J
J'), ('to', 'TO'), ('act', 'VB'), ('.', '.')]
[('The', 'DT'), ('only', 'JJ'), ('alternative', 'NN'), ('to', 'TO'), ('American',
'JJ'), ('leadership', 'NN'), ('is', 'VBZ'), ('a', 'DT'), ('dramatically', 'RB'),
('more', 'RBR'), ('dangerous', 'JJ'), ('and', 'CC'), ('anxious', 'JJ'), ('world',
'NN'), ('.', '.')]
[('Yet', 'CC'), ('we', 'PRP'), ('also', 'RB'), ('choose', 'VBP'), ('to', 'TO'),
('lead', 'VB'), ('because', 'IN'), ('it', 'PRP'), ('is', 'VBZ'), ('a', 'DT'), ('pr
ivilege', 'NN'), ('to', 'TO'), ('serve', 'VB'), ('the', 'DT'), ('values', 'NNS'),
('that', 'WDT'), ('gave', 'VBD'), ('us', 'PRP'), ('birth', 'NN'), ('.', '.')]
[('American', 'JJ'), ('leaders', 'NNS'), ('--', ':'), ('from', 'IN'), ('Roosevel

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 23/36
4/23/24, 12:25 PM NLP.27.03.24
t', 'NNP'), ('to', 'TO'), ('Truman', 'NNP'), ('to', 'TO'), ('Kennedy', 'NNP'), ('t
o', 'TO'), ('Reagan', 'NNP'), ('--', ':'), ('rejected', 'VBD'), ('isolation', 'N
N'), ('and', 'CC'), ('retreat', 'NN'), (',', ','), ('because', 'IN'), ('they', 'PR
P'), ('knew', 'VBD'), ('that', 'IN'), ('America', 'NNP'), ('is', 'VBZ'), ('alway
s', 'RB'), ('more', 'RBR'), ('secure', 'JJ'), ('when', 'WRB'), ('freedom', 'NN'),
('is', 'VBZ'), ('on', 'IN'), ('the', 'DT'), ('march', 'NN'), ('.', '.')]
[('Our', 'PRP$'), ('own', 'JJ'), ('generation', 'NN'), ('is', 'VBZ'), ('in', 'I
N'), ('a', 'DT'), ('long', 'JJ'), ('war', 'NN'), ('against', 'IN'), ('a', 'DT'),
('determined', 'JJ'), ('enemy', 'NN'), ('--', ':'), ('a', 'DT'), ('war', 'NN'),
('that', 'WDT'), ('will', 'MD'), ('be', 'VB'), ('fought', 'VBN'), ('by', 'IN'),
('Presidents', 'NNS'), ('of', 'IN'), ('both', 'DT'), ('parties', 'NNS'), (',',
','), ('who', 'WP'), ('will', 'MD'), ('need', 'VB'), ('steady', 'JJ'), ('bipartisa
n', 'JJ'), ('support', 'NN'), ('from', 'IN'), ('the', 'DT'), ('Congress', 'NNP'),
('.', '.')]
[('And', 'CC'), ('tonight', 'NN'), ('I', 'PRP'), ('ask', 'VBP'), ('for', 'IN'),
('yours', 'NNS'), ('.', '.')]
[('Together', 'RB'), (',', ','), ('let', 'VB'), ('us', 'PRP'), ('protect', 'VB'),
('our', 'PRP$'), ('country', 'NN'), (',', ','), ('support', 'VB'), ('the', 'DT'),
('men', 'NNS'), ('and', 'CC'), ('women', 'NNS'), ('who', 'WP'), ('defend', 'VBP'),
('us', 'PRP'), (',', ','), ('and', 'CC'), ('lead', 'VB'), ('this', 'DT'), ('worl
d', 'NN'), ('toward', 'IN'), ('freedom', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Here', 'RB'), ('at', 'IN'), ('home', 'NN'), (',', ','), ('America', 'NNP'), ('a
lso', 'RB'), ('has', 'VBZ'), ('a', 'DT'), ('great', 'JJ'), ('opportunity', 'NN'),
(':', ':'), ('We', 'PRP'), ('will', 'MD'), ('build', 'VB'), ('the', 'DT'), ('prosp
erity', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('country', 'NN'), ('by', 'IN'), ('s
trengthening', 'VBG'), ('our', 'PRP$'), ('economic', 'JJ'), ('leadership', 'NN'),
('in', 'IN'), ('the', 'DT'), ('world', 'NN'), ('.', '.')]
[('Our', 'PRP$'), ('economy', 'NN'), ('is', 'VBZ'), ('healthy', 'JJ'), ('and', 'C
C'), ('vigorous', 'JJ'), (',', ','), ('and', 'CC'), ('growing', 'VBG'), ('faster',
'RBR'), ('than', 'IN'), ('other', 'JJ'), ('major', 'JJ'), ('industrialized', 'VB
N'), ('nations', 'NNS'), ('.', '.')]
[('In', 'IN'), ('the', 'DT'), ('last', 'JJ'), ('two-and-a-half', 'JJ'), ('years',
'NNS'), (',', ','), ('America', 'NNP'), ('has', 'VBZ'), ('created', 'VBN'), ('4.
6', 'CD'), ('million', 'CD'), ('new', 'JJ'), ('jobs', 'NNS'), ('--', ':'), ('mor
e', 'JJR'), ('than', 'IN'), ('Japan', 'NNP'), ('and', 'CC'), ('the', 'DT'), ('Euro
pean', 'NNP'), ('Union', 'NNP'), ('combined', 'VBD'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Even', 'RB'), ('in', 'IN'), ('the', 'DT'), ('face', 'NN'), ('of', 'IN'), ('high
er', 'JJR'), ('energy', 'NN'), ('prices', 'NNS'), ('and', 'CC'), ('natural', 'J
J'), ('disasters', 'NNS'), (',', ','), ('the', 'DT'), ('American', 'JJ'), ('peopl
e', 'NNS'), ('have', 'VBP'), ('turned', 'VBN'), ('in', 'IN'), ('an', 'DT'), ('econ
omic', 'JJ'), ('performance', 'NN'), ('that', 'WDT'), ('is', 'VBZ'), ('the', 'D
T'), ('envy', 'NN'), ('of', 'IN'), ('the', 'DT'), ('world', 'NN'), ('.', '.')]
[('The', 'DT'), ('American', 'JJ'), ('economy', 'NN'), ('is', 'VBZ'), ('preeminen
t', 'JJ'), (',', ','), ('but', 'CC'), ('we', 'PRP'), ('can', 'MD'), ('not', 'RB'),
('afford', 'VB'), ('to', 'TO'), ('be', 'VB'), ('complacent', 'JJ'), ('.', '.')]
[('In', 'IN'), ('a', 'DT'), ('dynamic', 'JJ'), ('world', 'NN'), ('economy', 'NN'),
(',', ','), ('we', 'PRP'), ('are', 'VBP'), ('seeing', 'VBG'), ('new', 'JJ'), ('com
petitors', 'NNS'), (',', ','), ('like', 'IN'), ('China', 'NNP'), ('and', 'CC'),
('India', 'NNP'), (',', ','), ('and', 'CC'), ('this', 'DT'), ('creates', 'VBZ'),
('uncertainty', 'NN'), (',', ','), ('which', 'WDT'), ('makes', 'VBZ'), ('it', 'PR
P'), ('easier', 'JJR'), ('to', 'TO'), ('feed', 'VB'), ('people', 'NNS'), ("'s", 'P
OS'), ('fears', 'NNS'), ('.', '.')]
[('So', 'IN'), ('we', 'PRP'), ("'re", 'VBP'), ('seeing', 'VBG'), ('some', 'DT'),
('old', 'JJ'), ('temptations', 'NNS'), ('return', 'NN'), ('.', '.')]
[('Protectionists', 'NNS'), ('want', 'VBP'), ('to', 'TO'), ('escape', 'VB'), ('com
petition', 'NN'), (',', ','), ('pretending', 'VBG'), ('that', 'IN'), ('we', 'PR
P'), ('can', 'MD'), ('keep', 'VB'), ('our', 'PRP$'), ('high', 'JJ'), ('standard',
'NN'), ('of', 'IN'), ('living', 'NN'), ('while', 'IN'), ('walling', 'VBG'), ('of
f', 'RP'), ('our', 'PRP$'), ('economy', 'NN'), ('.', '.')]
[('Others', 'NNS'), ('say', 'VBP'), ('that', 'IN'), ('the', 'DT'), ('government',
'NN'), ('needs', 'VBZ'), ('to', 'TO'), ('take', 'VB'), ('a', 'DT'), ('larger', 'JJ
R'), ('role', 'NN'), ('in', 'IN'), ('directing', 'VBG'), ('the', 'DT'), ('econom

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 24/36
4/23/24, 12:25 PM NLP.27.03.24
y', 'NN'), (',', ','), ('centralizing', 'VBG'), ('more', 'JJR'), ('power', 'NN'),
('in', 'IN'), ('Washington', 'NNP'), ('and', 'CC'), ('increasing', 'VBG'), ('taxe
s', 'NNS'), ('.', '.')]
[('We', 'PRP'), ('hear', 'VBP'), ('claims', 'NNS'), ('that', 'IN'), ('immigrants',
'NNS'), ('are', 'VBP'), ('somehow', 'RB'), ('bad', 'JJ'), ('for', 'IN'), ('the',
'DT'), ('economy', 'NN'), ('--', ':'), ('even', 'RB'), ('though', 'IN'), ('this',
'DT'), ('economy', 'NN'), ('could', 'MD'), ('not', 'RB'), ('function', 'VB'), ('wi
thout', 'IN'), ('them', 'PRP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('All', 'PDT'), ('these', 'DT'), ('are', 'VBP'), ('forms', 'NNS'), ('of', 'IN'),
('economic', 'JJ'), ('retreat', 'NN'), (',', ','), ('and', 'CC'), ('they', 'PRP'),
('lead', 'VBP'), ('in', 'IN'), ('the', 'DT'), ('same', 'JJ'), ('direction', 'NN'),
('--', ':'), ('toward', 'IN'), ('a', 'DT'), ('stagnant', 'JJ'), ('and', 'CC'), ('s
econd-rate', 'JJ'), ('economy', 'NN'), ('.', '.')]
[('Tonight', 'NNP'), ('I', 'PRP'), ('will', 'MD'), ('set', 'VB'), ('out', 'RP'),
('a', 'DT'), ('better', 'JJR'), ('path', 'NN'), (':', ':'), ('an', 'DT'), ('agend
a', 'NN'), ('for', 'IN'), ('a', 'DT'), ('nation', 'NN'), ('that', 'WDT'), ('compet
es', 'VBZ'), ('with', 'IN'), ('confidence', 'NN'), (';', ':'), ('an', 'DT'), ('age
nda', 'NN'), ('that', 'WDT'), ('will', 'MD'), ('raise', 'VB'), ('standards', 'NN
S'), ('of', 'IN'), ('living', 'NN'), ('and', 'CC'), ('generate', 'VB'), ('new', 'J
J'), ('jobs', 'NNS'), ('.', '.')]
[('Americans', 'NNPS'), ('should', 'MD'), ('not', 'RB'), ('fear', 'VB'), ('our',
'PRP$'), ('economic', 'JJ'), ('future', 'NN'), (',', ','), ('because', 'IN'), ('w
e', 'PRP'), ('intend', 'VBP'), ('to', 'TO'), ('shape', 'VB'), ('it', 'PRP'), ('.',
'.')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('begins', 'NNS'),
('with', 'IN'), ('keeping', 'VBG'), ('our', 'PRP$'), ('economy', 'NN'), ('growin
g', 'VBG'), ('.', '.')]
[('And', 'CC'), ('our', 'PRP$'), ('economy', 'NN'), ('grows', 'VBZ'), ('when', 'WR
B'), ('Americans', 'NNPS'), ('have', 'VBP'), ('more', 'JJR'), ('of', 'IN'), ('thei
r', 'PRP$'), ('own', 'JJ'), ('money', 'NN'), ('to', 'TO'), ('spend', 'VB'), (',',
','), ('save', 'VB'), (',', ','), ('and', 'CC'), ('invest', 'JJS'), ('.', '.')]
[('In', 'IN'), ('the', 'DT'), ('last', 'JJ'), ('five', 'CD'), ('years', 'NNS'),
(',', ','), ('the', 'DT'), ('tax', 'NN'), ('relief', 'NN'), ('you', 'PRP'), ('pass
ed', 'VBN'), ('has', 'VBZ'), ('left', 'VBN'), ('$', '$'), ('880', 'CD'), ('billio
n', 'CD'), ('in', 'IN'), ('the', 'DT'), ('hands', 'NNS'), ('of', 'IN'), ('America
n', 'JJ'), ('workers', 'NNS'), (',', ','), ('investors', 'NNS'), (',', ','), ('sma
ll', 'JJ'), ('businesses', 'NNS'), (',', ','), ('and', 'CC'), ('families', 'NNS'),
('--', ':'), ('and', 'CC'), ('they', 'PRP'), ('have', 'VBP'), ('used', 'VBN'), ('i
t', 'PRP'), ('to', 'TO'), ('help', 'VB'), ('produce', 'VB'), ('more', 'JJR'), ('th
an', 'IN'), ('four', 'CD'), ('years', 'NNS'), ('of', 'IN'), ('uninterrupted', 'J
J'), ('economic', 'JJ'), ('growth', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Yet', 'RB'), ('the', 'DT'), ('tax', 'NN'), ('relief', 'NN'), ('is', 'VBZ'), ('s
et', 'VBN'), ('to', 'TO'), ('expire', 'VB'), ('in', 'IN'), ('the', 'DT'), ('next',
'JJ'), ('few', 'JJ'), ('years', 'NNS'), ('.', '.')]
[('If', 'IN'), ('we', 'PRP'), ('do', 'VBP'), ('nothing', 'NN'), (',', ','), ('Amer
ican', 'NNP'), ('families', 'NNS'), ('will', 'MD'), ('face', 'VB'), ('a', 'DT'),
('massive', 'JJ'), ('tax', 'NN'), ('increase', 'NN'), ('they', 'PRP'), ('do', 'VB
P'), ('not', 'RB'), ('expect', 'VB'), ('and', 'CC'), ('will', 'MD'), ('not', 'R
B'), ('welcome', 'VB'), ('.', '.')]
[('Because', 'IN'), ('America', 'NNP'), ('needs', 'VBZ'), ('more', 'JJR'), ('tha
n', 'IN'), ('a', 'DT'), ('temporary', 'JJ'), ('expansion', 'NN'), (',', ','), ('w
e', 'PRP'), ('need', 'VBP'), ('more', 'JJR'), ('than', 'IN'), ('temporary', 'JJ'),
('tax', 'NN'), ('relief', 'NN'), ('.', '.')]
[('I', 'PRP'), ('urge', 'VBP'), ('the', 'DT'), ('Congress', 'NNP'), ('to', 'TO'),
('act', 'VB'), ('responsibly', 'RB'), (',', ','), ('and', 'CC'), ('make', 'VB'),
('the', 'DT'), ('tax', 'NN'), ('cuts', 'NNS'), ('permanent', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('requires', 'VB
Z'), ('us', 'PRP'), ('to', 'TO'), ('be', 'VB'), ('good', 'JJ'), ('stewards', 'NN
S'), ('of', 'IN'), ('tax', 'NN'), ('dollars', 'NNS'), ('.', '.')]
[('Every', 'DT'), ('year', 'NN'), ('of', 'IN'), ('my', 'PRP$'), ('presidency', 'N
N'), (',', ','), ('we', 'PRP'), ("'ve", 'VBP'), ('reduced', 'VBN'), ('the', 'DT'),

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 25/36
4/23/24, 12:25 PM NLP.27.03.24
('growth', 'NN'), ('of', 'IN'), ('non-security', 'JJ'), ('discretionary', 'JJ'),
('spending', 'NN'), (',', ','), ('and', 'CC'), ('last', 'JJ'), ('year', 'NN'), ('y
ou', 'PRP'), ('passed', 'VBD'), ('bills', 'NNS'), ('that', 'IN'), ('cut', 'VBD'),
('this', 'DT'), ('spending', 'NN'), ('.', '.')]
[('This', 'DT'), ('year', 'NN'), ('my', 'PRP$'), ('budget', 'NN'), ('will', 'MD'),
('cut', 'VB'), ('it', 'PRP'), ('again', 'RB'), (',', ','), ('and', 'CC'), ('reduc
e', 'VB'), ('or', 'CC'), ('eliminate', 'VB'), ('more', 'JJR'), ('than', 'IN'), ('1
40', 'CD'), ('programs', 'NNS'), ('that', 'WDT'), ('are', 'VBP'), ('performing',
'VBG'), ('poorly', 'RB'), ('or', 'CC'), ('not', 'RB'), ('fulfilling', 'JJ'), ('ess
ential', 'JJ'), ('priorities', 'NNS'), ('.', '.')]
[('By', 'IN'), ('passing', 'VBG'), ('these', 'DT'), ('reforms', 'NNS'), (',',
','), ('we', 'PRP'), ('will', 'MD'), ('save', 'VB'), ('the', 'DT'), ('American',
'NNP'), ('taxpayer', 'NN'), ('another', 'DT'), ('$', '$'), ('14', 'CD'), ('billio
n', 'CD'), ('next', 'JJ'), ('year', 'NN'), (',', ','), ('and', 'CC'), ('stay', 'V
B'), ('on', 'IN'), ('track', 'NN'), ('to', 'TO'), ('cut', 'VB'), ('the', 'DT'),
('deficit', 'NN'), ('in', 'IN'), ('half', 'NN'), ('by', 'IN'), ('2009', 'CD'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('I', 'PRP'), ('am', 'VBP'), ('pleased', 'JJ'), ('that', 'IN'), ('members', 'NN
S'), ('of', 'IN'), ('Congress', 'NNP'), ('are', 'VBP'), ('working', 'VBG'), ('on',
'IN'), ('earmark', 'NN'), ('reform', 'NN'), (',', ','), ('because', 'IN'), ('the',
'DT'), ('federal', 'JJ'), ('budget', 'NN'), ('has', 'VBZ'), ('too', 'RB'), ('man
y', 'JJ'), ('special', 'JJ'), ('interest', 'NN'), ('projects', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('And', 'CC'), ('we', 'PRP'), ('can', 'MD'), ('tackle', 'VB'), ('this', 'DT'),
('problem', 'NN'), ('together', 'RB'), (',', ','), ('if', 'IN'), ('you', 'PRP'),
('pass', 'VBP'), ('the', 'DT'), ('line-item', 'JJ'), ('veto', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('We', 'PRP'), ('must', 'MD'), ('also', 'RB'), ('confront', 'VB'), ('the', 'DT'),
('larger', 'JJR'), ('challenge', 'NN'), ('of', 'IN'), ('mandatory', 'JJ'), ('spend
ing', 'NN'), (',', ','), ('or', 'CC'), ('entitlements', 'NNS'), ('.', '.')]
[('This', 'DT'), ('year', 'NN'), (',', ','), ('the', 'DT'), ('first', 'JJ'), ('o
f', 'IN'), ('about', 'RB'), ('78', 'CD'), ('million', 'CD'), ('baby', 'NN'), ('boo
mers', 'NNS'), ('turn', 'VBP'), ('60', 'CD'), (',', ','), ('including', 'VBG'),
('two', 'CD'), ('of', 'IN'), ('my', 'PRP$'), ('Dad', 'NNP'), ("'s", 'POS'), ('favo
rite', 'JJ'), ('people', 'NNS'), ('--', ':'), ('me', 'PRP'), ('and', 'CC'), ('Pres
ident', 'NNP'), ('Clinton', 'NNP'), ('.', '.')]
[('(', '('), ('Laughter', 'NNP'), ('.', '.'), (')', ')')]
[('This', 'DT'), ('milestone', 'NN'), ('is', 'VBZ'), ('more', 'JJR'), ('than', 'I
N'), ('a', 'DT'), ('personal', 'JJ'), ('crisis', 'NN'), ('--', ':'), ('(', '('),
('laughter', 'NN'), (')', ')'), ('--', ':'), ('it', 'PRP'), ('is', 'VBZ'), ('a',
'DT'), ('national', 'JJ'), ('challenge', 'NN'), ('.', '.')]
[('The', 'DT'), ('retirement', 'NN'), ('of', 'IN'), ('the', 'DT'), ('baby', 'NN'),
('boom', 'NN'), ('generation', 'NN'), ('will', 'MD'), ('put', 'VB'), ('unprecedent
ed', 'JJ'), ('strains', 'NNS'), ('on', 'IN'), ('the', 'DT'), ('federal', 'JJ'),
('government', 'NN'), ('.', '.')]
[('By', 'IN'), ('2030', 'CD'), (',', ','), ('spending', 'VBG'), ('for', 'IN'), ('S
ocial', 'NNP'), ('Security', 'NNP'), (',', ','), ('Medicare', 'NNP'), ('and', 'C
C'), ('Medicaid', 'NNP'), ('alone', 'RB'), ('will', 'MD'), ('be', 'VB'), ('almos
t', 'RB'), ('60', 'CD'), ('percent', 'NN'), ('of', 'IN'), ('the', 'DT'), ('entir
e', 'JJ'), ('federal', 'JJ'), ('budget', 'NN'), ('.', '.')]
[('And', 'CC'), ('that', 'DT'), ('will', 'MD'), ('present', 'VB'), ('future', 'J
J'), ('Congresses', 'NNS'), ('with', 'IN'), ('impossible', 'JJ'), ('choices', 'NN
S'), ('--', ':'), ('staggering', 'VBG'), ('tax', 'NN'), ('increases', 'NNS'),
(',', ','), ('immense', 'JJ'), ('deficits', 'NNS'), (',', ','), ('or', 'CC'), ('de
ep', 'JJ'), ('cuts', 'NNS'), ('in', 'IN'), ('every', 'DT'), ('category', 'NN'),
('of', 'IN'), ('spending', 'NN'), ('.', '.')]
[('Congress', 'NNP'), ('did', 'VBD'), ('not', 'RB'), ('act', 'VB'), ('last', 'J
J'), ('year', 'NN'), ('on', 'IN'), ('my', 'PRP$'), ('proposal', 'NN'), ('to', 'T
O'), ('save', 'VB'), ('Social', 'NNP'), ('Security', 'NNP'), ('--', ':'), ('(',
'('), ('applause', 'NN'), (')', ')'), ('--', ':'), ('yet', 'RB'), ('the', 'DT'),
('rising', 'VBG'), ('cost', 'NN'), ('of', 'IN'), ('entitlements', 'NNS'), ('is',
'VBZ'), ('a', 'DT'), ('problem', 'NN'), ('that', 'WDT'), ('is', 'VBZ'), ('not', 'R
B'), ('going', 'VBG'), ('away', 'RB'), ('.', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 26/36
4/23/24, 12:25 PM NLP.27.03.24
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('And', 'CC'), ('every', 'DT'), ('year', 'NN'), ('we', 'PRP'), ('fail', 'VBP'),
('to', 'TO'), ('act', 'VB'), (',', ','), ('the', 'DT'), ('situation', 'NN'), ('get
s', 'VBZ'), ('worse', 'JJR'), ('.', '.')]
[('So', 'RB'), ('tonight', 'JJ'), (',', ','), ('I', 'PRP'), ('ask', 'VBP'), ('yo
u', 'PRP'), ('to', 'TO'), ('join', 'VB'), ('me', 'PRP'), ('in', 'IN'), ('creatin
g', 'VBG'), ('a', 'DT'), ('commission', 'NN'), ('to', 'TO'), ('examine', 'VB'),
('the', 'DT'), ('full', 'JJ'), ('impact', 'NN'), ('of', 'IN'), ('baby', 'NN'), ('b
oom', 'NN'), ('retirements', 'NNS'), ('on', 'IN'), ('Social', 'NNP'), ('Security',
'NNP'), (',', ','), ('Medicare', 'NNP'), (',', ','), ('and', 'CC'), ('Medicaid',
'NNP'), ('.', '.')]
[('This', 'DT'), ('commission', 'NN'), ('should', 'MD'), ('include', 'VB'), ('memb
ers', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), ('of', 'IN'), ('both', 'DT'), ('p
arties', 'NNS'), (',', ','), ('and', 'CC'), ('offer', 'VBP'), ('bipartisan', 'J
J'), ('solutions', 'NNS'), ('.', '.')]
[('We', 'PRP'), ('need', 'VBP'), ('to', 'TO'), ('put', 'VB'), ('aside', 'RP'), ('p
artisan', 'JJ'), ('politics', 'NNS'), ('and', 'CC'), ('work', 'NN'), ('together',
'RB'), ('and', 'CC'), ('get', 'VB'), ('this', 'DT'), ('problem', 'NN'), ('solved',
'VBD'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('requires', 'VB
Z'), ('us', 'PRP'), ('to', 'TO'), ('open', 'VB'), ('more', 'JJR'), ('markets', 'NN
S'), ('for', 'IN'), ('all', 'DT'), ('that', 'DT'), ('Americans', 'NNPS'), ('make',
'VBP'), ('and', 'CC'), ('grow', 'VB'), ('.', '.')]
[('One', 'CD'), ('out', 'NN'), ('of', 'IN'), ('every', 'DT'), ('five', 'CD'), ('fa
ctory', 'NN'), ('jobs', 'NNS'), ('in', 'IN'), ('America', 'NNP'), ('is', 'VBZ'),
('related', 'VBN'), ('to', 'TO'), ('global', 'JJ'), ('trade', 'NN'), (',', ','),
('and', 'CC'), ('we', 'PRP'), ('want', 'VBP'), ('people', 'NNS'), ('everywhere',
'RB'), ('to', 'TO'), ('buy', 'VB'), ('American', 'NNP'), ('.', '.')]
[('With', 'IN'), ('open', 'JJ'), ('markets', 'NNS'), ('and', 'CC'), ('a', 'DT'),
('level', 'JJ'), ('playing', 'NN'), ('field', 'NN'), (',', ','), ('no', 'DT'), ('o
ne', 'NN'), ('can', 'MD'), ('out-produce', 'VB'), ('or', 'CC'), ('out-compete', 'V
B'), ('the', 'DT'), ('American', 'JJ'), ('worker', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('requires', 'VB
Z'), ('an', 'DT'), ('immigration', 'NN'), ('system', 'NN'), ('that', 'WDT'), ('uph
olds', 'VBZ'), ('our', 'PRP$'), ('laws', 'NNS'), (',', ','), ('reflects', 'VBZ'),
('our', 'PRP$'), ('values', 'NNS'), (',', ','), ('and', 'CC'), ('serves', 'VBZ'),
('the', 'DT'), ('interests', 'NNS'), ('of', 'IN'), ('our', 'PRP$'), ('economy', 'N
N'), ('.', '.')]
[('Our', 'PRP$'), ('nation', 'NN'), ('needs', 'VBZ'), ('orderly', 'JJ'), ('and',
'CC'), ('secure', 'JJ'), ('borders', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('To', 'TO'), ('meet', 'VB'), ('this', 'DT'), ('goal', 'NN'), (',', ','), ('we',
'PRP'), ('must', 'MD'), ('have', 'VB'), ('stronger', 'JJR'), ('immigration', 'N
N'), ('enforcement', 'NN'), ('and', 'CC'), ('border', 'NN'), ('protection', 'NN'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('And', 'CC'), ('we', 'PRP'), ('must', 'MD'), ('have', 'VB'), ('a', 'DT'), ('rati
onal', 'JJ'), (',', ','), ('humane', 'JJ'), ('guest', 'JJS'), ('worker', 'NN'),
('program', 'NN'), ('that', 'WDT'), ('rejects', 'VBZ'), ('amnesty', 'JJ'), (',',
','), ('allows', 'VBZ'), ('temporary', 'JJ'), ('jobs', 'NNS'), ('for', 'IN'), ('pe
ople', 'NNS'), ('who', 'WP'), ('seek', 'VBP'), ('them', 'PRP'), ('legally', 'RB'),
(',', ','), ('and', 'CC'), ('reduces', 'NNS'), ('smuggling', 'VBG'), ('and', 'C
C'), ('crime', 'NN'), ('at', 'IN'), ('the', 'DT'), ('border', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('requires', 'VB
Z'), ('affordable', 'JJ'), ('health', 'NN'), ('care', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Our', 'PRP$'), ('government', 'NN'), ('has', 'VBZ'), ('a', 'DT'), ('responsibil
ity', 'NN'), ('to', 'TO'), ('provide', 'VB'), ('health', 'NN'), ('care', 'NN'),
('for', 'IN'), ('the', 'DT'), ('poor', 'JJ'), ('and', 'CC'), ('the', 'DT'), ('elde
rly', 'JJ'), (',', ','), ('and', 'CC'), ('we', 'PRP'), ('are', 'VBP'), ('meeting',
'VBG'), ('that', 'IN'), ('responsibility', 'NN'), ('.', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 27/36
4/23/24, 12:25 PM NLP.27.03.24
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('For', 'IN'), ('all', 'DT'), ('Americans', 'NNPS'), ('--', ':'), ('for', 'IN'),
('all', 'DT'), ('Americans', 'NNPS'), (',', ','), ('we', 'PRP'), ('must', 'MD'),
('confront', 'VB'), ('the', 'DT'), ('rising', 'VBG'), ('cost', 'NN'), ('of', 'I
N'), ('care', 'NN'), (',', ','), ('strengthen', 'VB'), ('the', 'DT'), ('doctor-pat
ient', 'JJ'), ('relationship', 'NN'), (',', ','), ('and', 'CC'), ('help', 'NN'),
('people', 'NNS'), ('afford', 'VBP'), ('the', 'DT'), ('insurance', 'NN'), ('covera
ge', 'NN'), ('they', 'PRP'), ('need', 'VBP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('We', 'PRP'), ('will', 'MD'), ('make', 'VB'), ('wider', 'JJR'), ('use', 'NN'),
('of', 'IN'), ('electronic', 'JJ'), ('records', 'NNS'), ('and', 'CC'), ('other',
'JJ'), ('health', 'NN'), ('information', 'NN'), ('technology', 'NN'), (',', ','),
('to', 'TO'), ('help', 'VB'), ('control', 'VB'), ('costs', 'NNS'), ('and', 'CC'),
('reduce', 'VB'), ('dangerous', 'JJ'), ('medical', 'JJ'), ('errors', 'NNS'), ('.',
'.')]
[('We', 'PRP'), ('will', 'MD'), ('strengthen', 'VB'), ('health', 'NN'), ('saving
s', 'NNS'), ('accounts', 'NNS'), ('--', ':'), ('making', 'VBG'), ('sure', 'JJ'),
('individuals', 'NNS'), ('and', 'CC'), ('small', 'JJ'), ('business', 'NN'), ('empl
oyees', 'NNS'), ('can', 'MD'), ('buy', 'VB'), ('insurance', 'NN'), ('with', 'IN'),
('the', 'DT'), ('same', 'JJ'), ('advantages', 'VBZ'), ('that', 'IN'), ('people',
'NNS'), ('working', 'VBG'), ('for', 'IN'), ('big', 'JJ'), ('businesses', 'NNS'),
('now', 'RB'), ('get', 'VBP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('We', 'PRP'), ('will', 'MD'), ('do', 'VB'), ('more', 'JJR'), ('to', 'TO'), ('mak
e', 'VB'), ('this', 'DT'), ('coverage', 'NN'), ('portable', 'JJ'), (',', ','), ('s
o', 'IN'), ('workers', 'NNS'), ('can', 'MD'), ('switch', 'VB'), ('jobs', 'NNS'),
('without', 'IN'), ('having', 'VBG'), ('to', 'TO'), ('worry', 'VB'), ('about', 'I
N'), ('losing', 'VBG'), ('their', 'PRP$'), ('health', 'NN'), ('insurance', 'NN'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('And', 'CC'), ('because', 'IN'), ('lawsuits', 'NNS'), ('are', 'VBP'), ('drivin
g', 'VBG'), ('many', 'JJ'), ('good', 'JJ'), ('doctors', 'NNS'), ('out', 'IN'), ('o
f', 'IN'), ('practice', 'NN'), ('--', ':'), ('leaving', 'VBG'), ('women', 'NNS'),
('in', 'IN'), ('nearly', 'RB'), ('1,500', 'CD'), ('American', 'JJ'), ('counties',
'NNS'), ('without', 'IN'), ('a', 'DT'), ('single', 'JJ'), ('OB/GYN', 'NNP'), ('--
', ':'), ('I', 'PRP'), ('ask', 'VBP'), ('the', 'DT'), ('Congress', 'NNP'), ('to',
'TO'), ('pass', 'VB'), ('medical', 'JJ'), ('liability', 'NN'), ('reform', 'NN'),
('this', 'DT'), ('year', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Keeping', 'VBG'), ('America', 'NNP'), ('competitive', 'JJ'), ('requires', 'VB
Z'), ('affordable', 'JJ'), ('energy', 'NN'), ('.', '.')]
[('And', 'CC'), ('here', 'RB'), ('we', 'PRP'), ('have', 'VBP'), ('a', 'DT'), ('ser
ious', 'JJ'), ('problem', 'NN'), (':', ':'), ('America', 'NNP'), ('is', 'VBZ'),
('addicted', 'VBN'), ('to', 'TO'), ('oil', 'NN'), (',', ','), ('which', 'WDT'),
('is', 'VBZ'), ('often', 'RB'), ('imported', 'VBN'), ('from', 'IN'), ('unstable',
'JJ'), ('parts', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('world', 'NN'), ('.', '.')]
[('The', 'DT'), ('best', 'JJS'), ('way', 'NN'), ('to', 'TO'), ('break', 'VB'), ('t
his', 'DT'), ('addiction', 'NN'), ('is', 'VBZ'), ('through', 'IN'), ('technology',
'NN'), ('.', '.')]
[('Since', 'IN'), ('2001', 'CD'), (',', ','), ('we', 'PRP'), ('have', 'VBP'), ('sp
ent', 'VBN'), ('nearly', 'RB'), ('$', '$'), ('10', 'CD'), ('billion', 'CD'), ('t
o', 'TO'), ('develop', 'VB'), ('cleaner', 'JJR'), (',', ','), ('cheaper', 'JJR'),
(',', ','), ('and', 'CC'), ('more', 'RBR'), ('reliable', 'JJ'), ('alternative', 'J
J'), ('energy', 'NN'), ('sources', 'NNS'), ('--', ':'), ('and', 'CC'), ('we', 'PR
P'), ('are', 'VBP'), ('on', 'IN'), ('the', 'DT'), ('threshold', 'NN'), ('of', 'I
N'), ('incredible', 'JJ'), ('advances', 'NNS'), ('.', '.')]
[('So', 'RB'), ('tonight', 'JJ'), (',', ','), ('I', 'PRP'), ('announce', 'VBP'),
('the', 'DT'), ('Advanced', 'NNP'), ('Energy', 'NNP'), ('Initiative', 'NNP'), ('--
', ':'), ('a', 'DT'), ('22-percent', 'JJ'), ('increase', 'NN'), ('in', 'IN'), ('cl
ean-energy', 'JJ'), ('research', 'NN'), ('--', ':'), ('at', 'IN'), ('the', 'DT'),
('Department', 'NNP'), ('of', 'IN'), ('Energy', 'NNP'), (',', ','), ('to', 'TO'),
('push', 'VB'), ('for', 'IN'), ('breakthroughs', 'NNS'), ('in', 'IN'), ('two', 'C
D'), ('vital', 'JJ'), ('areas', 'NNS'), ('.', '.')]
[('To', 'TO'), ('change', 'VB'), ('how', 'WRB'), ('we', 'PRP'), ('power', 'NN'),

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 28/36
4/23/24, 12:25 PM NLP.27.03.24
('our', 'PRP$'), ('homes', 'NNS'), ('and', 'CC'), ('offices', 'NNS'), (',', ','),
('we', 'PRP'), ('will', 'MD'), ('invest', 'VB'), ('more', 'RBR'), ('in', 'IN'),
('zero-emission', 'JJ'), ('coal-fired', 'JJ'), ('plants', 'NNS'), (',', ','), ('re
volutionary', 'JJ'), ('solar', 'NN'), ('and', 'CC'), ('wind', 'NN'), ('technologie
s', 'NNS'), (',', ','), ('and', 'CC'), ('clean', 'JJ'), (',', ','), ('safe', 'J
J'), ('nuclear', 'JJ'), ('energy', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('We', 'PRP'), ('must', 'MD'), ('also', 'RB'), ('change', 'VB'), ('how', 'WRB'),
('we', 'PRP'), ('power', 'NN'), ('our', 'PRP$'), ('automobiles', 'NNS'), ('.',
'.')]
[('We', 'PRP'), ('will', 'MD'), ('increase', 'VB'), ('our', 'PRP$'), ('research',
'NN'), ('in', 'IN'), ('better', 'JJR'), ('batteries', 'NNS'), ('for', 'IN'), ('hyb
rid', 'JJ'), ('and', 'CC'), ('electric', 'JJ'), ('cars', 'NNS'), (',', ','), ('an
d', 'CC'), ('in', 'IN'), ('pollution-free', 'JJ'), ('cars', 'NNS'), ('that', 'WD
T'), ('run', 'VBP'), ('on', 'IN'), ('hydrogen', 'NN'), ('.', '.')]
[('We', 'PRP'), ("'ll", 'MD'), ('also', 'RB'), ('fund', 'VB'), ('additional', 'J
J'), ('research', 'NN'), ('in', 'IN'), ('cutting-edge', 'JJ'), ('methods', 'NNS'),
('of', 'IN'), ('producing', 'VBG'), ('ethanol', 'NN'), (',', ','), ('not', 'RB'),
('just', 'RB'), ('from', 'IN'), ('corn', 'NN'), (',', ','), ('but', 'CC'), ('fro
m', 'IN'), ('wood', 'NN'), ('chips', 'NNS'), ('and', 'CC'), ('stalks', 'NNS'),
(',', ','), ('or', 'CC'), ('switch', 'VB'), ('grass', 'NN'), ('.', '.')]
[('Our', 'PRP$'), ('goal', 'NN'), ('is', 'VBZ'), ('to', 'TO'), ('make', 'VB'), ('t
his', 'DT'), ('new', 'JJ'), ('kind', 'NN'), ('of', 'IN'), ('ethanol', 'JJ'), ('pra
ctical', 'JJ'), ('and', 'CC'), ('competitive', 'JJ'), ('within', 'IN'), ('six', 'C
D'), ('years', 'NNS'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Breakthroughs', 'NNS'), ('on', 'IN'), ('this', 'DT'), ('and', 'CC'), ('other',
'JJ'), ('new', 'JJ'), ('technologies', 'NNS'), ('will', 'MD'), ('help', 'VB'), ('u
s', 'PRP'), ('reach', 'VB'), ('another', 'DT'), ('great', 'JJ'), ('goal', 'NN'),
(':', ':'), ('to', 'TO'), ('replace', 'VB'), ('more', 'JJR'), ('than', 'IN'), ('7
5', 'CD'), ('percent', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('oil', 'NN'), ('impo
rts', 'NNS'), ('from', 'IN'), ('the', 'DT'), ('Middle', 'NNP'), ('East', 'NNP'),
('by', 'IN'), ('2025', 'CD'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('By', 'IN'), ('applying', 'VBG'), ('the', 'DT'), ('talent', 'NN'), ('and', 'C
C'), ('technology', 'NN'), ('of', 'IN'), ('America', 'NNP'), (',', ','), ('this',
'DT'), ('country', 'NN'), ('can', 'MD'), ('dramatically', 'RB'), ('improve', 'V
B'), ('our', 'PRP$'), ('environment', 'NN'), (',', ','), ('move', 'VB'), ('beyon
d', 'IN'), ('a', 'DT'), ('petroleum-based', 'JJ'), ('economy', 'NN'), (',', ','),
('and', 'CC'), ('make', 'VB'), ('our', 'PRP$'), ('dependence', 'NN'), ('on', 'I
N'), ('Middle', 'NNP'), ('Eastern', 'NNP'), ('oil', 'NN'), ('a', 'DT'), ('thing',
'NN'), ('of', 'IN'), ('the', 'DT'), ('past', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('And', 'CC'), ('to', 'TO'), ('keep', 'VB'), ('America', 'NNP'), ('competitive',
'JJ'), (',', ','), ('one', 'CD'), ('commitment', 'NN'), ('is', 'VBZ'), ('necessar
y', 'JJ'), ('above', 'IN'), ('all', 'DT'), (':', ':'), ('We', 'PRP'), ('must', 'M
D'), ('continue', 'VB'), ('to', 'TO'), ('lead', 'VB'), ('the', 'DT'), ('world', 'N
N'), ('in', 'IN'), ('human', 'JJ'), ('talent', 'NN'), ('and', 'CC'), ('creativit
y', 'NN'), ('.', '.')]
[('Our', 'PRP$'), ('greatest', 'JJS'), ('advantage', 'NN'), ('in', 'IN'), ('the',
'DT'), ('world', 'NN'), ('has', 'VBZ'), ('always', 'RB'), ('been', 'VBN'), ('our',
'PRP$'), ('educated', 'VBN'), (',', ','), ('hardworking', 'VBG'), (',', ','), ('am
bitious', 'JJ'), ('people', 'NNS'), ('--', ':'), ('and', 'CC'), ('we', 'PRP'),
("'re", 'VBP'), ('going', 'VBG'), ('to', 'TO'), ('keep', 'VB'), ('that', 'DT'),
('edge', 'NN'), ('.', '.')]
[('Tonight', 'NN'), ('I', 'PRP'), ('announce', 'VBP'), ('an', 'DT'), ('American',
'JJ'), ('Competitiveness', 'NNP'), ('Initiative', 'NNP'), (',', ','), ('to', 'T
O'), ('encourage', 'VB'), ('innovation', 'NN'), ('throughout', 'IN'), ('our', 'PRP
$'), ('economy', 'NN'), (',', ','), ('and', 'CC'), ('to', 'TO'), ('give', 'VB'),
('our', 'PRP$'), ('nation', 'NN'), ("'s", 'POS'), ('children', 'NNS'), ('a', 'D
T'), ('firm', 'NN'), ('grounding', 'VBG'), ('in', 'IN'), ('math', 'NN'), ('and',
'CC'), ('science', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('First', 'RB'), (',', ','), ('I', 'PRP'), ('propose', 'VBP'), ('to', 'TO'), ('do

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 29/36
4/23/24, 12:25 PM NLP.27.03.24
uble', 'VB'), ('the', 'DT'), ('federal', 'JJ'), ('commitment', 'NN'), ('to', 'T
O'), ('the', 'DT'), ('most', 'RBS'), ('critical', 'JJ'), ('basic', 'JJ'), ('resear
ch', 'NN'), ('programs', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('physical', 'JJ'),
('sciences', 'NNS'), ('over', 'IN'), ('the', 'DT'), ('next', 'JJ'), ('10', 'CD'),
('years', 'NNS'), ('.', '.')]
[('This', 'DT'), ('funding', 'NN'), ('will', 'MD'), ('support', 'VB'), ('the', 'D
T'), ('work', 'NN'), ('of', 'IN'), ('America', 'NNP'), ("'s", 'POS'), ('most', 'RB
S'), ('creative', 'JJ'), ('minds', 'NNS'), ('as', 'IN'), ('they', 'PRP'), ('explor
e', 'VBP'), ('promising', 'VBG'), ('areas', 'NNS'), ('such', 'JJ'), ('as', 'IN'),
('nanotechnology', 'NN'), (',', ','), ('supercomputing', 'NN'), (',', ','), ('an
d', 'CC'), ('alternative', 'JJ'), ('energy', 'NN'), ('sources', 'NNS'), ('.',
'.')]
[('Second', 'JJ'), (',', ','), ('I', 'PRP'), ('propose', 'VBP'), ('to', 'TO'), ('m
ake', 'VB'), ('permanent', 'JJ'), ('the', 'DT'), ('research', 'NN'), ('and', 'C
C'), ('development', 'NN'), ('tax', 'NN'), ('credit', 'NN'), ('--', ':'), ('(',
'('), ('applause', 'NN'), (')', ')'), ('--', ':'), ('to', 'TO'), ('encourage', 'V
B'), ('bolder', 'VB'), ('private-sector', 'JJ'), ('initiatives', 'NNS'), ('in', 'I
N'), ('technology', 'NN'), ('.', '.')]
[('With', 'IN'), ('more', 'JJR'), ('research', 'NN'), ('in', 'IN'), ('both', 'C
C'), ('the', 'DT'), ('public', 'NN'), ('and', 'CC'), ('private', 'JJ'), ('sector
s', 'NNS'), (',', ','), ('we', 'PRP'), ('will', 'MD'), ('improve', 'VB'), ('our',
'PRP$'), ('quality', 'NN'), ('of', 'IN'), ('life', 'NN'), ('--', ':'), ('and', 'C
C'), ('ensure', 'VB'), ('that', 'DT'), ('America', 'NNP'), ('will', 'MD'), ('lea
d', 'VB'), ('the', 'DT'), ('world', 'NN'), ('in', 'IN'), ('opportunity', 'NN'),
('and', 'CC'), ('innovation', 'NN'), ('for', 'IN'), ('decades', 'NNS'), ('to', 'T
O'), ('come', 'VB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Third', 'NNP'), (',', ','), ('we', 'PRP'), ('need', 'VBP'), ('to', 'TO'), ('enc
ourage', 'VB'), ('children', 'NNS'), ('to', 'TO'), ('take', 'VB'), ('more', 'JJ
R'), ('math', 'NN'), ('and', 'CC'), ('science', 'NN'), (',', ','), ('and', 'CC'),
('to', 'TO'), ('make', 'VB'), ('sure', 'JJ'), ('those', 'DT'), ('courses', 'NNS'),
('are', 'VBP'), ('rigorous', 'JJ'), ('enough', 'RB'), ('to', 'TO'), ('compete', 'V
B'), ('with', 'IN'), ('other', 'JJ'), ('nations', 'NNS'), ('.', '.')]
[('We', 'PRP'), ("'ve", 'VBP'), ('made', 'VBN'), ('a', 'DT'), ('good', 'JJ'), ('st
art', 'NN'), ('in', 'IN'), ('the', 'DT'), ('early', 'JJ'), ('grades', 'NNS'), ('wi
th', 'IN'), ('the', 'DT'), ('No', 'NNP'), ('Child', 'NNP'), ('Left', 'NNP'), ('Beh
ind', 'NNP'), ('Act', 'NNP'), (',', ','), ('which', 'WDT'), ('is', 'VBZ'), ('raisi
ng', 'VBG'), ('standards', 'NNS'), ('and', 'CC'), ('lifting', 'VBG'), ('test', 'N
N'), ('scores', 'NNS'), ('across', 'IN'), ('our', 'PRP$'), ('country', 'NN'),
('.', '.')]
[('Tonight', 'NNP'), ('I', 'PRP'), ('propose', 'VBP'), ('to', 'TO'), ('train', 'V
B'), ('70,000', 'CD'), ('high', 'JJ'), ('school', 'NN'), ('teachers', 'NNS'), ('t
o', 'TO'), ('lead', 'VB'), ('advanced-placement', 'JJ'), ('courses', 'NNS'), ('i
n', 'IN'), ('math', 'NN'), ('and', 'CC'), ('science', 'NN'), (',', ','), ('bring',
'VBG'), ('30,000', 'CD'), ('math', 'NN'), ('and', 'CC'), ('science', 'NN'), ('prof
essionals', 'NNS'), ('to', 'TO'), ('teach', 'VB'), ('in', 'IN'), ('classrooms', 'N
NS'), (',', ','), ('and', 'CC'), ('give', 'VB'), ('early', 'JJ'), ('help', 'NN'),
('to', 'TO'), ('students', 'NNS'), ('who', 'WP'), ('struggle', 'VBP'), ('with', 'I
N'), ('math', 'NN'), (',', ','), ('so', 'IN'), ('they', 'PRP'), ('have', 'VBP'),
('a', 'DT'), ('better', 'JJR'), ('chance', 'NN'), ('at', 'IN'), ('good', 'JJ'),
(',', ','), ('high-wage', 'JJ'), ('jobs', 'NNS'), ('.', '.')]
[('If', 'IN'), ('we', 'PRP'), ('ensure', 'VB'), ('that', 'IN'), ('America', 'NN
P'), ("'s", 'POS'), ('children', 'NNS'), ('succeed', 'VB'), ('in', 'IN'), ('life',
'NN'), (',', ','), ('they', 'PRP'), ('will', 'MD'), ('ensure', 'VB'), ('that', 'I
N'), ('America', 'NNP'), ('succeeds', 'VBZ'), ('in', 'IN'), ('the', 'DT'), ('worl
d', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Preparing', 'VBG'), ('our', 'PRP$'), ('nation', 'NN'), ('to', 'TO'), ('compet
e', 'VB'), ('in', 'IN'), ('the', 'DT'), ('world', 'NN'), ('is', 'VBZ'), ('a', 'D
T'), ('goal', 'NN'), ('that', 'IN'), ('all', 'DT'), ('of', 'IN'), ('us', 'PRP'),
('can', 'MD'), ('share', 'NN'), ('.', '.')]
[('I', 'PRP'), ('urge', 'VBP'), ('you', 'PRP'), ('to', 'TO'), ('support', 'VB'),
('the', 'DT'), ('American', 'JJ'), ('Competitiveness', 'NNP'), ('Initiative', 'NN
P'), (',', ','), ('and', 'CC'), ('together', 'RB'), ('we', 'PRP'), ('will', 'MD'),

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 30/36
4/23/24, 12:25 PM NLP.27.03.24
('show', 'VB'), ('the', 'DT'), ('world', 'NN'), ('what', 'WP'), ('the', 'DT'), ('A
merican', 'JJ'), ('people', 'NNS'), ('can', 'MD'), ('achieve', 'VB'), ('.', '.')]
[('America', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('great', 'JJ'), ('force', 'NN'),
('for', 'IN'), ('freedom', 'NN'), ('and', 'CC'), ('prosperity', 'NN'), ('.', '.')]
[('Yet', 'RB'), ('our', 'PRP$'), ('greatness', 'NN'), ('is', 'VBZ'), ('not', 'R
B'), ('measured', 'VBN'), ('in', 'IN'), ('power', 'NN'), ('or', 'CC'), ('luxurie
s', 'NNS'), (',', ','), ('but', 'CC'), ('by', 'IN'), ('who', 'WP'), ('we', 'PRP'),
('are', 'VBP'), ('and', 'CC'), ('how', 'WRB'), ('we', 'PRP'), ('treat', 'VBP'),
('one', 'CD'), ('another', 'DT'), ('.', '.')]
[('So', 'IN'), ('we', 'PRP'), ('strive', 'VBP'), ('to', 'TO'), ('be', 'VB'), ('a',
'DT'), ('compassionate', 'NN'), (',', ','), ('decent', 'NN'), (',', ','), ('hopefu
l', 'JJ'), ('society', 'NN'), ('.', '.')]
[('In', 'IN'), ('recent', 'JJ'), ('years', 'NNS'), (',', ','), ('America', 'NNP'),
('has', 'VBZ'), ('become', 'VBN'), ('a', 'DT'), ('more', 'RBR'), ('hopeful', 'J
J'), ('nation', 'NN'), ('.', '.')]
[('Violent', 'JJ'), ('crime', 'NN'), ('rates', 'NNS'), ('have', 'VBP'), ('fallen',
'VBN'), ('to', 'TO'), ('their', 'PRP$'), ('lowest', 'JJS'), ('levels', 'NNS'), ('s
ince', 'IN'), ('the', 'DT'), ('1970s', 'CD'), ('.', '.')]
[('Welfare', 'NN'), ('cases', 'NNS'), ('have', 'VBP'), ('dropped', 'VBN'), ('by',
'IN'), ('more', 'JJR'), ('than', 'IN'), ('half', 'NN'), ('over', 'IN'), ('the', 'D
T'), ('past', 'JJ'), ('decade', 'NN'), ('.', '.')]
[('Drug', 'NN'), ('use', 'NN'), ('among', 'IN'), ('youth', 'NN'), ('is', 'VBZ'),
('down', 'RB'), ('19', 'CD'), ('percent', 'NN'), ('since', 'IN'), ('2001', 'CD'),
('.', '.')]
[('There', 'EX'), ('are', 'VBP'), ('fewer', 'JJR'), ('abortions', 'NNS'), ('in',
'IN'), ('America', 'NNP'), ('than', 'IN'), ('at', 'IN'), ('any', 'DT'), ('point',
'NN'), ('in', 'IN'), ('the', 'DT'), ('last', 'JJ'), ('three', 'CD'), ('decades',
'NNS'), (',', ','), ('and', 'CC'), ('the', 'DT'), ('number', 'NN'), ('of', 'IN'),
('children', 'NNS'), ('born', 'VBN'), ('to', 'TO'), ('teenage', 'VB'), ('mothers',
'NNS'), ('has', 'VBZ'), ('been', 'VBN'), ('falling', 'VBG'), ('for', 'IN'), ('a',
'DT'), ('dozen', 'NN'), ('years', 'NNS'), ('in', 'IN'), ('a', 'DT'), ('row', 'N
N'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('These', 'DT'), ('gains', 'NNS'), ('are', 'VBP'), ('evidence', 'NN'), ('of', 'I
N'), ('a', 'DT'), ('quiet', 'JJ'), ('transformation', 'NN'), ('--', ':'), ('a', 'D
T'), ('revolution', 'NN'), ('of', 'IN'), ('conscience', 'NN'), (',', ','), ('in',
'IN'), ('which', 'WDT'), ('a', 'DT'), ('rising', 'VBG'), ('generation', 'NN'), ('i
s', 'VBZ'), ('finding', 'VBG'), ('that', 'IN'), ('a', 'DT'), ('life', 'NN'), ('o
f', 'IN'), ('personal', 'JJ'), ('responsibility', 'NN'), ('is', 'VBZ'), ('a', 'D
T'), ('life', 'NN'), ('of', 'IN'), ('fulfillment', 'NN'), ('.', '.')]
[('Government', 'NNP'), ('has', 'VBZ'), ('played', 'VBN'), ('a', 'DT'), ('role',
'NN'), ('.', '.')]
[('Wise', 'NNP'), ('policies', 'NNS'), (',', ','), ('such', 'JJ'), ('as', 'IN'),
('welfare', 'NN'), ('reform', 'NN'), ('and', 'CC'), ('drug', 'NN'), ('education',
'NN'), ('and', 'CC'), ('support', 'NN'), ('for', 'IN'), ('abstinence', 'NN'), ('an
d', 'CC'), ('adoption', 'NN'), ('have', 'VBP'), ('made', 'VBN'), ('a', 'DT'), ('di
fference', 'NN'), ('in', 'IN'), ('the', 'DT'), ('character', 'NN'), ('of', 'IN'),
('our', 'PRP$'), ('country', 'NN'), ('.', '.')]
[('And', 'CC'), ('everyone', 'NN'), ('here', 'RB'), ('tonight', 'RB'), (',', ','),
('Democrat', 'NNP'), ('and', 'CC'), ('Republican', 'NNP'), (',', ','), ('has', 'VB
Z'), ('a', 'DT'), ('right', 'NN'), ('to', 'TO'), ('be', 'VB'), ('proud', 'JJ'),
('of', 'IN'), ('this', 'DT'), ('record', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Yet', 'RB'), ('many', 'JJ'), ('Americans', 'NNPS'), (',', ','), ('especially',
'RB'), ('parents', 'NNS'), (',', ','), ('still', 'RB'), ('have', 'VBP'), ('deep',
'JJ'), ('concerns', 'NNS'), ('about', 'IN'), ('the', 'DT'), ('direction', 'NN'),
('of', 'IN'), ('our', 'PRP$'), ('culture', 'NN'), (',', ','), ('and', 'CC'), ('th
e', 'DT'), ('health', 'NN'), ('of', 'IN'), ('our', 'PRP$'), ('most', 'JJS'), ('bas
ic', 'JJ'), ('institutions', 'NNS'), ('.', '.')]
[('They', 'PRP'), ("'re", 'VBP'), ('concerned', 'VBN'), ('about', 'IN'), ('unethic
al', 'JJ'), ('conduct', 'NN'), ('by', 'IN'), ('public', 'JJ'), ('officials', 'NN
S'), (',', ','), ('and', 'CC'), ('discouraged', 'VBN'), ('by', 'IN'), ('activist',
'NN'), ('courts', 'NNS'), ('that', 'WDT'), ('try', 'VBP'), ('to', 'TO'), ('redefin
e', 'VB'), ('marriage', 'NN'), ('.', '.')]

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 31/36
4/23/24, 12:25 PM NLP.27.03.24
[('They', 'PRP'), ('worry', 'VBP'), ('about', 'IN'), ('children', 'NNS'), ('in',
'IN'), ('our', 'PRP$'), ('society', 'NN'), ('who', 'WP'), ('need', 'VBP'), ('direc
tion', 'NN'), ('and', 'CC'), ('love', 'NN'), (',', ','), ('and', 'CC'), ('about',
'IN'), ('fellow', 'JJ'), ('citizens', 'NNS'), ('still', 'RB'), ('displaced', 'VB
N'), ('by', 'IN'), ('natural', 'JJ'), ('disaster', 'NN'), (',', ','), ('and', 'C
C'), ('about', 'IN'), ('suffering', 'VBG'), ('caused', 'VBN'), ('by', 'IN'), ('tre
atable', 'JJ'), ('diseases', 'NNS'), ('.', '.')]
[('As', 'IN'), ('we', 'PRP'), ('look', 'VBP'), ('at', 'IN'), ('these', 'DT'), ('ch
allenges', 'NNS'), (',', ','), ('we', 'PRP'), ('must', 'MD'), ('never', 'RB'), ('g
ive', 'VB'), ('in', 'IN'), ('to', 'TO'), ('the', 'DT'), ('belief', 'NN'), ('that',
'IN'), ('America', 'NNP'), ('is', 'VBZ'), ('in', 'IN'), ('decline', 'NN'), (',',
','), ('or', 'CC'), ('that', 'IN'), ('our', 'PRP$'), ('culture', 'NN'), ('is', 'VB
Z'), ('doomed', 'VBN'), ('to', 'TO'), ('unravel', 'VB'), ('.', '.')]
[('The', 'DT'), ('American', 'JJ'), ('people', 'NNS'), ('know', 'VBP'), ('better',
'JJR'), ('than', 'IN'), ('that', 'DT'), ('.', '.')]
[('We', 'PRP'), ('have', 'VBP'), ('proven', 'VBN'), ('the', 'DT'), ('pessimists',
'NNS'), ('wrong', 'JJ'), ('before', 'RB'), ('--', ':'), ('and', 'CC'), ('we', 'PR
P'), ('will', 'MD'), ('do', 'VB'), ('it', 'PRP'), ('again', 'RB'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('depends', 'VBZ'), ('on', 'I
N'), ('courts', 'NNS'), ('that', 'IN'), ('deliver', 'VBP'), ('equal', 'JJ'), ('jus
tice', 'NN'), ('under', 'IN'), ('the', 'DT'), ('law', 'NN'), ('.', '.')]
[('The', 'DT'), ('Supreme', 'NNP'), ('Court', 'NNP'), ('now', 'RB'), ('has', 'VB
Z'), ('two', 'CD'), ('superb', 'JJ'), ('new', 'JJ'), ('members', 'NNS'), ('--',
':'), ('new', 'JJ'), ('members', 'NNS'), ('on', 'IN'), ('its', 'PRP$'), ('bench',
'NN'), (':', ':'), ('Chief', 'JJ'), ('Justice', 'NNP'), ('John', 'NNP'), ('Robert
s', 'NNP'), ('and', 'CC'), ('Justice', 'NNP'), ('Sam', 'NNP'), ('Alito', 'NNP'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('I', 'PRP'), ('thank', 'VBD'), ('the', 'DT'), ('Senate', 'NNP'), ('for', 'IN'),
('confirming', 'VBG'), ('both', 'DT'), ('of', 'IN'), ('them', 'PRP'), ('.', '.')]
[('I', 'PRP'), ('will', 'MD'), ('continue', 'VB'), ('to', 'TO'), ('nominate', 'V
B'), ('men', 'NNS'), ('and', 'CC'), ('women', 'NNS'), ('who', 'WP'), ('understan
d', 'VBP'), ('that', 'IN'), ('judges', 'NNS'), ('must', 'MD'), ('be', 'VB'), ('ser
vants', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('law', 'NN'), (',', ','), ('and', 'C
C'), ('not', 'RB'), ('legislate', 'VB'), ('from', 'IN'), ('the', 'DT'), ('bench',
'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Today', 'NN'), ('marks', 'VBZ'), ('the', 'DT'), ('official', 'JJ'), ('retiremen
t', 'NN'), ('of', 'IN'), ('a', 'DT'), ('very', 'RB'), ('special', 'JJ'), ('America
n', 'NNP'), ('.', '.')]
[('For', 'IN'), ('24', 'CD'), ('years', 'NNS'), ('of', 'IN'), ('faithful', 'JJ'),
('service', 'NN'), ('to', 'TO'), ('our', 'PRP$'), ('nation', 'NN'), (',', ','),
('the', 'DT'), ('United', 'NNP'), ('States', 'NNPS'), ('is', 'VBZ'), ('grateful',
'JJ'), ('to', 'TO'), ('Justice', 'NNP'), ('Sandra', 'NNP'), ('Day', 'NNP'), ("O'Co
nnor", 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('has', 'VBZ'), ('institution
s', 'NNS'), ('of', 'IN'), ('science', 'NN'), ('and', 'CC'), ('medicine', 'NN'),
('that', 'WDT'), ('do', 'VBP'), ('not', 'RB'), ('cut', 'VB'), ('ethical', 'JJ'),
('corners', 'NNS'), (',', ','), ('and', 'CC'), ('that', 'IN'), ('recognize', 'VB
P'), ('the', 'DT'), ('matchless', 'NN'), ('value', 'NN'), ('of', 'IN'), ('every',
'DT'), ('life', 'NN'), ('.', '.')]
[('Tonight', 'NNP'), ('I', 'PRP'), ('ask', 'VBP'), ('you', 'PRP'), ('to', 'TO'),
('pass', 'VB'), ('legislation', 'NN'), ('to', 'TO'), ('prohibit', 'VB'), ('the',
'DT'), ('most', 'RBS'), ('egregious', 'JJ'), ('abuses', 'NNS'), ('of', 'IN'), ('me
dical', 'JJ'), ('research', 'NN'), (':', ':'), ('human', 'JJ'), ('cloning', 'VB
G'), ('in', 'IN'), ('all', 'DT'), ('its', 'PRP$'), ('forms', 'NNS'), (',', ','),
('creating', 'VBG'), ('or', 'CC'), ('implanting', 'VBG'), ('embryos', 'NN'), ('fo
r', 'IN'), ('experiments', 'NNS'), (',', ','), ('creating', 'VBG'), ('human-anima
l', 'JJ'), ('hybrids', 'NNS'), (',', ','), ('and', 'CC'), ('buying', 'NN'), (',',
','), ('selling', 'NN'), (',', ','), ('or', 'CC'), ('patenting', 'VBG'), ('human',
'JJ'), ('embryos', 'NN'), ('.', '.')]
[('Human', 'NNP'), ('life', 'NN'), ('is', 'VBZ'), ('a', 'DT'), ('gift', 'NN'), ('f

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 32/36
4/23/24, 12:25 PM NLP.27.03.24
rom', 'IN'), ('our', 'PRP$'), ('Creator', 'NNP'), ('--', ':'), ('and', 'CC'), ('th
at', 'IN'), ('gift', 'NN'), ('should', 'MD'), ('never', 'RB'), ('be', 'VB'), ('dis
carded', 'VBN'), (',', ','), ('devalued', 'VBD'), ('or', 'CC'), ('put', 'VB'), ('u
p', 'RP'), ('for', 'IN'), ('sale', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('expects', 'VBZ'), ('electe
d', 'VBN'), ('officials', 'NNS'), ('to', 'TO'), ('uphold', 'VB'), ('the', 'DT'),
('public', 'JJ'), ('trust', 'NN'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Honorable', 'JJ'), ('people', 'NNS'), ('in', 'IN'), ('both', 'DT'), ('parties',
'NNS'), ('are', 'VBP'), ('working', 'VBG'), ('on', 'IN'), ('reforms', 'NNS'), ('t
o', 'TO'), ('strengthen', 'VB'), ('the', 'DT'), ('ethical', 'JJ'), ('standards',
'NNS'), ('of', 'IN'), ('Washington', 'NNP'), ('--', ':'), ('I', 'PRP'), ('suppor
t', 'VBP'), ('your', 'PRP$'), ('efforts', 'NNS'), ('.', '.')]
[('Each', 'DT'), ('of', 'IN'), ('us', 'PRP'), ('has', 'VBZ'), ('made', 'VBN'),
('a', 'DT'), ('pledge', 'NN'), ('to', 'TO'), ('be', 'VB'), ('worthy', 'JJ'), ('o
f', 'IN'), ('public', 'JJ'), ('responsibility', 'NN'), ('--', ':'), ('and', 'CC'),
('that', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('pledge', 'NN'), ('we', 'PRP'), ('mus
t', 'MD'), ('never', 'RB'), ('forget', 'VB'), (',', ','), ('never', 'RB'), ('dismi
ss', 'NN'), (',', ','), ('and', 'CC'), ('never', 'RB'), ('betray', 'NN'), ('.',
'.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('As', 'IN'), ('we', 'PRP'), ('renew', 'VBP'), ('the', 'DT'), ('promise', 'NN'),
('of', 'IN'), ('our', 'PRP$'), ('institutions', 'NNS'), (',', ','), ('let', 'VB'),
('us', 'PRP'), ('also', 'RB'), ('show', 'VBP'), ('the', 'DT'), ('character', 'N
N'), ('of', 'IN'), ('America', 'NNP'), ('in', 'IN'), ('our', 'PRP$'), ('compassio
n', 'NN'), ('and', 'CC'), ('care', 'NN'), ('for', 'IN'), ('one', 'CD'), ('anothe
r', 'DT'), ('.', '.')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('gives', 'VBZ'), ('special',
'JJ'), ('attention', 'NN'), ('to', 'TO'), ('children', 'NNS'), ('who', 'WP'), ('la
ck', 'VBP'), ('direction', 'NN'), ('and', 'CC'), ('love', 'NN'), ('.', '.')]
[('Through', 'IN'), ('the', 'DT'), ('Helping', 'NNP'), ('America', 'NNP'), ("'s",
'POS'), ('Youth', 'NNP'), ('Initiative', 'NNP'), (',', ','), ('we', 'PRP'), ('ar
e', 'VBP'), ('encouraging', 'VBG'), ('caring', 'VBG'), ('adults', 'NNS'), ('to',
'TO'), ('get', 'VB'), ('involved', 'VBN'), ('in', 'IN'), ('the', 'DT'), ('life',
'NN'), ('of', 'IN'), ('a', 'DT'), ('child', 'NN'), ('--', ':'), ('and', 'CC'), ('t
his', 'DT'), ('good', 'JJ'), ('work', 'NN'), ('is', 'VBZ'), ('being', 'VBG'), ('le
d', 'VBN'), ('by', 'IN'), ('our', 'PRP$'), ('First', 'NNP'), ('Lady', 'NNP'),
(',', ','), ('Laura', 'NNP'), ('Bush', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('This', 'DT'), ('year', 'NN'), ('we', 'PRP'), ('will', 'MD'), ('add', 'VB'), ('r
esources', 'NNS'), ('to', 'TO'), ('encourage', 'VB'), ('young', 'JJ'), ('people',
'NNS'), ('to', 'TO'), ('stay', 'VB'), ('in', 'IN'), ('school', 'NN'), (',', ','),
('so', 'RB'), ('more', 'JJR'), ('of', 'IN'), ('America', 'NNP'), ("'s", 'POS'),
('youth', 'NN'), ('can', 'MD'), ('raise', 'VB'), ('their', 'PRP$'), ('sights', 'NN
S'), ('and', 'CC'), ('achieve', 'VBP'), ('their', 'PRP$'), ('dreams', 'NNS'),
('.', '.')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('comes', 'VBZ'), ('to', 'T
O'), ('the', 'DT'), ('aid', 'NN'), ('of', 'IN'), ('fellow', 'JJ'), ('citizens', 'N
NS'), ('in', 'IN'), ('times', 'NNS'), ('of', 'IN'), ('suffering', 'NN'), ('and',
'CC'), ('emergency', 'NN'), ('--', ':'), ('and', 'CC'), ('stays', 'NNS'), ('at',
'IN'), ('it', 'PRP'), ('until', 'IN'), ('they', 'PRP'), ("'re", 'VBP'), ('back',
'RB'), ('on', 'IN'), ('their', 'PRP$'), ('feet', 'NNS'), ('.', '.')]
[('So', 'RB'), ('far', 'RB'), ('the', 'DT'), ('federal', 'JJ'), ('government', 'N
N'), ('has', 'VBZ'), ('committed', 'VBN'), ('$', '$'), ('85', 'CD'), ('billion',
'CD'), ('to', 'TO'), ('the', 'DT'), ('people', 'NNS'), ('of', 'IN'), ('the', 'D
T'), ('Gulf', 'NNP'), ('Coast', 'NNP'), ('and', 'CC'), ('New', 'NNP'), ('Orleans',
'NNP'), ('.', '.')]
[('We', 'PRP'), ("'re", 'VBP'), ('removing', 'VBG'), ('debris', 'NN'), ('and', 'C
C'), ('repairing', 'NN'), ('highways', 'NNS'), ('and', 'CC'), ('rebuilding', 'VB
G'), ('stronger', 'JJR'), ('levees', 'NNS'), ('.', '.')]
[('We', 'PRP'), ("'re", 'VBP'), ('providing', 'VBG'), ('business', 'NN'), ('loan
s', 'NNS'), ('and', 'CC'), ('housing', 'NN'), ('assistance', 'NN'), ('.', '.')]
[('Yet', 'RB'), ('as', 'IN'), ('we', 'PRP'), ('meet', 'VBP'), ('these', 'DT'), ('i

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 33/36
4/23/24, 12:25 PM NLP.27.03.24
mmediate', 'JJ'), ('needs', 'NNS'), (',', ','), ('we', 'PRP'), ('must', 'MD'), ('a
lso', 'RB'), ('address', 'VB'), ('deeper', 'JJR'), ('challenges', 'NNS'), ('that',
'WDT'), ('existed', 'VBD'), ('before', 'IN'), ('the', 'DT'), ('storm', 'NN'), ('ar
rived', 'VBD'), ('.', '.')]
[('In', 'IN'), ('New', 'NNP'), ('Orleans', 'NNP'), ('and', 'CC'), ('in', 'IN'),
('other', 'JJ'), ('places', 'NNS'), (',', ','), ('many', 'JJ'), ('of', 'IN'), ('ou
r', 'PRP$'), ('fellow', 'JJ'), ('citizens', 'NNS'), ('have', 'VBP'), ('felt', 'VB
N'), ('excluded', 'VBN'), ('from', 'IN'), ('the', 'DT'), ('promise', 'NN'), ('of',
'IN'), ('our', 'PRP$'), ('country', 'NN'), ('.', '.')]
[('The', 'DT'), ('answer', 'NN'), ('is', 'VBZ'), ('not', 'RB'), ('only', 'RB'),
('temporary', 'JJ'), ('relief', 'NN'), (',', ','), ('but', 'CC'), ('schools', 'NN
S'), ('that', 'WDT'), ('teach', 'VBP'), ('every', 'DT'), ('child', 'NN'), (',',
','), ('and', 'CC'), ('job', 'NN'), ('skills', 'NNS'), ('that', 'IN'), ('bring',
'VBG'), ('upward', 'JJ'), ('mobility', 'NN'), (',', ','), ('and', 'CC'), ('more',
'JJR'), ('opportunities', 'NNS'), ('to', 'TO'), ('own', 'VB'), ('a', 'DT'), ('hom
e', 'NN'), ('and', 'CC'), ('start', 'VB'), ('a', 'DT'), ('business', 'NN'), ('.',
'.')]
[('As', 'IN'), ('we', 'PRP'), ('recover', 'VBP'), ('from', 'IN'), ('a', 'DT'), ('d
isaster', 'NN'), (',', ','), ('let', 'VB'), ('us', 'PRP'), ('also', 'RB'), ('wor
k', 'NN'), ('for', 'IN'), ('the', 'DT'), ('day', 'NN'), ('when', 'WRB'), ('all',
'DT'), ('Americans', 'NNPS'), ('are', 'VBP'), ('protected', 'VBN'), ('by', 'IN'),
('justice', 'NN'), (',', ','), ('equal', 'JJ'), ('in', 'IN'), ('hope', 'NN'),
(',', ','), ('and', 'CC'), ('rich', 'JJ'), ('in', 'IN'), ('opportunity', 'NN'),
('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('A', 'DT'), ('hopeful', 'JJ'), ('society', 'NN'), ('acts', 'NNS'), ('boldly', 'R
B'), ('to', 'TO'), ('fight', 'VB'), ('diseases', 'NNS'), ('like', 'IN'), ('HIV/AID
S', 'NNP'), (',', ','), ('which', 'WDT'), ('can', 'MD'), ('be', 'VB'), ('prevente
d', 'VBN'), (',', ','), ('and', 'CC'), ('treated', 'VBD'), (',', ','), ('and', 'C
C'), ('defeated', 'VBD'), ('.', '.')]
[('More', 'JJR'), ('than', 'IN'), ('a', 'DT'), ('million', 'CD'), ('Americans', 'N
NPS'), ('live', 'VBP'), ('with', 'IN'), ('HIV', 'NNP'), (',', ','), ('and', 'CC'),
('half', 'NN'), ('of', 'IN'), ('all', 'DT'), ('AIDS', 'NNP'), ('cases', 'NNS'),
('occur', 'VBP'), ('among', 'IN'), ('African', 'JJ'), ('Americans', 'NNPS'), ('.',
'.')]
[('I', 'PRP'), ('ask', 'VBP'), ('Congress', 'NNP'), ('to', 'TO'), ('reform', 'V
B'), ('and', 'CC'), ('reauthorize', 'VB'), ('the', 'DT'), ('Ryan', 'NNP'), ('Whit
e', 'NNP'), ('Act', 'NNP'), (',', ','), ('and', 'CC'), ('provide', 'VB'), ('new',
'JJ'), ('funding', 'NN'), ('to', 'TO'), ('states', 'NNS'), (',', ','), ('so', 'I
N'), ('we', 'PRP'), ('end', 'VBP'), ('the', 'DT'), ('waiting', 'NN'), ('lists', 'N
NS'), ('for', 'IN'), ('AIDS', 'NNP'), ('medicines', 'NNS'), ('in', 'IN'), ('Americ
a', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('We', 'PRP'), ('will', 'MD'), ('also', 'RB'), ('lead', 'VB'), ('a', 'DT'), ('nat
ionwide', 'JJ'), ('effort', 'NN'), (',', ','), ('working', 'VBG'), ('closely', 'R
B'), ('with', 'IN'), ('African', 'JJ'), ('American', 'JJ'), ('churches', 'NNS'),
('and', 'CC'), ('faith-based', 'JJ'), ('groups', 'NNS'), (',', ','), ('to', 'TO'),
('deliver', 'VB'), ('rapid', 'JJ'), ('HIV', 'NNP'), ('tests', 'NNS'), ('to', 'T
O'), ('millions', 'NNS'), (',', ','), ('end', 'VBP'), ('the', 'DT'), ('stigma', 'N
N'), ('of', 'IN'), ('AIDS', 'NNP'), (',', ','), ('and', 'CC'), ('come', 'VB'), ('c
loser', 'JJR'), ('to', 'TO'), ('the', 'DT'), ('day', 'NN'), ('when', 'WRB'), ('the
re', 'EX'), ('are', 'VBP'), ('no', 'DT'), ('new', 'JJ'), ('infections', 'NNS'),
('in', 'IN'), ('America', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]
[('Fellow', 'NNP'), ('citizens', 'NNS'), (',', ','), ('we', 'PRP'), ("'ve", 'VB
P'), ('been', 'VBN'), ('called', 'VBN'), ('to', 'TO'), ('leadership', 'NN'), ('i
n', 'IN'), ('a', 'DT'), ('period', 'NN'), ('of', 'IN'), ('consequence', 'NN'),
('.', '.')]
[('We', 'PRP'), ("'ve", 'VBP'), ('entered', 'VBN'), ('a', 'DT'), ('great', 'JJ'),
('ideological', 'JJ'), ('conflict', 'NN'), ('we', 'PRP'), ('did', 'VBD'), ('nothin
g', 'NN'), ('to', 'TO'), ('invite', 'VB'), ('.', '.')]
[('We', 'PRP'), ('see', 'VBP'), ('great', 'JJ'), ('changes', 'NNS'), ('in', 'IN'),
('science', 'NN'), ('and', 'CC'), ('commerce', 'NN'), ('that', 'WDT'), ('will', 'M
D'), ('influence', 'VB'), ('all', 'DT'), ('our', 'PRP$'), ('lives', 'NNS'), ('.',

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 34/36
4/23/24, 12:25 PM NLP.27.03.24
'.')]
[('Sometimes', 'RB'), ('it', 'PRP'), ('can', 'MD'), ('seem', 'VB'), ('that', 'D
T'), ('history', 'NN'), ('is', 'VBZ'), ('turning', 'VBG'), ('in', 'IN'), ('a', 'D
T'), ('wide', 'JJ'), ('arc', 'NN'), (',', ','), ('toward', 'IN'), ('an', 'DT'),
('unknown', 'JJ'), ('shore', 'NN'), ('.', '.')]
[('Yet', 'CC'), ('the', 'DT'), ('destination', 'NN'), ('of', 'IN'), ('history', 'N
N'), ('is', 'VBZ'), ('determined', 'VBN'), ('by', 'IN'), ('human', 'JJ'), ('actio
n', 'NN'), (',', ','), ('and', 'CC'), ('every', 'DT'), ('great', 'JJ'), ('movemen
t', 'NN'), ('of', 'IN'), ('history', 'NN'), ('comes', 'VBZ'), ('to', 'TO'), ('a',
'DT'), ('point', 'NN'), ('of', 'IN'), ('choosing', 'NN'), ('.', '.')]
[('Lincoln', 'NNP'), ('could', 'MD'), ('have', 'VB'), ('accepted', 'VBN'), ('peac
e', 'NN'), ('at', 'IN'), ('the', 'DT'), ('cost', 'NN'), ('of', 'IN'), ('disunity',
'NN'), ('and', 'CC'), ('continued', 'JJ'), ('slavery', 'NN'), ('.', '.')]
[('Martin', 'NNP'), ('Luther', 'NNP'), ('King', 'NNP'), ('could', 'MD'), ('have',
'VB'), ('stopped', 'VBN'), ('at', 'IN'), ('Birmingham', 'NNP'), ('or', 'CC'), ('a
t', 'IN'), ('Selma', 'NNP'), (',', ','), ('and', 'CC'), ('achieved', 'VBD'), ('onl
y', 'RB'), ('half', 'PDT'), ('a', 'DT'), ('victory', 'NN'), ('over', 'IN'), ('segr
egation', 'NN'), ('.', '.')]
[('The', 'DT'), ('United', 'NNP'), ('States', 'NNPS'), ('could', 'MD'), ('have',
'VB'), ('accepted', 'VBN'), ('the', 'DT'), ('permanent', 'JJ'), ('division', 'N
N'), ('of', 'IN'), ('Europe', 'NNP'), (',', ','), ('and', 'CC'), ('been', 'VBN'),
('complicit', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('oppression', 'NN'), ('of', 'I
N'), ('others', 'NNS'), ('.', '.')]
[('Today', 'NN'), (',', ','), ('having', 'VBG'), ('come', 'VBN'), ('far', 'RB'),
('in', 'IN'), ('our', 'PRP$'), ('own', 'JJ'), ('historical', 'JJ'), ('journey', 'N
N'), (',', ','), ('we', 'PRP'), ('must', 'MD'), ('decide', 'VB'), (':', ':'), ('Wi
ll', 'MD'), ('we', 'PRP'), ('turn', 'VB'), ('back', 'RP'), (',', ','), ('or', 'C
C'), ('finish', 'VB'), ('well', 'RB'), ('?', '.')]
[('Before', 'IN'), ('history', 'NN'), ('is', 'VBZ'), ('written', 'VBN'), ('down',
'RP'), ('in', 'IN'), ('books', 'NNS'), (',', ','), ('it', 'PRP'), ('is', 'VBZ'),
('written', 'VBN'), ('in', 'IN'), ('courage', 'NN'), ('.', '.')]
[('Like', 'IN'), ('Americans', 'NNPS'), ('before', 'IN'), ('us', 'PRP'), (',',
','), ('we', 'PRP'), ('will', 'MD'), ('show', 'VB'), ('that', 'DT'), ('courage',
'NN'), ('and', 'CC'), ('we', 'PRP'), ('will', 'MD'), ('finish', 'VB'), ('well', 'R
B'), ('.', '.')]
[('We', 'PRP'), ('will', 'MD'), ('lead', 'VB'), ('freedom', 'NN'), ("'s", 'POS'),
('advance', 'NN'), ('.', '.')]
[('We', 'PRP'), ('will', 'MD'), ('compete', 'VB'), ('and', 'CC'), ('excel', 'VB'),
('in', 'IN'), ('the', 'DT'), ('global', 'JJ'), ('economy', 'NN'), ('.', '.')]
[('We', 'PRP'), ('will', 'MD'), ('renew', 'VB'), ('the', 'DT'), ('defining', 'VB
G'), ('moral', 'JJ'), ('commitments', 'NNS'), ('of', 'IN'), ('this', 'DT'), ('lan
d', 'NN'), ('.', '.')]
[('And', 'CC'), ('so', 'RB'), ('we', 'PRP'), ('move', 'VBP'), ('forward', 'RB'),
('--', ':'), ('optimistic', 'JJ'), ('about', 'IN'), ('our', 'PRP$'), ('country',
'NN'), (',', ','), ('faithful', 'JJ'), ('to', 'TO'), ('its', 'PRP$'), ('cause', 'N
N'), (',', ','), ('and', 'CC'), ('confident', 'NN'), ('of', 'IN'), ('the', 'DT'),
('victories', 'NNS'), ('to', 'TO'), ('come', 'VB'), ('.', '.')]
[('May', 'NNP'), ('God', 'NNP'), ('bless', 'NN'), ('America', 'NNP'), ('.', '.')]
[('(', '('), ('Applause', 'NNP'), ('.', '.'), (')', ')')]

In [50]: import os

In [52]: from nltk.corpus import stopwords

In [54]: stop_words1 = set(stopwords.words('odia')) # here english language. you may do in

print(stop_words1)

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 35/36
4/23/24, 12:25 PM NLP.27.03.24

---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[54], line 1
----> 1 stop_words1 = set(stopwords.words('odia')) # here english language. you m
ay do in other language
3 print(stop_words1)

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\corpus\reader\wordlist.py:21,
in WordListCorpusReader.words(self, fileids, ignore_lines_startswith)
18 def words(self, fileids=None, ignore_lines_startswith="\n"):
19 return [
20 line
---> 21 for line in line_tokenize(self.raw(fileids))
22 if not line.startswith(ignore_lines_startswith)
23 ]

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\corpus\reader\api.py:218, in
CorpusReader.raw(self, fileids)
216 contents = []
217 for f in fileids:
--> 218 with self.open(f) as fp:
219 contents.append(fp.read())
220 return concat(contents)

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\corpus\reader\api.py:231, in
CorpusReader.open(self, file)
223 """
224 Return an open stream that can be used to read the given file.
225 If the file's encoding is not None, then the stream will
(...)
228 :param file: The file identifier of the file to read.
229 """
230 encoding = self.encoding(file)
--> 231 stream = self._root.join(file).open(encoding)
232 return stream

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\data.py:334, in FileSystemPat


hPointer.join(self, fileid)
332 def join(self, fileid):
333 _path = os.path.join(self._path, fileid)
--> 334 return FileSystemPathPointer(_path)

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\compat.py:41, in py3_data.<lo


cals>._decorator(*args, **kwargs)
39 def _decorator(*args, **kwargs):
40 args = (args[0], add_py3_data(args[1])) + args[2:]
---> 41 return init_func(*args, **kwargs)

File C:\ProgramData\anaconda3\Lib\site-packages\nltk\data.py:312, in FileSystemPat


hPointer.__init__(self, _path)
310 _path = os.path.abspath(_path)
311 if not os.path.exists(_path):
--> 312 raise OSError("No such file or directory: %r" % _path)
313 self._path = _path

OSError: No such file or directory: 'C:\\Users\\dhara\\AppData\\Roaming\\nltk_data


\\corpora\\stopwords\\odia'

In [ ]:

localhost:8889/nbconvert/html/NLP.27.03.24.ipynb?download=false 36/36
4/23/24, 1:23 PM NLP.02.04.24

parts of speech (tagging)


it is a linguistic activity in NLP. where in each word in a document is given a particular part of
speech like adverb,adjective,noun, etc or grammatical category.

In NLP application POS tagging is useful for machine translation.

ex:- mubin bring an book from library and gave to dhara. This sentence has errors. first is
brings,a,the,gives

In [1]: import nltk

In [2]: text = " The indian Criket Team saw a transition after the 2811 world cup. Senior p

In [3]: from nltk.tokenize import word_tokenize

In [4]: result = word_tokenize(text)

In [5]: print(result)

['The', 'indian', 'Criket', 'Team', 'saw', 'a', 'transition', 'after', 'the', '281
1', 'world', 'cup', '.', 'Senior', 'players', 'were', 'dropped', 'and', 'younster
s', 'were', 'given', 'the', 'chances']

In [6]: nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package averaged_perceptron_tagger to


[nltk_data] C:\Users\dhara\AppData\Roaming\nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
True
Out[6]:

wapp pos tagging

In [7]: final = nltk.pos_tag(result)

In [8]: print(final)

[('The', 'DT'), ('indian', 'JJ'), ('Criket', 'NNP'), ('Team', 'NNP'), ('saw', 'VB
D'), ('a', 'DT'), ('transition', 'NN'), ('after', 'IN'), ('the', 'DT'), ('2811',
'CD'), ('world', 'NN'), ('cup', 'NN'), ('.', '.'), ('Senior', 'JJ'), ('players',
'NNS'), ('were', 'VBD'), ('dropped', 'VBN'), ('and', 'CC'), ('younsters', 'NNS'),
('were', 'VBD'), ('given', 'VBN'), ('the', 'DT'), ('chances', 'NNS')]

In [9]: nltk.download('tagsets')

[nltk_data] Downloading package tagsets to


[nltk_data] C:\Users\dhara\AppData\Roaming\nltk_data...
[nltk_data] Package tagsets is already up-to-date!
True
Out[9]:

In [10]: nltk.help.upenn_tagset('DT')

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 1/13
4/23/24, 1:23 PM NLP.02.04.24
DT: determiner
all an another any both del each either every half la many much nary
neither no some such that the them these this those

In [11]: nltk.help.upenn_tagset('JJ')

JJ: adjective or numeral, ordinal


third ill-mannered pre-war regrettable oiled calamitous first separable
ectoplasmic battery-powered participatory fourth still-to-be-named
multilingual multi-disciplinary ...

In [12]: nltk.help.upenn_tagset('VBD')

VBD: verb, past tense


dipped pleaded swiped regummed soaked tidied convened halted registered
cushioned exacted snubbed strode aimed adopted belied figgered
speculated wore appreciated contemplated ...

In [13]: nltk.help.upenn_tagset('CD')

CD: numeral, cardinal


mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-
seven 1987 twenty '79 zero two 78-degrees eighty-four IX '60s .025
fifteen 271,124 dozen quintillion DM2,000 ...

In [14]: nltk.help.upenn_tagset('CC')

CC: conjunction, coordinating


& 'n and both but either et for less minus neither nor or plus so
therefore times v. versus vs. whether yet

In [15]: for i in range (0,len(final)):


if (final[i][1]=='DT'):
print(final[i][0])

The
a
the
the

In [16]: print(final)

[('The', 'DT'), ('indian', 'JJ'), ('Criket', 'NNP'), ('Team', 'NNP'), ('saw', 'VB
D'), ('a', 'DT'), ('transition', 'NN'), ('after', 'IN'), ('the', 'DT'), ('2811',
'CD'), ('world', 'NN'), ('cup', 'NN'), ('.', '.'), ('Senior', 'JJ'), ('players',
'NNS'), ('were', 'VBD'), ('dropped', 'VBN'), ('and', 'CC'), ('younsters', 'NNS'),
('were', 'VBD'), ('given', 'VBN'), ('the', 'DT'), ('chances', 'NNS')]

In [17]: for i in range (0,len(final)):


if (final[i][1]=='NNP'):
print(final[i][0])

Criket
Team

In [18]: for i in range (0,len(final)):


if (final[i][1]=='VBD'):
print(final[i][0])

saw
were
were

In [19]: for i in range (0,len(final)):


if (final[i][1]=='NNS'):
print(final[i][0])

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 2/13
4/23/24, 1:23 PM NLP.02.04.24
players
younsters
chances

Spacy
We can give spans a bit more specificity by classifying them into different groups. These are
known as SpanGroup containers. spaCy (yes, spelled with a lowercase “s” and uppercase “C”
is a natural language processing framework. Natural language processing, or NLP, is a
branch of linguistics that seeks to parse human language in a computer system. This field is
generally referred to as computational linguistics, though it has far reaching applications
beyond academic linguistic research.

In [20]: !pip install spacy

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 3/13
4/23/24, 1:23 PM NLP.02.04.24
Defaulting to user installation because normal site-packages is not writeable
Collecting spacy
Obtaining dependency information for spacy from https://files.pythonhosted.org/p
ackages/92/fb/d1f0605e1e8627226c6c96053fe1632e9a04a3fbcd8b5d715528cb95eb97/spacy-
3.7.4-cp311-cp311-win_amd64.whl.metadata
Downloading spacy-3.7.4-cp311-cp311-win_amd64.whl.metadata (27 kB)
Collecting spacy-legacy<3.1.0,>=3.0.11 (from spacy)
Obtaining dependency information for spacy-legacy<3.1.0,>=3.0.11 from https://fi
les.pythonhosted.org/packages/c3/55/12e842c70ff8828e34e543a2c7176dac4da006ca6901c9
e8b43efab8bc6b/spacy_legacy-3.0.12-py2.py3-none-any.whl.metadata
Downloading spacy_legacy-3.0.12-py2.py3-none-any.whl.metadata (2.8 kB)
Collecting spacy-loggers<2.0.0,>=1.0.0 (from spacy)
Obtaining dependency information for spacy-loggers<2.0.0,>=1.0.0 from https://fi
les.pythonhosted.org/packages/33/78/d1a1a026ef3af911159398c939b1509d5c36fe524c7b64
4f34a5146c4e16/spacy_loggers-1.0.5-py3-none-any.whl.metadata
Downloading spacy_loggers-1.0.5-py3-none-any.whl.metadata (23 kB)
Collecting murmurhash<1.1.0,>=0.28.0 (from spacy)
Obtaining dependency information for murmurhash<1.1.0,>=0.28.0 from https://file
s.pythonhosted.org/packages/71/46/af01a20ec368bd9cb49a1d2df15e3eca113bbf6952cc1f2a
47f1c6801a7f/murmurhash-1.0.10-cp311-cp311-win_amd64.whl.metadata
Downloading murmurhash-1.0.10-cp311-cp311-win_amd64.whl.metadata (2.0 kB)
Collecting cymem<2.1.0,>=2.0.2 (from spacy)
Obtaining dependency information for cymem<2.1.0,>=2.0.2 from https://files.pyth
onhosted.org/packages/c1/c3/dd044e6f62a3d317c461f6f0c153c6573ed13025752d779e514000
c15dd2/cymem-2.0.8-cp311-cp311-win_amd64.whl.metadata
Downloading cymem-2.0.8-cp311-cp311-win_amd64.whl.metadata (8.6 kB)
Collecting preshed<3.1.0,>=3.0.2 (from spacy)
Obtaining dependency information for preshed<3.1.0,>=3.0.2 from https://files.py
thonhosted.org/packages/e4/fc/78cdbdb79f5d6d45949e72c32445d6c060977ad50a1dcfc03926
22165f7c/preshed-3.0.9-cp311-cp311-win_amd64.whl.metadata
Downloading preshed-3.0.9-cp311-cp311-win_amd64.whl.metadata (2.2 kB)
Collecting thinc<8.3.0,>=8.2.2 (from spacy)
Obtaining dependency information for thinc<8.3.0,>=8.2.2 from https://files.pyth
onhosted.org/packages/de/a5/c242d57dc7a8afe677aa48ce370d84be3d04523cbb819c4a36b64f
35155c/thinc-8.2.3-cp311-cp311-win_amd64.whl.metadata
Downloading thinc-8.2.3-cp311-cp311-win_amd64.whl.metadata (15 kB)
Collecting wasabi<1.2.0,>=0.9.1 (from spacy)
Obtaining dependency information for wasabi<1.2.0,>=0.9.1 from https://files.pyt
honhosted.org/packages/8f/69/26cbf0bad11703241cb84d5324d868097f7a8faf2f1888354dac8
883f3fc/wasabi-1.1.2-py3-none-any.whl.metadata
Downloading wasabi-1.1.2-py3-none-any.whl.metadata (28 kB)
Collecting srsly<3.0.0,>=2.4.3 (from spacy)
Obtaining dependency information for srsly<3.0.0,>=2.4.3 from https://files.pyth
onhosted.org/packages/eb/f5/e3f29993f673d91623df6413ba64e815dd2676fd7932cbc5e73474
02ddae/srsly-2.4.8-cp311-cp311-win_amd64.whl.metadata
Downloading srsly-2.4.8-cp311-cp311-win_amd64.whl.metadata (20 kB)
Collecting catalogue<2.1.0,>=2.0.6 (from spacy)
Obtaining dependency information for catalogue<2.1.0,>=2.0.6 from https://files.
pythonhosted.org/packages/9e/96/d32b941a501ab566a16358d68b6eb4e4acc373fab3c3c4d7d9
e649f7b4bb/catalogue-2.0.10-py3-none-any.whl.metadata
Downloading catalogue-2.0.10-py3-none-any.whl.metadata (14 kB)
Collecting weasel<0.4.0,>=0.1.0 (from spacy)
Obtaining dependency information for weasel<0.4.0,>=0.1.0 from https://files.pyt
honhosted.org/packages/d5/e5/b63b8e255d89ba4155972990d42523251d4d1368c4906c646597f
63870e2/weasel-0.3.4-py3-none-any.whl.metadata
Downloading weasel-0.3.4-py3-none-any.whl.metadata (4.7 kB)
Collecting typer<0.10.0,>=0.3.0 (from spacy)
Obtaining dependency information for typer<0.10.0,>=0.3.0 from https://files.pyt
honhosted.org/packages/62/39/82c9d3e10979851847361d922a373bdfef4091020da7f893acfaf
07c0225/typer-0.9.4-py3-none-any.whl.metadata
Downloading typer-0.9.4-py3-none-any.whl.metadata (14 kB)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in c:\programdata\anaconda
3\lib\site-packages (from spacy) (5.2.1)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in c:\programdata\anaconda3\lib

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 4/13
4/23/24, 1:23 PM NLP.02.04.24
\site-packages (from spacy) (4.65.0)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in c:\programdata\anaconda3
\lib\site-packages (from spacy) (2.31.0)
Collecting pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 (from spacy)
Obtaining dependency information for pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 from h
ttps://files.pythonhosted.org/packages/16/ca/330c4f3bd983bb24ac12c7fd1e08c26c8aed7
0bc64498cf38c770321067f/pydantic-2.7.0-py3-none-any.whl.metadata
Downloading pydantic-2.7.0-py3-none-any.whl.metadata (103 kB)
---------------------------------------- 0.0/103.4 kB ? eta -:--:--
-------------------------------------- 103.4/103.4 kB 2.0 MB/s eta 0:00:00
Requirement already satisfied: jinja2 in c:\programdata\anaconda3\lib\site-package
s (from spacy) (3.1.2)
Requirement already satisfied: setuptools in c:\programdata\anaconda3\lib\site-pac
kages (from spacy) (68.0.0)
Requirement already satisfied: packaging>=20.0 in c:\programdata\anaconda3\lib\sit
e-packages (from spacy) (23.0)
Collecting langcodes<4.0.0,>=3.2.0 (from spacy)
Obtaining dependency information for langcodes<4.0.0,>=3.2.0 from https://files.
pythonhosted.org/packages/fe/c3/0d04d248624a181e57c2870127dfa8d371973561caf54333c8
5e8f9133a2/langcodes-3.3.0-py3-none-any.whl.metadata
Downloading langcodes-3.3.0-py3-none-any.whl.metadata (29 kB)
Requirement already satisfied: numpy>=1.19.0 in c:\programdata\anaconda3\lib\site-
packages (from spacy) (1.24.3)
Collecting annotated-types>=0.4.0 (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spac
y)
Obtaining dependency information for annotated-types>=0.4.0 from https://files.p
ythonhosted.org/packages/28/78/d31230046e58c207284c6b2c4e8d96e6d3cb4e52354721b944d
3e1ee4aa5/annotated_types-0.6.0-py3-none-any.whl.metadata
Downloading annotated_types-0.6.0-py3-none-any.whl.metadata (12 kB)
Collecting pydantic-core==2.18.1 (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spac
y)
Obtaining dependency information for pydantic-core==2.18.1 from https://files.py
thonhosted.org/packages/65/9c/04371826c287b9e0233b2a7c910ea0275a41d6a9574e186a43ea
d32cd22c/pydantic_core-2.18.1-cp311-none-win_amd64.whl.metadata
Downloading pydantic_core-2.18.1-cp311-none-win_amd64.whl.metadata (6.7 kB)
Requirement already satisfied: typing-extensions>=4.6.1 in c:\programdata\anaconda
3\lib\site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (4.7.1)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\programdata\anaconda
3\lib\site-packages (from requests<3.0.0,>=2.13.0->spacy) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\programdata\anaconda3\lib\site-p
ackages (from requests<3.0.0,>=2.13.0->spacy) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\programdata\anaconda3\lib
\site-packages (from requests<3.0.0,>=2.13.0->spacy) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib
\site-packages (from requests<3.0.0,>=2.13.0->spacy) (2023.7.22)
Collecting blis<0.8.0,>=0.7.8 (from thinc<8.3.0,>=8.2.2->spacy)
Obtaining dependency information for blis<0.8.0,>=0.7.8 from https://files.pytho
nhosted.org/packages/2f/09/da0592c74560cc33396504698122f7a56747c82a5e072ca7d2c3397
898e1/blis-0.7.11-cp311-cp311-win_amd64.whl.metadata
Downloading blis-0.7.11-cp311-cp311-win_amd64.whl.metadata (7.6 kB)
Collecting confection<1.0.0,>=0.0.1 (from thinc<8.3.0,>=8.2.2->spacy)
Obtaining dependency information for confection<1.0.0,>=0.0.1 from https://file
s.pythonhosted.org/packages/39/78/f9d18da7b979a2e6007bfcea2f3c8cc02ed210538ae1ce7e
69092aed7b18/confection-0.1.4-py3-none-any.whl.metadata
Downloading confection-0.1.4-py3-none-any.whl.metadata (19 kB)
Requirement already satisfied: colorama in c:\programdata\anaconda3\lib\site-packa
ges (from tqdm<5.0.0,>=4.38.0->spacy) (0.4.6)
Requirement already satisfied: click<9.0.0,>=7.1.1 in c:\programdata\anaconda3\lib
\site-packages (from typer<0.10.0,>=0.3.0->spacy) (8.0.4)
Collecting cloudpathlib<0.17.0,>=0.7.0 (from weasel<0.4.0,>=0.1.0->spacy)
Obtaining dependency information for cloudpathlib<0.17.0,>=0.7.0 from https://fi
les.pythonhosted.org/packages/0f/6e/45b57a7d4573d85d0b0a39d99673dc1f5eea9d92a1a460
3b35e968fbf89a/cloudpathlib-0.16.0-py3-none-any.whl.metadata
Downloading cloudpathlib-0.16.0-py3-none-any.whl.metadata (14 kB)

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 5/13
4/23/24, 1:23 PM NLP.02.04.24
Requirement already satisfied: MarkupSafe>=2.0 in c:\programdata\anaconda3\lib\sit
e-packages (from jinja2->spacy) (2.1.1)
Downloading spacy-3.7.4-cp311-cp311-win_amd64.whl (12.1 MB)
---------------------------------------- 0.0/12.1 MB ? eta -:--:--
---------------------------------------- 0.1/12.1 MB 2.6 MB/s eta 0:00:05
--------------------------------------- 0.2/12.1 MB 2.9 MB/s eta 0:00:05
- -------------------------------------- 0.3/12.1 MB 2.5 MB/s eta 0:00:05
- -------------------------------------- 0.5/12.1 MB 2.8 MB/s eta 0:00:05
-- ------------------------------------- 0.7/12.1 MB 3.0 MB/s eta 0:00:04
-- ------------------------------------- 0.9/12.1 MB 3.2 MB/s eta 0:00:04
--- ------------------------------------ 1.1/12.1 MB 3.3 MB/s eta 0:00:04
---- ----------------------------------- 1.2/12.1 MB 3.4 MB/s eta 0:00:04
---- ----------------------------------- 1.5/12.1 MB 3.7 MB/s eta 0:00:03
----- ---------------------------------- 1.8/12.1 MB 3.9 MB/s eta 0:00:03
------ --------------------------------- 2.0/12.1 MB 4.1 MB/s eta 0:00:03
------- -------------------------------- 2.3/12.1 MB 4.2 MB/s eta 0:00:03
-------- ------------------------------- 2.5/12.1 MB 4.4 MB/s eta 0:00:03
--------- ------------------------------ 2.8/12.1 MB 4.4 MB/s eta 0:00:03
---------- ----------------------------- 3.1/12.1 MB 4.6 MB/s eta 0:00:02
----------- ---------------------------- 3.4/12.1 MB 4.7 MB/s eta 0:00:02
------------ --------------------------- 3.7/12.1 MB 4.8 MB/s eta 0:00:02
------------ --------------------------- 3.9/12.1 MB 4.7 MB/s eta 0:00:02
------------- -------------------------- 4.1/12.1 MB 4.7 MB/s eta 0:00:02
-------------- ------------------------- 4.4/12.1 MB 4.8 MB/s eta 0:00:02
--------------- ------------------------ 4.8/12.1 MB 4.9 MB/s eta 0:00:02
--------------- ------------------------ 4.8/12.1 MB 4.7 MB/s eta 0:00:02
---------------- ----------------------- 5.0/12.1 MB 4.7 MB/s eta 0:00:02
----------------- ---------------------- 5.3/12.1 MB 4.7 MB/s eta 0:00:02
------------------ --------------------- 5.6/12.1 MB 4.8 MB/s eta 0:00:02
------------------- -------------------- 6.0/12.1 MB 4.9 MB/s eta 0:00:02
--------------------- ------------------ 6.4/12.1 MB 5.1 MB/s eta 0:00:02
---------------------- ----------------- 6.8/12.1 MB 5.1 MB/s eta 0:00:02
----------------------- ---------------- 7.0/12.1 MB 5.2 MB/s eta 0:00:01
------------------------- -------------- 7.7/12.1 MB 5.4 MB/s eta 0:00:01
-------------------------- ------------- 8.0/12.1 MB 5.5 MB/s eta 0:00:01
--------------------------- ------------ 8.2/12.1 MB 5.4 MB/s eta 0:00:01
---------------------------- ----------- 8.5/12.1 MB 5.5 MB/s eta 0:00:01
------------------------------ --------- 9.3/12.1 MB 5.7 MB/s eta 0:00:01
-------------------------------- ------- 9.8/12.1 MB 5.8 MB/s eta 0:00:01
--------------------------------- ------ 10.2/12.1 MB 5.9 MB/s eta 0:00:01
---------------------------------- ----- 10.5/12.1 MB 6.1 MB/s eta 0:00:01
----------------------------------- ---- 10.8/12.1 MB 6.4 MB/s eta 0:00:01
------------------------------------- -- 11.2/12.1 MB 6.5 MB/s eta 0:00:01
------------------------------------- -- 11.5/12.1 MB 6.6 MB/s eta 0:00:01
--------------------------------------- 11.9/12.1 MB 6.7 MB/s eta 0:00:01
--------------------------------------- 12.1/12.1 MB 6.8 MB/s eta 0:00:01
---------------------------------------- 12.1/12.1 MB 6.5 MB/s eta 0:00:00
Downloading catalogue-2.0.10-py3-none-any.whl (17 kB)
Downloading cymem-2.0.8-cp311-cp311-win_amd64.whl (39 kB)
Downloading langcodes-3.3.0-py3-none-any.whl (181 kB)
---------------------------------------- 0.0/181.6 kB ? eta -:--:--
---------------------------------------- 181.6/181.6 kB 5.5 MB/s eta 0:00:00
Downloading murmurhash-1.0.10-cp311-cp311-win_amd64.whl (25 kB)
Downloading preshed-3.0.9-cp311-cp311-win_amd64.whl (122 kB)
---------------------------------------- 0.0/122.3 kB ? eta -:--:--
---------------------------------------- 122.3/122.3 kB 7.0 MB/s eta 0:00:00
Downloading pydantic-2.7.0-py3-none-any.whl (407 kB)
---------------------------------------- 0.0/407.9 kB ? eta -:--:--
-------------------------------------- 399.4/407.9 kB 12.6 MB/s eta 0:00:01
--------------------------------------- 407.9/407.9 kB 12.8 MB/s eta 0:00:00
Downloading pydantic_core-2.18.1-cp311-none-win_amd64.whl (1.9 MB)
---------------------------------------- 0.0/1.9 MB ? eta -:--:--
--------- ------------------------------ 0.5/1.9 MB 13.8 MB/s eta 0:00:01
-------------------- ------------------- 1.0/1.9 MB 15.1 MB/s eta 0:00:01

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 6/13
4/23/24, 1:23 PM NLP.02.04.24
---------------------------------- ----- 1.7/1.9 MB 13.3 MB/s eta 0:00:01
---------------------------------------- 1.9/1.9 MB 12.1 MB/s eta 0:00:00
Downloading spacy_legacy-3.0.12-py2.py3-none-any.whl (29 kB)
Downloading spacy_loggers-1.0.5-py3-none-any.whl (22 kB)
Downloading srsly-2.4.8-cp311-cp311-win_amd64.whl (479 kB)
---------------------------------------- 0.0/479.7 kB ? eta -:--:--
--------------------------------------- 479.7/479.7 kB 10.2 MB/s eta 0:00:00
Downloading thinc-8.2.3-cp311-cp311-win_amd64.whl (1.5 MB)
---------------------------------------- 0.0/1.5 MB ? eta -:--:--
------ --------------------------------- 0.2/1.5 MB 13.4 MB/s eta 0:00:01
----------- ---------------------------- 0.4/1.5 MB 6.5 MB/s eta 0:00:01
-------------------- ------------------- 0.8/1.5 MB 6.1 MB/s eta 0:00:01
------------------------------- -------- 1.2/1.5 MB 6.8 MB/s eta 0:00:01
---------------------------------------- 1.5/1.5 MB 6.7 MB/s eta 0:00:00
Downloading typer-0.9.4-py3-none-any.whl (45 kB)
---------------------------------------- 0.0/46.0 kB ? eta -:--:--
---------------------------------------- 46.0/46.0 kB ? eta 0:00:00
Downloading wasabi-1.1.2-py3-none-any.whl (27 kB)
Downloading weasel-0.3.4-py3-none-any.whl (50 kB)
---------------------------------------- 0.0/50.1 kB ? eta -:--:--
---------------------------------------- 50.1/50.1 kB 2.5 MB/s eta 0:00:00
Downloading annotated_types-0.6.0-py3-none-any.whl (12 kB)
Downloading blis-0.7.11-cp311-cp311-win_amd64.whl (6.6 MB)
---------------------------------------- 0.0/6.6 MB ? eta -:--:--
-- ------------------------------------- 0.4/6.6 MB 7.6 MB/s eta 0:00:01
--- ------------------------------------ 0.6/6.6 MB 6.5 MB/s eta 0:00:01
----- ---------------------------------- 0.9/6.6 MB 6.4 MB/s eta 0:00:01
------- -------------------------------- 1.2/6.6 MB 6.7 MB/s eta 0:00:01
--------- ------------------------------ 1.6/6.6 MB 6.8 MB/s eta 0:00:01
----------- ---------------------------- 2.0/6.6 MB 7.0 MB/s eta 0:00:01
--------------- ------------------------ 2.5/6.6 MB 8.1 MB/s eta 0:00:01
------------------ --------------------- 3.1/6.6 MB 8.5 MB/s eta 0:00:01
-------------------- ------------------- 3.4/6.6 MB 8.6 MB/s eta 0:00:01
----------------------- ---------------- 3.8/6.6 MB 8.4 MB/s eta 0:00:01
------------------------- -------------- 4.2/6.6 MB 8.3 MB/s eta 0:00:01
----------------------------- ---------- 4.8/6.6 MB 9.0 MB/s eta 0:00:01
------------------------------- -------- 5.2/6.6 MB 9.1 MB/s eta 0:00:01
---------------------------------- ----- 5.8/6.6 MB 9.4 MB/s eta 0:00:01
-------------------------------------- - 6.3/6.6 MB 9.6 MB/s eta 0:00:01
--------------------------------------- 6.6/6.6 MB 9.8 MB/s eta 0:00:01
---------------------------------------- 6.6/6.6 MB 9.0 MB/s eta 0:00:00
Downloading cloudpathlib-0.16.0-py3-none-any.whl (45 kB)
---------------------------------------- 0.0/45.0 kB ? eta -:--:--
---------------------------------------- 45.0/45.0 kB 2.3 MB/s eta 0:00:00
Downloading confection-0.1.4-py3-none-any.whl (35 kB)
Installing collected packages: cymem, wasabi, spacy-loggers, spacy-legacy, pydanti
c-core, murmurhash, langcodes, cloudpathlib, catalogue, blis, annotated-types, typ
er, srsly, pydantic, preshed, confection, weasel, thinc, spacy
Successfully installed annotated-types-0.6.0 blis-0.7.11 catalogue-2.0.10 cloudpat
hlib-0.16.0 confection-0.1.4 cymem-2.0.8 langcodes-3.3.0 murmurhash-1.0.10 preshed
-3.0.9 pydantic-2.7.0 pydantic-core-2.18.1 spacy-3.7.4 spacy-legacy-3.0.12 spacy-l
oggers-1.0.5 srsly-2.4.8 thinc-8.2.3 typer-0.9.4 wasabi-1.1.2 weasel-0.3.4
WARNING: The script weasel.exe is installed in 'C:\Users\dhara\AppData\Roaming\P
ython\Python311\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warnin
g, use --no-warn-script-location.
WARNING: The script spacy.exe is installed in 'C:\Users\dhara\AppData\Roaming\Py
thon\Python311\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warnin
g, use --no-warn-script-location.

In [21]: !python -m spacy download en_core_web_sm

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 7/13
4/23/24, 1:23 PM NLP.02.04.24
Defaulting to user installation because normal site-packages is not writeable
Collecting en-core-web-sm==3.7.1
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_
web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
---------------------------------------- 0.0/12.8 MB ? eta -:--:--
- -------------------------------------- 0.5/12.8 MB 10.2 MB/s eta 0:00:02
--- ------------------------------------ 1.3/12.8 MB 13.4 MB/s eta 0:00:01
----- ---------------------------------- 1.9/12.8 MB 13.1 MB/s eta 0:00:01
------- -------------------------------- 2.3/12.8 MB 12.1 MB/s eta 0:00:01
-------- ------------------------------- 2.8/12.8 MB 12.0 MB/s eta 0:00:01
----------- ---------------------------- 3.6/12.8 MB 12.9 MB/s eta 0:00:01
------------ --------------------------- 4.1/12.8 MB 12.6 MB/s eta 0:00:01
-------------- ------------------------- 4.6/12.8 MB 12.4 MB/s eta 0:00:01
---------------- ----------------------- 5.3/12.8 MB 12.4 MB/s eta 0:00:01
------------------ --------------------- 5.9/12.8 MB 12.5 MB/s eta 0:00:01
------------------- -------------------- 6.3/12.8 MB 12.1 MB/s eta 0:00:01
-------------------- ------------------- 6.6/12.8 MB 11.8 MB/s eta 0:00:01
--------------------- ------------------ 6.9/12.8 MB 11.4 MB/s eta 0:00:01
----------------------- ---------------- 7.5/12.8 MB 11.4 MB/s eta 0:00:01
------------------------ --------------- 8.0/12.8 MB 11.4 MB/s eta 0:00:01
-------------------------- ------------- 8.5/12.8 MB 11.3 MB/s eta 0:00:01
--------------------------- ------------ 8.9/12.8 MB 11.2 MB/s eta 0:00:01
----------------------------- ---------- 9.4/12.8 MB 11.1 MB/s eta 0:00:01
------------------------------ --------- 9.9/12.8 MB 11.1 MB/s eta 0:00:01
------------------------------- ------- 10.3/12.8 MB 11.1 MB/s eta 0:00:01
--------------------------------- ----- 10.9/12.8 MB 11.1 MB/s eta 0:00:01
---------------------------------- ---- 11.3/12.8 MB 10.9 MB/s eta 0:00:01
----------------------------------- --- 11.8/12.8 MB 10.7 MB/s eta 0:00:01
------------------------------------- - 12.4/12.8 MB 10.6 MB/s eta 0:00:01
-------------------------------------- 12.8/12.8 MB 10.7 MB/s eta 0:00:01
-------------------------------------- 12.8/12.8 MB 10.7 MB/s eta 0:00:01
---------------------------------------- 12.8/12.8 MB 9.9 MB/s eta 0:00:00
Requirement already satisfied: spacy<3.8.0,>=3.7.2 in c:\users\dhara\appdata\roami
ng\python\python311\site-packages (from en-core-web-sm==3.7.1) (3.7.4)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in c:\users\dhara\appda
ta\roaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-s
m==3.7.1) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in c:\users\dhara\appda
ta\roaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-s
m==3.7.1) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in c:\users\dhara\appdata
\roaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm=
=3.7.1) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in c:\users\dhara\appdata\roami
ng\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in c:\users\dhara\appdata\roa
ming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.
7.1) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in c:\users\dhara\appdata\roami
ng\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (8.2.3)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in c:\users\dhara\appdata\roam
ing\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in c:\users\dhara\appdata\roami
ng\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in c:\users\dhara\appdata\r
oaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==
3.7.1) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in c:\users\dhara\appdata\roam
ing\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (0.3.4)

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 8/13
4/23/24, 1:23 PM NLP.02.04.24
Requirement already satisfied: typer<0.10.0,>=0.3.0 in c:\users\dhara\appdata\roam
ing\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.
1) (0.9.4)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in c:\programdata\anaconda
3\lib\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (5.2.1)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in c:\programdata\anaconda3\lib
\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.65.0)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in c:\programdata\anaconda3
\lib\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.31.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in c:\users\dh
ara\appdata\roaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-c
ore-web-sm==3.7.1) (2.7.0)
Requirement already satisfied: jinja2 in c:\programdata\anaconda3\lib\site-package
s (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.1.2)
Requirement already satisfied: setuptools in c:\programdata\anaconda3\lib\site-pac
kages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (68.0.0)
Requirement already satisfied: packaging>=20.0 in c:\programdata\anaconda3\lib\sit
e-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (23.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in c:\users\dhara\appdata\r
oaming\python\python311\site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==
3.7.1) (3.3.0)
Requirement already satisfied: numpy>=1.19.0 in c:\programdata\anaconda3\lib\site-
packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.24.3)
Requirement already satisfied: annotated-types>=0.4.0 in c:\users\dhara\appdata\ro
aming\python\python311\site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->s
pacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.6.0)
Requirement already satisfied: pydantic-core==2.18.1 in c:\users\dhara\appdata\roa
ming\python\python311\site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->sp
acy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.18.1)
Requirement already satisfied: typing-extensions>=4.6.1 in c:\programdata\anaconda
3\lib\site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.
2->en-core-web-sm==3.7.1) (4.7.1)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\programdata\anaconda
3\lib\site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-we
b-sm==3.7.1) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\programdata\anaconda3\lib\site-p
ackages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1)
(3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\programdata\anaconda3\lib
\site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm=
=3.7.1) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib
\site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm=
=3.7.1) (2023.7.22)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in c:\users\dhara\appdata\roamin
g\python\python311\site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->e
n-core-web-sm==3.7.1) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in c:\users\dhara\appdata
\roaming\python\python311\site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=
3.7.2->en-core-web-sm==3.7.1) (0.1.4)
Requirement already satisfied: colorama in c:\programdata\anaconda3\lib\site-packa
ges (from tqdm<5.0.0,>=4.38.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.4.6)
Requirement already satisfied: click<9.0.0,>=7.1.1 in c:\programdata\anaconda3\lib
\site-packages (from typer<0.10.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.
7.1) (8.0.4)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in c:\users\dhara\appda
ta\roaming\python\python311\site-packages (from weasel<0.4.0,>=0.1.0->spacy<3.8.0,
>=3.7.2->en-core-web-sm==3.7.1) (0.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\programdata\anaconda3\lib\sit
e-packages (from jinja2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.1.1)
Installing collected packages: en-core-web-sm
Successfully installed en-core-web-sm-3.7.1
[+] Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 9/13
4/23/24, 1:23 PM NLP.02.04.24

In [22]: import spacy

In [24]: nlp = spacy.load("en_core_web_sm")

In [25]: nlp

<spacy.lang.en.English at 0x23c667bcf10>
Out[25]:

Containers are spaCy objects that contain a large quantity of data about a text. When we
analyze texts with the spaCy framework, we create different container objects to do that.
Here is a full list of all spaCy containers. We will be focusing on three (emboldened): Doc,
Span, and Token.

In [26]: import os

In [27]: cwd = os.getcwd() # Get the current working directory (cwd)


files = os.listdir(cwd) # Get all the files in that directory
print("Files in %r: %s" % (cwd, files))

Files in 'C:\\Users\\dhara': ['.anaconda', '.conda', '.condarc', '.continuum', '.i


dlerc', '.ipynb_checkpoints', '.ipython', '.jupyter', '.keras', '.matplotlib', '.m
s-ad', '.spyder-py3', '10.10.23.ipynb', 'Amazon_sales_data_vaishvi.ipynb', 'anacon
da3', 'AppData', 'Application Data', 'Contacts', 'Cookies', 'correlation 02.01.24.
ipynb', 'cv.12.10.23.ipynb', 'dhara_IAR14042_18.04.24_NLPexam.ipynb', 'Dhara_iar14
042_ML_20.03.24.ipynb', 'Documents', 'Downloads', 'Favorites', 'geo.12.10.23.ipyn
b', 'IntelGraphicsProfiles', 'Jedi', 'leukemia.25.12.ipynb', 'leukemia.26.12.ipyn
b', 'leukemia_detection_model.h5', 'Links', 'Local Settings', 'logistic reg.19.03.
24.ipynb', 'log_reg.pkl', 'ML 29.02.24.ipynb', 'ML 7.02.24.ipynb', 'ML polynomial
reg 28.12.23.ipynb', 'ML.31.01.24.ipynb', 'ML.cross_validation.03.04.24.ipynb', 'M
L.DENDOGRAM.09.04.24.ipynb', 'ML.roc_curve.28.03.24.ipynb', 'ML.SVM.03.04.24.ipyn
b', 'Music', 'My Documents', 'NetHood', 'NLP Practicals', 'NLP.02.04.24.ipynb', 'N
LP.27.03.24.ipynb', 'NLTK.16.01.23.ipynb', 'NTUSER.DAT', 'ntuser.dat.LOG1', 'ntuse
r.dat.LOG2', 'NTUSER.DAT{d15a58bc-c195-11ed-adb9-d5da1bd57bc0}.TM.blf', 'NTUSER.DA
T{d15a58bc-c195-11ed-adb9-d5da1bd57bc0}.TMContainer00000000000000000001.regtrans-m
s', 'NTUSER.DAT{d15a58bc-c195-11ed-adb9-d5da1bd57bc0}.TMContainer00000000000000000
002.regtrans-ms', 'ntuser.ini', 'OneDrive', 'PrintHood', 'Project_2_Budget_Data_An
alysis.ipynb', 'Recent', 'rv 14.12.23.ipynb', 'salary prediction-Copy1.ipynb', 'sa
lary prediction-Copy2.ipynb', 'salary prediction.ipynb', 'salary.pkl', 'salary_dha
ra.pkl', 'satellite', 'Saved Games', 'Searches', 'SendTo', 'Start Menu', 'Template
s', 'test.mp3', 'Untitled.1.ipynb', 'Untitled.ipynb', 'Untitled1.ipynb', 'Untitled
10.ipynb', 'Untitled11.ipynb', 'Untitled12.ipynb', 'Untitled13.ipynb', 'Untitled1
4.ipynb', 'Untitled15.ipynb', 'Untitled16.ipynb', 'Untitled17.ipynb', 'Untitled18.
ipynb', 'Untitled2.ipynb', 'Untitled3.ipynb', 'Untitled4.ipynb', 'Untitled5.ipyn
b', 'Untitled6.ipynb', 'Untitled7.ipynb', 'Untitled8.ipynb', 'Untitled9.ipynb', 'V
ideos']

In [28]: nlp1=spacy.load('en_core_web_sm')

In [29]: nlp1

<spacy.lang.en.English at 0x23c667b8f90>
Out[29]:

In [34]: with open (r"C:\Users\dhara\Downloads\good morning everyone!.txt") as f:


text = f.read()

In [35]: text

'good morning everyone!'


Out[35]:

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 10/13
4/23/24, 1:23 PM NLP.02.04.24

Creating a Doc Container What is doc in NLP? The object 'doc' would be containing all the
information about the text

the words, the whitespaces etc. doc = nlp(text) 'doc' can be used as an iterator to parse
through the text. It contains a '.text' method which can give the text of every token like:
for token in doc: print(token.text)

In [36]: doc = nlp(text)

In [37]: doc

good morning everyone!


Out[37]:

In [38]: print (len(doc))


print (len(text))

4
22

Same text, but different length. Why does this occur? To answer that, let’s explore it more
deeply and try and print off each item in each object.

In [41]: for token in text[:10]:


print(token)

g
o
o
d

m
o
r
n
i

In [42]: for token in text.split()[:10]:


print (token)

good
morning
everyone!

In [43]: words = text.split()[:10]

In [44]: words

['good', 'morning', 'everyone!']


Out[44]:

In [47]: i=5
for token in doc[i:8]:
print (f"SpaCy Token {i}:\n{token}\nWord Split {i}:\n{words[i]}\n\n")
i=i+1

Sentence Boundary Detection (SBD)

In NLP, sentence boundary detection, or SBD, is the identification of sentences in a text.


Again, this may seem fairly easy to do with rules. One could use split(“.”), but in English we

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 11/13
4/23/24, 1:23 PM NLP.02.04.24

use the period to also denote abbreviation.

In [48]: for sent in doc.sents:


print (sent)

good morning everyone!

In [50]: sentence1 = list(doc.sents)[0]


print (sentence1)

good morning everyone!

Token Attributes

performing NLP in spaCy. We will be working with a few of them, such as: .text .head
.left_edge .rightedge .enttype .iob .lemma .morph .pos .dep .lang

In [51]: token2 = sentence1[2]


print (token2)

everyone

In [52]: token2.text

'everyone'
Out[52]:

In [53]: token2.head

everyone
Out[53]:

In [54]: token2.left_edge

good
Out[54]:

In [55]: token2.right_edge

!
Out[55]:

In [56]: token2.ent_type

0
Out[56]:

In [57]: token2.ent_type_

''
Out[57]:

IOB code of named entity tag. “B” means the token begins an entity, “I” means it is inside an
entity, “O” means it is outside an entity, and "" means no entity tag is set.

In [58]: token2.ent_iob_

'O'
Out[58]:

In [59]: token2.lemma_

'everyone'
Out[59]:

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 12/13
4/23/24, 1:23 PM NLP.02.04.24

In [61]: sentence1[2].lemma_

'everyone'
Out[61]:

In [62]: sentence1[2].morph

Number=Sing|PronType=Ind
Out[62]:

In [63]: token2.pos_

'PRON'
Out[63]:

In [64]: token2.dep_

'ROOT'
Out[64]:

In [66]: token2.lang_

'en'
Out[66]:

In [68]: for token in sentence1:


print (token.text, token.pos_, token.dep_)

good ADJ amod


morning NOUN compound
everyone PRON ROOT
! PUNCT punct

In [69]: from spacy import displacy


displacy.render(sentence1, style="dep")

amod compound

good morning everyone!


ADJ NOUN PRON

In [70]:

In [ ]:

In [ ]:

localhost:8889/nbconvert/html/NLP.02.04.24.ipynb?download=false 13/13

You might also like