NLP Unit 3&4
NLP Unit 3&4
Unit 5 Notes pdf by faculty covers all the topics so this pdf will not
contain Unit 5 at all.
You can ignore that one uncovered topic or look it up yourself, see it as an
optional DIY.
Made by – Utkarsh M.
UNIT – 3
Made by – Utkarsh M.
Information extraction is a challenging task that requires the use of
various techniques, including named entity recognition (NER), regular
expressions, and text matching, among others.
1. Initial processing
The first step is to break down a text into fragments such as zones, phrases,
segments, and tokens. This function can be performed by tokenizers, text zoners,
segmenters, and splitters, among other components.
In the initial processing stage, part-of-speech tagging, and phrasal unit
identification (noun or verb phrases) are usually the next tasks.
2. Proper names identification
One of the most important stages in the information extraction chain is the
identification of various classes of proper names, such as names of people or
organizations, dates, monetary amounts, places, addresses, and so on.
They may be found in practically any sort of text and are widely used in the
extraction process.
3. Parsing
The syntactic analysis of the sentences in the texts is done at this step. After
recognizing the fundamental entities in the previous stage, the sentences are
processed to find the noun groups that surround some of those entities and verb
groups.
Made by – Utkarsh M.
4. Extraction of events and relations
This stage establishes relations between the extracted ideas. This is
accomplished by developing and implementing extraction rules that describe
various patterns.
The text is compared to certain patterns, and if a match is discovered, the text
element is labeled and retrieved later.
5. Coreference or Anaphora resolution
Coreference resolution is used to identify all of the ways the entity is named
throughout the text. The step where noun phrases are decided if they relate to
the same entity or not is called coreference or anaphora resolution.
6. Output results generation
This stage entails converting the structures collected during the preceding
processes into output templates that follow the format defined by the user. It
might comprise a variety of normalization processes.
Made by – Utkarsh M.
Working of NRE:
Made by – Utkarsh M.
Relation Identification or Relation Extraction:
The main goal of relationship extraction is to extract valuable insights from text
that enrich our understanding of the relationships that bind people,
organizations, concepts, etc.
Template Filling:
[I leave this topic for more big brainers as I couldn’t find a satisfactory answer.]
Made by – Utkarsh M.
Language Models in NLP:
II. Unigram: The unigram is the simplest type of language model. It doesn't
look at any conditioning context in its calculations. It evaluates each word
or term independently. Unigram models commonly handle language
processing tasks such as information retrieval.
Made by – Utkarsh M.
The unigram is the foundation of a more specific model variant called the
query likelihood model, which uses information retrieval to examine a pool
of documents and match the most relevant one to a specific query.
III. Bidirectional: Unlike n-gram models, which analyze text in one direction
(backwards), bidirectional models analyze text in both directions,
backwards and forwards. These models can predict any word in a
sentence or body of text by using every other word in the text. Examining
text bidirectionally increases result accuracy.
Made by – Utkarsh M.
Probabilistic Language Model:
Made by – Utkarsh M.
p(NLP|this article is on)
Made by – Utkarsh M.
The hidden Markov Model (HMM) is a statistical model that is used to describe
the probabilistic relationship between a sequence of observations and a
sequence of hidden states. It is often used in situations where the underlying
system or process that generates the observations is unknown or hidden, hence
it has the name “Hidden Markov Model.”
It is used to predict future observations or classify sequences, based on the
underlying hidden process that generates the data.
An HMM consists of two types of variables: hidden states and observations.
• The hidden states are the underlying variables that generate the observed
data, but they are not directly observable.
• The observations are the variables that are measured and observed.
The relationship between the hidden states and the observations is modeled
using a probability distribution.
The Hidden Markov Model (HMM) is the relationship between the hidden states
and the observations using two sets of probabilities: the transition probabilities
and the emission probabilities.
• The transition probabilities describe the probability of transitioning from
one hidden state to another.
• The emission probabilities describe the probability of observing an
output given a hidden state.
Made by – Utkarsh M.
Step 3: Define the state transition probabilities
These are the probabilities of transitioning from one state to another. This
forms the transition matrix, which describes the probability of moving
from one state to another.
Step 4: Define the observation likelihoods:
These are the probabilities of generating each observation from each
state. This forms the emission matrix, which describes the probability of
generating each observation from each state.
Topic Modelling:
Topic modelling is recognizing the words from the topics present in the
document or the corpus of data. This is useful because extracting the words
from a document takes more time and is much more complex than extracting
them from topics present in the document. For example, there are 1000
Made by – Utkarsh M.
documents and 500 words in each document. So to process this it requires
500*1000 = 500000 threads. So when you divide the document containing
certain topics then if there are 5 topics present in it, the processing is just 5*500
words = 2500 threads.
So based on this it divides the document into different topics. As this doesn’t
have any outputs through which it can do this task hence it is an unsupervised
learning method. This type of modelling is very much useful when there are many
documents present and when we want to get to know what type of information is
present in it. This takes a lot of time when done manually and this can be done
easily in very little time using Topic modelling.
Graph Models:
[I don’t know if this is talking about GNNs or Graph Representation stuff, so
proceed with caution.]
Graphs have always formed an essential part of NLP applications ranging from
syntax-based Machine Translation, knowledge graph-based question answering,
abstract meaning representation for common sense reasoning tasks, and so on.
But with the advent of ‘end to end deep learning’ systems, there was a decrease
in such traditional parse algorithms.
Made by – Utkarsh M.
Syntactic and Semantic Parse Graphs
The image describes the parser output by the Spacy tagger. We can define every
node as a word and every edge as the dependency parse tag. Every word can
have pos tags as attributes. Some might argue that powerful attention
mechanisms can automatically learn the syntactic and semantic relationships.
Knowledge Graph
A knowledge graph represents a collection of interlinked descriptions of entities
— real-world objects, events, situations, or abstract concepts. Every node is an
entity and edges describe relations between them. Most famous KGs in NLP
include Dbpedia, WikiData, ConceptNet.
Made by – Utkarsh M.
Temporal graphs
LSTMs have been shown to be poor on long-range dependencies so connecting
words/documents through edges that depict instances in time would be one of
the solutions.
Made by – Utkarsh M.
Feature Selection and classifiers:
Feature selection is a process that chooses a subset of features from the original
features so that the feature space is optimally reduced according to a certain
criterion.
Feature selection is a critical step in the feature construction process. In text
categorization problems, some words simply do not appear very often. Perhaps
the word “groovy” appears in exactly one training document, which is positive.
Rule-based classifiers are just another type of classifier which makes the class
decision depending by using various “if..else” rules.
These rules are easily interpretable and thus these classifiers are generally used
to generate descriptive models. The condition used with “if” is called the
antecedent and the predicted class of each rule is called the consequent.
Properties of rule-based classifiers:
• The decision boundaries created by them is linear, but these can be much
more complex than the decision tree because the many rules are triggered
for the same record.
Made by – Utkarsh M.
Example:
Below is the dataset to classify mushrooms as edible or poisonous:
Made by – Utkarsh M.
Maximum Entropy Classifier:
The Max Entropy classifier is a probabilistic classifier which belongs to the class
of exponential models.
Unlike the Naive Bayes classifier that we discussed in the previous article, the
Max Entropy does not assume that the features are conditionally independent of
each other.
The MaxEnt is based on the Principle of Maximum Entropy and from all the
models that fit our training data, selects the one which has the largest entropy.
The Max Entropy classifier can be used to solve a large variety of text
classification problems such as language detection, topic classification,
sentiment analysis and more.
Due to the minimum assumptions that the Maximum Entropy classifier makes,
we regularly use it when we don’t know anything about the prior distributions and
when it is unsafe to make any such assumptions.
Moreover, Maximum Entropy classifier is used when we can’t assume the
conditional independence of the features. This is particularly true in Text
Classification problems where our features are usually words which obviously
are not independent.
The Max Entropy requires more time to train comparing to Naive Bayes, primarily
due to the optimization problem that needs to be solved in order to estimate the
parameters of the model.
Clustering Word:
Made by – Utkarsh M.
There are two main different types of similarity have been used which can be
characterized as follows:
For example, in the context I read the book, the word book can be replaced
by magazine with no violation of the semantic well-formedness of the
sentence, and therefore the two words can be said to be paradigmatically
similar;
Transformer Architecture:
Made by – Utkarsh M.
They are specifically designed to comprehend context and meaning by
analyzing the relationship between different elements, and they rely almost
entirely on a mathematical technique called attention to do so.
Architecture:
Made by – Utkarsh M.
The task of the encoder, on the left half of the Transformer architecture, is to map
an input sequence to a sequence of continuous representations, which is then
fed into a decoder.
The decoder, on the right half of the architecture, receives the output of the
encoder together with the decoder output at the previous time step to generate
an output sequence.
3. The augmented embedding vectors are fed into the encoder block
consisting of the two sublayers explained above.
Since the encoder attends to all words in the input sequence, irrespective if
they precede or succeed the word under consideration, then the Transformer
encoder is bidirectional.
4. The decoder receives as input its own predicted output word at time-step,
t-1.
6. The augmented decoder input is fed into the three sublayers comprising
the decoder block explained above. Masking is applied in the first sublayer
in order to stop the decoder from attending to the succeeding words.
Made by – Utkarsh M.
At the second sublayer, the decoder also receives the output of the encoder,
which now allows the decoder to attend to all the words in the input
sequence.
7. The output of the decoder finally passes through a fully connected layer,
followed by a softmax layer, to generate a prediction for the next word of
the output sequence.
Made by – Utkarsh M.
Various Transformers Models for NLP:
Made by – Utkarsh M.
UNIT – 4
WhAt iS a pRoMpT?
Elements of a prompt:
3. Input Data - the input or question that we are interested to find a response
for
Made by – Utkarsh M.
Tips for designing a prompt:
1. Be Specific: Clearly specify the format you want the AI's response in. If
you need a particular structure, include that information in your prompt.
2. Provide Context: Especially for complex tasks, offer context to guide the
AI. You can describe previous related interactions or the current situation.
Examples are incredibly helpful. Show the AI what a correct response
looks like.
5. Utilize Tools: There are helpful tools available, like the Prompt Perfect
plugin and prompt templates. These resources can streamline your
prompt creation process and boost creativity.
Made by – Utkarsh M.
6. Write a script for an educational video that explores the Solar System and its
role in the Milky Way.
7. Compose a persuasive essay arguing for the importance of protecting national
parks.
9. Write a speech for accepting an acting award for "Best Supporting Actor."
10. Develop an email marketing strategy to generate leads and increase
conversion rates for a tractor company. Use personalized messaging and
automated workflows.
Working of AI Chatbots:
[Refer page 4(end) and 5 from ‘What is a chatbot and How… AirDroid.pdf’ *Notes pdf*]
Made by – Utkarsh M.
Popular AI chatbots:
Made by – Utkarsh M.
ChatGPT:
Working of ChatGPT:
ChatGPT now uses the GPT-3.5 model that includes a fine-tuning process for its
algorithm. ChatGPT Plus uses GPT-4, which offers a faster response time and
internet plugins. GPT-4 can also handle more complex tasks compared with
previous models, such as describing photos, generating captions for images and
creating more detailed responses up to 25,000 words.
Architecture of ChatGPT:
ChatGPT follows a similar architecture to the original GPT models, which is
based on the transformer architecture. It uses a transformer decoder block with
a self-attention mechanism.
Made by – Utkarsh M.
ChatGPT uses deep learning, a subset of machine learning, to produce
humanlike text through transformer neural networks. The transformer predicts
text -- including the next word, sentence or paragraph -- based on its training
data's typical sequence.
Made by – Utkarsh M.
Training begins with generic data, then moves to more tailored data for a specific
task. ChatGPT was trained with online text to learn the human language, and
then it used transcripts to learn the basics of conversations.
Access ChatGPT:
Go to chat.openai.com or use the mobile app and sign in or sign up.
Get a Response:
ChatGPT generates an answer based on your question, and it appears below
your question.
Interact:
Once you have the answer to your prompt, you can do the following:
Enter a new prompt.
Regenerate the response.
Copy the response.
Share the response.
Like or dislike the response.
Made by – Utkarsh M.
Use cases of ChatGPT for various users:
5. Writing code - ChatGPT can write code for simple or repetitive tasks, such as
file I/O operations, data manipulation, and database queries.
However, it’s important to note that its ability to write code is limited and the
generated code may not always be the accurate, optimized or desired output.
Made by – Utkarsh M.
6. Debugging - ChatGPT’s bug fixing abilities can also be a valuable tool for
programmers. It can assist in debugging code by proposing possible causes of
errors and presenting solutions to resolve them.
In September 2023, OpenAI rolled out multi-modal capabilities to ChatGPT including
image processing. So, now it can also classify image and identify objects in an image.
9. Grading - ChatGPT can help teachers with the grading of student essays by
evaluating the content, structure, and coherence of the written work. The AI can
offer feedback on grammar, spelling, punctuation, and syntax while also
assessing the quality of the argument or analysis presented.
Made by – Utkarsh M.
Role of AI in Image Generation
Working of Midjourney:
Made by – Utkarsh M.
Advantages and Disadvantages of Midjourney:
Made by – Utkarsh M.
Usecases of Midjourney:
5. Enriching User Interfaces and User Experience - UIUX plays a crucial role in
today’s digital landscape. Midjourney AI Art can enhance UI/UX design by
providing visually engaging elements that captivate users and create a delightful
interactive experience.
6. Icon Design - In the digital realm, where apps are essential, a captivating icon
is crucial for user engagement. An icon’s shape influences recognition and
memorability. While some icons stick to traditional shapes like squares or
circles, others embrace unique and abstract forms.
Made by – Utkarsh M.
8. Characters Design - The realm of character design is elevated to new heights
with Midjourney AI Art. From humanoid robots to extraterrestrial beings, artists
can create unique and compelling characters that push the boundaries of
imagination.
Made by – Utkarsh M.
This marks the end of this PDF this one is one of two or three PDFs…
We then called it a new discovery at the end of the day and made
ourselves even more incompetent, it wouldn’t be unfair to say with use
of AI our future generation are only going to be more incompetent than
us, and still ignoring all this crucial aspects we just go on in our lives
making it a career…
There are just much more problems with Overuse of AI than the net
amount of good things that come out of it.
Made by – Utkarsh M.