0% found this document useful (0 votes)
10 views

Detailed_Notes_on_Language_Models_and_NLP

Uploaded by

Tamilselvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Detailed_Notes_on_Language_Models_and_NLP

Uploaded by

Tamilselvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

1.

N-grams and Language Models

- N-grams: A sequence of N items from text (Unigram, Bigram, Trigram).

- Language Models: Statistical models predicting word sequences.

2. Smoothing

- Techniques to address zero-probability in unseen data.

- Types: Additive Smoothing, Good-Turing, Backoff, Interpolation.

3. Text Classification

- Categorizing text into predefined groups (e.g., Sentiment Analysis, Spam Detection).

4. Naïve Bayes Classifier

- A probabilistic classifier using Bayes' theorem with independence assumptions.

5. Evaluation

- Metrics: Accuracy, Precision, Recall, F1 Score, ROC-AUC.

- Tools: Confusion Matrix for detailed performance analysis.

6. Vector Semantics

- Representing words as vectors to capture meanings and relationships.

7. TF-IDF

- Importance measure for words in documents using Term Frequency and Inverse Document Frequency.

8. Word2Vec

- Neural network-based model for word embeddings (CBOW, Skip-Gram).


9. Evaluating Vector Models

- Tasks: Word similarity, NLP applications, Intrinsic and Extrinsic evaluations.

10. Sequence Labeling

- Assigning labels to sequences (e.g., POS tagging, Named Entity Recognition).

11. Part of Speech (POS)

- Grammatical categorization of words (noun, verb, adjective, etc.).

12. Named Entities

- Identification and classification of proper nouns like names, organizations, etc.

13. Named Entity Tagging

- Techniques: Rule-based, Machine Learning-based, Pre-trained models (SpaCy, BERT).

You might also like