Lec1 Intro-2
Lec1 Intro-2
CLASS INTRODUCTION
Lecture 1: Introduction I
Outline Lecturer
▪ Class Introduction ▪ Jee Eun Kim
▪ Introduction to NLP – Contact Info
• e-mail (preferred)
▪ Brief Introduction to AI
– [email protected]
• Telephone
– 02-2173-3110 (O)
• Office
– Faculty Building #312
• Office Hour
– Upon Request/Appointment
» Through e-mail: preferred
03/05/2025 Computer & Linguistics 2025 2 03/05/2025 Computer & Linguistics 2025 4
1
2025-03-04
Books Syllabus
▪ Textbook ▪ Mar. 05
– Speech and Language Processing (SLP) – Introduction
• Daniel Jurafsky & James H. Martin. Prentice Hall. ▪ Mar. 12
– Relevant chapters are uploaded to e-class
– Introduction to NLTK
» 2nd ed. 2008. (Published)
– Introduction to Text Processing
» 3rd ed. 2025. (Online draft version)
▪ Mar. 19 ~ Apr. 02
▪ Reference – Text Processing I-1
– Language and Computers • Regular Expression
• Markus Dickinson, Chris Brew & Detmar Meuirers. Wiley- • Finite-State Automata
Blackwell, 2012.
▪ Apr. 09 & 16
– Text Processing I-2
• Morphology & Subword Tokenization
03/05/2025 Computer & Linguistics 2025 5 03/05/2025 Computer & Linguistics 2025 7
03/05/2025 Computer & Linguistics 2025 6 03/05/2025 Computer & Linguistics 2025 8
2
2025-03-04
Syllabus (cont’d)
▪ May 28
– Text Processing II-2 INTRODUCTION TO NLP
• Constituency Parsing II
– Student Presentations on the final assignment I
▪ Jun. 04
– Student Presentations on the final assignment II
▪ Jun. 11
– Make-up week: No class
▪ Jun. 18
– Final Exam (using e-class in the classroom)
3
2025-03-04
4
2025-03-04
▪ NLP Logic
Philosophy
– a field of AI Language and Psychology
Linguistics
• Focusing on the interaction between humans and
computers using natural language
NLP /
• Involving the development of algorithms and models Computational Lx
that enable computers to understand, interpret, and Artificial
generate human language Phonetics
Intelligence
03/05/2025 Computer & Linguistics 2025 17 03/05/2025 Computer & Linguistics 2025 19
5
2025-03-04
03/05/2025 Computer & Linguistics 2025 21 03/05/2025 Computer & Linguistics 2025 23
6
2025-03-04
▪ The ultimate objective of NLP ▪ The fusion of human cognition with AI, neural
– To read, decipher, understand, and make sense of the networks, and data flow
human languages in a manner that is valuable
▪ NLP techniques
– Mostly rely on machine learning to derive meaning
from human languages
03/05/2025 Computer & Linguistics 2025 25 03/05/2025 Computer & Linguistics 2025 27
7
2025-03-04
▪ NLP applications
03/05/2025 Computer & Linguistics 2025 29 03/05/2025 Computer & Linguistics 2025 31
03/05/2025 Computer & Linguistics 2025 30 03/05/2025 Computer & Linguistics 2025 32
8
2025-03-04
03/05/2025 Computer & Linguistics 2025 33 03/05/2025 Computer & Linguistics 2025 35
9
2025-03-04
03/05/2025 Computer & Linguistics 2025 37 03/05/2025 Computer & Linguistics 2025 39
03/05/2025 Computer & Linguistics 2025 38 03/05/2025 Computer & Linguistics 2025 40
10
2025-03-04
Semantics Parsing
03/05/2025 Computer & Linguistics 2025 41 03/05/2025 Computer & Linguistics 2025 43
11
2025-03-04
03/05/2025 Computer & Linguistics 2025 46 03/05/2025 Computer & Linguistics 2025 48
12
2025-03-04
• Three common steps • Similar to "traditional" ML, but with a few differences
① Extracting features from texts – Feature engineering is generally skipped
» Feature engineering » Networks "learn" important features, which is One of the
• Word type, surrounding words, capitalized, plural, etc. claimed big benefits of using NNs for NLP
② Using the feature representation to train a model – Streams of raw parameters ("words” - actually vector
» Training data: a corpus with markup representations of words) without engineered features, are
fed into NNs
» Training a model on parameters, followed by fitting on
test data • Challenges
③ Evaluating and refining the model – Requiring the substantial computational resources
» Inference (applying model to test data) » Including very large training corpus
• Characterized by finding most probable words, next word, – The ongoing need to address biases in large training
best category, etc. datasets
03/05/2025 Computer & Linguistics 2025 49 03/05/2025 Computer & Linguistics 2025 51
13
2025-03-04
03/05/2025 Computer & Linguistics 2025 53 03/05/2025 Computer & Linguistics 2025 55
14
2025-03-04
03/05/2025 Computer & Linguistics 2025 57 03/05/2025 Computer & Linguistics 2025 59
03/05/2025 Computer & Linguistics 2025 58 03/05/2025 Computer & Linguistics 2025 60
15
2025-03-04
03/05/2025 Computer & Linguistics 2025 62 03/05/2025 Computer & Linguistics 2025 64
16
2025-03-04
03/05/2025 Computer & Linguistics 2025 65 03/05/2025 Computer & Linguistics 2025 67
▪ IBM ▪ Google
– NLP – Natural Language Processing
• https://natural-language-classifier- • https://cloud.google.com/natural-language
demo.ng.bluemix.net/?cm_mc_uid=866573452701152157 – Entity Recognition
99647&cm_mc_sid_50200000=67404251549376437907&c
m_mc_sid_52640000=20772841549376437918 – Sentiment Analysis
– Syntactic Parsing
– Watson Assistant: Chatbot
– Text Categorization
• https://www.ibm.com/cloud/watson-assistant/
– Watson Speech to Text (STT) – Machine Translation
• https://www.ibm.com/cloud/watson-speech-to-text • Google Translate
– Watson Tone Analyzer – https://cloud.google.com/translate/#how-automl-
translationbeta-works
• https://www.ibm.com/cloud/watson-tone-analyzer
– Watson Knowledge Studio
• https://www.ibm.com/cloud/watson-knowledge-studio
03/05/2025 Computer & Linguistics 2025 66 03/05/2025 Computer & Linguistics 2025 68
17
2025-03-04
18
2025-03-04
▪ NLP has heavily benefited from recent 2. Deep Learning Models for Specialized Tasks
advances in machine learning, especially from – We are witnessing Transformers models like GPT-4 and
deep learning techniques BERT are achieving excellent accuracy and in 2025
• They will surely reach new dynamics of possibilities
– Speech Processing (SP)
– These models can now handle niche tasks like drafting
• The translation of spoken language into text legal contracts & analyzing medical records of patients
• The translation of text into spoken language with close to human-like precision
– Natural Language Understanding (NLU) – When fine-tuned, you can customize them for
• The computer's ability to understand what we say industries like finance and law
– Natural Language Generation (NLG)
• The generation of natural language by a computer
03/05/2025 Computer & Linguistics 2025 74 03/05/2025 Computer & Linguistics 2025 76
19
2025-03-04
NLP Trends For 2025 (cont’d) NLP Trends For 2025 (cont’d)
03/05/2025 Computer & Linguistics 2025 77 03/05/2025 Computer & Linguistics 2025 79
NLP Trends For 2025 (cont’d) NLP Trends For 2025 (cont’d)
20
2025-03-04
03/05/2025 Computer & Linguistics 2025 81 03/05/2025 Computer & Linguistics 2025 83
INTRODUCTION TO AI
21
2025-03-04
• DeepMind
– https://deepmind.com/
22
2025-03-04
03/05/2025 Computer & Linguistics 2025 90 03/05/2025 Computer & Linguistics 2025 92
23
2025-03-04
▪ AI
– Enables the machine to think
▪ ML
– Use statistical tools to explore and analyze the data
• Supervised Learning
• Unsupervised Learning (Clustering)
• Reinforcement Learning
– Semi-supervised Learning
▪ DL
– Multi Neural Network Architecture
• ANN (Artificial Neural Network)
• CNN (Convolutional Neural Network): Transfer Learning
• RNN (Recurrent Neural Network)
03/05/2025 Computer & Linguistics 2025 93 03/05/2025 Computer & Linguistics 2025 95
Classic Deep
Rule-based Learning
Machine
System Representation Learning
Learning
03/05/2025 Computer & Linguistics 2025 94 03/05/2025 Computer & Linguistics 2025 96
24
2025-03-04
03/05/2025 Computer & Linguistics 2025 98 03/05/2025 Computer & Linguistics 2025 100
25