Large Language Models: Foundation of

The document explains the foundational role of vectors and embeddings in AI, highlighting how vectors represent data in a format that algorithms can process. It discusses the concept of embeddings, which convert discrete data into numerical vectors, enabling mathematical operations for tasks like clustering and classification. Additionally, it covers cosine similarity as a method for measuring the semantic similarity between vectors, crucial for advancements in natural language processing.

Uploaded by

sumitraja15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views8 pages

Large Language Models: Foundation of

Uploaded by

sumitraja15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Foundation of Large

Language Models
How is Math the backbone of AI?

(c) Copyrights Reserved [Link]

What are Vectors?

A vector is a mathematical point that represents data in a

format that AI algorithms can understand.

Vectors are arrays (or lists) of numbers, with each

number representing a specific feature or attribute of the
data.

(c) Copyrights Reserved [Link]

Vector
Representation
Let's say we have a 2D space where each axis
represents a taste characteristic: sweet and sour.

As we add more dimensions (e.g., bitter, salty, savory),

we get a richer understanding of the dishes.

The closer the vectors, the more similar their meanings.

(c) Copyrights Reserved [Link]

What are
Embeddings?
Embeddings convert words, phrases, or even images
into numerical vectors.

These vectors map discrete data into a continuous latent

space, capturing relationships.

(c) Copyrights Reserved [Link]

What do Embeddings
represent?
Embeddings allow mathematical operations on text,
making tasks like clustering, classification, and
regression possible.

By converting text to vectors, models can interpret,

compare, and manipulate words more effectively.

(c) Copyrights Reserved [Link]

Cosine similarity for

vector similarity
Cosine similarity is a measurement that quantifies the
similarity between two or more vectors.

It is the cosine of the angle between vectors, which are

typically non-zero and within an inner product space.

Source: Wikipedia

(c) Copyrights Reserved [Link]

Cosine similarity for

vector similarity
Cosine similarity is a measure in LLMs for evaluating the
semantic similarity between embeddings.

It is used in semantic search as it provides a robust and

efficient way to compare textual data based on its
meaning, driving significant advancements in NLP and
AI-driven applications.

(c) Copyrights Reserved [Link]

To learn more, join us on 15
January at 10AM PDT for
Simplifying Mathematics
Behind AI

Albar Wahab
Senior Data Scientist
Data Science Dojo

(c) Copyrights Reserved [Link]

Cosine Similarity in Machine Learning
No ratings yet
Cosine Similarity in Machine Learning
14 pages
Untitled Presentation
No ratings yet
Untitled Presentation
10 pages
Intro To Vector Embeddings
No ratings yet
Intro To Vector Embeddings
8 pages
Vector Databases & Top Solutions
No ratings yet
Vector Databases & Top Solutions
4 pages
Scalars and Vectors in Machine Learning
No ratings yet
Scalars and Vectors in Machine Learning
8 pages
Maths Roadmap For Machine Learning - Linear Algebra-1
No ratings yet
Maths Roadmap For Machine Learning - Linear Algebra-1
5 pages
Ruchi PPT Neural
No ratings yet
Ruchi PPT Neural
5 pages
Embeddings - A Simple Guide To Rag
No ratings yet
Embeddings - A Simple Guide To Rag
10 pages
STORM - The Nature of Vector Spaces in LLMs
No ratings yet
STORM - The Nature of Vector Spaces in LLMs
10 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
21 pages
You LL Learn Why They Matter What Makes Them Different How They Work The New Use Cases They Re Designed For and How To Get Started 1688203106
No ratings yet
You LL Learn Why They Matter What Makes Them Different How They Work The New Use Cases They Re Designed For and How To Get Started 1688203106
25 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2024-12-14 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2024-12-14 Reference-Material-I
36 pages
Lecture 2 Introduction To Linear Algebra (Part 1)
No ratings yet
Lecture 2 Introduction To Linear Algebra (Part 1)
49 pages
Lecture 1 - Linear Alegbra - 250102 - 225538
No ratings yet
Lecture 1 - Linear Alegbra - 250102 - 225538
11 pages
21 Word2Vec 24 09 2024
No ratings yet
21 Word2Vec 24 09 2024
63 pages
1 Linear Algebra Basics 25-07-2024
No ratings yet
1 Linear Algebra Basics 25-07-2024
30 pages
Guided Learning Pathways Project: 4/7, 2011 Tetsuro Takahashi
No ratings yet
Guided Learning Pathways Project: 4/7, 2011 Tetsuro Takahashi
21 pages
Understanding Vector Embeddings in AI
No ratings yet
Understanding Vector Embeddings in AI
46 pages
Unit 1 1
No ratings yet
Unit 1 1
6 pages
Understanding Vector Space Model & Cosine Similarity
No ratings yet
Understanding Vector Space Model & Cosine Similarity
4 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
49 pages
OpenAI Embeddings Guide
No ratings yet
OpenAI Embeddings Guide
13 pages
Vector Space Model
No ratings yet
Vector Space Model
11 pages
Vector Database
No ratings yet
Vector Database
7 pages
Vector Spaces and Its Applications
No ratings yet
Vector Spaces and Its Applications
10 pages
Mathophilia
No ratings yet
Mathophilia
18 pages
2 (D) Vector Space Model
No ratings yet
2 (D) Vector Space Model
9 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
16 pages
Vector Space Models: Deeplearning - Ai
No ratings yet
Vector Space Models: Deeplearning - Ai
48 pages
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
No ratings yet
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
5 pages
Linear Algebra for CS Students
No ratings yet
Linear Algebra for CS Students
50 pages
06 VectorSpaceModel
No ratings yet
06 VectorSpaceModel
65 pages
SFML DATE 16 Lecture1 Basics Notes
No ratings yet
SFML DATE 16 Lecture1 Basics Notes
7 pages
CSE2 12200122084 SohamSarkar PR 67
No ratings yet
CSE2 12200122084 SohamSarkar PR 67
9 pages
Unit 2 Updated New
No ratings yet
Unit 2 Updated New
77 pages
La - ML
No ratings yet
La - ML
6 pages
Lec 1 & 2
No ratings yet
Lec 1 & 2
135 pages
Session 1 Introduction To AI and Generative AI
No ratings yet
Session 1 Introduction To AI and Generative AI
26 pages
1 2.-Maths ML
No ratings yet
1 2.-Maths ML
18 pages
Introduction To Vector Embeddings and Vector Databases
No ratings yet
Introduction To Vector Embeddings and Vector Databases
11 pages
Linear Algebra for ML Beginners
No ratings yet
Linear Algebra for ML Beginners
27 pages
Linear Algebra - Part 1
No ratings yet
Linear Algebra - Part 1
10 pages
Essential Math For AI - ML
100% (1)
Essential Math For AI - ML
22 pages
Embeddings, Vector Databases, and Search in LLM
No ratings yet
Embeddings, Vector Databases, and Search in LLM
38 pages
Constructing and Evaluating Word Embeddings
No ratings yet
Constructing and Evaluating Word Embeddings
33 pages
Chapter 4 - Part II
No ratings yet
Chapter 4 - Part II
44 pages
Vector-DataBase in AI
No ratings yet
Vector-DataBase in AI
14 pages
Vector Database
No ratings yet
Vector Database
3 pages
Vectors
No ratings yet
Vectors
80 pages
Unit 1 - DE
No ratings yet
Unit 1 - DE
44 pages
Vector Embedding
No ratings yet
Vector Embedding
8 pages
Vector Databases
0% (1)
Vector Databases
2 pages
Quantum Mathematics in Artificial Intelligence
No ratings yet
Quantum Mathematics in Artificial Intelligence
37 pages
10 Intro Vses & Tfidf
No ratings yet
10 Intro Vses & Tfidf
56 pages
Vector Database
No ratings yet
Vector Database
8 pages
Math NSHM
No ratings yet
Math NSHM
10 pages
Word Embeddings
No ratings yet
Word Embeddings
163 pages
Algebra Aplicacion de Espacios Vectoriales
No ratings yet
Algebra Aplicacion de Espacios Vectoriales
11 pages
Vector Space Model Overview
No ratings yet
Vector Space Model Overview
75 pages
KISSsoft Tutorial 016 Wormgear en
No ratings yet
KISSsoft Tutorial 016 Wormgear en
18 pages
JavaScript Data Types
No ratings yet
JavaScript Data Types
13 pages
EPEAT Lexmark PDF
No ratings yet
EPEAT Lexmark PDF
20 pages
PLP2019876 462 Wendouree Parad - Lake Wendouree Planning Report
No ratings yet
PLP2019876 462 Wendouree Parad - Lake Wendouree Planning Report
41 pages
Plugin List
No ratings yet
Plugin List
2 pages
Seating Arrangement Puzzles Explained
No ratings yet
Seating Arrangement Puzzles Explained
8 pages
Hind Rectifiers LTD - Global Rail 2025 Presentation.
No ratings yet
Hind Rectifiers LTD - Global Rail 2025 Presentation.
14 pages
Event & Exhibition Services
No ratings yet
Event & Exhibition Services
62 pages
Exp No. 7 - Load Test On Single Phase Transformer
No ratings yet
Exp No. 7 - Load Test On Single Phase Transformer
3 pages
7.11.3.12 IfcReinforcingElement - IFC4.3.2.0 Documentation
No ratings yet
7.11.3.12 IfcReinforcingElement - IFC4.3.2.0 Documentation
4 pages
Ridebnb Eng Plan Presentation - Update
100% (2)
Ridebnb Eng Plan Presentation - Update
25 pages
Tailing Lug
No ratings yet
Tailing Lug
3 pages
MIT-103 Chapter 7, 8, 9
No ratings yet
MIT-103 Chapter 7, 8, 9
2 pages
Sound PMP
No ratings yet
Sound PMP
20 pages
Mining of Massive Data Sets 2nd Edition by Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman ISBN 1107077230 9781107077232 ebook fresh digital copy
100% (2)
Mining of Massive Data Sets 2nd Edition by Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman ISBN 1107077230 9781107077232 ebook fresh digital copy
87 pages
Document Book EE ESE Obj+ (2026) (Vol-I) (479+pages)
No ratings yet
Document Book EE ESE Obj+ (2026) (Vol-I) (479+pages)
20 pages
Tailieuxanh HH Pt3 576
No ratings yet
Tailieuxanh HH Pt3 576
305 pages
Data and Business Intelligence
No ratings yet
Data and Business Intelligence
11 pages
Grade 4 Unit 1
No ratings yet
Grade 4 Unit 1
61 pages
Business Impact Analysis (BIA) and Risk Assessment Data Gathering Worksheet
100% (4)
Business Impact Analysis (BIA) and Risk Assessment Data Gathering Worksheet
6 pages
Nikita Chavan - MuleSoft Developer
No ratings yet
Nikita Chavan - MuleSoft Developer
1 page
Firmware Flashing Procedure - V1.01 Technical Bulletin
No ratings yet
Firmware Flashing Procedure - V1.01 Technical Bulletin
3 pages
Singapore's Crypto Regulatory Vision
No ratings yet
Singapore's Crypto Regulatory Vision
4 pages
Data Part & Komponen Exs Dump Truck 6x4 MT
No ratings yet
Data Part & Komponen Exs Dump Truck 6x4 MT
5 pages
Product Bulletin - Optima Series e
No ratings yet
Product Bulletin - Optima Series e
8 pages
Digitrim62 Step by Step Calibration
No ratings yet
Digitrim62 Step by Step Calibration
3 pages
CCPCJ Background Guide
No ratings yet
CCPCJ Background Guide
5 pages
R2R Notes
No ratings yet
R2R Notes
232 pages
Unit 2 (Last Topic) Model Based Software Architecture
No ratings yet
Unit 2 (Last Topic) Model Based Software Architecture
4 pages
Introduction To Computer
No ratings yet
Introduction To Computer
16 pages