01-relationalmodel
01-relationalmodel
Systems
Relational Model &
Algebra
15-445/645 FALL 2024 PROF. ANDY PAVLO
5
5
COURSE LOGISTICS
Course Policies + Schedule: Course Web Page
Discussion + Announcements: Piazza
Homeworks + Projects: Gradescope
Final Grades: Canvas
Waitlist: Six open seats (as of 12pm today)
TODAY’S AGENDA
Database Systems Background
Relational Model
Relational Algebra
Alternative Data Models
Q&A Session
DATABASE
Organized collection of inter-related data that
models some aspect of the real-world.
DATABASE EXAMPLE
Create a database that models a digital music store
to keep track of artists and albums.
DATA MODELS
A data model is a collection of concepts for
describing the data in a database.
DATA MODELS
A data model is a collection of concepts for
describing the data in a database.
DATA MODELS
Relational ← Most DBMSs
Key/Value
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
19
DATA MODELS
Relational
Key/Value ← Simple Apps / Caching
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
20
DATA MODELS
Relational
Key/Value
Graph
Document / JSON / XML / Object ← NoSQL
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
21
DATA MODELS
Relational
Key/Value
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor) ← ML / Science
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
22
DATA MODELS
Relational
Key/Value
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
← Obsolete / Legacy / Rare
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
23
DATA MODELS
Relational ← This Course
Key/Value
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
24
EARLY DBMSs
Early database applications were difficult to build
and maintain on available DBMSs in the 1960s.
→ Examples: IDS, IMS, CODASYL
→ Computers were expensive, humans were cheap.
EARLY DBMSs
Ted Codd was a mathematician at
IBM Research in the late 1960s.
Edgar F. Codd
8 26
EARLY DBMSs
Ted Codd was a mathematician at
IBM Research in the late 1960s.
Edgar F. Codd
Codd
Codd
RELATIONAL MODEL
The relational model defines a database
abstraction based on relations to avoid maintenance
overhead.
Key tenets:
→ Store database in simple data structures (relations).
→ Physical storage left up to the DBMS implementation.
→ Access data through high-level language, DBMS figures
out best execution strategy.
RELATIONAL MODEL
Structure: The definition of the database’s relations
and their contents independent of their physical
representation.
DATA INDEPENDENCE
Application Application
Isolate the user/application from low-
level data representation.
→ The user only worries about high-level External Schema External Schema
application logic. Views (SQL)
Logical Data
→ DBMS optimizes the layout according Independence
to operating environment, database Logical Schema
contents, and workload. Schema, Constraints…
→ DBMS can then re-optimize the Physical Data
Independence (SQL)
database if/when these factors changes.
Physical Schema
Pages, Files, Extents…
Database
5-445/645 (Fall 2024)
Storage
15-445/645 (Fall 2024)
32
RELATIONAL MODEL
A relation is an unordered set that
contain the relationship of attributes Artist(name, year, country)
that represent entities. name year country
Wu-Tang Clan 1992 USA
A tuple is a set of attribute values Notorious BIG 1992 USA
(aka its domain) in the relation. GZA 1990 USA
→ Values are (normally) atomic/scalar.
→ The special value NULL is a member of n-ary Relation
every domain (if allowed). =
Table with n columns
internal primary key if a table does 101 Wu-Tang Clan 1992 USA
RELATIONAL ALGEBRA
Fundamental operations to retrieve σ Select
and manipulate tuples in a relation.
→ Based on set algebra (unordered lists with π Projection
no duplicates). ∪ Union
Each operator takes one or more ∩ Intersection
relations as its inputs and outputs a – Difference
new relation.
→ We can “chain” operators together to × Product
create more complex operations. ⋈ Join
(SELECT * FROM R)
INTERSECT
(SELECT * FROM S);
OBSERVATION
Relational algebra defines an ordering of the high-
level steps of how to compute a query.
→ Example: σb_id=102(R⋈S) vs. (R⋈(σb_id=102(S))
DATA MODELS
Relational ← This Course
Key/Value
Graph
Document / JSON / XML / Object
Wide-Column / Column-family
Array (Vector, Matrix, Tensor)
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
57
DATA MODELS
Relational
Key/Value
Graph
Document / JSON / XML / Object ← Leading Alternative
Wide-Column / Column-family
Array (Vector, Matrix, Tensor) ← New Hotness
Hierarchical
Network
Semantic
5-445/645 (Fall 2024)
Entity-Relationship
15-445/645 (Fall 2024)
40
Artist R1(id,…)
⨝
ArtistAlbum R2(artist_id,album_id)
⨝
Album R3(id,…)
Artist R1(id,…)
⨝
ArtistAlbum R2(artist_id,album_id)
⨝
Album R3(id,…)
Application Code {
class Artist { "name": "GZA",
Artist int id;
"year": 1990,
"albums": [
String name; {
int year; "name": "Liquid Swords",
Album albums[]; "year": 1995
},
} {
class Album { "name": "Beneath the Surface",
int id; "year": 1999
Album String name; }
]
int year;
}
}
Application Code {
class Artist { "name": "GZA",
Artist int id;
"year": 1990,
"albums": [
String name; {
int year; "name": "Liquid Swords",
Album albums[]; "year": 1995
},
} {
class Album { "name": "Beneath the Surface",
int id; "year": 1999
Album String name; }
]
int year;
}
}
Transformer
22 St.Ides Mix Tape 1994 Id3 → [0.01, 0.18, 0.85, ...]
⋮
Liquid Swords
Query
Find albums similar
Vector
to "Liquid Swords" Index
HNSW
HNSW, IVFFlat
Meta Faiss Spotify Annoy
Transformer
22 St.Ides Mix Tape 1994 Id3 → [0.01, 0.18, 0.85, ...]
⋮
Liquid Swords
HNSW, IVFFlat
Meta Faiss Spotify Annoy
Transformer
22 St.Ides Mix Tape 1994 Id3 → [0.01, 0.18, 0.85, ...]
⋮
Liquid Swords
HNSW, IVFFlat
Meta Faiss Spotify Annoy
CONCLUSION
Databases are ubiquitous.
NEXT CLASS
Modern SQL
→ Make sure you understand basic SQL before the lecture.