0% found this document useful (0 votes)
7 views11 pages

DBMS Unit 2 (1)

The document discusses formal query languages, specifically relational algebra and calculus, which serve as the foundation for database query languages like SQL. It outlines the basic operators of relational algebra, including selection, projection, and joins, while explaining their functionalities and applications. Additionally, it highlights the importance of query optimization and the expressive power of these query languages in efficiently accessing and manipulating data.

Uploaded by

khushi saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

DBMS Unit 2 (1)

The document discusses formal query languages, specifically relational algebra and calculus, which serve as the foundation for database query languages like SQL. It outlines the basic operators of relational algebra, including selection, projection, and joins, while explaining their functionalities and applications. Additionally, it highlights the importance of query optimization and the expressive power of these query languages in efficiently accessing and manipulating data.

Uploaded by

khushi saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Topics

{ Formal query languages


Relational Algebra and Calculus { Preliminaries
{ Relational algebra
{ Relational calculus
{ Expressive power of algebra and calculus

Relational Query Languages Formal Relational Query Languages


{ Relational model supports simple, powerful { Two mathematical query languages form
query languages the basis for “real” languages (e.g. SQL),
Allow manipulation and retrieval of data from a
z
database
and for implementation
z Allow for much optimization z Relational Algebra
z Strong formal foundation based on logic { Describe a step-by-step procedure for computing
the desired answer
{ Query Languages ≠ programming
{ Operational, useful for representing execution
languages plans
z Query languages are not expected to be “Turing
complete” z Relational Calculus
z Query languages are not intended to be used for { Describe the desired answer, rather than how to
complex calculations compute it
z Query languages support easy, efficient access to { Non-operational, declarative

large data sets


3 4

1
Preliminaries
{ A query is applied to relation instances,
and the result of a query is also a relation Relational Algebra
instance
z Schemas of input relations for a query are fixed { Selection
z The schema for the result of a given query is also
{ Projection
fixed
{ Set operations
{ Positional vs. named-field notation
z Positional notation is easier for formal definitions; { Renaming
named-field notation is more readable { Joins
z Both are used in SQL { Division

Operators Operators (Cont.)


{ Basic operators { Each operator accepts relation instance(s)
z Selection (σ): select a subset of rows from relation as arguments, and returns a relation
z Projection (π): delete unwanted columns from instance as result
relation
{ Algebra expression
z Cross-product (×): combine two relations
z Composed by operators
z Set-difference (−): tuples in relation 1 but not in
z Describe a procedure by which computing the
relation 2
desired answer
z Union (∪): tuples in both relation 1 and 2
z Used by relational systems to represent query
{ Additional operators evaluation plans
z Intersection(∩), join(ZY), division(⁄), renaming(ρ)
z Not essential, but very useful

7 8

2
Example Instances Projection π
Boat ( bid: integer, bname: string, color: string ) { To delete attributes that are not in projection list
Sailors ( sid: integer, sname: string, rating: integer, { Schema of result contains exactly the fields in the
age: real ) projection list, with the same names that they had
Reserves ( sid: integer, bid: integer, day: date ) in the single input relation
sid sname rating age sid sname rating age { Projection operator has to eliminate duplicates!
22 Dustin 7 45.0 28 Yuppy 9 35.0
31 Lubber 8 55.5 31 Lubber 8 55.5 sname rating age
58 Rusty 10 35.0 44 guppy 5 35.0 Yuppy 9 55.5
Instance S1 of Sailors 58 Rusty 10 35.0 Lubber 8 35.0
Instance S2 of Sailors π
sid bid day guppy 5 a g e(S 2 )
22 101 10/10/96 Rusty 10
58 103 11/12/96 π s n a m e, r a tin g
(S 2 )
Instance R1 of Reserves
9 10

Selection σ Selection σ (Cont.)


{ To select rows that satisfy selection condition { Selection condition
{ No duplicates in result
z A Boolean combination (∧,∨) of terms
{ Schema of result is identical to schema of single
input relation z A term has the forms:
{ Result relation can be the input for another { attribute op constant, or,
relational algebra operation (operator composition) { attribute1 op attribute2
* op is one of: <, ≤, =, ≠, ≥, >
sid sname rating age
σ rating >8(S2) z Example
28 Yuppy 9 35.0
{ (rating ≥ 8) ∨ (age < 50)
58 Rusty 10 35.0
{ (sid1 = sid2) ∧ (bid1 ≠ bid2)
sname rating
Yuppy 9 π snam e, rating (σ rating > 8( S 2 ))
Rusty 10
11 12

3
Union, Intersection, Set-Difference Cross-Product ×
{ These 3 operators { R × S = {<r, s> | r ∈ R, s ∈ S}
take 2 input relations, sid sname rating age z Each row of R is paired with each row of S
which must be union- 22 Dustin 7 45.0
compatible: z Result schema has one field per field of R and
31 Lubber 8 55.5 S, with field names inherited if possible
z Have the same
number of fields 58 Rusty 10 35.0 z The result fields have the same domains as
z Corresponding fields 44 guppy 5 35.0 the corresponding fields in R and S
have the same types 28 Yuppy 9 35.0 z Naming conflict: R and S contain field(s) with
{ Result schema S1∪ S 2 the same name
z The first relation { The corresponding fields in R × S are
sid sname rating age unnamed and referred to only by position
sid sname rating age 31 Lubber 8 55.5 { E.g., both S1 and R1 have a field sid
22 Dustin 7 45.0 58 Rusty 10 35.0
S1 − S 2 S1∩ S 2
13 14

Cross-Product × (Cont.) Renaming ρ


{ ρ (R(F), E)
(sid) sname rating age (sid) bid day E: a relational algebra expression
22 Dustin 7 45.0 22 101 10/10/96 R: a new relation
22 Dustin 7 45.0 58 103 11/12/96 F: a list of fields that are renamed
31 Lubber 8 55.5 22 101 10/10/96 z Takes E and returns an instance of R
31 Lubber 8 55.5 58 103 11/12/96 z R has the same tuples as the result of E
58 Rusty 10 35.0 22 101 10/10/96 z R has the same schema as E, but some fields
are renamed
58 Rusty 10 35.0 58 103 11/12/96
S1 × R1 ρ ( C(1 → sid1, 5 → sid2), S1 × R1 )
ρ ( C, S1 × R1)
ρ ( (1 → sid1, 5 → sid2), S1 × R1 )

15 16

4
Joins Joins (Cont.)
{ One of the most useful operations in { Condition Join
relational algebra R >< c S = σ c ( R × S)
{ The most common way to combine z C: join condition
information from two or more relations may refer to the attributes of both R and S
{ Defined as a cross-product followed by z Result schema is same as that of cross-product
selections and projections z Result has fewer tuples than cross-product;
{ Has a smaller result than a cross-product might be able to compute more efficiently

{ Condition join, equijoin, natural join, etc.


(sid) sname rating age (sid) bid day
22 Dustin 7 45.0 58 103 11/12/96
31 Lubber 8 55.5 58 103 11/12/96
S 1>< R1
S 1 .sid < R1 .sid
17 18

Joins (Cont.) Division


{ Equijoin: a special case of condition join { Not a primitive operator, but useful for
where the condition C contains only expressing queries like:
equalities “Find sailors who have reserved all boats”
{ Let A have 2 fields, x & y; B have only field y
z Equality is of form: R.name1 = S.name2
z Result schema is similar to cross-product, but A/B = { 〈x〉 | ∃ 〈x,y〉 ∈ A ∀ 〈Y〉 ∈ B}
only one copy of fields for which equality is z i.e., A/B contains all x tuples (sailors) such that for
specified every y tuple (boat) in B, there is an xy tuple in A
(reserves), or,
{ Natural Join: equijoin on all common fields
z if the set of y values (boats) associated with an x
sid sname rating age bid day value (sailor) in A contains all y values in B, then x
value is in A/B
22 Dustin 7 45.0 101 10/10/96
{ In general, x and y can be any lists of fields; y is
58 Rusty 10 35.0 103 11/12/96
the list of fields in B, and x∪y is the list of fields
S1 >< R1, or, S1 ><S1.sid=R1.sid R1 of A
19 20

5
Division (Cont.) Division (Cont.)
B1 B2 B3 Division is not an essential operation; just
A {
P# P# P# a useful shorthand
S# P#
P2 P2 P1 z (Also true of joins, but joins are so common
S1 P1 that systems implement joins specially)
P4 P2
S1 P2 { Expressing division using basic operators
P4
S1 P3 z Idea: for A/B, compute all x values that are
S1 P4 A/B1 A/B2 A/B3 not “disqualified” by some y value in B
S2 P1 S# S# S# z x value is disqualified if: by attaching y value
from B, we obtain an xy tuple that is not in A
S2 P2 S1 S1 S1
S3 P2 S2 S4 Disqualified x values: π x ((π x ( A ) × B ) − A )
S4 P2 S3 A / B: π x ( A) − all disqualified tuples
S4 P4 S4
21 22

Examples Examples (Cont.)


{ Find the names of sailors who have { Find the names of sailors who have
reserved boat #103 reserved a red boat
z Information about boat color is only available
Solution 1: π sname ((σ Re serves) >< Sailors) in Boats; so need an extra join with Boats
bid =103
Solution 1:
Solution 2: ρ (Temp1, σ Re serves) π sname ((σ Boats) >< Re serves >< Sailors)
bid = 103 color =' red '
ρ ( T em p 2 , T em p1 >< S a ilo rs)
Solution 2 (more efficient):
π sn a m e ( T e m p 2 )
π sname (π ((π σ Boats) >< Re s) >< Sailors)
sid bid color =' red '
Solution 3: π snam e (σ ( Re serves >< Sailors ))
bid = 103 A query optimizer can find the second solution, given the first one!
23 24

6
Examples (Cont.) Examples (Cont.)
{ Find the names of sailors who have { Find the names of sailors who have
reserved a red or a green boat reserved a red and a green boat
z Identify all red or green boats, then find sailors z Identify sailors who have reserved red boats,
who have reserved one of these boats sailors who have reserved green boats, and
then, find the intersection
ρ (Tempboats, (σ Boats)) z Note that sid is a key for Sailors
color =' red ' ∨ color = ' green '
π s n a m e ( T e m p b o a ts >< R e s e r v e s >< S a ilo r s ) ρ ( T em p red , π ( (σ B o a ts ) >< R e serves ))
sid co lo r =' red '
ρ (Tempgreen, π ((σ Boats) >< Re serves))
• Tempboats can also be defined using union sid color =' green'
• What if “∨” is replaced by “∧” in this query? π sn a m e (( T em p red ∩ T em p g reen ) >< S a ilo rs )

25 26

Examples (Cont.) Summary


{ Find the names of sailors who have { The relational model has rigorously
reserved all boats defined query languages that are simple
z Uses division; schemas of the input relations to and powerful
the division (/) must be carefully chosen
{ Relational algebra is more operational;
useful as internal representation for
ρ ( T e m p s id s, ( π R e se rv e s ) / ( π B o a ts ))
s id , b id b id query evaluation plans
π s n a m e ( T e m p sid s >< S a ilo rs ) { There might be several ways of
expressing a given query; a query
{ Find the names of sailors who have optimizer should choose the most
reserved all ‘Interlake’ boats efficient version
..... / π (σ B o a ts )
b id b n a m e = ' In te r la k e'

27 28

7
Relational Calculus
{ Two variants of relational calculus
z Tuple relational calculus (TRC): SQL
Relational Calculus z Domain relational calculus (DRC): QBE
{ Calculus has variables, constants, comparison
operators, logical connectives and quantifiers
{ Domain relational calculus
z TRC: variables range over tuples
{ Tuple relational calculus z DRC: variables range over domain elements (= field
values)
z Both TRC and DRC are simple subsets of first-order
logic
{ Calculus expressions are called formulas
{ An answer tuple is an assignment of constants
to variables that make the formula evaluate to
true

30

Domain Relational Calculus Domain Relational Calculus (Cont.)


{ DRC query has the form { DRC atomic formula
{ 〈x1, x2, …, xn〉 | p ( 〈x1, x2, …, xn〉 ) } z 〈x1, x2, …, xn〉 ∈ Rname, or,
z The answer to the query includes all tuples 〈x1, z X op Y, or,
x2, …, xn〉 that make the formula p ( 〈x1, x2, …, xn〉 ) z X op constant
be true * Rname is relation name; X, Y are domain variables;
z DRC formula is recursively defined, starting op is one of <, >, =, ≤, ≥, ≠
with simple atomic formulas, and building { DRC formula
bigger and better formulas using the logical z an atomic formula, or,
connectives
z ¬ p, p ∧ q, p ∨ q, where p and q are formulas, or,
z Example: find all sailors with a rating above 7
z ∃X (p(X)), where variable X is free in p(X), or,
{ 〈 I, N, T, A〉 | 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7 }
z ∀X (p(X)), where variable X is free in p(X)

31 32

8
Domain Relational Calculus (Cont.) DRC Query Examples
{ Free and bound variables { Find all sailors with a rating above 7
z ∃ and ∀ are quantifiers
z The use of ∃X and ∀X is said to bind X { 〈 I, N, T, A〉 | 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7}
z A variable that is not bound is free
{ An important restriction on the definition z The condition 〈I, N, T, A〉 ∈ Sailors ensures
of a DRC query that the domain variables I, N, T and A are
{ 〈x1, x2, …, xn〉 | p ( 〈x1, x2, …, xn〉 ) } bound to fields of the same Sailors tuple
z The variables x1, x2, …, xn that appear to the left z “|” should be read as “such that”
of `|’ must be the only free variables in the z The term 〈I, N, T, A〉 to the left of “|” says
formula p ( …) that every tuple 〈I, N, T, A〉 that satisfies T>7
is in the answer

34

DRC Query Examples (Cont.) DRC Query Examples (Cont.)


{ Find the names of sailors rated > 7 who { Find sailors rated > 7 who have reserved
have reserved boat #103 a red boat
z ∃Ir, Br, D: a shorthand for ∃Ir(∃Br(∃D())) z The parentheses control the scope of each
z ∃: to find a tuple in Reserves that “joins with” quantifier’s binding
the Sailors tuple under consideration
{ 〈 I, N, T, A〉 | 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7 ∧
∃ Ir, Br, D (〈 Ir, Br, D〉 ∈ Reserves ∧ Ir = I ∧
{ 〈 N〉 | ∃ I, T, A ( 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7
∃ B, BN, C (〈 B, BN, C〉 ∈ Boats ∧ B= Br ∧ C = ‘red’ ) ) }
∧ ∃ Ir, Br, D (〈 Ir, Br, D〉 ∈ Reserves ∧ Ir = I ∧ Br = 103 ) ) }
{ 〈 I, N, T, A〉 | 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7 ∧
{ 〈 N〉 | ∃ I, T, A ( 〈 I, N, T, A〉 ∈ Sailors ∧ T > 7 ∃ 〈 I, Br, D〉 ∈ Reserves ∧
∧ ∃ 〈 Ir, Br, D〉 ∈ Reserves ( Ir = I ∧ Br = 103 ) ) } ∃ 〈Br, BN, ‘red’〉 ∈ Boats }
35 36

9
DRC Query Examples (Cont.) DRC Query Examples (Cont.)
{ Find the names of sailors who have { Find the names of sailors who have
reserved all boats (solution 1) reserved all boats (solution 2)
z Find all sailors 〈I, N, T, A〉 such that: for each z Simpler notation, same query (much clearer!)
3-tuple 〈B, BN, C〉, either it is not a tuple in
Boats, or there is a tuple in Reserves showing
that sailor I has reserved B
{ 〈 N〉 | ∃ I, T, A ( 〈 I, N, T, A〉 ∈ Sailors ∧
∀ 〈 B, BN, C〉 ∈ Boats
( ∃ 〈 Ir, Br, D〉 ∈ Reserves ( Ir = I ∧ Br = B ) ) ) }
{ 〈 N〉 | ∃ I, T, A ( 〈 I, N, T, A〉 ∈ Sailors ∧
∀ B, BN, C ( ¬ (〈 B, BN, C〉 ∈ Boats ) ∨ { To find the names of sailors who jave
( ∃ 〈 Ir, Br, D〉 ∈ Reserves ( Ir = I ∧ Br = B ) ) ) ) } reserved all red boats
{ 〈 N〉 | ∃ I, T, A ( 〈 I, N, T, A〉 ∈ Sailors ∧
∀ 〈 B, BN, C〉 ∈ Boats ( C ≠ ‘red’ ∨
37
∃ 〈 Ir, Br, D〉 ∈ Reserves ( Ir = I ∧ Br = B ) ) ) } 38

Tuple Relational Calculus Tuple Relational Calculus (Cont.)


{ TRC query has the form { TRC atomic formula
z R ∈ Rname, or,
{ T | p (T) }
z R.a op S.b, or,
z T is a tuple variable that takes on tuples of a
z R.a op constant
relation as values
* Rname is relation name; R, S are tuple variables;
z p(T) is a formula describing T a is an attribute of R, b is an attribute of S; op is
z The answer to the query is the set of all one of <, >, =, ≤, ≥, ≠
tuples t that make p(T) be true when T = t { TRC formula
z TRC formula is recursively defined z an atomic formula, or,
z Example: find all sailors with a rating above 7 z ¬ p, p ∧ q, p ∨ q, where p and q are formulas, or,
{ S | S ∈ Sailors ∧ S.rating > 7 } z ∃R (p(R)), where R is a tuple variable, or,
z ∀R (p(R)), where R is a tuple variable

39 40

10
TRC Query Examples TRC Query Examples (Cont.)
{ Find the names and ages of sailors with a { Find the names of sailors who have
rating above 7 reserved all boats
{ P | ∃ S∈ Sailors ( S.rating>7 ∧ P.name = S.sname { P | ∃ S ∈ Sailors ∀ B ∈ Boats
∧ P.age = S.age ) } ( ∃ R ∈ Reserves ( S.sid = R.sid ∧ R.bid = B.bid
∧ P.sname = S.sname ) ) }
z P is a tuple variable with two fields: name
and age { Find sailors who have reserved all red
z P.name=S.sname and P.age=S.age gives values boats
to the fields of an answer tuple P
* If a variable R does not appear in an atomic { S | S ∈ Sailors ∧ ∀ B ∈ Boats
formula of the form R ∈ Rname, the type of R ( B.color ≠ ’red’ ∨ (∃ R ∈ Reserves
is a tuple whose fields include all (and only) ( S.sid = R.sid ∧ R.bid = B.bid ) ) ) }
fields of R that appear in the formula
41 42

Expressive Power of Algebra and


Calculus Summary
{ Unsafe query { Relational calculus is non-operational,
a syntactically correct calculus query that has an
z
infinite number of answers and users define queries in terms of
z E.g., { S | ¬ ( S ∈ Sailors ) }
what they want, not in terms of how to
{ Every query that can be expressed in relational compute it (declarativeness)
algebra can be expressed as a safe query in { Algebra and safe calculus have same
DRC / TRC; the converse is also true expressive power, leading to the notion
{ Relational Completeness of relational completeness
z Query language (e.g., SQL) can express every
query that is expressible in relational algebra
{ In addition, commercial query languages can
express some queries that cannot be expressed
in relational algebra

43 44

11

You might also like