Relational Algebra
Join
Sanghita Bhattacharjee
Department of CSE
NIT Durgapur
References
• A Silberschatz, H F Korth and S Sudarshan, Database System Concepts, 5th Edition, 2006
• Ramez Elmasriand Shamkant, B Navathe, Fundamentals of Database Systems, 3rd
Edition, Addison Wesley, 2000
• Video lectures:
(i) Database Management System by Prof. Partha Pratim Das
(ii) Introduction to database systems by Prof. P. Sreenivasa Kumar
(iii) Online DBMS tutorials
Join
• Binary operator
• Denoted by
• Join is used to combine the related tuples from two relations into single tuples
• Join is useful as it allows to process the relationships among the relations
• Cartesian Product:
• All combinations of tuples are included in the result
• Certain tuples in the result are meaningful
• Useful when we follow select after Cartesian Product
• Join : only the combination of tuples satisfying the join condition appear in the
result
R S = 𝜎𝑐 ( R× S)
Example of Join
Employee HOD
EID Ename Dept HODID Dept Year
1 Smith CS 1 CS 2020
2 David IT 2 IT 2019
3 John IT 4 EE 2019
4 Virat EE
Suppose that, we want to retrieve the name of the HOD of various departments. So, we have to join two tables/
relations to retrieve the required information i.e. Ename.
Here, HODID is proper subset of EID
𝜋𝐸𝑛𝑎𝑚𝑒 (𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒 ⋈𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒.𝐸𝐼𝐷=𝐻𝑂𝐷.𝐻𝑂𝐷𝐼𝐷 HOD)
HODID is FK and EID is PK. So referential integrity is used to maintain the consistency to match the two
join attributes.
Types of Join
• Types of Join: Inner Join and Outer Join ( See later)
• Inner Join:
Theta join
Equi join
Natural join ( a variant of inner join)
• Outer join:
Left outer
Right outer
Full outer
Theta Join
• Let R (A1, A2, … An ) and S (B1, B2, … Bm) and R ∩ S = 𝜙 then 𝑇heta join
combines the tuples from the relations into a relation with (m+n) attributes that
satisfy the condition represented by 𝛩
• Theta join of R and S : R S
Θ = Join condition is form of = c1 AND c2 AND c3 AND…..
Each condition ci is form of Ai Θ Bi where Ai is attribute of R and Bi is attribute of
S and Ai and Bi have same domain
Θ ={ < , > =, <=, =, ≠}
Theta Join Example
S
R
Class Course
SID Sname Section
2 CS01
101 Alex 3
2 PH01
102 Rohit 2
3 ME01
1 BIO01
Q= R S
= (R.Section = S.Class)
SID Sname Section Class Course
101 Alex 3 3 ME01
102 Rohit 2 2 CS01
102 Rohit 2 2 PH01
• Fewer tuples than cross-product, might be able to compute more efficiently
• Tuples whose join attributes are null do not appear in the result. Thus join does
not preserve all the information of the input relations
Equi Join
• Equality is the only comparison operator used in the join condition
• Equi Join is a one kind of theta join where 𝛩 is an equality
Example : Enrolment( SID, Cno, Semester, Grade)
Course (CID, Cname, Credit, Hours)
Find the name of the course and credits in which students have enrolled
T1= 𝜋𝐶𝐼𝐷,𝐶𝑛𝑎𝑚𝑒,𝐶𝑟𝑒𝑑𝑖𝑡 (Course)
T2= 𝜋𝑆𝐼𝐷,𝐶𝑛𝑜 (Enrolment)
Resulting relation = 𝜋𝑆𝐼𝐷,𝐶𝑛𝑎𝑚𝑒,𝐶𝑟𝑒𝑑𝑖𝑡 (T1 T2 )
T1.CID= T2.Cno
Equivalent SQL: Select SID, Cname, Credit from Course as T1 inner join Enrolment as T2 on T1.CID =
T2.Cno
Equi Join Example
Class Course
SID Sname Section
2 CS01
R 101 Alex 3 S
2 PH01
102 Rohit 2
3 ME01
1 BIO01
Q = (R S)
R.Section = S.Class
SID Sname Section Class Course
101 Alex 3 3 ME01
102 Rohit 2 2 CS01
102 Rohit 2 2 PH01
Why Natural Join
• In result of an Equi Join, we can have one or more pairs of attributes that have
identical values in every tuple because of the equality join condition
• See in the previous slide, values of the attributes Section and Class are identical
for every tuple in the resulting relation Q
• Because one of each pair of attributes with identical values is superfluous, Natural
Join is created to over come the problem of superfluous attribute in an Equi Join
• The definition of Natural Join requires that the two join attributes ( or a pair of
join attributes) must have same name in both the relations.
Natural Join
• Natural join does not use any comparison operator. It does not concatenate the
way the Cartesian product does
• Natural join can only be performed if there exists at least one common attribute
between the relations. Those attributes must have same name and domain
• There can be a list of join attributes from each relation, each corresponding pair
must have the same name
• Relations R, S-have common attributes, say X1,X2,X3
• Join condition:
(R.X1= S.X1) ^ (R.X2= S.X2) ^ (R.X3= S.X3)
provided the values of common attributes should be equal
• Schema for the result Q = R ⋃(S-{X1, X2, X3 })
Only one copy of the common attributes is kept
• Notation for natural join Q = R* S
Natural Join Example
A B B C
R X Y S Z U
X Z V W
Y Z Z V
Z V
A B C
X Z U
R*S X Z V
Y Z U
Y Z V
Z V W
More on Natural Join
• If joining attributes of the relations do not have same name, a renaming operation
is performed first. Then join is applied
Q= R * (𝜌A (S))
• If the attributes on which the natural join is performed have the same name in both
the relations , renaming is unnecessary
• In join, if no combination of tuples satisfies the join condition, the result of join is
empty relation with 0 tuples. If R has m tuples and S has n tuples, the size of R * S
will have between 0 to m*n tuples
• If there is no join condition, all tuples qualify and join becomes Cartesian Product
• The natural join or Equi join can also be specified among multiple tables, leading
to n-way join
((R *a S) *b Q)
1. Let two relations R ( A, B, C) and S ( B, D, E)
B→A
A→C
R has 200 tuples and S has 100 tuples. What is the maximum size of the natural join
R * S ? Answer : 100 tuples
2. Consider two relations A (P, Q, R) and B ( R, S, T) with primary keys P and R
respectively. The relation A contains 200 tuples and B contains 250 tuples. What is
the maximum size of A * B ?
(i) 200 tuples
(ii) 250 tuples
(iii) 50000 tuples
(iv) 0 tuples
Q: What is the optimized version of the relational algebra expression
𝜋𝐴1 (𝜋𝐴2 (𝜎𝐹1 (𝜎𝐹2 (r) )))
where A1 , A2 are the sets of attributes in r with A1 ⊆ 𝐴2 and F1 and F2 are the Boolean expression
based on the attributes in r?
(i) 𝜋𝐴1 (𝜎𝐹1∧𝐹2 ( r ))
(ii) 𝜋𝐴1 (𝜎𝐹1∨𝐹2 ( r ))
(iii) 𝜋𝐴2 (𝜎𝐹1∧𝐹2 ( r ))
(iv) 𝜋𝐴2 (𝜎𝐹1∨𝐹2 ( r ))
Q:
Q. Which of the following query expression are correct? r1, r2 are relations, c1 c2
are conditions and A1, A2 are attributes
(i ) 𝜎𝑐1 (𝜎𝑐2 ( r1)) → 𝜎𝑐2 (𝜎𝑐1 ( r1))
(ii) 𝜎𝑐1 ( r1 ∪ r2) → 𝜎𝑐1 (r1) ∪ 𝜎𝑐1 (r2)
(iii) 𝜎𝑐1 (𝜋𝐴1 ( r1)) → 𝜋𝐴1 (𝜎𝑐1 ( r1))
(iv) 𝜋𝐴1 (𝜎𝑐1 ( r1)) → 𝜎𝑐1 (𝜋𝐴1 ( r1))
Questions
• Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R * S ?
• Given R(A, B, C), S(D, E), what is R * S ?
• Given R(A, B), S(A, B), what is R * S ?
Complete Set of Operators
• { σ, π ,∪, −,× } are operators are known as 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒 𝑠𝑒𝑡 of operators as any
other relational algebra operations can be expressed as a sequence of operations
from the set
R ∩ S= ( R ∪ S) – (( R – S) ∪ (S – R)
(R ⋈ C S ) = 𝜎𝐶 ( R × S)
Rename
• Unary operator
• Changes the schema, not the instance
• Notation: 𝜌 (B1,…,Bn) (R) // change column names or attributes to B1, B2, … Bn //
• Notation : 𝜌 S (R) // rename the relation name R to S //
• Example:
𝜌 (SID, Sname, Age) (Student)
Student 𝜌 (SID, Sname, Age) (Student)
RollNo Name Age SID Sname Age
101 Rohit 18 101 Rohit 18
102 Ranbir 19 102 Ranbir 19
103 Sandy 18 103 Sandy 18
Equivalence
1. σc1∧c2 = σc1 σc2 r
2. σc1 σc2 r = σc2 σc1 r
3. If A⊆ A1, 𝜋𝐴 (r) = 𝜋𝐴 (𝜋𝐴1 (r))
4. πA (σc (r)) = σc (πA r ) if attributes in c ⊆ attributes in A
5. if attributes in c ⊆ attributes in r , 𝜎𝑐 ( r ⋈ 𝑠) = 𝜎𝑐 ( r)⋈s
6. ( r⋈s)⋈ q= r ⋈(s⋈q) ,⋈:: join
7. σc ( r ∪ s)= σc (r ) ∪ σc (s) (also intersection, difference)
8. πA (r ∪ s )= πA (r ) ∪ πA (s)
9. if c involves the attributes in A of r and B of s, 𝜋𝐴𝐵 (r ⋈𝑐 𝑠 )= 𝜋𝐴 (r) ⋈𝑐 𝜋𝐵 (s)
Queries and RA # 1
Book = (BID, title, publiser, year)
Student = (SID, sname, age, major)
Author = (Aname, address)
Borrow= (BID, SID, date)
Has_written =( BID, Aname)
Describe = (BID, keyword)
Book = (BID, title,
Q: List year and title of each book publiser, year)
𝜋𝑡𝑖𝑡𝑙𝑒,𝑦𝑒𝑎𝑟 𝐵𝑜𝑜𝑘 Student = (SID, sname,
age, major)
Q. List all information about the students whose major is ‘CS’ Author = (Aname, address)
𝜎𝑚𝑎𝑗𝑜𝑟=′𝐶𝑆 ′ (Student) Borrow= (BID, SID, date)
Has_written =( BID,
Q. List all students with books they can borrow Aname)
Student × Borrow Describe = (BID, keyword)
Q. List all books published by LPE before 1990
𝜎𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟=′𝐿𝑃𝐸′^𝑦𝑒𝑎𝑟<1990 (Book)
Q. List name of students who are older than 24 and not studying CS
𝜋𝑠𝑛𝑎𝑚𝑒 ( 𝜎𝑎𝑔𝑒>24 (Student)) - 𝜋𝑠𝑛𝑎𝑚𝑒 ( 𝜎𝑚𝑎𝑗𝑜𝑟 ≠′𝐶𝑆 ′ (Student))
Q. List name of student who have borrowed a book and major is CS
𝜋𝑠𝑛𝑎𝑚𝑒 ( 𝜎𝑆𝑡𝑢𝑑𝑒𝑛𝑡.𝑆𝐼𝐷=𝐵𝑜𝑟𝑟𝑜𝑤.𝑆𝐼𝐷 (𝜎𝑚𝑎𝑗𝑜𝑟 =′𝐶𝑆 ′ (Student) × Borrow))
Q. List the books written by Korth
𝜋𝑡𝑖𝑡𝑙𝑒 (𝜎𝐻𝑎𝑠𝑤𝑟𝑖𝑡𝑡𝑒𝑛.𝐵𝐼𝐷=𝐵𝑜𝑜𝑘.𝐵𝐼𝐷 (𝜎𝑎𝑛𝑎𝑚𝑒=′ 𝐾𝑜𝑟𝑡ℎ′ (Has_written) × Book))
Queries in Join
Employee (eid, ename, street, city)
Works_on (eid, cname, salary)
Company (CID, cname, city)
Manage( eID, manager_name)
Q. Find names and cities of all employees who work for RS company
𝜋𝑒𝑛𝑎𝑚𝑒,𝑐𝑖𝑡𝑦 (𝜎𝑐𝑛𝑎𝑚𝑒=′𝑅𝑆′ ( Works_on) *Employee)
Q. Find name, street, city of all employees who work for RS and earn more than
20000
T1= 𝜎𝑐𝑛𝑎𝑚𝑒=′𝑅𝑆′ ∧ 𝑠𝑎𝑙𝑎𝑟𝑦>20000 ( Works_on)
Result = 𝜋𝑒𝑛𝑎𝑚𝑒,𝑠𝑡𝑟𝑒𝑒𝑡,𝑐𝑖𝑡𝑦 (T1 * Employee)
Queries in Join
Q. Companies are in different cities. Find all companies located in every city in
which RS company is located
T1= 𝜋𝑐𝑖𝑡𝑦 (𝜎𝑐𝑛𝑎𝑚𝑒=𝑅𝑆 (Company))
T2= company ÷ 𝑇1
Result= 𝜋𝑐𝑛𝑎𝑚𝑒 ( T2)