0% found this document useful (0 votes)
3 views

Cheatsheet

The document covers key concepts in data modeling and databases, including functional dependencies, multivalued dependencies, and various normal forms such as BCNF and 4NF. It outlines Armstrong's rules for deriving functional dependencies, decomposition methods, and relational algebra operations. Additionally, it discusses SQL basics, advanced SQL features, and the importance of safety in Datalog rules.

Uploaded by

jjlkoppe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Cheatsheet

The document covers key concepts in data modeling and databases, including functional dependencies, multivalued dependencies, and various normal forms such as BCNF and 4NF. It outlines Armstrong's rules for deriving functional dependencies, decomposition methods, and relational algebra operations. Additionally, it discusses SQL basics, advanced SQL features, and the importance of safety in Datalog rules.

Uploaded by

jjlkoppe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data modeling and databases (2ID50)

by Lea Faris

Functional dependencies (FDs) Multi-valued dependencies (MVDs) Boyce-Codd Normal Form (BCNF)
A → B iff any 2 tuples that have the same value in A have the A multivalued dependency exists when there are at least three ”One thing one table:” for every A → B ∈ F + , either A → B is
same value in B, i.e. t1 [A] = t2 [B] =⇒ t1 [B] = t2 [B] attributes (like X,Y and Z) in a relation and for a value of X trivial or A is a superkey.
there is a well defined set of values of Y and a well defined set of
Armstrong’s rules values of Z. However, the set of values of Y is independent of set
Decomposition
1. Reflexivity – If α ⊆ R ∧ β ⊆ α, then α → β Z and vice versa.
2. Augmentation – If α → β ∧ γ ⊆ R, then γα → γβ Inference Rules Always lossless, not necessarily dependency preserving. Put
3. Transitivity – If α → β ∧ β → γ, then α → γ each A in its own table.
1. Complementation – If α ↠ β, then α ↠ (R − β) − α
4. Union – If α → β ∧ α → γ then α → βγ 2. Multivalued augmentation – If α ↠ β and 1. Identify the dependencies which violates the BCNF
5. Decomposition – If α → βγ, then α → β ∧ α → γ γ ⊆ R ∧ δ ⊆ γ, then γα ↠ δβ definition and consider that as X → A
6. Pseudotransitivity – If α → β ∧ γβ → δ then αγ → δ 3. Multivalued transitivity – If α ↠ β and
2. Decompose the relation R into XA and R − A.
β ↠ γ, then α ↠ γ − β
The first three are enough to be sound and complete. 4. Replication – If α → β, then α ↠ β 3. Validate if both the decomposition are in BCNF or not.
Soundness: they generate only true FDs i.e. F + ⊆ F ∗ If not re-apply algorithm on the decomposition that is
5. Coalescence – If α ↠ β and not in BCNF.
Completeness: they generate all FDs i.e. F ∗ ⊆ F + γ ⊆ β ∧ ∃δ : δ ⊆ R, ∧δ ∩ β = ∧δ → γ, then α → γ
Where F ∗ is everything logically implied by the definition of FD
and F . More helper rules 4th Normal Form
Armstrong relation 1. Multivalued union – If α ↠ β and α ↠ γ, then α ↠ βγ For every MVD α ↠ β, either it is trivial, or α is a superkey.
2. Intersection – If α ↠ β and α ↠ γ, then α ↠ β ∩ γ
Instance of R such that all FDs are satisfied and all not FDs are
not satisfied. 3. Difference – If α ↠ β and α ↠ γ then α ↠ β − γ and Decomposition
α↠γ−β
1. Compute closure of every set in the powerset of R. Same as BCNF decomposition! Except identify dependencies
2. Construct a table with each attribute as a header. Restriction of MVDs that violate 4NF. Compute the closure of D to make this easier.
3. For each set in the powerset, using the set as a subscript: Restriction of D to Ri is Di containing
• All FDs in D+ that include only attributes of Ri More Decomposition
(a) Add a row of 1s
• All MVDs of the form α ↠ (β ∩ Ri ) where α ⊆ Ri and Lossless join
(b) Add a row with 0s for attributes it does not α ↠ β is in D+ .
determine and 1s for attributes it does determine. A decomposition of R into R1 , R2 is lossless if there is a
Normal Forms primary key for R1 or R2 in R1 ∩ R2 . If decomposed into more
Closure than 2 relations, check all relations by joining the first two, then
F + denotes the closure of F .
1st Normal Form
the new one with the next one, etc.
For every FD A → B in F : No null values or sub-tables or multiple values per cell.

1. Add a line labelled A 2nd Normal Form Dependency preservation


2. Use Armstrong’s rules to calculate what is derivable from No dependencies on part of the primary key. If (F1 ∪ F2 ∪ ...Fn )+ = F + .
it.
3rd Normal Form
The closure is the set of generated FDs. Ensures dependency preservation. Conditions same as BCNF, Relational Algebra
Canonical Cover OR every attribute b in B − A is part of a candidate key.
Expensive to check 3NF, calculated through canonical cover. Basic RA
Reduce to minimal set of FDs. By either
Decomposition • Select (σ): Selects tuples that satisfy a given predicate.
1. Applying the union rule of FDs
To decompose a relation R into R1 , R2 , ... with a set of • Project (Π): Reduces a relation to contain only certain
2. Remove extraneous attributes from either α or β in attributes.
α→β functional dependencies F :
1. Determine the canonical cover Fc of F • Union (∪): Combines the tuples of two relations and
Extraneous attributes eliminates duplicate tuples.
2. Make relations R1 , R2 , ... with all attributes for each FD
For an FD α → β: in Fc • Set Difference (−): Yields the tuples that are in one
relation but not in the other.
• An attribute A ∈ α is extraneous if ({α} − A)+contains 3. If no relations contain a candidate key for R, add another
β, relationship with the a candidate key • Cartesian Product (×): Combines tuples from two
• An attribute B ∈ β is extraneous if α+ contains B, using 4. Delete redundant relations (if a schema is contained in relations in every possible way.
only FDs in F − (α → β) ∪ α → (β − B). another). • Rename (ρ): Changes the attribute names of a relation.
Advanced RA • Subqueries: Queries nested inside other queries.
• Assignment (←): Assigns the result of a relational SELECT * FROM table WHERE column IN (SELECT column
{t |∃s ∈ borrower(t[loanNumber] = s[loanNumber]
algebra expression to a new relation. FROM table2);
∧ ∃c ∈ customer(c[custName] = s[custName]
• Set Intersection (∩): Yields only the tuples that are Advanced SQL
present in both relations. ∧ c[custCity] = ”Eindhoven”))}
• Natural Join (▷◁): Combines tuples from two relations Query Composition and Management I hope you did well in Logic and Set Theory! Make Bas Luttik
based on common attribute values. • Derived Relations: Relations obtained from queries proud.
• Division (÷): Returns a relation consisting of tuples that combine rows or columns from one or more tables.
from one relation that match all tuples from a SELECT * FROM (SELECT * FROM table1) AS Datalog
subrelation in another relation. derived table; A positive literal has the form p(t1, t2, ..., tn).
• Views: Virtual tables based on the result-set of an SQL A negative literal has the form not p(t1, t2, ..., tn).
A B
statement. Comparison operators and arithmetic operations are
1 2 treated as positive predicates.
CREATE VIEW view name AS SELECT column FROM table
1 3
WHERE condition;
1 4 Basic Datalog
Relation R:
2 2 • WITH (Common Table Expressions): Defines a
2 3 temporary named result set that can be used within a interest(A, l) :- perryridge account(A, B),
3 3 SELECT, INSERT, UPDATE, or DELETE statement. B·R
WITH CTE name AS (SELECT * FROM table) SELECT * interest rate(A, R), l =
3 4 100
FROM CTE name;
B A perryridge account(A, B) :- account(A, ”Perryridge”, B)
Relation S: 2 Result of R ÷ S: 1 Data Control interest rate(A, 5) :- account(A, L, B), B < 10000
3 2 • Triggers: Stored procedures that are automatically interest rate(A, 6) :- account(A, L, B), B ≥ 10000
Because only 1 and 2 always have both an entry for 2 and an executed or fired when certain events occur.
entry for 3. This can be used to represent the statement ”for CREATE TRIGGER trigger name BEFORE INSERT ON table ?interest rate
all”. FOR EACH ROW EXECUTE procedure name; Safety
SQL Data Manipulation and Analysis A rule is safe if it does not generate an infinite number of
Basic SQL • Aggregation Functions: Used to perform a calculation results.
• SELECT: Retrieves data from a database. on a set of values and return a single value. 1. Every variable that appears in the head of a rule also
-- Examples of aggregation functions: AVG, SUM, appears in a non-positive literal in the body of the rule.
• FROM: Specifies the table to retrieve data from.
MIN, MAX, COUNT SELECT AVG(column) FROM table; 2. Every variable appearing in a negative literal in the body
• AS: Renames a column or table with an alias.
• Forming Groups: Groups rows that share a property of the rule also in some positive literal in the of the rule.
• WHERE: Filters the results to include only those that so that aggregate functions can be applied to each group.
satisfy a condition. SELECT column, COUNT(*) FROM table GROUP BY Advanced (Recursive) Datalog
SELECT column AS col FROM table WHERE condition; column; A view V is said to be monotonic if given any two sets of
• UNION: Combines the result-set of two or more • HAVING: Specifies conditions on the groups formed by facts I1 , I2 : I1 ⊆ I2 , then Ev (I1 ) ⊆ Ev (I2 ), where Ev is
SELECT statements. the GROUP BY clause. the expression used to define V . Relational algebra with-
• INTERSECT: Returns the intersection of the SELECT column, COUNT(*) FROM table GROUP BY column out ”−” is monotonic, and Datalog without negation is
result-sets of two SELECT statements. HAVING COUNT(*) > 1; monotonic.
• EXCEPT: Returns the difference between the result-set • Ordering Results: Specifies the order of rows returned Say view V depends positively/negatively on view W if W
of one SELECT statement and another. from a query. appears positively/negatively in a rule defining V . If there
SELECT column FROM table1 EXCEPT SELECT column SELECT * FROM table ORDER BY column ASC; are no cycles in this dependency graph (where each IDB is
FROM table2; a node) involving a negative dependence, then the Datalog
• EXISTS, NOT EXISTS: Tests for the existence of any Handling Special Cases program is stratified. Only stratified Datalog is permitted.
record in a subquery. • Null Values: Handles fields with no value.
SELECT column FROM table WHERE EXISTS (subquery); SELECT * FROM table WHERE column IS NULL; ER Diagrams
• DISTINCT: Ensures that all the values in a column are • Effect of NULL on Aggregation: NULL values are • Entity set (boxes, sets of objects)
different. ignored by aggregate functions. • Relationship set (diamonds, sets of associations)
SELECT DISTINCT column FROM table; SELECT AVG(column) FROM table;
– Can have attributes
• Three-Valued Logic: Deals with TRUE, FALSE, and
More features in basic SQL: • Carnality constraints, participation, weak entities
UNKNOWN values to handle NULLs in conditional
• Tuple variables: Variables that represent rows in a expressions. • Generalization, specialization, aggregation
table.
SELECT t1.* FROM table AS t1 WHERE condition; Tuple Calculus Tips and Tricks
• String operations: Operations to manipulate strings. An expression t|P (t) in the TRC is safe if every value for every • Check the DB schema for query questions! There may be
SELECT CONCAT(column1, ’ ’, column2) AS full name attribute of t in the query result appears in one of the relations, intentional syntax errors that make the answer ’none of
FROM table; tuples, or constants that appear in P . the above.’

You might also like