Definitions to memorize
Memorize everything in bold.
Relation on sets A, B:
Subset of the Cartesian product A x B
- That is, a collection of tuples (a, b) where a is in A and b is in B.
- If there are more sets, e.g. A, B, C, D, then use longer tuples, e.g. (a, b, c, d)
- We can also refer to a relation as a table.
- The sets A, B, … are also called columns.
- The tuples (a, b, …) are also called rows.
Relation schema on sets A, B, C,… :
A collection of constraints on the possible tuples
- The most common type of constraint is a uniqueness constraint, stating that the values from a given column or collection of columns must be unique.
- Another common type of constraint is a foreign key constraint, which is defined later. It is actually a constraint that applies simultaneously to a combination of two relation schemas.
- Formally, a relation schema is a subset of all possible relations on the sets A, B, C,…
Superkey of a relation schema:
Any collection of columns whose values are guaranteed to be unique
- Could be a superset of unique columns, e.g. if A contains unique values then AB is a superkey (and A is also a superkey).
Key of a relation schema:
A minimal superkey
- i.e. Deleting any column loses uniqueness
Candidate key of a relation schema:
Same as a key
- The word candidate emphasizes that there can be more than one key
Primary key (PK) of a relation schema:
A single key designated as the primary key by the database designer
- In a relational database, every table should have a primary key (although there are a few rare exceptions).
Alternate key of a relation schema:
A key that is not the primary key
Foreign key of a relation R referring to relation S:
Short definition (memorize this): A in R is a foreign key referring to B in S if the values of A are a subset of B.
Formal definition (understand this): Let R and S be relation schemas, such that:
- A is a column in R;
- B is a column in S;
- B has a uniquess constraint.
Then A is a foreign key referring to B if the values of A are constrained to be a subset of B. There is some fine print, which you can ignore.
Prime column of a relation schema:
A column that is part of a key.
Functional dependency in a relation schema:
A–>B is a functional dependency if the mapping from A to B is always a function.
- That is, in every tuple (a,b), the value of a uniquely determines the value of b.
Boyce-Codd normal form (BCNF) for a relation schema:
The left hand side of every functional dependency is a superkey.
- This rules out a lot of redundancy because it means that any functional dependency in a table can’t have repeated rows—the left hand side is a superkey so its values are unique.
Third normal form (3NF) for a relation schema:
For every functional dependency, either:
- the left hand side is a superkey; or
- the right hand side is a prime column.
Important fact: Any relation schema can be converted to an equivalent collection of relation schemas in 3NF. BCNF is preferable, but in certain rare cases it’s not possible (e.g. the ZIP code example).