The fine print about foreign keys

Recall our formal definition of foreign keys:

Let R and S be relation schemas, such that:

A is a column in R;
B is a column in S;
B has a uniquess constraint. Then A is a foreign key referring to B if the values of A are constrained to be a subset of B.

Ouch—if you think about it carefully enough, you’ll realize this definition doesn’t even make sense. In our formal definition of relation schema, we say R is a subset of possible relations on one set of columns and S is a subset of possible relations on another set of columns. There is no room in this definition for the two schemas to interact via a constraint. Don’t worry about it. We are trying to avoid too much abstract mathematics. BUT, if you care about it, here is a way to define a foreign key constraint properly. First, we add another level of abstraction. Given relation schemas R, S, …, we say a database schema on R, S, … is a relation schema on R, S, … — yes, a database schema is just a schema of schemas. It’s a collection of all possible tuples of tables. A single element of the database schema is a particular choice of rows for each table; call this a realization of the database schema. We can easily define a foreign key constraint for the database schema stating that, given a realization with tables r in R and s in S, column A in r must be a subset of column B in s. As I said, it’s best not to worry about it.