Why IDENTITY Is Not A Good Primary Key Solution
|
(Joe Celko) The fact that this question is being considered is a sign that we have a lot of really bad SQL programmers. By definition a relational key is a subset of attributes of an entity or relationship being modeled in the schema. This is absolute foundations of the Relational Model kind of stuff. |
|
| Related Articles - For Members.
|
|
Key (Please note):
(R) - registration may be required for access at the target site
($) - target site may require paid membership for access to this or other content
Reader Comments: Post Your Comments/Feedback
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 9:07:58 AM) Dude, you state: "UNIQUE constraint on her SSN to prevent this, thus making the IDENTITY redundant." But since the IDENTITY is a "surrogate" key, which by very definition IS redundant since it is standing in for another key. Can't do that without the other key.
"But we were assuming that we use only IDENTITY as a key, so we are screwed." This is the problem, and anyone who does this IS confused about what a key is and is headed down the path of relational database destruction…
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 9:22:17 AM) All of your statements are predicated on two unstated assumptions: 1. the dba knows enough about the business problem to guarantee that the chosen primary key really will uniquely identify a record. 2. all of the information in the primary key will be available in every case when records are entered. In the real world this is not the case. Databases should be designed to accept data, not just the data that matches the initial ideas of the dba.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 11:03:59 AM) I have an Employee table. Every Employee has a unique Employee ID used as the Primary Key. I have 30k employees. Each has a Title. There are 3,500 distinct titles. Titles aren’t related to a combination of entities. I can’t say "this employee has X, Y and Z so title = 'Sr. Drudge’". I built a Titles table with an Identity as the Primary Key. The Identity is stored in the Employee table.
Apparently this breaks with theory. If anyone can show me a better model that follows theory, please do.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 11:22:30 AM) I am confused. You are saying that is ok to use an External sequence generator as a primary key but it is not ok to use an internal sequence generator. No-one has yet to explain why the federal government is the only entity that is eligible to assign a sequence identification number. For example, SSN, Tax ID, Medical Provider ID, etc. Frankly these are JUST data which can be wrong and can change. I want a KEY THAT IS ALWAYS UNIQUE AND CONSTANT.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 7:35:06 PM) This is Dale, and obviously I agree with you, Joe. I was given an example of a tree and using its circumference and type as an idenitfier but a year later, you can't find the tree because it has grown, therefore the identity column is required. That is wrong. I will explain in the next comment. (500 character limit again)
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 7:39:05 PM) Just as you stated, it is the database designer's responsibility to 'discover' the proper unique identifier to use as a primary key. Try finding tree 23489 in the park. It is not possible. You have two options: First, use existing data such as exact coorinates or distance from a fixed object (oh wait! We forgot plate tectonics!) or, the second option (continued)
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 7:42:53 PM) The second option is to generate a tree number and stamp, engrave, carve, or otherwise permanently affix it to the tree. Now, when we look for a tree by its number we can find the tree. The identification must exist outside of the database or it does not identify an entity.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 7:53:42 PM) In this case, justifying the identity column on the basis of the tree circumference changing over time just means that the designer did not discover the proper unique identifier to use as a primary key, that's all.
Oh, and by the way. Great article.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/26/2005 10:29:57 PM) http://www.dalepreston.com/Blog/2005/08/identity-crisis-cured_26.html
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/29/2005 10:43:58 AM) >> Since the IDENTITY is a "surrogate" key, which by very definition IS redundant since it is standing in for another key. <<
The problem is that IDENTITY is not surogate because it is exposed and has to be manipoulated by the user. The purpose of a surrogate is to speed up PHYSICAL access (indexing, hashing, bit vectors, row_ids, ADABASE numbers, etc) and let the RDBMS engine handle the details to align the relational (LOGICAL) key with the PHYSICAL locator.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/29/2005 10:46:17 AM)
>> All of your statements are predicated on two unstated assumptions: ... <<
Let me paraphrase you: Going into surgery is predicated on two unstated assumptions: 1) The doctor knows enough about the disease to do the job. 2) The doctor has studied enough medicine to the body parts when he gets inside you. An imcompetent doctor or DB designer is another problem.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/29/2005 10:47:33 AM) >> I have an Employee table. Every Employee has a unique Employee ID ..Each has a Title.. <<
Actually, you should have a Personnel table which contains employee data – tables are sets so you use collective or plural names. Your choice of names says you have only one guy in the company. I would favor using the DOT (Dictionary of Occupational Titles) codes; it free, standard and required by the feds for certain reports and for contracts.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/29/2005 10:55:20 AM) >> why the federal government is the only entity that is eligible to assign .. identification umber. ... can be wrong and can change. I want a KEY THAT IS ALWAYS UNIQUE AND CONSTANT. <<
They have the law and the guns on their side :) I hate to tell you, but all data can be wrong. And you want a unique key, which can be validated via some procedure, verified via a trusted source and changed in an orderly fashion. VIN numbers are up for a re-design --want to stop making automobiles?
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/29/2005 10:57:03 AM) >> example of a tree and using its circumference and type as an <<
That tree example is taken from a short description of ISO-11179 data element naming rules. It was not meant to describe a key at all! The guy missed the whole point about key and non-key attributes. I honestly have no idea how commercial tree farms identify a particular tree, but I know that cattle now have an ISO defined RFID ear tag. Maybe they use something like that instead of the old spray paint codes.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 8/31/2005 10:47:08 AM) Totally agree with Joe (I've been an aficionado for years). In fact, I get such a kick through the years when he re-releases this argument on the sophomoric masses.
Having been guilty of the "identity crisis" in the past (several years ago), I strongly suggest reading Joe's books for enlightenment.
RE: Why IDENTITY Is Not A Good Primary Key Solution (posted: 9/5/2005 4:27:38 AM) I agree with Joe. Much better to accept an independently generated key that applies universally than to use a locally generated ID or GUID.
|