Editorials

Natural Keys

Today we are going to return to the topic of a natural key. There are two kinds of keys which have become popular as the unique identifier in relational database tables. The first is a system assigned unique key. This is most often a sequential integer value. It may be a GUID or other randomly assigned value. The second kind of key is a natural key. It may be a GUID, a string, or even a number. The difference is that the natural key comes from an authoritative source external to your database.

System assigned keys are popular because they are guaranteed unique within the system in which they are created. This is a feature of any unique record identifier that must be true. Each instance of a key, used to uniquely identify a record in a table, must by unique. When a key comes from an external source, you do not have the control to assure uniqueness within your own database.

In order to use a natural key, a value must be available for every record in your table. I have seen databases where USA federally assigned Social Security Numbers are used to identify people. This is not a good external, or natural key. The reason is that not all persons who may need to be a part of your database are required to have a Social Security Number. There are often individuals from other countries, children who have not yet been assigned an SSN, and other situations where an SSN may not be known. If a SSN is not available, then you need some other unique identifier for people. Instead, make an SSN a property in your table.

A good example of a natural key would be a key that is assigned by another system you maintain, and is an authoritative source. A CRM system may be that kind of external source. If you are extending the capabilities of your CRM, it makes sense to continue to use a unique identifier assigned by the CRM. By doing so, you assure that the data contained in your tables must also exist in your CRM system. Where this breaks down is when you expand your extension to contain persons not in your CRM database. At that point, it doesn’t make sense to add people into your CRM system simply to create a unique identifier.

Part numbers may be a better example. Especially if you are using a packaged system for managing your bill of materials, etc. Again, if the external system is always authoritative, and no data may exist in your database that doesn’t first exist in the external system , then it is a good candidate for a natural key.

Your own database may be the generator of natural keys. Perhaps you have a purchase order table with a system assigned Purchase Order Number. You have a Purchase Order Line Items table with a record for each row in the purchase order. The Line number is sequential, and unique for each purchase order. So, you could use a composite primary key (Natural Key) of PurchaseOrderId, Line Number. This way you are assured that a line number occurs only once within a single purchase order. You also have one less column in your table because you didn’t create a System Assigned integer.

Natural keys are to be preferred over s
ystem assigned keys, if they are assured to be unique. If you still must have a system assigned value, make it an alternate key. Then your related tables have the added benefit of including all the columns of your natural key.

Cheers,

Ben