Why Use Foreign keys and Referential Integrity
|
Why would you implement referential integrity? Is it beneficial to implement it all the time? When to make the most of it in your applications. |
|
|
Key (Please note):
(R) - registration may be required for access at the target site
($) - target site may require paid membership for access to this or other content
Reader Comments: Post Your Comments/Feedback
RE: Why Use Foreign keys and Referential Integrity by Manuela Fortenberry (posted: 11/21/2008 7:59:18 AM) Someone should have edited this article before publishing it online. There are serious gramatical errors.
RE: Why Use Foreign keys and Referential Integrity by Edward Joell (posted: 11/21/2008 8:43:10 AM) With regard to the grammatical errors, some of us read articles for content rather than nitpicking style. Reminds me of a professor I had in college which knocked my term paper down two grades because I didn't include the Naval Academy graduating class of Admirals King, Halsey and Nimitz when referring to them.
I thought this was a very interesting article, but I be interested in hearing more about the author's ideas on using replication for updating reporting databases.
RE: Why Use Foreign keys and Referential Integrity by Dan Clark (posted: 11/21/2008 10:27:46 AM) In general, the article provides good (but standard) information. It's well known that physical FKs should not be used in a reporting DB or OLAP system. There are some exceptions, but generally this is correct. However...
The author goes on to comment... "Coming back to the topic, will you proceed with maintaining referential integrity on the reporting database? My personal opinion would be that you shouldnt, as the OLTP has already checked for the accuracy of the data. You will not benefit by doing the same again in the reporting database. This is will be the same explanation for not maintaining foreign keys in a OLAP db."
I strongly disagree with this point - you SHOULD maintain RI in a reporting or OLAP DB. The difference is that you do it at the dataset level and not at the transactional level. In an reporting/OLAP DB, you are very concerned about dataset consistency and not transactional consistency.
For example, because of ETL and latency issues in reporting/OLAP DB, you may have to define whether child data (facts) should be excluded if the parent data (dimensional member) is not available. Or you could spoof the dimensional data until the next load period.
Regarding replication... It can work in some circumstances. In the real world, it tends not to be used for a variety of reasons that include:
- Source data not in a compatible DB for replication.
- Reporting/OLAP/DW team has no control over production OLTP system. I.e. the OLTP team tells the OLAP team to "go pound sand" when they request a replication.
- Replication does not provide the granularity of control necessary for a production OLAP ETL system. ETL process metadata data is minimal to non-existent with replication.
So...
While I agree with some of the overall comments about using physical FKs in an OLAP DB (don't us 'em), the statements about relational integrity in a reporting/OLAP systems are misleading - they address OLAP relational integrity issues from an OLTP developer's perspective. OLAP and OLTP are apples and oranges different. Each has it complexities. Do NOT oversimplify the "other side".
Regards,
Dan.
RE: Why Use Foreign keys and Referential Integrity by Jim Drewe (posted: 11/21/2008 11:33:41 AM) I will probably express the minority view, but here goes.
The real question isn't if you will have RI, but who will provide it. If you have a free-wheeling ad hoc environment on production, then make sure you have all the ***database*** RI you can get. If on the other hand only stored procedures update data (and there is rigorous testing before migrating them to production), then what is the value of database RI? You have already established the RI within the application code. I think it boils down to what type of production discipline you have at your shop.
Regarding the performance overhead of database RI, I think it would be beneficial to explain this more fully. Repeatable logical-child inserts with database RI is a horror story. Within the SP, you can interrogate the logical-parent one time and be free to continue inserting. There is one other little known impact with database RI: it is the area of locking. You may think you are only locking on the logical-child during an insert. Not true. You are also locking on the parent (in case someone else deletes the parent row before your unit-of-work completes. If there is a heavy volume of SQL writes going on (DELETE, INSERT, UPDATE), this could be costly in overall system performance.
BTW, if you need database RI in your OLTP environment, you need it in your reporting environment. True reporting doesn't invoke RI. On the other hand, consider what would happen if your dimension table doesn't have the matching value to your fact table. Your SQL won't get a hit. Someone high up in the business might not appreciate your particular physical design.
RE: Why Use Foreign keys and Referential Integrity by Natasha Cain (posted: 12/3/2008 9:09:46 AM) Wouln't querying be optimized by having FK relationships on your tables? Especially if these keys are indexed?
RE: Why Use Foreign keys and Referential Integrity by Jim Drewe (posted: 12/9/2008 12:07:46 PM) I haven't found that declarative FK relationships improve performance any better than just manually defining the proper indexes (on the implied Foreign Keys). Perhaps there might be a nuance where this is true, but I haven't experienced it.
|