Thursday, December 08, 2005
During my college years, I was taught about RDBMS and keys. The keys, also known as constraints in some DBMS, are primary keys (PK), foreign keys (FK) and unique keys. When we draw entity relationship diagrams, we were also advised to lookout for keys and identify them properly as proper primary keys and foreign keys will ensure referential integrity. I was taught to look out for unique keys in a table and use that as a primary key. If I can find a single column that is unique, I can combine multiple columns (composite key) and make them unique. I can group them together and name it as a primary key. Now, after years of development, I still wonder if this is the correct way to do things. There are a lot of disadvantage to the way I was taught. As an example, say you have identified that in a manufacturing department, a product code is unique. Theoretically, we can use this as a primary key. As a primary key, other tables as foreign keys will also reference this. Now, imagine this product code is being used as keys and are stored in product table as PK, invoice table as FK. One day, they decide to change the product code. How do you proceed to do that? As primary key, the code cannot be changed. Even if the database allows it, it will be difficult to change, as you will need to ensure that it is changed at all referencing table. Isn't this a headache? Therefore, I feel that it is better if we maintain a unique identifier, which could be randomly generated or sequentially generated as a primary key instead. This key should have no meaning to the business record. It should be mainly a database table level key only. Now if we apply this to our product table, the product code is simply a unique column on its own with the primary key being another column. The referencing tables such as invoice will use the unique identifier as the foreign key. Any changes to the product code just require changes to one table only and your referential integrity is intact. You also do not need to worry about using composite key as your primary key. Isn't this a cleaner solution? However, selecting data from tables design in this way could lead to slower response, as we will need to join the product table each time we search for a particular product from the invoice table. Compared to using the product code as the key, which we could directly use the product code as a criteria in the invoice table, it will definitely be slower. By how much, I would not be able to say. So, there is this question again, do we strife for better manageability or better performance? For me, I would like better and easier design as we could always scale our hardware. Hardware is cheaper than cracking your head finding expertise to maintain your database.