Well, well, well… you jacked up your database, didn’t you?
You didn’t pay attention to your data types and you set up your primary key as a decimal, didn’t you? Why are your foreign keys null? You say this is supposed to be released next week? Ha! You’ll be lucky if you can DTS this over, brother.
There’s nothing worse than programming some API around a poorly designed database, one where you have to jump through hoops just to get a little query to return a few values. If your database is set up correctly and you actually take the time to tune it like an engine, you’ll have a nice, scalable solution.
Ensure that you’re doing the right thing when you’re building a database – use these best practices and you won’t go wrong. Now, now – never fear… we’re going to step through some of the common mistakes that database designers make as well as some tips and tricks to optimizing your database to be the screaming beast you originally envisioned.
Be Normal
Look, relational databases are just that, relational. That means that you use Primary and Foreign keys to link’ fields from tables – relating’ the data from one table to the next. And hey, NEWS FLASH – SQL allows the programmer to create join’ statements that return back the data from these related’ tables.
When you are designing a relational database, keep in mind how you’ll be linking’ tables – sometimes you’ll have what’s known as a
one-to-one relationship
and sometimes you’ll have a
one-to-many relationship.
Learn what the difference is between these relationship types… and if you don’t know, get someone else to design your database, plain and simple.
Think Thin
If know that your data in a column will never exceed say, four bytes, use a CHAR data type – you can set up a fixed length field and be done with it. This is effective design and saves space. If the data coming in can vary from 1 to 100 characters, use the VARCHAR. Because the VARCHAR is variable in length, and not a fixed length like CHAR, you may actually save space by choosing VARCHAR. Even though a VARCHAR data type is a bit more expensive in the overhead department than the CHAR, you’ll save on all of those deadly I/O reads on you
Before you slap a few TEXT data types in your table make sure that the length of your input data won’t exceed 8,000 characters – if it will, go for it. You’ll need it. If you’re never going to use more than 8,000 characters, use the VARCHAR data type.
Think thin, especially when deciding on the length of your data types. For example, if you just need to hold a number from 1 to 10, use the TINYINT data type. Don’t specify larger lengths than you actually need! Sure, build in a little buffer’ room once in a while when you need to, but make sure you’re not getting sloppy.
FK MOFO
My ultimate pet peeve: Foreign Keys shouldn’t be nullable, you fools! Ok, ok, I’ll get down off my high horse – but please, if a key is meant to be a key, don’t make it a null field… you’re just asking for trouble.
Don’t get Foreign Key happy. Usually developers will denormalize’ a database, that is to say, they add extra’ foreign keys in tables to ease the pain of the SQL statements needed to get out information. When you’ve go to do it because joins are causing a performance hit, you’re well in your right to denormalize your tables… If you’re just loathe to write join statements… well, that’s just laziness.
Flat on your Back
If you’re not going to use your database for online transaction processing (OLTP), you would be wise to flatten out your data into a wide table that holds as much information as possible. In other words – denormalize, denormalize, denormalize. Don’t go hog wild, obviously, keep any supporting lookup tables that you need as relational tables, but try and pare down the girth of your database into something a little more compact and query-able. Once you’ve flattened your 20 table OLTP setup into a mere 5 tables, the indexer will breath a sigh of relief.
Having a very small and compact database makes for the creation of a data access layer, quick and easy stored procedures and a compact API for database queries.
This is a great strategy to use when you’re looking for high scalability – fine tuning the database into the compact, lean machine that you know it can be.
Table Bloat
Whoa! How many tables do you have in that database? Oh, they’re temporary/dev/whatever tables, eh? You’re just bogging down your performance with all of those nasty, half-completed tables out there… Do yourself a favor and clean those up… Sheesh, I almost sound like your mother!
Indexing
Indexing Wizard? Bah. Look at your SQL statements – you’ll know what foreign keys to index. If you wrote the stored procedures, you’ll know what keys need indexing and which don’t on any particular table.
Don’t rely on wizards to do your job for you – you’re better in the long run if you know what’s indexed and why. Again, don’t go crazy on the Indexes… more is NOT better. Efficiency is the key, you’re not trying to index everything under the sun, just the fields you most commonly join or where clause off of.
Add too many Indexes and your database will be screaming for weight-watchers. It’ll be so fat and slow that when it sits around the house, it really… well, you get the idea.
Data Types
Here’s where a lot of programmers and DBAs get into trouble – they slap data types onto fields without thinking. I’ve seen the decimal datatype used as a primary key, if you can believe that! Listen, before you go hog wild and make everything a string, or use Ntext for that address column, check out what data types are most appropriate for your field.
Null
Nulls are used and abused – they are the bastard children of database columns everywhere, leading to abandoned and incomplete data. Major glitches abound when DBAs and Programmers decide “Uh… I don’t know, I think maybe we should make this nullable”… Don’t do it, unless you absolutely have to make a column null.
Select * From…
This one is real simple, a no-brainer. Don’t “select *” unless you’re going to use every column. Isn’t this pitfall the first thing you learn not to do when programming SQL?
Yet, how many of you still begin your queries with “select *”?
Yeah.
Get rid of it.
Final Word
If you’re doing both the programming and the data design, you’ve got some work ahead of you. Knowing how to effectively design your database and also program the DAL or API that interacts with your data is a valuable skill that many developers don’t have. Make sure that you’re up to speed on designing effective databases and that you know your datatypes – don’t just do it’ – think through your design.
Originally appeared at http://www.tiberi.us.
Tiberius OsBurn is a freelance technology author and speaker based in Omaha,
Nebraska. His book, “Hardcore Development”, will be released in the summer
of 2003. Mr. OsBurn has extensive experience in VB, VB.NET, C#, SQL Server,
ASP.NET and various other web technologies. Be sure to visit his site,
http://www.tiberi.us, for his latest articles of interest to .NET developers.