百度空间 | 百度首页 
 
查看文章
 
SQL Server Database Coding Conventions, 1
2007-05-23 04:10 P.M.
by Vyas Kondreddi


Databases are the heart and soul of many enterprise applications, and it is very essential to pay special attention to database programming. I've seen in many occasions where database programming is overlooked, thinking that it's something easy that be done by anyone. This is wrong.

For a better performing database you need a real DBA and a specialist database programmer, let it be for Microsoft SQL Server, Oracle, Sybase, DB2 or whatever! If you don't use database specialists during your development cycle, databases often end up becoming the performance bottleneck. I decided to write this article in order to put together some of the database programming best practices so that my fellow DBAs and database developers can benefit!

Here are some programming guidelines and best practices, keeping quality, performance and maintainability in mind. This list many not be complete at this moment, and will be constantly updated. BTW, special thanks to Tibor Karaszi (SQL Server MVP) and Linda (lindawie) for taking time to read this article and providing suggestions.

  • Decide upon a database naming convention, standardize it across your organization, and be consistent in following it. It helps make your code more readable and understandable. Click here to see the database object naming convention that I follow.

  • Make sure you normalize your data at least to the 3rd normal form. At the same time, do not compromise on query performance. A little bit of denormalization helps queries perform faster.

  • Write comments in your stored procedures, triggers and SQL batches generously, whenever something is not very obvious. This helps other programmers understand your code clearly. Don't worry about the length of the comments, as it won't impact the performance, unlike interpreted languages like ASP 2.0.

  • Do not use SELECT * in your queries. Always write the required column names after the SELECT statement, like:

SELECT CustomerID, CustomerFirstName, City

This technique results in reduced disk I/O and better performance.

  • Try to avoid server side cursors as much as possible. Always stick to a 'set-based approach' instead of a 'procedural approach' for accessing and manipulating data. Cursors can often be avoided by using SELECT statements instead.

If a cursor is unavoidable, use a WHILE loop instead. I have personally tested and concluded that a WHILE loop is always faster than a cursor. But for a WHILE loop to replace a cursor you need a column (primary key or unique key) to identify each row uniquely. I personally believe every table must have a primary or unique key. Click here to see some examples of using a WHILE loop.

  • Avoid the creation of temporary tables while processing data as much as possible, as creating a temporary table means more disk I/O. Consider using advanced SQL, views, SQL Server 2000 table variable, or derived tables, instead of temporary tables.

  • Try to avoid wildcard characters at the beginning of a word while searching using the LIKE keyword, as that results in an index scan, which defeats the purpose of an index. The following statement results in an index scan, while the second statement results in an index seek:

    SELECT LocationID FROM Locations WHERE Specialities LIKE '%pples'
    SELECT LocationID FROM Locations WHERE Specialities LIKE 'A%s'


    Also avoid searching using not equals operators (<> and NOT) as they result in table and index scans.

  • Use 'Derived tables' wherever possible, as they perform better. Consider the following query to find the second highest salary from the Employees table:

    SELECT MIN(Salary)
    FROM Employees
    WHERE EmpID IN
    (
    SELECT TOP 2 EmpID
    FROM Employees
    ORDER BY Salary Desc
    )


    The same query can be re-written using a derived table, as shown below, and it performs twice as fast as the above query:

    SELECT MIN(Salary)
    FROM
    (
    SELECT TOP 2 Salary
    FROM Employees
    ORDER BY Salary DESC
    ) AS A


    This is just an example, and your results might differ in different scenarios depending on the database design, indexes, volume of data, etc. So, test all the possible ways a query could be written and go with the most efficient one.

  • While designing your database, design it keeping "performance" in mind. You can't really tune performance later, when your database is in production, as it involves rebuilding tables andindexes, re-writing queries, etc. Use the graphical execution plan in Query Analyzer or SHOWPLAN_TEXT or SHOWPLAN_ALL commands to analyze your queries. Make sure your queries do an "Index seek" instead of an "Index scan" or a "Table scan." A table scan or an index scan is a very bad thing and should be avoided where possible. Choose the right indexes on the right columns.

  • Prefix the table names with the owner's name, as this improves readability and avoids any unnecessary confusion. Microsoft SQL Server Books Online even states that qualifying table names with owner names helps in execution plan reuse, further boosting performance.

  • Use SET NOCOUNT ON at the beginning of your SQL batches, stored procedures and triggers in production environments, as this suppresses messages like '(1 row(s) affected)' after executing INSERT, UPDATE, DELETE and SELECT statements. This improves the performance of stored procedures by reducing network traffic.

  • Use the more readable ANSI-Standard Join clauses instead of the old style joins. With ANSI joins, the WHERE clause is used only for filtering data. Where as with older style joins, the WHERE clause handles both the join condition and filtering data. The first of the following two queries shows the old style join, while the second one shows the new ANSI join syntax:

    SELECT a.au_id, t.title
    FROM titles t, authors a, titleauthor ta
    WHERE
    a.au_id = ta.au_id AND
    ta.title_id = t.title_id AND
    t.title LIKE '%Computer%'

    SELECT a.au_id, t.title
    FROM authors a
    INNER JOIN
    titleauthor ta
    ON
    a.au_id = ta.au_id
    INNER JOIN
    titles t
    ON
    ta.title_id = t.title_id
    WHERE t.title LIKE '%Computer%'

  • Do not prefix your stored procedure names with "sp_". The prefix sp_ is reserved for system stored procedure that ship with SQL Server. Whenever SQL Server encounters a procedure name starting with sp_, it first tries to locate the procedure in the master database, then it looks for any qualifiers (database, owner) provided, then it tries dbo as the owner. So you can really save time in locating the stored procedure by avoiding the "sp_" prefix.

  • Views are generally used to show specific data to specific users based on their interest. Views are also used to restrict access to the base tables by granting permission only on views. Yet another significant use of views is that they simplify your queries. Incorporate your frequently required, complicated joins and calculations into a view so that you don't have to repeat those joins/calculations in all your queries. Instead, just select from the view.

  • Use User Defined Datatypes if a particular column repeats in a lot of your tables, so that the datatype of that column is consistent across all your tables.

  • Do not let your front-end applications query/manipulate the data directly using SELECT or INSERT/UPDATE/DELETE statements. Instead, create stored procedures, and let your applications access these stored procedures. This keeps the data access clean and consistent across all the modules of your application, and at the same time centralizing the business logic within the database.

  • Try not to use TEXT or NTEXT datatypes for storing large textual data. The TEXT datatype has some inherent problems associated with it. For example, you cannot directly write or update text data using the INSERT or UPDATE statements. Instead,   you have to use special statements like READTEXT, WRITETEXT and UPDATETEXT. There are also a lot of bugs associated with replicating tables containing text columns. So, if you don't have to store more than 8KB of text, use CHAR(8000) or VARCHAR(8000) datatypes instead.

  • If you have a choice, do not store binary or image files (Binary Large Objects or BLOBs) inside the database. Instead, store the path to the binary or image file in the database and use that as a pointer to the actual binary file stored elsewhere on a server. Retrieving and manipulating these large binary files is better performed outside the database, and after all, a database is not meant for storing files.

  • Use the CHAR data type for a column only when the column is non-nullable. If a CHAR column is nullable, it is treated as a fixed length column in SQL Server 7.0+. So, a CHAR(100), when NULL, will eat up 100 bytes, resulting in space wastage. So, use VARCHAR(100) in this situation. Of course, variable length columns do have a very little processing overhead over fixed length columns. Carefully choose between CHAR and VARCHAR depending up on the length of the data you are going to store.

  • Avoid dynamic SQL statements as much as possible. Dynamic SQL tends to be slower than static SQL, as SQL Server must generate an execution plan every time at runtime. IF and CASE statements come in handy to avoid dynamic SQL. Another major disadvantage of using dynamic SQL is that it requires users to have direct access permissions on all accessed objects, like tables and views. Generally, users are given access to the stored procedures which reference the tables, but not directly on the tables. In this case, dynamic SQL will not work. Consider the following scenario where a user named 'dSQLuser' is added to the pubs database and is granted access to a procedure named 'dSQLproc', but not on any other tables in the pubs database. The procedure dSQLproc executes a direct SELECT on titles table and that works. The second statement runs the same SELECT on titles table, using dynamic SQL and it fails with the following error:

    Server: Msg 229, Level 14, State 5, Line 1
    SELECT permission denied on object 'titles', database 'pubs', owner 'dbo'.

类别:Sql | 添加到搜藏 | 浏览() | 评论 (0)
 
最近读者:
 
网友评论:
发表评论:
姓 名:
网址或邮箱: (选填)
内 容:
验证码: 请点击后输入四位验证码,字母不区分大小写
      

     

©2009 Baidu