Sei sulla pagina 1di 47

Questionnaire

Category - Database Design

1) What is denormalization and when would you go for it?


- As the name indicates, denormalization is the reverse process of normalization. It's the controlled introduction of redundancy in to
the database design. It helps improve the query performance as the number of joins could be reduced.

2) How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?
- One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships.
- One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.
- Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite
primary key of the junction table.

3) What's the difference between a primary key and a unique key?


- Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a
clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key
doesn't allow NULLs, but unique key allows one NULL only.

4) What are user defined datatypes and when you should go for them?
- User defined datatypes let you extend the base SQL Server datatypes by providing a descriptive name, and format to the database.
Take for example, in your database, there is a column called Flight_Num which appears in many tables. In all these tables it should be
varchar(8). In this case you could create a user defined datatype called Flight_num_type of varchar(8) and use it across all your tables.

5) What is bit datatype and what's the information that can be stored inside a bit column?
- Bit datatype is used to store boolean information like 1 or 0 (true or false). Untill SQL Server 6.5 bit datatype could hold either a 1 or
0 and there was no support for NULL. But from SQL Server 7.0 onwards, bit datatype can represent a third state, which is NULL.

6) Define candidate key, alternate key, composite key.


- A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the
table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys.
- A key formed by combining at least two or more columns is called composite key.

7) What are defaults? Is there a column to which a default can't be bound?


- A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns
and timestamp columns can't have defaults bound to them. See CREATE DEFUALT in books online.

Category - SQL Server Architecture

1) What is a transaction and what are ACID properties?


- A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency,
Isolation, Durability. These are the properties of a transaction.

2) Explain different isolation levels


- An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level
is Read Committed. Here are the other isolation levels (in the ascending order of isolation): Read Uncommitted, Read Committed,
Repeatable Read, Serializable.

3) CREATE INDEX myIndex ON myTable(myColumn) - What type of Index will get created after executing the above statement?
- Non-clustered index. Important thing to note: By default a clustered index gets created on the primary key, unless specified
otherwise.

4) What's the maximum size of a row?


- 8060 bytes. Don't be surprised with questions like 'what is the maximum number of columns per table'.

5) Explain some cluster configurations


- Two of the clusterning configurations are Active/Active and Active/Passive.

6) What is lock escalation?


- Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into higher level locks (like table
locks). Every lock is a memory structure too many locks would mean, more memory being occupied by locks. To prevent this from
happening, SQL Server escalates the many fine-grain locks to fewer coarse-grain locks. Lock escalation threshold was definable in
SQL Server 6.5, but from SQL Server 7.0 onwards it's dynamically managed by SQL Server.

7) What's the difference between DELETE TABLE and TRUNCATE TABLE commands?
- DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. -
TRUNCATE TABLE also deletes all the rows in a table, but it won't log the deletion of each row, instead it logs the deallocation of the
data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back.
8) Explain the storage models of OLAP
- MOLAP, ROLAP and HOLAP

9) What are constraints? Explain different types of constraints.


- Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you to create triggers, rule or
defaults.
- Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY

10) Whar is an index? What are the types of indexes? How many clustered indexes can be created on a table? I create a separate index
on each column of a table. what are the advantages and disadvantages of this approach?

- Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker.

- Indexes are of two types. Clustered indexes and non-clustered indexes. When you craete a clustered index on a table, all the rows in
the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes
have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered
indexes), with the leaf level nodes having the index key and it's row locater. The row located could be the RID or the Clustered index
key, depending up on the absence or presence of clustered index on the table.

- If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the
existing indexes to come up with an efficient execution plan. At the same t ime, data modification operations (such as INSERT,
UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another
disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

Category - Database Administration

1) What is RAID and what are different types of RAID configurations?


- RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database servers. There are six RAID
levels 0 through 5 offering different levels of performance, fault tolerance.

2) What are the steps you will take to improve performance of a poor performing query?
- This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general
issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of
stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins,
too much normalization, excess usage of cursors and temporary tables.

- Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL ON, SET
SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical
execution plan in Query Analyzer.

3) What are the steps you will take, if you are tasked with securing an SQL Server?
- Again this is another open ended question. Here are some things you could talk about: Preferring NT authentication, using server,
databse and application roles to control access to the data, securing the physical database files using NTFS permissions, using an
unguessable SA password, restricting physical access to the SQL Server, renaming the Administrator account on the SQL Server
computer, disabling the Guest account, enabling auditing, using multiprotocol encryption, setting up SSL, setting up firewalls,
isolating SQL Server from the web server etc.

4) What is a deadlock and what is a live lock? How will you go about resolving deadlocks?
- Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece.
Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server
detects deadlocks and terminates one user's process.

- A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps
interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read
transactions monopolize a table or page, forcing a write transaction to wait indefinitely.

5) What is blocking and how would you troubleshoot it?


- Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type.
This forces the second connection to wait, blocked on the first.

6)How to restart SQL Server in single user mode? How to start SQL Server in minimal configuration mode?
- SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very important parameters with
which a DBA should be familiar with. -m is used for starting SQL Server in single user mode and -f is used to start the SQL Server in
minimal confuguration mode.

7) As a part of your job, what are the DBCC commands that you commonly use for database maintenance?
- DBCC CHECKDB, DBCC CHECKTABLE, DBCC CHECKCATALOG, DBCC CHECKALLOC, DBCC SHOWCONTIG, DBCC
SHRINKDATABASE, DBCC SHRINKFILE etc. But there are a whole load of DBCC commands which are very useful for DBAs.

8) What are statistics, under what circumstances they go out of date, how do you update them?
- Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more,
as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not
while executing a query.

Some situations under which you should update statistics:


a) If there is significant change in the key values in the index
b) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has
changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
c) Database is upgraded from a previous version

9) What are the different ways of moving data/databases between servers and databases in SQL Server?
- There are lots of options available, you have to choose your option depending upon your requirements. Some of the options you have
are: BACKUP/RESTORE, dettaching and attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT,
SELECT...INTO, creating INSERT scripts to generate data.

10) Explian different types of BACKUPs avaialabe in SQL Server? Given a particular scenario, how would you go about choosing a
backup plan?
- Types of backups you can create in SQL Sever 7.0+ are Full database backup, differential database backup, transaction log backup,
filegroup backup. Check out the BACKUP and RESTORE commands in SQL Server books online.

11) What is database replicaion? What are the different types of replication you can set up in SQL Server?
- Replication is the process of copying/moving data between databases on the same or different servers. SQL Server supports the
following types of replication scenarios:

a) Snapshot replication
b) Transactional replication (with immediate updating subscribers, with queued updating subscribers)
Merge replication

12) How to determine the service pack currently installed on SQL Server?
- The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed.

Category - Database Programming

1) What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?

Cursors allow row-by-row prcessing of the resultsets.

Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT
query makes only one rowundtrip, however large the resultset is. Cursors are also costly because they require more resources and
temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with
some types of cursors.

Most of the times, set based operations can be used instead of cursors. Here is an example:

If you have to give a flat hike to your employees using the following criteria:

Salary between 30000 and 40000 -- 5000 hike


Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike

In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above
formula. But the same can be achieved by multiple update statements or can be combined in a single UPDATE statement as shown
below:

UPDATE tbl_emp SET salary =


CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000
WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END

Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row
meets certain condition. You don't have to use cursors for this. This can be achieved using WHILE loop, as long as there is a unique
key to identify each row. For examples of using WHILE loop for row by row processing, check out the 'My code library' section of my
site or search for WHILE.

2) Write down the general syntax for a SELECT statements covering all the options.
Here's the basic syntax: (Also checkout SELECT in books online for advanced syntax).

SELECT select_list
[INTO new_table_]
FROM table_source
[WHERE search_condition]
[GROUP BY group_by_expression]
[HAVING search_condition]
[ORDER BY order_expression [ASC | DESC] ]

3) What is a join and explain different types of joins.

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data
from another table.

Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS,
RIGHT OUTER JOINS and FULL OUTER JOINS.

4) Can you have a nested transaction?

Yes, very much. Check out BEGIN TRAN, COMMIT, ROLLBACK, SAVE TRAN and @@TRANCOUNT

5) What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?

An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services
(ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement. See books
online to learn how to create extended stored procedures and how to add them to SQL Server.

Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.

6) What is the system function to get the current user's user id?

USER_ID(). Also check out other system functions like USER_NAME(), SYSTEM_USER, SESSION_USER, CURRENT_USER,
USER, SUSER_SID(), HOST_NAME().

7) What are triggers? How many triggers you can have on a table? How to invoke a trigger on demand?

Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation
takes place on a table.

In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE. From SQL
Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there's no way to control
the order in which the triggers fire. In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder

Triggers can't be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on
the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks,
but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster.

Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in
SQL Server 2000 you could create pre triggers also.

9) There is a trigger defined for INSERT operations on a table, in an OLTP system. The trigger is written to instantiate a COM object
and pass the newly insterted rows to it for some custom processing. What do you think of this implementation? Can this be
implemented better?

Instantiating COM objects is a time consuming process and since you are doing it from within a trigger, it slows down the data
insertion process. Same is the case with sending emails from triggers. This scenario can be better implemented by logging all the
necessary data into a separate table, and have a job which periodically checks this table and does the needful.

10) What is a self join? Explain it with an example.

Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example:
Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees,
you need a self join.

CREATE TABLE emp


(
empid int,
mgrid int,
empname char(10)
)

INSERT emp SELECT 1,2,'Vyas'


INSERT emp SELECT 2,3,'Mohan'
INSERT emp SELECT 3,NULL,'Shobha'
INSERT emp SELECT 4,2,'Shridhar'
INSERT emp SELECT 5,2,'Sourabh'

SELECT t1.empname [Employee], t2.empname [Manager]


FROM emp t1, emp t2
WHERE t1.mgrid = t2.empid

Here's an advanced query using a LEFT OUTER JOIN that even returns the employees without managers (super bosses)

SELECT t1.empname [Employee], COALESCE(t2.empname, 'No manager') [Manager]


FROM emp t1
LEFT OUTER JOIN
emp t2
ON t1.mgrid = t2.empid
SQL Servers

• What is a major difference between SQL Server 6.5 and 7.0 platform wise?
SQL Server 6.5 runs only on Windows NT Server. SQL Server 7.0 runs on Windows NT Server, workstation and
Windows 95/98.

• Is SQL Server implemented as a service or an application?


It is implemented as a service on Windows NT server and workstation and as an application on Windows 95/98.

• What is the difference in Login Security Modes between v6.5 and 7.0?
7.0 doesn't have Standard Mode, only Windows NT Integrated mode and Mixed mode that consists of both Windows NT
Integrated and SQL Server authentication modes.

• What is a traditional Network Library for SQL Servers?


Named Pipes

• What is a default TCP/IP socket assigned for SQL Server?


1433

• If you encounter this kind of an error message, what you need to look into to solve this problem? "[Microsoft][ODBC SQL
Server Driver][Named Pipes]Specified SQL Server not found."
1.Check if MS SQL Server service is running on the computer you are trying to log into
2.Check on Client Configuration utility. Client and Server have to in sync.

• What are the two options the DBA has to assign a password to sa?
a) to use SQL statement
Use master
Exec sp_password NULL,
b) to use Query Analyzer utility

• What is new philosophy for database devises for SQL Server 7.0?
There are no devises anymore in SQL Server 7.0. It is file system now.

• When you create a database how is it stored?


It is stored in two separate files: one file contains the data, system tables, other database objects, the other file stores the
transaction log.

• Let's assume you have data that resides on SQL Server 6.5. You have to move it SQL Server 7.0. How are you going to do
it?
You have to use transfer command.

DirectConnect
• Have you ever tested 3 tier applications?

• Do you know anything about DirectConnect software? Who is a vendor of the software?
Sybase.

• What platform does it run on?


UNIX.

• How did you use it? What kind of tools have you used to test connection?
SQL Server or Sybase client tools.

• How to set up a permission for 3 tier application?


Contact System Administrator.

• What UNIX command do you use to connect to UNIX server?


FTP Server Name

• Do you know how to configure DB2 side of the application?


Set up an application ID, create RACF group with tables attached to this group, attach the ID to this RACF group.
• Differences between SET and SELECT!
Are standards important to you? If your answer is 'yes', then you should be using SET. This is because, SET is the ANSI
standard way of assigning values to variables, and SELECT is not.

Another fundamental difference between SET and SELECT is that, you can use SELECT to assign values to more than one
variable at a time. SET allows you to assign data to only one variable at a time. Here's how:

/* Declaring variables */
DECLARE @Variable1 AS int, @Variable2 AS int

/* Initializing two variables at once */


SELECT @Variable1 = 1, @Variable2 = 2

/* The same can be done using SET, but two SET statements are needed */
SET @Variable1 = 1
SET @Variable2 = 2

What is the output of following Sql query ?


select datepart(yy,'2/2/50'),datepart(yy,'12/31/49'),datepart(yy,getdate())
Note : Current year is assumed as 2005
Select Answer:
1. 2050 2049 2005
2. 2050 1949 2005
3. 1950 1940 2005
4. 1950 1940 1995
5. 1950 2049 2005

What is the purpose of clustering: Allowing physical server to take task in case of failure of other server
New terminology used in SQL server 7.0 for DUMP,LOAD: BACKUP,RESTORE
What is the Default ScriptTimeOut for Server Object? 20 Sec
How many machine config file possible in system? Only 1
How many webconfig file possible in one system? Depend on no of web application on the System
How many Global.asax file is possible in one application? 1
What is the use of following Statement Response.Expires=120 The page will be removed form cache before 120 Min
What should the developer use in order to have an Active Server Page (ASP) invokes a stored procedure on a SQL
Server database? ADO
Write the query to get the full details of the currently installed SQL server in the machine
execute master..xp_msver
Which choice is NOT an ADO collection? Records
Which is the default Scripting Language on the client side in ASP? JavaScript
What is the query to get the version of the SQL Server? select @@version
which version of sqlserver is not supported by System.Data.SqlClinet? Sql Server 6.5
What is the use of SQL-DMO? It is set of programmable object to programmatically administer the database
What is SQL-DMO stands for? Distributed Management Object
What is the Use of BCP It is used to export and import large file in and out sql
Expand BCP ? Bulk Copy Program
1. How do you read transaction logs?

2. How do you reset or reseed the IDENTITY column?

3. How do you persist objects, permissions in tempdb?

4. How do you simulate a deadlock for testing purposes?

5. How do you rename an SQL Server computer?

6. How do you run jobs from T-SQL?

7. How do you restore single tables from backup in SQL Server 7.0/2000? In SQL Server 6.5?

8. Where to get the latest MDAC from?

9. I forgot/lost the sa password. What do I do?

10. I have only the .mdf file backup and no SQL Server database backups. Can I get my database back into SQL Server?

11. How do you add a new column at a specific position (say at the beginning of the table or after the second column) using
ALTER TABLE command?

12. How do you change or alter a user defined data type?

13. How do you rename an SQL Server 2000 instance?

14. How do you capture/redirect detailed deadlock information into the error logs?

15. How do you remotely administer SQL Server?

16. What are the effects of switching SQL Server from ‘Mixed mode’ to ‘Windows only’ authentication mode? What are the
steps required, to not break existing applications?

17. Is there a command to list all the tables and their associated filegroups?

18. How do you ship the stored procedures, user defined functions (UDFs), triggers, views of my application, in an encrypted
form to my clients/customers? How do you protect intellectual property?

19. How do you archive data from my tables? Is there a built-in command or tool for this?

20. How do you troubleshoot ODBC timeout expired errors experienced by applications accessing SQL Server databases?

21. How do you restart SQL Server service automatically at regular intervals?

22. What is the T-SQL equivalent of IIF (immediate if/ternary operator) function of other programming languages?

23. How do you programmatically find out when the SQL Server service started?

24. How do you get rid of the time part from the date returned by GETDATE function?

25. How do you upload images or binary files into SQL Server tables?

26. How do you run an SQL script file that is located on the disk, using T-SQL?

27. How do you get the complete error message from T-SQL while error handling?

28. How do you get the first day of the week, last day of the week and last day of the month using T-SQL date functions?

29. How do you pass a table name, column name etc. to the stored procedure so that I can dynamically select from a table?

30. Error inside a stored procedure is not being raised to my front-end applications using ADO. But I get the error when I run
the procedure from Query Analyzer.

31. How do you suppress error messages in stored procedures/triggers etc. using T-SQL?
32. How do you save the output of a query/stored procedure to a text file?

33. How do you join tables from different databases?

34. How do you join tables from different servers?

35. How do you convert timestamp data to date data (datetime datatype)?

36. Can I invoke/instantiate COM objects from within stored procedures or triggers using T-SQL?

37. Oracle has a rownum to access rows of a table using row number or row id. Is there any equivalent for that in SQL Server?
Or How do you generate output with row number in SQL Server?

38. How do you specify a network library like TCP/IP using ADO connect string?

39. How do you generate scripts for repetitive tasks like truncating all the tables in a database, changing owner of all the
database objects, disabling constraints on all tables etc?

40. Is there a way to find out when a stored procedure was last updated?

41. How do you find out all the IDENTITY columns of all the tables in a given database?

42. How do you search the code of stored procedures?

43. How do you retrieve the generated GUID value of a newly inserted row? Is there an @@GUID, just like @@IDENTITY?
How to delete the rows which are duplicate (don’t delete both duplicate records).?
SET ROWCOUNT 1
DELETE yourtable
FROM yourtable a
WHERE (SELECT COUNT(*) FROM yourtable b WHERE b.name1 = a.name1 AND b.age1 = a.age1) > 1
WHILE @@rowcount > 0
DELETE yourtable
FROM yourtable a
WHERE (SELECT COUNT(*) FROM yourtable b WHERE b.name1 = a.name1 AND b.age1 = a.age1) > 1
SET ROWCOUNT 0
?

Find top salary among two tables


SELECT TOP 1 sal
FROM (SELECT MAX(sal) AS sal
FROM sal1
UNION
SELECT MAX(sal) AS sal
FROM sal2) a
ORDER BY sal DESC

How to find nth highest salary?


SELECT TOP 1 salary
FROM (SELECT DISTINCT TOP n salary
FROM employee
ORDER BY salary DESC) a
ORDER BY salary

How to know how many tables contains "Col1" as a column in a database?


SELECT COUNT(*) AS Counter
FROM syscolumns
WHERE (name = 'Col1')

SQL Server Performance Enhancement Tips

SQL Server 2000 has many features which when used wisely, will improve the performance of the queries and stored procedures
considerably.
Index Creation

Table indexing will boost the performance of the queries a lot. SQL Server can perform a table scan, or it can use an index
scan. When performing a table scan, SQL Server starts at the beginning of the table, goes row by row in the table, and
extracts the rows that meet the criteria of the query. When SQL Server uses an index, it finds the storage location of the rows needed
by the query and extracts only the needed rows.

Avoid creating indexes on small tables since it will take more time for SQL Server to perform an index scan than to perform a
simple table scan.

If a table has too many transactions happening in it (INSERT/UPDATE/DELETE), keep the number of indexes minimal. For each
transaction, indexes created in the table are re-organized by SQL Sever, which reduces performance.

Index Tuning Wizard, which is available in the SQL Server Enterprise Manager, is a good tool to create optimized indexes. You can
find it in Tools->Wizards->Management->Index Tuning Wizard

Avoid using “Select *”

Select only the columns that are necessary from the table. Select * always degrades performance of SQL queries.

Avoid using UNION and try to use UNION All wherever possible

When UNION operation is performed, SQL Serer combines the resultsets of the participating queries and then performs a “Select
DISTINCT” on these to eliminate the duplicate rows.

Instead if UNION ALL is used “Select DISTINCT” is not performed which improves performance a lot. So if you know that your
participating queries will not return duplicate rows, then use UNION ALL.

Usage of stored procedures instead of large complex queries

If stored procedures are used instead of very large queries the advantage is that the network traffic will reduce
significantly since from the application, only the Stored procedure name and some of its parameters are passed to the SQL
Server rather than passing the entire query.

Further when a stored procedure is executed, SQL Server creates an execution plan and it is kept in memory. Any further
execution to this stored procedure will be very fast since the execution plan is readily available. If queries are used
instead of stored procedures, each time an execution plan will be generated by SQL Serer which makes the process slow.

Restrict number of rows by using Where clause

When ever the number of rows can be restricted, use Where clause to reduce the network traffic and enhance
performance.

Instead of Tempoprary Tables use Table Variable

Table variables requires less locking and logging and so they are more efficient than Temp Tables (# Tables).

Avoid using Temp Tables inside stored procedures

If temporary tables are used inside stored procedures, SQL Server may not reuse the execution plan each time the stored procedure is
called. So this will reduce performance.

• What Stored Procedure means

A Stored procedure is a database object that contains one or more SQL statements. In this article you will get an idea on how to create
and use stored procedures and also highlighted on how to use stored procedure.

The first time a stored procedure is executed; each SQL statement it contains is compiled and executed to create an execution plan.
Then procedure is stored in compiled form with in the database. For each subsequent execution, the SQL statements are executed
without compilation, because they are precompiled. This makes the execution of stored procedure faster than the execution of an
equivalent SQL script.

To execute a stored procedure you can use EXEC statement.


CREATE PROC spGetAuthors
AS
SELECT * FROM AUTHORS

When you run this script in Pubs database you will get the following message in Query Analyzer.
The Command(s) completed successfully.

Now you are ready to call/execute this procedure from Query Analyzer.

EXEC spGetAuthors

This stored procedure creates a result set and returns to client.

You can call a stored procedure from within another stored procedure. You can even call a stored procedure from within itself. This
technique is called a recursive call in programming. One of the advantages of using stored procedures is that application programmers
and end users don’t need to know the structure of the database or how to code SQL. Another advantage of it is they can restrict and
control access to a database.
Now days every one is familiar with SQL Injection Attack I think stored are the way this can be prevented from this malicious attack.

How to Create a Stored Procedure

When the CREATE PROCEDURE statement is executed, the syntax of the SQL statements within the procedure is checked. If you
have made a coding error the system responds with an appropriate message and the procedure is not created.

The Syntax of the CREATE PROCEDURE statement


CREATE {PROC|PROCEDURE} Procedure_name
[Parameter_declaration]
[WITH {RECOMPILE|ENCRYPTION|RECOMPILE, ENCRYPTION}]
AS sql_statements

You can use CREATE PROCEDURE statement to create a stored procedure in the database. The name of the stored procedure can be
up to 128 characters and is typically prefixed with the letters sp.
If you look at the above options like AS, RECOMPILE, ENCRYPTION these are having some significance meaning to it.
The AS clause contains the SQL statements to be executed by the stored procedure. Since a stored procedure must consist of single
batch.
Recompile is used when you want to compile the stored procedure every time when you call. This comes into the picture when one
doesn’t want to catch the execution plan of stored procedure in the database. Encryption implies that you want to hide this code so that
no one can see it. This is very important when you want to distribute the code across the globe or when you want to sell this code to
other vendors. But make sure you have original copy it; because once you encrypted it no one can decrypt it.

Apart from the stored procedure that store in the database a permanent entity you can create stored procedure as per you session. That
means as long the as the session is alive then the stored procedure is available in the memory means in the database.
Once the session ends the stored procedure is vanished this actually depends on what type of stored procedure you have chosen to
create it.

Stored procedure provide for two different types of parameters: input parameters and Output Parameters. An input Parameter is passed
to the stored procedure from the calling program. An output parameter is returned to the calling program from the stored procedure.
You can identify an output parameter with the OUTPUT keyword. If this keyword is omitted the parameter is assumed to be an input
parameter.
You can declare an input parameter so that it requires a value or so that its value is optional. The value of a required parameter must be
passed to the stored procedure from the calling program on an error occurs. The value of an optional parameter doesn’t need to be
passed from the calling program. You identify an optional parameter by assigning a default value to it. Then if a value isn’t passed
from the calling program, the default value is used. You can also use output parameter as input parameters. That is you can pass a
value from the calling program to the stored procedure through an output parameter. However is not advisable to pass parameters to
Output parameters.

The syntax for declaring the parameters

@Parameter_name_1 data_type [= default] [OUTPUT]


[, @Parameter_name_2 data_type [= default] [OUTPUT]…
Parameter declarations
@FirstName varchar(50) -- Input parameter that accepts a string.
@LastName varchar(50) -- Output Parameter that returns a string.
Create Procedure statement that uses an input and an output parameter.

CREATE PROC spGetAuthors


@FirstName varchar(50),
@LastName varchar(50)
AS
SELECT @LastName= ln_Name
FROM AUTHORS
WHERE fn_name = @FirstName

Create procedure statement that uses an optional parameter.

CREATE PROC spGetAuthors


@LastName varchar(50),
@FirstName varchar(50) = ‘vijay’
AS
SELECT @LastName= ln_Name
FROM AUTHORS
WHERE fn_name = @FirstName

A stored procedure can declare up to 2100 parameters. If you declare two or more parameters, the declarations must be separated by
commas.

Calling stored procedure with Parameters

To pass parameter values to a stored procedure, you code the values in the EXEC statement after the procedure name. You can pass the
parameters either by position or by name.

Passing parameters by Name:

Write the following code in Query Analyzer


DECLARE @LN VARCHAR(100)
EXEC spGetAuthors @FirstName = ‘krishna’, @LastName = @LN OUTPUT

Passing parameters by Position:


DECLARE @LN VARCHAR(100)
EXEC spGetAuthors @LN OUTPUT, ‘krishna’

In fact you can use both notations to pass parameters to stored procedures when you are calling. To pass parameters by position, list
them in the same order as they appear in the CREATE PROCEDURE statement and separate them with commas. When you use this
technique, you can omit optional parameters only if they are declared after any required parameters.

To use an output parameter in the calling program, you must declare a variable to store its value. Then you use the name of the
variable in the EXEC statement and you code the OUTPUT keyword after it to identify it as an output parameter.

Handling error in stored procedure

In addition to passing output parameters back to the calling program, stored procedures also pass back a return value. By default, this
value is zero. If an error occurs during the execution of a stored procedure you may want to pass a value back to the calling
environment that indicates the error that occurred. To do that you use the RETURN statement and the @@ERROR function.

The @@ERROR system function returns the error number that’s generated by the execution of the most recent SQL statement. If the
value is zero, it means that no error has occurred. The stored procedure listed below uses this function to test whether a DELETE
statement that deletes a row from authors table is successful.
CREATE PROC spDeleteAuthors @FirstName varchar(50)
As
DECLARE @ErrorVar int
DELETE FROM AUTHORS WHERE fn_name = @FirstName
SET @ErrorVar = @ERROR
IF @ErrorVar <> 0
BEGIN
PRINT ‘An Unknown Error Occurred’
RETURN @ErrorVar
END

RETURN statement immediately exists the procedure and returns an optional integer value to the calling environment. If you don’t
specify the value in this statement the return value is zero.

How to delete or change a stored procedure

You use DROP PROC statement to delete one or more stored procedures from database. To redefine the stored procedure you use
ALTER PROC.

The syntax of the DROP PROC statement


DROP {PROC|PROCEDURE} Procedure_name [, …]

The syntax of the ALTER PROC statement


ALTER {PROC|PROCEDURE} Procedure_name
[Parameter_declaration]
[WITH {RECOMPILE|ENCRYPTION|RECOMPILE, ENCRYPTION}]
AS sql_statements

When you delete a procedure any security permission that are assigned to the procedure are also deleted. In that case you will want to
use the ALTER PROC statement to modify the procedure and preserve permissions.

• Based on this you can divide it in 4 sections.


System stored procedures
Local stored procedures
Temporary stored procedures
Extended stored procedures

System Stored Procedures:

System stored procedures are stored in the Master database and are typically named with a sp_ prefix. They can be used to perform
variety of tasks to support SQL Server functions that support external application calls for data in the system tables, general system
procedures for database administration, and security management functions.
For example, you can view the contents of the stored procedure by calling
sp_helptext [StoredProcedure_Name].

There are hundreds of system stored procedures included with SQL Server. For a complete list of system stored procedures, refer to
"System Stored Procedures" in SQL Server Books Online.

Local stored procedures

Local stored procedures are usually stored in a user database and are typically designed to complete tasks in the database in which
they reside. While coding these procedures don’t use sp_ prefix to you stored procedure it will create a performance bottleneck. The
reason is when you can any procedure that is prefixed with sp_ it will first look at in the mater database then comes to the user local
database.

Temporary stored procedures

A temporary stored procedure is all most equivalent to a local stored procedure, but it exists only as long as SQL Server is running or
until the connection that created it is not closed. The stored procedure is deleted at connection termination or at server shutdown. This
is because temporary stored procedures are stored in the TempDB database. TempDB is re-created when the server is restarted.

There are three types of temporary stored procedures: local , global, and stored procedures created directly in TempDB.
A local temporary stored procedure always begins with #, and a global temporary stored procedure always begins with ##. The
execution scope of a local temporary procedure is limited to the connection that created it. All users who have connections to the
database, however, can see the stored procedure in Query Analyzer. There is no chance of name collision between other connections
that are creating temporary stored procedures. To ensure uniqueness, SQL Server appends the name of a local temporary stored
procedure with a series of underscore characters and a connection number unique to the connection. Privileges cannot be granted to
other users for the local temporary stored procedure. When the connection that created the temporary stored procedure is closed, the
procedure is deleted from TempDB.

Any connection to the database can execute a global temporary stored procedure. This type of procedure must have a unique name,
because all connections can execute the procedure and, like all temporary stored procedures, it is created in TempDB. Permission to
execute a global temporary stored procedure is automatically granted to the public role and cannot be changed. A global temporary
stored procedure is almost as volatile as a local temporary stored procedure. This procedure type is removed when the connection used
to create the procedure is closed and any connections currently executing the procedure have completed.

Temporary stored procedures created directly in TempDB are different than local and global temporary stored procedures in the
following ways:

You can configure permissions for them.


They exist even after the connection used to create them is terminated.
They aren't removed until SQL Server is shut down.
Because this procedure type is created directly in TempDB, it is important to fully qualify the database objects referenced by Transact-
SQL commands in the code. For example, you must reference the Authors table, which is owned by dbo in the Pubs database, as
pubs.dbo.authors.

--create a local temporary stored procedure.


CREATE PROCEDURE #tempAuthors
AS
SELECT * from [pubs].[dbo].[authors]

--create a global temporary stored procedure.


CREATE PROCEDURE ##tempAuthors
AS
SELECT * from [pubs].[dbo].[authors]

--create a temporary stored procedure that is local to tempdb.


CREATE PROCEDURE directtemp
AS
SELECT * from [pubs].[dbo].[authors]

Extended Stored Procedures

An extended stored procedure uses an external program, compiled as a 32-bit dynamic link library (DLL), to expand the capabilities of
a stored procedure. A number of system stored procedures are also classified as extended stored procedures. For example, the
xp_sendmail program, which sends a message and a query result set attachment to the specified e-mail recipients, is both a system
stored procedure and an extended stored procedure. Most extended stored procedures use the xp_ prefix as a naming convention.
However, there are some extended stored procedures that use the sp_ prefix, and there are some system stored procedures that are not
extended and use the xp_ prefix. Therefore, you cannot depend on naming conventions to identify system stored procedures and
extended stored procedures.
Use the OBJECTPROPERTY function to determine whether a stored procedure is extended or not. OBJECTPROPERTY returns a
value of 1 for IsExtendedProc, indicating an extended stored procedure, or returns a value of 0, indicating a stored procedure that is
not extended.
USE Master
SELECT OBJECTPROPERTY(object_id('xp_sendmail'), 'IsExtendedProc')
1. To see current user name
Sql> show user;

2. Change SQL prompt name


SQL> set sqlprompt “Manimara > “
Manimara >
Manimara >

3. Switch to DOS prompt


SQL> host

4. How do I eliminate the duplicate rows ?


SQL> delete from table_name where rowid not in (select max(rowid) from table group by duplicate_values_field_name);
or
SQL> delete duplicate_values_field_name dv from table_name ta where rowid <(select min(rowid) from table_name tb where
ta.dv=tb.dv);
Example.
Table Emp
Empno Ename
101 Scott
102 Jiyo
103 Millor
104 Jiyo
105 Smith
delete ename from emp a where rowid < ( select min(rowid) from emp b where a.ename = b.ename);
The output like,
Empno Ename
101 Scott
102 Millor
103 Jiyo
104 Smith

5. How do I display row number with records?


To achive this use rownum pseudocolumn with query, like SQL> SQL> select rownum, ename from emp;
Output:
1 Scott
2 Millor
3 Jiyo
4 Smith

6. Display the records between two range


select rownum, empno, ename from emp where rowid in
(select rowid from emp where rownum <=&upto
minus
select rowid from emp where rownum<&Start);
Enter value for upto: 10
Enter value for Start: 7

ROWNUM EMPNO ENAME


--------- --------- ----------
1 7782 CLARK
2 7788 SCOTT
3 7839 KING
4 7844 TURNER

7. I know the nvl function only allows the same data type(ie. number or char or date Nvl(comm, 0)), if commission is null
then the text “Not Applicable” want to display, instead of blank space. How do I write the query?

SQL> select nvl(to_char(comm.),'NA') from emp;

Output :

NVL(TO_CHAR(COMM),'NA')
-----------------------
NA
300
500
NA
1400
NA
NA

8. Oracle cursor : Implicit & Explicit cursors


Oracle uses work areas called private SQL areas to create SQL statements.
PL/SQL construct to identify each and every work are used, is called as Cursor.
For SQL queries returning a single row, PL/SQL declares all implicit cursors.
For queries that returning more than one row, the cursor needs to be explicitly declared.

9. Explicit Cursor attributes


There are four cursor attributes used in Oracle
cursor_name%Found, cursor_name%NOTFOUND, cursor_name%ROWCOUNT, cursor_name%ISOPEN

10. Implicit Cursor attributes


Same as explicit cursor but prefixed by the word SQL

SQL%Found, SQL%NOTFOUND, SQL%ROWCOUNT, SQL%ISOPEN

Tips : 1. Here SQL%ISOPEN is false, because oracle automatically closed the implicit cursor after executing SQL statements.
: 2. All are Boolean attributes.

11. Find out nth highest salary from emp table


SELECT DISTINCT (a.sal) FROM EMP A WHERE &N = (SELECT COUNT (DISTINCT (b.sal)) FROM EMP B WHERE
a.sal<=b.sal);
Enter value for n: 2
SAL
---------
3700

12. To view installed Oracle version information


SQL> select banner from v$version;

13. Display the number value in Words


SQL> select sal, (to_char(to_date(sal,'j'), 'jsp'))
from emp;
the output like,

SAL (TO_CHAR(TO_DATE(SAL,'J'),'JSP'))
--------- -----------------------------------------------------
800 eight hundred
1600 one thousand six hundred
1250 one thousand two hundred fifty
If you want to add some text like,
Rs. Three Thousand only.
SQL> select sal "Salary ",
(' Rs. '|| (to_char(to_date(sal,'j'), 'Jsp'))|| ' only.'))
"Sal in Words" from emp
/
Salary Sal in Words
------- ------------------------------------------------------
800 Rs. Eight Hundred only.
1600 Rs. One Thousand Six Hundred only.
1250 Rs. One Thousand Two Hundred Fifty only.

14. Display Odd/ Even number of records


Odd number of records:
select * from emp where (rowid,1) in (select rowid, mod(rownum,2) from emp);
1
3
5
Even number of records:
select * from emp where (rowid,0) in (select rowid, mod(rownum,2) from emp)
2
4
6

15. Which date function returns number value?


months_between

16. Any three PL/SQL Exceptions?


Too_many_rows, No_Data_Found, Value_Error, Zero_Error, Others

17. What are PL/SQL Cursor Exceptions?


Cursor_Already_Open, Invalid_Cursor

18. Other way to replace query result null value with a text
SQL> Set NULL ‘N/A’
to reset SQL> Set NULL ‘’

19. What are the more common pseudo-columns?


SYSDATE, USER , UID, CURVAL, NEXTVAL, ROWID, ROWNUM

20. What is the output of SIGN function?


1 for positive value,
0 for Zero,
-1 for Negative value.

21. What is the maximum number of triggers, can apply to a single table?
12 triggers.
Q. I want to use a Primary key constraint to uniquely identify all employees, but I also want to check that the values entered
are in a particular range, and are all non-negative. How can I set up my constraint to do this?

A. It’s a simple matter of setting up two constraints on the employee id column: the primary key constraint to guarantee unique
values, and a check constraint to set up the rule for a valid, non-negative range; for example:

Create Table Employee ( Emp_Id Integer Not Null Constraint PkClustered Primary Key
Constraint ChkEmployeeId Check ( Emp_Id Between 1 And 1000) )

Or, if you want to create table-level constraints (where more than one column defines the primary key):

Create Table Employee ( Emp_Id Int Not Null , Emp_Name VarChar(30) Not Null,
Constraint PkClustered Primary Key (Emp_Id),
Constraint ChkEmployeeId Check ( Emp_Id Between 0 And 1000) )

Q. What’s a quick way of outputting a script to delete all triggers in a database (I don’t want to automatically delete, just
review the scripts, and selectively delete based on the trigger name)?

A. There are probably a number of ways, but this may be a job for a cursor. Try the following Transact SQL script:

Declare @Name VarChar (50)


Declare Cursor_ScriptTriggerDrop Cursor
For Select name from sysObjects where type = 'Tr'

Open Cursor_ScriptTriggerDrop
Fetch Next From Cursor_ScriptTriggerDrop Into @Name
Print 'If Exists (Select name From sysObjects Where name = ' + Char(39) + @Name +
Char(39)+ ' and type = ' + Char(39) + 'Tr' + Char(39) + ')'
Print 'Drop Trigger ' + @Name
Print 'Go'
While @@Fetch_Status = 0
Fetch Next From Cursor_ScriptTriggerDrop Into @Name
Begin
Print 'If Exists (Select name From sysObjects Where name = ' + Char(39)
+ @Name + Char(39) + ' and type = ' + Char(39) + 'Tr' + Char(39) + ')'
Print 'Drop Trigger ' + @Name
Print 'Go'
End
Close Cursor_ScriptTriggerDrop
DeAllocate Cursor_ScriptTriggerDrop

Q. Is there a function I can use to format dates on the fly as they’re added to a resultset? I need to support a large range of
formatted outputs.

A. Take a deep breath and build it yourself. Here’s a useful function to give you sophisticated formatting abilities:

Use Pubs
Go
If Object_Id ( 'fn_FormatDate' , 'Fn' ) Is Not Null Drop Function fn_FormatDate
Go
Create Function fn_FormatDate
(
@Date DateTime --Date value to be formatted
,@Format VarChar(40) --Format to apply
)
Returns VarChar(40)
As
Begin
-- Insert Day:
Set @Format = Replace (@Format , 'DDDD' , DateName(Weekday , @Date))
Set @Format = Replace (@Format , 'DDD' ,
Convert(Char(3),DateName(Weekday , @Date)))
Set @Format = Replace (@Format , 'DD' , Right(Convert(Char(6) , @Date,12) ,
2))
Set @Format = Replace (@Format , 'D1' , Convert(VarChar(2) , Convert(Integer
, Right(Convert(Char(6) , @Date , 12) , 2))))

-- Insert Month:
Set @Format = Replace (@Format , 'MMMM', DateName(Month , @Date))
Set @Format = Replace (@Format , 'MMM', Convert(Char(3) , DateName(Month
, @Date)))
Set @Format = Replace (@Format , 'MM',Right(Convert(Char(4) , @Date,12),2))
Set @Format = Replace (@Format , 'M1',Convert(VarChar(2) , Convert(Integer ,
Right(Convert(Char(4) , @Date,12),2))))

-- Insert the Year:


Set @Format = Replace (@Format,'YYYY' , Convert(Char(4) , @Date , 112))
Set @Format = Replace (@Format,'YY' , Convert(Char(2) , @Date , 12))

-- Return function value:


Return @Format
End
Go

-- Examples:

Set NoCount On

Select dbo.FormatDate(Ord_Date,'dddd, mmmm d1, yyyy') From Pubs..Sales


Where stor_id = 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mm/dd/yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mm-dd-yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'yyyymmdd') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mmm-yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'

Set NoCount Off

Q. If SQL Server does on-the-fly caching and parameterisation of ordinary SQL statements do I need to build libraries of
stored procedures with input parameters that simply wrap the SQL statement itself?

A. I’m sure this question has come up before, relating to SQL Server 7, but it remains a good question. The answer’s ‘yes and no’
(though more yes than no). It’s true that both SQL Server 7 and 2000 do ad hoc caching/ parameterization of SQL statements, but
there are a couple of ‘gotchas’ attached. First of all the following two statements will generate an executable plan in SQL Server’s
cache with a usage count of two (i.e. the plan is put in cache and reused by the second statement):

Select Title From Pubs.Dbo.Titles Where Price = £20


Select Title From Pubs.Dbo.Titles Where Price = £7.99

However if the currency indicator is dropped, one plan is marked with an Integer parameter, the other with a Numeric parameter (as
each is auto-parameterized). Worse still, if the case of any part of the column name(s), database, owner, or object name(s) changes, in
a following SQL statement, SQL Server fails to recognize the structural identity of the next statement and generates another plan in
cache.

To check this out, run both statements (with currency marker and identical case) along with the following call:

Select CacheObjType , UseCounts , Sql From Master..sysCacheObjects

(If possible run Dbcc FreeProcCache before the three SQL statements, but beware: it drops all objects in cache).

You should find that the executable plan has been found and reused (the UseCounts value will be ‘2’). Now, either drop the currency
indicators or change the case in one of the ‘Select Title … ‘ statements and note that a fresh executable (and compiled) plan is
generated for the second statement.
Now create a stored procedure with Price as an input parameter, run it with or without currency markers, changing the case of the call
at will, and the procedure will be found and reused – check the UseCounts column value.

Q. How can I get a quick list of all the options set for a particular session?

A. Here’s a stored procedure which will give you a list of all options set in a session. If you create it in Master, with the ‘sp_’ prefix
it’ll be available within any database:

Create Procedure sp_GetDBOptions


As
Set NoCount On
/* Create temporary table to hold values */
Create Table #Options
(OptId Integer Not Null , Options_Set VarChar(25) Not Null )

If @@Error <> 0
Begin
Raiserror('Failed to create temporary table #Options',16,1)
Return(@@Error)
End

/* Insert values into the Temporary table */


Insert Into #Options Values (0,'NO OPTIONS SET')
Insert Into #Options Values (1,'DISABLE_DEF_CNST_CHK')
Insert Into #Options Values (2,'IMPLICIT_TRANSACTIONS')
Insert Into #Options Values (4,'CURSOR_CLOSE_ON_COMMIT')
Insert Into #Options Values (8,'ANSI_WARNINGS')
Insert Into #Options Values (16,'ANSI_PADDING')
Insert Into #Options Values (32,'ANSI_NULLS')
Insert Into #Options Values (64,'ARITHABORT')
Insert Into #Options Values (128,'ARITHIGNORE')
Insert Into #Options Values (256,'QUOTED_IDENTIFIER')
Insert Into #Options Values (512,'NOCOUNT')
Insert Into #Options Values (1024,'ANSI_NULL_DFLT_ON')
Insert Into #Options Values (2048,'ANSI_NULL_DFLT_OFF')

If @@Options <> 0
Select Options_Set
From #Options
Where (OptId & @@Options) > 0
Else
Select Options_Set
From #Options
Where Optid = 0

Set NoCount Off

Q. Is it possible to get a list of all of the system tables and views that are in Master only?

A. Perfectly easy, I’ll even order the output by type and name:

Select type, name From Master..sysObjects Where Type In ('S', 'V')


AND name Not In (Select name From Model..sysObjects) order by type, name

Q. Is there an easy way to get a list of all databases a particular login can access?

A. Yes, there’s an undocumented procedure called sp_MsHasDbAccess which gives precisely the information you want. Use the
SetUser command to impersonate a user (SetUser loginname) and run sp_MsHasDbAccess with no parameters. To revert to your
default status, run SetUser with no parameters.

Other recommended sources for Microsoft SQL FAQs:


http://support.microsoft.com/support/default.asp?PR=sql&FR=0&SD=MSDN&LN=EN-US (for FAQ and Highlights for SQL)
http://support.microsoft.com/view/dev.asp?id=hl&pg=sql.asp for Microsoft technical advice)
http://support.microsoft.com/support/sql/70faq.asp (UK based)
http://www.mssqlserver.com/faq/ (an independent support company)
http://www.sqlteam.com/ (a good source for answers to SQL questions)
http://www.sql-server-performance.com/default.asp (for SQL performance issues)
http://www.sql-server.co.uk/frmMain.asp (UK SQL user group - need to do free registration)
http://www.advisor.com/wHome.nsf/w/MMB (a useful US based VB/SQL Magazine)
http://www.advisor.com/wHome.nsf/w/JMSS (a useful US based SQL Magazine)
http://secure.duke.com/nt/sqlmag/newsub.cfm?LSV=NunLjpaxugA-hA9hauyikjkarnlkTKgEAQ&Program=9 (another useful SQL
magazine)

SQL FAQ
Details of "Frequently Asked Questions" (FAQ) dealing with common SQL Server 7.0 problems.

SQL Server 2000 Frequently Asked Questions

Q. Does SQL Server still use Transact SQL?

A. Yes, Transact SQL is still the language engine for SQL Server. There are a number of extensions to support the new features of SQL
Server 7, but routines written for earlier versions should still run as before.

SQL Server 7 does automate a number of administrative chores however, so you should check to see if it’s still necessary to run all of
your TSQL scripts.

Q. Does SQL Server support the ANSI standard?

A. Yes, SQL Server 7 is fully ANSI-compliant (ANSI SQL-92). There are, of course a number of proprietary extensions which provide
extra functionality – for example the multi-dimensional RollUp and Cube extensions.

Q. Can I still run SQL Server from DOS and Windows 3.x?

A. Yes. Both can still act as client operating systems. Of course there are a number of features that won’t be available (for example,
the Enterprise Manager, which requires a Windows 9.x or NT client operating system).

Remember that you can run SQL Server 7’s server components on a Windows 9.x machine, but security is less strictly applied on
Windows 9.x systems.

Q. What’s happened to the SQL Executive service?

A. SQL Agent now replaces the SQL Executive service which controlled scheduling operations under earlier versions of SQL Server.
SQL Agent is far more flexible, allowing Jobs (formerly Tasks) to be run as a series of steps.

MSDB is remains the scheduling database storing all of SQL Agent’s scheduling information.

Q. ISQL seems to have been replaced by OSQL. What’s the difference, if any?

A. ISQL is still there, and it continues to use ODBC to connect to SQL Server. However SQL Server 7 has been designed to use OLE
DB as its data access interface. OLE DB is designed to be faster and more flexible than ODBC.

While ODBC allows access to SQL data only, OLE DB can work with both structured and unstructured data. OLE DB is a C-like
interface, but developers using Visual Studio have access to a code wrapper known as Active Data Objects.

Q. I’m presently using the RDO (Remote Data Object) library to work with SQL Server data. Should I migrate to ADO or
wait and see?

A. RDO is a tried and tested data access technology, that’s been tweaked and tuned over two versions. If you’re developing a new
system from scratch, consider using ADO, as its object model is flatter and more streamlined than RDO, but if you’re supporting a
large, complex suite of applications using RDO to interface with SQL Server, adopt a more cautious, incremental approach. Consider
migrating portions of your application suite to ADO, perhaps encapsulating particular functions in classes.

Q. How has the Distributed Management Framework changed in SQL Server 7?


A. The Distributed Management Framework uses SQL Server’s own object library to make property and method calls to SQL Server
components. SQL Server 6.5 used the SQLOLE object library (SqlOle65.dll); SQL Server 7 uses a new version, the SQLDMO library
(SqlDmo.Enu).

Certain collections are no longer supported (notably the Devices collection) and others have changed their name. You can still choose
to manipulate SQL Server using the new, snap-in Enterprise Manager (which makes the SQLDMO calls for you) or make the calls
yourself from an ActiveX-compliant client.

Q. How can I test the Authentication mode of a particular SQL Server? I don’t want to pass a user Id and password across the
network if I don’t have to.

A. If you can call SQLDMO functions directly, you’ll find an Enumeration (a group of SQL Server constants) called
SQLDMO_LOGIN_TYPE. This Enumeration supports three login modes: (1) SQLDMOLogin_Standard, (2)
SQLDMOLogin_NTUser, and (3) SQLDMOLogin_NTGroup.

You need to create an instance of the enumeration, test the login type and pass the appropriate values.

Q. I have a group of TSQL scripts that query SQL Server’s system tables. Can I continue to use them with SQL Server 7?

A. Yes, but be aware this isn’t good practice. If Microsoft change the structure of system tables in a later release your scripts may fail
or provide inaccurate information. Consider instead using the new information schema views (for example
information_schema.tables). These are system table independent, but still give access to SQL Server metadata.

Q. I’ve tried to load SQL Server 7, but the setup insists on loading Internet explorer 4.01 "or later". I’m perfectly happy with
IE3; do I have to upgrade to IE4?

A. Unfortunately, yes. All operating systems require IE4.01 or later to be loaded before SQL Server 7 can be installed.

Q. Can I speed up SQL Server sort operations by choosing a non-default Sort Order?

A. In short, yes. The default remains Dictionary Order, Case Insensitive. A binary sort will be faster, but consider that you may not get
result sets back in the order you expect. It’s up to check the sequence in which character data is returned.

Note also that changing the sort order after installation requires that you rebuild all your databases, and you can’t carry out backups
and restores between SQL Servers with differing Sort Orders.

Q. Do I need to set the same Sort Order for both Unicode and non-Unicode data?

A. It’s recommended that you adopt the same Sort Order for both Unicode and non-Unicode data as you may experience problems
migrating non-Unicode data to Unicode. In addition Unicode and non-Unicode data may sort differently.

Q. I’m having problems sending Email notifications to other SQL Servers. What could be the problem?

A. It could be a host of problems, but the most common is running under a LocalSystem account rather that a Domain User account.
Under the LocalSystem account you can’t communicate with another server expecting a Trusted Connection.

Q. Can I still use the Net Start command to start SQL Server in single-user mode or with a minimal configuration?

A. Yes. To start SQL Server in single-user mode the syntax is net start mssqlserver –m typed at the Command Prompt. To start SQL
Server with the minimum set of drivers loaded, type net start mssqlserver –f at the Command Prompt.

Q. After registering a SQL Server 7 I noticed the system databases weren’t visible in the Enterprise Manager. Have they been
removed from the Enterprise Manager interface?

A. No, they’re still available to manage via the Enterprise Manager, but by default they’re not displayed.

Q. Under SQL Server 6.5 I managed Permissions using Grant and Revoke statements. How does the new Deny statement
work?
A. A Revoke used to be the opposite of a Grant – removing a Permission to do something. Under SQL Server 7 a permission is
removed by a Deny statement and the Revoke statement has become ‘neutral’. In other works you can Revoke a Grant or Revoke a
Deny.

Other points to note:

SQL Server 7 places an icon in the System Tray on your TaskBar allowing a visual check of the current status of either the Server,
SQL Agent or the Distributed Transaction Co-ordinator.

SQL Server, SQL Agent or the Distributed Transaction Co-ordinator can all be started, paused, and stopped using a right mouse-click
on the System Tray icon.

The SQL Server Distributed Management Object (SQLDMO) library can be directly programmed to carry out almost any SQL Server,
SQL Agent, or DTC task by manipulating properties and methods.

To give a visual display of the Execution Plan adopted by SQL Server, type the query into the Microsoft SQL Server Query Analyser
and select the toolbar icon 'Display SQL Execution Plan'.

To output statistics from particular parts of the Execution Plan generated by the SQL Server Query Analyser, pause the pointer over
the appropriate icon. For instance Index statistics, including the individual and total subtree costs, can be displayed by selecting the
icon captioned with the index name.

The output from the SQL Server Query Analyser can be run into a grid, or output in raw format to the results pane. To generate the
output in a grid, compose the query and click the toolbar icon 'Execute Query into Grid'.

When creating a new SQL Server database you have the option to increase the database size by fixed increments, or by a percentage of
the current size of the database.

SQL Server automatically performs query parallelism with multi-processor computers. Operators provide process management, data
redistribution, and flow control for queries which would benefit from query parallelism.

To determine who's currently logged on use the system procedure sp_who


SQL Server 7 automatically adjusts its memory requirements according to the current load on the operating system.

SQL Server 7 removes the administrative chore of periodically updating statistics by a combination of table, row and index sampling.

Connections to SQL Server should now be routed through the Microsoft Active Data Objects. SQL Server 7 provides the ADO 2.0
library which supersedes ODBC. Connection is made by calling an ADO Provider.

SQL Server 7 now provides full, cost-based locking: automatically de-escalating to a single-row lock as well as escalating to a Table
lock.

Server Roles allow extensive privileges in one area of SQL Server Administration to be assigned, e.g. Process Administration, while
implicitly denying all other Administration rights. The task of System Administrator can now be compartmentalised into a number of
discrete, mutually-exclusive roles.

Single queries can now reference 32 tables, and the number of internal work tables used by a query is no longer limited to 16 as in
earlier versions.

SQL Server tables now support multiple triggers of the same type - e.g. several Insert triggers on the same table. This allows greater
flexibility of Business Rule processing.

The SQL Server Page size has increased to 8K removing the upper limit of 1960 bytes per data row.

SQL Server Extents are now 64K in size (increasing by a factor of eight). Multiple objects can now share the same Extent until they
grow large enough to be allocated their own Extent. 'Uniform' extents are owned by a single object, all eight pages in the extent can
only be used by the owning object. 'Mixed' extents are shared by up to eight objects. A new table or index is allocated pages from
mixed extents. When the table or index grows to the point that it has eight pages, it is switched to a uniform extent.

SQL Server Tables now support up to 1024 columns.

The SQL Server ODBC Driver now supports the ODBC 3.5 API. The driver has also been enhanced to support the bulk copy
functions originally introduced in DB-Library, and now possesses the ability to get metadata for linked tables used in distributed
queries.

The SqlServer Distributed Management Object (SQLDMO) library can now be called using either of the two major scripting
languages: VBScript or Jscript.
SQL Server 7 now supports Unicode data types: nchar, nvarchar, ntext. These DataTypes are exactly the same as char, varchar, and
text, except for the wider range of characters supported and the increased storage space used. Data can be stored in multiple languages
within one database, getting rid of the problem of converting characters and installing multiple code pages.

The maximum length of the char, varchar, and varbinary data types has been increased to 8000 bytes, substantially greater than the
previous limit of 255 bytes in SQL Server 6.0/6.5. This growth in size allows Text and Image datatypes to be reserved for very large
data values.

SQL Server 7 allows the SubString function to be used in the processing of both text and image data values.

GUID's (128 bit, Globally unique Identifiers) are now automatically generated via the UniqueIdentifier DataType.

Most of SQL Server's functionality is now supported on Windows 95/98. Exceptions are processes like Symmetric Multiprocessing,
Asynchronous I/O, and integrated security, supported on NT platforms.

SQL Server's new Replication Wizard automates the task of setting up a distributed solution. Replication is now a simple task, and is
significantly easier to set up, administer, deploy, monitor, and troubleshoot.

SQL Server 7 has introduced a new Replication model 'Merge Replication' allowing 'update anywhere' capability. Use Merge
Replication with care however, as it does not guarantee transactional consistency, as does the traditional transactional replication.

SQL Server Tasks have now become multistep Jobs, allowing the Administrator to schedule the job, manage job step flow, and store
job success or failure information in one central location.

Indexes are substantially more efficient in SQL Server 7. In earlier versions of SQL Server, nonclustered indexes used physical record
identifiers (page number, row number) as row locators, even if a Clustered index had been built. Now, if a table has a clustered index
(and thus a clustering key), the leaf nodes of all nonclustered indexes use the clustering key as the row locator rather than the physical
record identifier. Of course, if a table does not have a clustered index, nonclustered indexes continue to use the physical record
identifiers to point to the data pages.

Setting up SQL Server to use its LocalSystem account restricts SQL Server to local processes only. The LocalSystem account has no
network access rights at all.

When setting up replication, it's sensible to set up the Publisher and all its Subscribers to share the same account.

Ensure you develop the appropriate standard for SQL Server's character set. If you need to change it later, you must rebuild the
databases and reload the data. Server-to-server activities may fail if the character set is not consistent across servers within your
organisation.

The SQL Server Upgrade Wizard can be run either following SetUp or later, at your convenience.

If appropriate, you can choose to Autostart any or all of the three SQL Server processes: the MSSQLServer itself, SQLAgent, or the
Distributed Transaction Co-ordinator.

Rather than backing up data to a device such as disk or tape, SQL Server backs up data through shared memory to a virtual device.
The data can then be picked up from the virtual device and backed up by a custom application.

Ensure you select the correct sort order when you first install SQL Server. If you need to change sort orders after installation, you must
rebuild your databases and reload your data.

The simplest and fastest sort order is 'Binary'. The collating sequence for this sort order is based on the numeric value of the characters
in the installed character set. Using the binary sort order, upper-case Z sorts before lower-case a because the character Z precedes the
character a in all character sets.

Remember that SQL Server's options for dictionary sort orders (Case-sensitive, Case-Insensitive) carry with them a trade-off in
performance.

Since SQL Server 6.0 login passwords have been encrypted. To check this, look in Master's syslogins Table What appears to be
garbled text is actually a textual representation of the binary, encrypted password.

When using Windows 95/98 clients, client application and server-to-server connections must be over TCP/IP Sockets instead of
Named Pipes. Named Pipes is not an available protocol in Windows 95/98.

To help trap the cause of SQL Server errors, while the error dialog still showing, look at Sqlstp.log in the \Windows or \WinNT
directory. Check the last few events recorded in the log to see if any problems occurred before the error message was generated.

When upgrading from an earlier version of SQL Server, The Upgrade Wizard estimates only the disk space required. It cannot give an
exact requirement.
When upgrading from SQL Server 6.x, remember to set TempDb to at least 25 MB in your SQL Server 6.x installation.

Forget memory tuning! Unlike SQL Server 6.x, SQL Server 7.0 can dynamically size memory based on available system resources.

SQL Server's Upgrade Wizard allows the selection of databases for upgrade to SQL Server 7 databases. If you run the SQL Server
Upgrade Wizard again after databases have been upgraded, previously updated databases will default to the excluded list. If you want
to upgrade a database again, move it to the included list.

All tasks scheduled by SQL Executive, for a SQL Server 6.x environment are transferred and upgraded so that SQL Server 7.0 can
schedule and run the tasks in SQL Server Agent.

SQL Server's Quoted_Identifier setting determines what meaning SQL Server gives to double quotation marks. With 'Set
Quoted_Identifier = Off, double quotation marks delimit a character string, in the same way that single quotation marks do. With 'Set
Quoted_Identifier = On, double quotation marks delimit an identifier, such as a column name.

SQL Server 7 can be installed side-by-side with SQL Server 6.x on the same computer, however only one version can be run at one
time. When the SQL Server Upgrade Wizard is complete, SQL Server 7 is marked as the active version of SQL Server. If you have
enough disk space, it is a good idea to leave SQL Server 6.x on your computer until you are sure the version upgrade to SQL Server
7.0 was successful.

Failover Support for SQL Server 7 is designed to work in conjunction with Microsoft Cluster Server (MSCS). Managing Failover
Support provides the ability to install a virtual SQL Server that is managed through the MSCS Cluster Administrator. MSCS monitors
the health of the primary (active) and secondary (idle) nodes, the SQL Server application, and shared disk resources. Upon failure of
the primary node, services will automatically 'fail over' to the secondary node, and uncommitted transactions will be rolled back prior
to reconnection of clients to the database.

You can launch any Windows NT-based application from SQL Server's Enterprise Manager. External applications can be added and
run from the Tools menu.

With SQL Server 7 exactly the same database engine can be used across platforms ranging from laptop computers running Microsoft
Windows 95/98 to very large, multiprocessor servers running Microsoft Windows NT, Enterprise Edition.

SQL Server Performance Monitor allows the SQL Server Administrator to set up SQL Server-specific counters in the NT Performance
Monitor, allowing monitoring and graphing of the performance of SQL Server with the same tool used to monitor Windows NT
Servers.

In addition to a programmable library of SQLDMO SQL Server Distributed Management Objects), SQL Server 7 also offers access to
the Object Library for DTS (Data Transformation Services).

The Windows 95/98 operating systems do not support the server side of the trusted connection API. When SQL Server is running on
Windows 95 or 98, it does not support an Integrated Security model.

When running an application on the same computer as SQL Server, you can use refer to the SQL Server using the machine name or
'(local)' or '.'.

SQL Server 7 uses a new algorithm for comparing fresh Transact-SQL statements with any Transact-SQL statements which have
created an existing execution plans. If SQL Server 7 finds that a new Transact-SQL statement matches an existing execution plan, it
reuses the plan. This reduces the performance benefit of pre-compiling stored procedures into execution plans.

To check Server Role membership, use sp_helpsrvrole; to extract the specific permissions for each role execute sp_srvrolepermission.

Every user in a SQL Server database belongs to the Public role. If you want everyone in a database to be able to have a specific
permission, assign the permission to the public role. If a user has not been specifically granted permissions on an object, they use the
permissions assigned to the Public role.

At the top of every SQL Server 8K Page is a 96 byte header used to store system information such as the type of page, the amount of
free space on the page, and the object ID of the object owning the page.

Rows still can't span pages in SQL Server. In SQL Server 7, the maximum amount of data contained in a single row is 8060 bytes,
excluding text, ntext, and image data, which are held in separate pages.

SQL Server 7 will automatically shrink databases that have a large amount of free space. Only those databases where the AutoShrink
option has been set to true will be shrunk. The server checks the space usage in each database periodically. If a database is found with
a lot of empty space and it has the AutoShrink option set to true, SQL Server will reduce the size of the files in the database.

SQL Server uses a Global Allocation Map (GAM) to record what extents have been allocated. Each GAM covers 64,000 extents, or
nearly 4 GB of data. The GAM has one bit for each extent in the interval covered. If the bit is 1, the extent is free; if the bit is 0, the
extent is allocated.
SQL Server Index Statistics are stored as a long string of bits across multiple pages in the same way that Image data is stored. The
column sysindexes.statblob points to this distribution data. You can use the DBCC SHOW_STATISTICS statement to get a report on
the distribution statistics for an index.

Because Non-Clustered indexes store Clustered index keys as their pointers to data rows, it is important to keep Clustered index keys
as small as possible. Do not choose large columns as the keys to clustered indexes if a table also has Non-Clustered indexes.

In SQL Server 7, individual text, ntext, and image pages are not limited to holding data for only one occurrence of a text, ntext, or
image column. A text, ntext, or image page can hold data from multiple rows; the page can even have a mix of text, ntext, and image
data.

In SQL Server 7, Text and Image data pages are logically organized in a b-tree structure, while in earlier versions of SQL Server they
were linked together in a page chain.

Because SQL Server 7 can store small amounts of text, ntext, or image data in the same Page, you can insert 20 rows that each have
200 bytes of data in a text column, with the data and all the root structures fitting onto the same 8K page.

User Connections are cheaper in SQL Server 7. Under SQL Server 6.5 each connection 'cost' 44K of memory; each connection under
SQL Server 7 costs only 24K of memory.

If you need to maintain existing processes Pause rather than Stop your SQL Server. Pausing SQL Server prevents new users from
logging in and gives you time to send a message to current users asking them to complete their work and log out before you Stop the
server. If you stop SQL Server without Pausing it, all server processes are terminated immediately. Stopping SQL Server prevents new
connections and disconnects current users. Note that you can't pause SQL Server if it was started by running sqlservr. Only SQL
Server services started as NT services can be paused.

If you need to start SQL Server in minimal configuration to correct configuration problems, stop the SQLServerAgent service before
connecting to SQL Server. Otherwise, the SQLServerAgent service uses the connection and blocks your connection to SQL Server.

When specifying a trace flag with the /T option, make sure you use an uppercase "T" to pass the trace flag number. A lowercase "t" is
accepted by SQL Server, but this sets other internal trace flags that are required only by SQL Server support engineers.

Be careful when stopping your SQL Server. If you stop SQL Server using Ctrl+C at the command prompt, it does not perform a
CHECKPOINT in every database. Therefore, the recovery time is increased the next time the server is started.

To shutdown your SQL Server immediately, issue 'SHUTDOWN WITH NOWAIT'. This stops the server immediately, but it requires
more recovery time the next time the server is started because no CHECKPOINT is issued against any databases.

If you prefer to avoid the command line, data can be transferred into a SQL Server table from a data file using the 'Bulk Insert'
statement. The Bulk Insert statement allows you to bulk copy data into SQL Server using the bcp utility with a Transact-SQL
statement: e.g. Bulk Insert pubs..publishers FROM 'c:\publishers.txt' With (DataFileType = 'char').

When running the Bulk Copy Program (BCP), use the -n parameter where possible. Storing information in native format is useful
when information is to be copied from one computer running SQL Server to another. Using native format saves time and space,
preventing unnecessary conversion of data types to and from character format. However, a data file in native format cannot be read by
any program other than BCP.

Native format BCP can now be used to bulk copy data from one computer running SQL Server to another running with a different
processor architecture. This was impossible with earlier versions of SQL Server.

When running the Bulk Copy Program (BCP), the new -6 parameter, when used in conjunction with either native format (-n) or
character format (-c), uses SQL Server 6/6.5 data types. Use this parameter when using data files generated by BCP in native or
character format from SQL Server 6/6.5, or when generating data files to be loaded into SQL Server 6/6.5. Note that the -6 parameter
is not applicable to the Bulk Insert statement.

With BCP, the SQL Server Char DataType is always stored in the data file as the full length of the defined column. For example, a
column defined as Char(10) always occupies 10 characters in the data file regardless of the length of the data stored in the column;
spaces are appended to the data as padding. Note that any pre-existing space characters are indistinguishable from the padding
characters added by BCP.

When running the Bulk Copy Program (BCP), choose terminators with care to ensure that their pattern does not appear in any of the
data. For example, when using tab terminators with a field that contains tabs as part of the data, bcp does not know which tab
represents the end of the field. The bcp utility always looks for the first possible character(s) that matches the terminator it expects.
Using a character sequence with characters that do not occur in the data avoids this conflict.

When running the Bulk Copy Program (BCP), you may decide to drop the indexes on the table prior to loading a large amount of.
Conversely, if you are loading a small amount of data relative to the amount of data already in the table, dropping the indexes may not
be necessary because the time taken to rebuild the indexes can be longer than performing the bulk copy operation.
If, for any reason, a BCP operation aborts before completion, the entire transaction is rolled back, and no new rows are added to the
destination table.

Following a BCP operation, it's necessary to identify any rows that violate constraints or triggers. To do this run queries or stored
procedures that test the constraint or trigger conditions, such as: UPDATE pubs..authors SET au_fname = au_fname. Although this
query does not change data to a different value, it causes SQL Server to update each value in the au_fname column to itself. This
causes any constraints or triggers to fire, testing the validity of the inserted rows.

During BCP, users often try to load an ASCII file in native format. This leads to misinterpretation of the hexadecimal values in the
ASCII file and the generation of an "unexpected end of file" error message. The correct method of loading the ASCII file is to
represent all fields in the data file as a character string (i.e character format BCP), and let SQL Server do the data conversion to
internal data types as rows are inserted into the table.

During BCP In, a hidden character in an ASCII data file can cause problems, generating an "unexpected null found" error message.
Many utilities and text editors display hidden characters which can usually be found at the bottom of the data file. Finding and
removing these characters should resolve the problem.

With BCP, it's possible to specify the number of rows to load from the data file rather than loading the entire data file. For example, to
load only the first 150 rows from a 10,000 row data file, specify the -L last_row parameter when loading the data.

After a bulk load using BCP, from a data file, into a table with an index, execute 'Update Statistics' so that SQL Server can continue to
optimise queries made against the table.
Deleting Duplicate Records
graz on 3/26/2001 in DELETEs
Seema writes "There is a Table with no key constraints. It has duplicate records. The duplicate records have to be deleted (eg
there are 3 similar records, only 2 have to be deleted). I need a single SQL query for this." This is a pretty common question so I
thought I'd provide some options.
First, I'll need some duplicates to work with. I use this script to create a table called dup_authors in the pubs database. It selects a
subset of the columns and creates some duplicate records. At the end it runs a SELECT statement to identify the duplicate records:
select au_lname, au_fname, city, state, count(*)
from dup_authors
group by au_lname, au_fname, city, state
having count(*) > 1
order by count(*) desc, au_lname, au_fname
The easiest way I know of to identify duplicates is to do a GROUP BY on all the columns in the table. It can get a little
cumbersome if you have a large table. My duplicates look something like this:
au_lname au_fname city state
--------------- ---------- -------------------- ----- -----------
Smith Meander Lawrence KS 3
Bennet Abraham Berkeley CA 2
Carson Cheryl Berkeley CA 2
except there are thirteen additional duplicates identified.
Second, backup your database. Third, make sure you have a good backup of your database.
Temp Table and Truncate
The simplest way to eliminate the duplicate records is to SELECT DISTINCT into a temporary table, truncate the original table
and SELECT the records back into the original table. That query looks like this:
select distinct *
into #holding
from dup_authors

truncate table dup_authors

insert dup_authors
select *
from #holding

drop table #holding


If this is a large table, it can quickly fill up your tempdb. This also isn't very fast. It makes a copy of your data and then makes
another copy of your data. Also while this script is running, your data is unavailable. It may not be the best solution but it certainly
works.
Rename and Copy Back
The second option is to rename the original table to something else, and copy the unique records into the original table. That looks
like this:
sp_rename 'dup_authors', 'temp_dup_authors'

select distinct *
into dup_authors
from temp_dup_authors

drop table temp_dup_authors


This has a couple of benefits over the first option. It doesn't use tempdb and it only makes one copy of the data. On the downside,
you'll need to rebuild any indexes or constraints on the table when you're done. This one also makes the data unavailable during
the process.
Create a Primary Key
Our last option is more complex. It has the benefit of not making a copy of the data and only deleting the records that are
duplicates. It's main drawback is that we have to alter the original table and add a sequential record number field to uniquely
identify each record. That script looks like this:
-- Add a new column
-- In real life I'd put an index on it
Alter table dup_authors add NewPK int NULL
go

-- populate the new Primary Key


declare @intCounter int
set @intCounter = 0
update dup_authors
SET @intCounter = NewPK = @intCounter + 1
Handling Errors in Stored Procedures
Garth on 2/5/2001 in Stored Procs
The following article introduces the basics of handling errors in stored procedures. If you are not familiar with the difference
between fatal and non-fatal errors, the system function @@ERROR, or how to add a custom error with the system stored
procedure sp_addmessage, you should find it interesting.
The examples presented here are specific to stored procedures as they are the desired method of interacting with a database. When
an error is encountered within a stored procedure, the best you can do (assuming it’s a non-fatal error) is halt the sequential
processing of the code and either branch to another code segment in the procedure or return processing to the calling application.
Notice that the previous sentence is specific to non-fatal errors. There are two type of errors in SQL Server: fatal and non-fatal.
Fatal errors cause a procedure to abort processing and terminate the connection with the client application. Non-fatal errors do not
abort processing a procedure or affect the connection with the client application. When a non-fatal error occurs within a procedure,
processing continues on the line of code that follows the one that caused the error.
The following example demonstrates how a fatal error affects a procedure.
USE tempdb
go
CREATE PROCEDURE ps_FatalError_SELECT
AS
SELECT * FROM NonExistentTable
PRINT 'Fatal Error'
go
EXEC ps_FatalError _SELECT
--Results--
Server:Msg 208,Level 16,State 1,Procedure ps_FatalError_SELECT,Line 3
Invalid object name 'NonExistentTable'.
The SELECT in the procedure references a table that does not exist, which produces a fatal error. The procedure aborts processing
immediately after the error and the PRINT statement is not executed.
To demonstrate how a non-fatal error is processed, I need to create the following table.
USE tempdb
go
CREATE TABLE NonFatal
(
Column1 int IDENTITY,
Column2 int NOT NULL
)
This example uses a procedure to INSERT a row into NonFatal, but does not include a value for Column2 (defined as NOT
NULL).
USE tempdb
go
CREATE PROCEDURE ps_NonFatal_INSERT
@Column2 int =NULL
AS
INSERT NonFatal VALUES (@Column2)
PRINT 'NonFatal'
go
EXEC ps_NonFatal_INSERT
--Results--
Server:Msg 515,Level 16,State 2,Procedure ps_NonFatal_INSERT,Line 4
Cannot insert the value NULL into column 'Column2',table 'tempdb.dbo.NonFatal';
column does not_allow nulls.INSERT fails.
The statement has been terminated.
NonFatal
The last line of the results (shown in blue) demonstrates that the error did not affect the processing of the procedure—the PRINT
statement executed.
You might be wondering what actions cause fatal errors. Unfortunately, the actions that cause a fatal error are not well
documented. Each error has an associated severity level that is a value between 0–25. The errors with a severity level of 20 or
above are all fatal, but once you get below this value there is no well-defined rule as to which errors are fatal. In truth, though,
worrying about which errors are fatal is a bit useless because there is no code you can implement that will allow you to handle
them gracefully. Of course, you can use pro-actice coding to make sure fatal-errors do not occur. For example, if your application
allows users to type in the name of the table on which a query is based you can verify it’s existence before referencing it with
dynamic SQL.
@@ERROR
The @@ERROR system function is used to implement error handling code. It contains the error ID produced by the last SQL
statement executed during a client’s connection. When a statement executes successfully, @@ERROR contains 0. To determine if
a statement executes successfully, an IF statement is used to check the value of the function immediately after the target statement
executes. It is imperative that @@ERROR be checked immediately after the target statement, because its value is reset when the
next statement executes successfully.
Let’s alter ps_NonFatal_INSERT to use @@ERROR with the following.
Transact-SQL Query
SQL Server Performance Tuning Tips

This tip may sound obvious to most of you, but I have seen professional developers, in two major SQL Server-based applications used
worldwide, not follow it. And that is to always include a WHERE clause in your SELECT statement to narrow the number of rows
returned. If you don't use a WHERE clause, then SQL Server will perform a table scan of your table and return all of the rows. In some
case you may want to return all rows, and not using a WHERE clause is appropriate in this case. But if you don't need all the rows
returned, use a WHERE clause to limit the number of rows returned.

By returning data you don't need, you are causing SQL Server to perform I/O it doesn't need to perform, wasting SQL Server resources. In
addition, it increases network traffic, which can also lead to reduced performance. And if the table is very large, a table scan will lock the
table during the time-consuming scan, preventing other users from accessing it, hurting concurrency.

Another negative aspect of a table scan is that it will tend to flush out data pages from the cache with useless data, which reduces SQL
Server's ability to reuse useful data in the cache, which increases disk I/O and hurts performance. [6.5, 7.0, 2000] Updated 4-17-2003

*****

To help identify long running queries, use the SQL Server Profiler Create Trace Wizard to run the "TSQL By Duration" trace. You can
specify the length of the long running queries you are trying to identify (such as over 1000 milliseconds), and then have these recorded in
a log for you to investigate later. [7.0]

*****

When using the UNION statement, keep in mind that, by default, it performs the equivalent of a SELECT DISTINCT on the final result
set. In other words, UNION takes the results of two like recordsets, combines them, and then performs a SELECT DISTINCT in order to
eliminate any duplicate rows. This process occurs even if there are no duplicate records in the final recordset. If you know that there are
duplicate records, and this presents a problem for your application, then by all means use the UNION statement to eliminate the duplicate
rows.

On the other hand, if you know that there will never be any duplicate rows, or if there are, and this presents no problem to your
application, then you should use the UNION ALL statement instead of the UNION statement. The advantage of the UNION ALL is that is
does not perform the SELECT DISTINCT function, which saves a lot of unnecessary SQL Server resources from being using. [6.5, 7.0,
2000] Updated 10-30-2003

*****

Sometimes you might want to merge two or more sets of data resulting from two or more queries using UNION. For example:

SELECT column_name1, column_name2


FROM table_name1
WHERE column_name1 = some_value
UNION
SELECT column_name1, column_name2
FROM table_name1
WHERE column_name2 = some_value

This same query can be rewritten, like the following example, and when doing so, performance will be boosted:

SELECT DISTINCT column_name1, column_name2


FROM table_name1
WHERE column_name1 = some_value OR column_name2 = some_value

And if you can assume that neither criteria will return duplicate rows, you can even further boost the performance of this query by
removing the DISTINCT clause. [6.5, 7.0, 2000] Added 6-5-2003

*****

Carefully evaluate whether your SELECT query needs the DISTINCT clause or not. Some developers automatically add this clause
to every one of their SELECT statements, even when it is not necessary. This is a bad habit that should be stopped.
The DISTINCT clause should only be used in SELECT statements if you know that duplicate returned rows are a possibility, and that
having duplicate rows in the result set would cause problems with your application.

The DISTINCT clause creates a lot of extra work for SQL Server, and reduces the physical resources that other SQL statements have at
their disposal. Because of this, only use the DISTINCT clause if it is necessary. [6.5, 7.0, 2000] Updated 10-30-2003

*****

In your queries, don't return column data you don't need. For example, you should not use SELECT * to return all the columns from a
table if you don't need all the data from each column. In addition, using SELECT * prevents the use of covered indexes, further potentially
hurting query performance. [6.5, 7.0, 2000] Updated 6-21-2004

*****

If your application allows users to run queries, but you are unable in your application to easily prevent users from returning hundreds, even
thousands of unnecessary rows of data they don't need, consider using the TOP operator within the SELECT statement. This way, you
can limit how may rows are returned, even if the user doesn't enter any criteria to help reduce the number or rows returned to the client.
For example, the statement:

SELECT TOP 100 fname, lname FROM customers


WHERE state = 'mo'

limits the results to the first 100 rows returned, even if 10,000 rows actually meet the criteria of the WHERE clause. When the specified
number of rows is reached, all processing on the query stops, potentially saving SQL Server overhead, and boosting performance.

The TOP operator works by allowing you to specify a specific number of rows to be returned, like the example above, or by specifying a
percentage value, like this:

SELECT TOP 10 PERCENT fname, lname FROM customers


WHERE state = 'mo'

In the above example, only 10 percent of the available rows would be returned.

In SQL Server 2005, a new argument has been added for the TOP statement. Books Online specifies:

[
TOP (expression) [PERCENT]
[ WITH TIES ]
]

Example:

USE AdventureWorks
GO
SELECT TOP(10) PERCENT WITH TIES
EmployeeID, Title, DepartmentID, Gender, BaseRate
FROM HumanResources.Employee
ORDER BY BaseRate DESC

What the WITH TIES option does is to allow more than the specified number or percent of rows to be returned if the the values of the last
group of rows are identical. If you don't use this option, then any number of tied rows will be arbitrarily dropped so that the exact number
of rows specified by the TOP statement are only returned.

In addition to the above new feature, SQL Server 2005 allows the TOP statement to be used with DML statements, such as DELETE,
INSERT and UPDATE. Also, the TOP statement cannot be used in conjunction with UPDATE and DELETE statements on partitioned
views.

No changes were made to the SET ROWCOUNT statement in SQL Server 2005, and usually the SET ROWCOUNT value overrides the
SELECT statement TOP keyword if the ROWCOUNT is the smaller value.

Keep in mind that using this option may prevent the user from getting the data they need. For example, the data the are looking for may be
in record 101, but they only get to see the first 100 records. Because of this, use this option with discretion. [7.0, 2000] Updated 4-7-2005

*****

You may have heard of a SET command called SET ROWCOUNT. Like the TOP operator, it is designed to limit how many rows are
returned from a SELECT statement. In effect, the SET ROWCOUNT and the TOP operator perform the same function.
While is most cases, using either option works equally efficiently, there are some instances (such as rows returned from an unsorted heap)
where the TOP operator is more efficient than using SET ROWCOUNT. Because of this, using the TOP operator is preferable to using
SET ROWCOUNT to limit the number of rows returned by a query. [6.5, 7.0, 2000] Updated 10-30-2003

*****

In a WHERE clause, the various operators used directly affect how fast a query is run. This is because some operators lend
themselves to speed over other operators. Of course, you may not have any choice of which operator you use in your WHERE clauses, but
sometimes you do.

Here are the key operators used in the WHERE clause, ordered by their performance. Those operators at the top will produce results faster
than those listed at the bottom.

• =

• >, >=, <, <=

• LIKE

• <>
This lesson here is to use = as much as possible, and <> as least as possible. [6.5, 7.0, 2000] Added 5-30-2003
*****
In a WHERE clause, the various operands used directly affect how fast a query is run. This is because some operands lend
themselves to speed over other operands. Of course, you may not have any choice of which operand you use in your WHERE clauses, but
sometimes you do.
Here are the key operands used in the WHERE clause, ordered by their performance. Those operands at the top will produce results faster
than those listed at the bottom.

• A single literal used by itself on one side of an operator

• A single column name used by itself on one side of an operator, a single parameter used by itself on one side of an operator

• A multi-operand expression on one side of an operator

• A single exact number on one side of an operator

• Other numeric number (other than exact), date and time

• Character data, NULLs


The simpler the operand, and using exact numbers, provides the best overall performance. [6.5, 7.0, 2000] Added 5-30-2003
*****
If a WHERE clause includes multiple expressions, there is generally no performance benefit gained by ordering the various
expressions in any particular order. This is because the SQL Server Query Optimizer does this for you, saving you the effort. There are
a few exceptions to this, which are discussed on this web site[7.0, 2000] Added 5-30-2003
*****
Don't include code that doesn't do anything. This may sound obvious, but I have seen this in some off-the-shelf SQL Server-based
applications. For example, you may see code like this:
SELECT column_name FROM table_name
WHERE 1 = 0
When this query is run, no rows will be returned. Obviously, this is a simple example (and most of the cases where I have seen this done
have been very long queries), a query like this (or part of a larger query) like this doesn't perform anything useful, and shouldn't be run. It
is just wasting SQL Server resources. In addition, I have seen more than one case where such dead code actually causes SQL Server to
through errors, preventing the code from even running. [6.5, 7.0, 2000] Added 5-30-2003
*****
By default, some developers, especially those who have not worked with SQL Server before, routinely include code similar to this in their
WHERE clauses when they make string comparisons:
SELECT column_name FROM table_name
WHERE LOWER(column_name) = 'name'
In other words, these developers are making the assuming that the data in SQL Server is case-sensitive, which it generally is not. If your
SQL Server database is not configured to be case sensitive, you don't need to use LOWER or UPPER to force the case of text to be
equal for a comparison to be performed. Just leave these functions out of your code. This will speed up the performance of your query,
as any use of text functions in a WHERE clause hurts performance.
But what if your database has been configured to be case-sensitive? Should you then use the LOWER and UPPER functions to ensure that
comparisons are properly compared? No. The above example is still poor coding. If you have to deal with ensuring case is consistent for
proper comparisons, use the technique described below, along with appropriate indexes on the column in question:
SELECT column_name FROM table_name
WHERE column_name = 'NAME' or column_name = 'name'
This code will run much faster than the first example. [6.5, 7.0, 2000] Added 5-30-2003
*****
Try to avoid WHERE clauses that are non-sargable. The term "sargable" (which is in effect a made-up word) comes from the pseudo-
acronym "SARG", which stands for "Search ARGument," which refers to a WHERE clause that compares a column to a constant value. If
a WHERE clause is sargable, this means that it can take advantage of an index (assuming one is available) to speed completion of the
query. If a WHERE clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot take advantage of an index,
instead performing a table/index scan, which may cause the query's performance to suffer.
Non-sargable search arguments in the WHERE clause, such as "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN",
"NOT LIKE", and "LIKE '%500'" generally prevents (but not always) the query optimizer from using an index to perform a search. In
addition, expressions that include a function on a column, expressions that have the same column on both sides of the operator, or
comparisons against a column (not a constant), are not sargable.
But not every WHERE clause that has a non-sargable expression in it is doomed to a table/index scan. If the WHERE clause includes both
sargable and non-sargable clauses, then at least the sargable clauses can use an index (if one exists) to help access the data quickly.
In many cases, if there is a covering index on the table, which includes all of the columns in the SELECT, JOIN, and WHERE clauses in a
query, then the covering index can be used instead of a table/index scan to return a query's data, even if it has a non-sargable WHERE
clause. But keep in mind that covering indexes have their own drawbacks, such as producing very wide indexes that increase disk I/O
when they are read.
In some cases, it may be possible to rewrite a non-sargable WHERE clause into one that is sargable. For example, the clause:
WHERE SUBSTRING(firstname,1,1) = 'm'
can be rewritten like this:
WHERE firstname like 'm%'
Both of these WHERE clauses produce the same result, but the first one is non-sargable (it uses a function) and will run slow, while the
second one is sargable, and will run much faster.
WHERE clauses that perform some function on a column are non-sargable. On the other hand, if you can rewrite the WHERE clause so
that the column and function are separate, then the query can use an available index, greatly boosting performance. for example:
Function Acts Directly on Column, and Index Cannot Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE DATEDIFF(yy,datofbirth,GETDATE()) > 21
Function Has Been Separated From Column, and an Index Can Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE dateofbirth < DATEADD(yy,-21,GETDATE())
Each of the above queries produces the same results, but the second query will use an index because the function is not performed directly
on the column, as it is in the first example. The moral of this story is to try to rewrite WHERE clauses that have functions so that the
function does not act directly on the column.
WHERE clauses that use NOT are not sargable, but can often be rewritten to remove the NOT from the WHERE clause, for example:
WHERE NOT column_name > 5
to
WERE column_name <= 5
Each of the above clauses produce the same results, but the second one is sargable.
If you don't know if a particular WHERE clause is sargable or non-sargable, check out the query's execution plan in Query Analyzer.
Doing this, you can very quickly see if the query will be using index lookups or table/index scans to return your results.
With some careful analysis, and some clever thought, many non-sargable queries can be written so that they are sargable. Your goal for
best performance (assuming it is possible) is to get the left side of a search condition to be a single column name, and the right side an easy
to look up value. [6.5, 7.0, 2000] Updated 6-2-2003
*****
If you run into a situation where a WHERE clause is not sargable because of the use of a function on the right side of an equality
sign (and there is no other way to rewrite the WHERE clause), consider creating an index on a computed column instead. This way, you
avoid the non-sargable WHERE clause altogether, using the results of the function in your WHERE clause instead.
Because of the additional overhead required for indexes on computed columns, you will only want to do this if you need to run this same
query over and over in your application, thereby justifying the overhead of the indexed computed column. [2000] Updated 6-21-2004
*****
If you currently have a query that uses NOT IN, which offers poor performance because the SQL Server optimizer has to use a nested
table scan to perform this activity, instead try to use one of the following options instead, all of which offer better performance:

• Use EXISTS or NOT EXISTS

• Use IN
• Perform a LEFT OUTER JOIN and check for a NULL condition
[6.5, 7.0, 2000] Updated 10-30-2003
*****
When you have a choice of using the IN or the EXISTS clause in your Transact-SQL, you will generally want to use the EXISTS
clause, as it is usually more efficient and performs faster. [6.5, 7.0, 2000] Updated 10-30-2003
*****
If you find that SQL Server uses a TABLE SCAN instead of an INDEX SEEK when you use an IN or OR clause as part of your
WHERE clause, even when those columns are covered by an index, consider using an index hint to force the Query Optimizer to use the
index.
For example:
SELECT * FROM tblTaskProcesses WHERE nextprocess = 1 AND processid IN (8,32,45)
takes about 3 seconds, while:

SELECT * FROM tblTaskProcesses (INDEX = IX_ProcessID) WHERE nextprocess = 1 AND processid IN (8,32,45)
returns in under a second. [7.0, 2000] Updated 6-21-2004 Contributed by David Ames
*****
If you use LIKE in your WHERE clause, try to use one or more leading character in the clause, if at all possible. For example, use:
LIKE 'm%'
not:
LIKE '%m'
If you use a leading character in your LIKE clause, then the Query Optimizer has the ability to potentially use an index to perform the
query, speeding performance and reducing the load on SQL Server.
But if the leading character in a LIKE clause is a wildcard, the Query Optimizer will not be able to use an index, and a table scan must be
run, reducing performance and taking more time.
The more leading characters you can use in the LIKE clause, the more likely the Query Optimizer will find and use a suitable index. [6.5,
7.0, 2000] Updated 10-30-2003
*****
If your application needs to retrieve summary data often, but you don't want to have the overhead of calculating it on the fly every
time it is needed, consider using a trigger that updates summary values after each transaction into a summary table.
While the trigger has some overhead, overall, it may be less that having to calculate the data every time the summary data is needed. You
may have to experiment to see which methods is fastest for your environment. [6.5, 7.0, 2000] Updated 10-30-2003
*****
If your application needs to insert a large binary value into an image data column, perform this task using a stored procedure, not using
an INSERT statement embedded in your application.
The reason for this is because the application must first convert the binary value into a character string (which doubles its size, thus
increasing network traffic and taking more time) before it can be sent to the server. And when the server receives the character string, it
then has to convert it back to the binary format (taking even more time).
Using a stored procedure avoids all this because all the activity occurs on the SQL Server, and little data is transmitted over the network.
[6.5, 7.0, 2000] Updated 10-30-2003
*****
When you have a choice of using the IN or the BETWEEN clauses in your Transact-SQL, you will generally want to use the
BETWEEN clause, as it is much more efficient. For example:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number in (1000, 1001, 1002, 1003, 1004)
is much less efficient than this:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number BETWEEN 1000 and 1004
Assuming there is a useful index on customer_number, the Query Optimizer can locate a range of numbers much faster (using
BETWEEN) than it can find a series of numbers using the IN clause (which is really just another form of the OR clause). [6.5, 7.0, 2000]
Updated 10-30-2003
*****
If possible, try to avoid using the SUBSTRING function in your WHERE clauses. Depending on how it is constructed, using the
SUBSTRING function can force a table scan instead of allowing the optimizer to use an index (assuming there is one). If the substring you
are searching for does not include the first character of the column you are searching for, then a table scan is performed.
If possible, you should avoid using the SUBSTRING function and use the LIKE condition instead, for better performance.
Instead of doing this:
WHERE SUBSTRING(column_name,1,1) = 'b'
Try using this instead:
WHERE column_name LIKE 'b%'
If you decide to make this choice, keep in mind that you will want your LIKE condition to be sargable, which means that you cannot place
a wildcard in the first position. [6.5, 7.0, 2000] Updated 6-4-2003
*****
Where possible, avoid string concatenation in your Transact-SQL code, as it is not a fast process, contributing to overall slower
performance of your application. [6.5, 7.0, 2000] Updated 10-30-2003
*****
Generally, avoid using optimizer hints in your queries. This is because it is generally very hard to outguess the Query Optimizer.
Optimizer hints are special keywords that you include with your query to force how the Query Optimizer runs. If you decide to include a
hint in a query, this forces the Query Optimizer to become static, preventing the Query Optimizer from dynamically adapting to the current
environment for the given query. More often than not, this hurts, not helps performance.
If you think that a hint might be necessary to optimize your query, be sure you first do all of the following first:

• Update the statistics on the relevant tables.

• If the problem query is inside a stored procedure, recompile it.

• Review the search arguments to see if they are sargable, and if not, try to rewrite them so that they are sargable.

• Review the current indexes, and make changes if necessary.


If you have done all of the above, and the query is not running as you expect, then you may want to consider using an appropriate
optimizer hint.
If you haven't heeded my advice and have decided to use some hints, keep in mind that as your data changes, and as the Query Optimizer
changes (through service packs and new releases of SQL Server), your hard-coded hints may no longer offer the benefits they once did. So
if you use hints, you need to periodically review them to see if they are still performing as expected. [6.5, 7.0, 2000] Updated 6-21-2004
*****
If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them
from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution.
Because of this, you may want to consider one of the following when using AND:

• Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately,
saving time.

• If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is
false, less work will have to be done to evaluate the expression.
You may want to consider using Query Analyzer to look at the execution plans of your queries to see which is best for your situation. [6.5,
7.0, 2000] Updated 6-21-2004
*****
If you want to boost the performance of a query that includes an AND operator in the WHERE clause, consider the following:

• Of the search criterions in the WHERE clause, at least one of them should be based on a highly selective column that has an
index.

• If at least one of the search criterions in the WHERE clause is not highly selective, consider adding indexes to all of the
columns referenced in the WHERE clause.

• If none of the column in the WHERE clause are selective enough to use an index on their own, consider creating a covering
index for this query.
[7.0, 2000] Updated 9-6-2004
*****
The Query Optimizer will perform a table scan or a clustered index scan on a table if the WHERE clause in the query contains an OR
operator and if any of the referenced columns in the OR clause are not indexed (or does not have a useful index). Because of this, if you
use many queries with OR clauses, you will want to ensure that each referenced column in the WHERE clause has a useful index. [7.0,
2000] Updated 9-6-2004
*****
A query with one or more OR clauses can sometimes be rewritten as a series of queries that are combined with a UNION ALL
statement, in order to boost the performance of the query. For example, let's take a look at the following query:

SELECT employeeID, firstname, lastname


FROM names
WHERE dept = 'prod' or city = 'Orlando' or division = 'food'
This query has three separate conditions in the WHERE clause. In order for this query to use an index, then there must be an index on all
three columns found in the WHERE clause.

This same query can be written using UNION ALL instead of OR, like this example:

SELECT employeeID, firstname, lastname FROM names WHERE dept = 'prod'


UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE city = 'Orlando'
UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE division = 'food'

Each of these queries will produce the same results. If there is only an index on dept, but not the other columns in the WHERE clause, then
the first version will not use any index and a table scan must be performed. But in the second version of the query will use the index for
part of the query, but not for all of the query.

Admittedly, this is a very simple example, but even so, it does demonstrate how rewriting a query can affect whether or not an index is
used or not. If this query was much more complex, then the approach of using UNION ALL might be must more efficient, as it allows you
to tune each part of the index individually, something that cannot be done if you use only ORs in your query.
Note that I am using UNION ALL instead of UNION. The reason for this is to prevent the UNION statement from trying to sort the data
and remove duplicates, which hurts performance. Of course, if there is the possibility of duplicates, and you want to remove them, then of
course you can use just UNION.

If you have a query that uses ORs and it not making the best use of indexes, consider rewriting it as a UNION ALL, and then testing
performance. Only through testing can you be sure that one version of your query will be faster than another. [7.0, 2000] Updated 9-6-
2004
*****
Don't use ORDER BY in your SELECT statements unless you really need to, as it adds a lot of extra overhead. For example, perhaps
it may be more efficient to sort the data at the client than at the server. In other cases, perhaps the client doesn't even need sorted data to
achieve its goal. The key here is to remember that you shouldn't automatically sort data, unless you know it is necessary. [6.5, 7.0, 2000]
Updated 9-6-2004
*****
Whenever SQL Server has to perform a sorting operation, additional resources have to be used to perform this task. Sorting often
occurs when any of the following Transact-SQL statements are executed:

• ORDER BY

• GROUP BY

• SELECT DISTINCT

• UNION

• CREATE INDEX (generally not as critical as happens much less often)


In many cases, these commands cannot be avoided. On the other hand, there are few ways that sorting overhead can be reduced. These
include:

• Keep the number of rows to be sorted to a minimum. Do this by only returning those rows that absolutely need to be sorted.

• Keep the number of columns to be sorted to the minimum. In other words, don't sort more columns that required.

• Keep the width (physical size) of the columns to be sorted to a minimum.

• Sort column with number datatypes instead of character datatypes.


When using any of the above Transact-SQL commands, try to keep the above performance-boosting suggestions in mind. [6.5, 7.0, 2000]
Added 6-5-2003
*****
If you have to sort by a particular column often, consider making that column a clustered index. This is because the data is already
presorted for you and SQL Server is smart enough not to resort the data. [6.5, 7.0, 2000] Added 6-5-2003
*****
If your WHERE clause includes an IN operator along with a list of values to be tested in the query, order the list of values so that the
most frequently found values are placed at the first of the list, and the less frequently found values are placed at the end of the list. This can
speed performance because the IN option returns true as soon as any of the values in the list produce a match. The sooner the match is
made, the faster the query completes. [6.5, 7.0, 2000] Updated 4-6-2004
*****
If you need to use the SELECT INTO option, keep in mind that it can lock system tables, preventing others users from accessing the data
they need. If you do need to use SELECT INTO, try to schedule it when your SQL Server is less busy, and try to keep the amount of data
inserted to a minimum. [6.5, 7.0, 2000] Updated 4-6-2004
*****
If your SELECT statement contains a HAVING clause, write your query so that the WHERE clause does most of the work (removing
undesired rows) instead of the HAVING clause do the work of removing undesired rows. Using the WHERE clause appropriately can
eliminate unnecessary rows before they get to the GROUP BY and HAVING clause, saving some unnecessary work, and boosting
performance.
For example, in a SELECT statement with WHERE, GROUP BY, and HAVING clauses, here's what happens. First, the WHERE clause is
used to select the appropriate rows that need to be grouped. Next, the GROUP BY clause divides the rows into sets of grouped rows, and
then aggregates their values. And last, the HAVING clause then eliminates undesired aggregated groups. If the WHERE clause is used to
eliminate as many of the undesired rows as possible, this means the GROUP BY and the HAVING clauses will have less work to do,
boosting the overall performance of the query. [6.5, 7.0, 2000] Updated 4-6-2004
*****
If your application performs many wildcard (LIKE %) text searches on CHAR or VARCHAR columns, consider using SQL Server's
full-text search option. The Search Service can significantly speed up wildcard searches of text stored in a database. [7.0, 2000] Updated
4-6-2004
*****
The GROUP BY clause can be used with or without an aggregate function. But if you want optimum performance, don't use the GROUP
BY clause without an aggregate function. This is because you can accomplish the same end result by using the DISTINCT option
instead, and it is faster.
For example, you could write your query two different ways:
USE Northwind
SELECT OrderID
FROM [Order Details]
WHERE UnitPrice > 10
GROUP BY OrderID
or
USE Northwind
SELECT DISTINCT OrderID
FROM [Order Details]
WHERE UnitPrice > 10
Both of the above queries produce the same results, but the second one will use less resources and perform faster. [6.5, 7.0, 2000]
Updated 11-15-2004
*****
The GROUP BY clause can be sped up if you follow these suggestion:

• Keep the number of rows returned by the query as small as possible.

• Keep the number of groupings as few as possible.

• Don't group redundant columns.

• If there is a JOIN in the same SELECT statement that has a GROUP BY, try to rewrite the query to use a subquery instead of
using a JOIN. If this is possible, performance will be faster. If you have to use a JOIN, try to make the GROUP BY column
from the same table as the column or columns on which the set function is used.

• Consider adding an ORDER BY clause to the SELECT statement that orders by the same column as the GROUP BY. This may
cause the GROUP BY to perform faster. Test this to see if is true in your particular situation.
[7.0, 2000] Added 6-6-2003
*****
Sometimes perception is more important that reality. For example, which of the following two queries is the fastest:
• A query that takes 30 seconds to run, and then displays all of the required results.

• A query that takes 60 seconds to run, but displays the first screen full of records in less than 1 second.
Most DBAs would choose the first option as it takes less server resources and performs faster. But from many user's point-of-view, the
second one may be more palatable. By getting immediate feedback, the user gets the impression that the application is fast, even though in
the background, it is not.
If you run into situations where perception is more important than raw performance, consider using the FAST query hint. The FAST query
hint is used with the SELECT statement using this form:
OPTION(FAST number_of_rows)
where number_of_rows is the number of rows that are to be displayed as fast as possible.
When this hint is added to a SELECT statement, it tells the Query Optimizer to return the specified number of rows as fast as possible,
without regard to how long it will take to perform the overall query. Before rolling out an application using this hint, I would suggest you
test it thoroughly to see that it performs as you expect. You may find out that the query may take about the same amount of time whether
the hint is used or not. If this the case, then don't use the hint. [7.0, 2000] Updated 11-15-2004
*****
Instead of using temporary tables, consider using a derived table instead. A derived table is the result of using a SELECT statement in
the FROM clause of an existing SELECT statement. By using derived tables instead of temporary tables, you can reduce I/O and boost
your application's performance. [7.0, 2000] Updated 11-15-2004 More info on derived tables.
*****
It is fairly common request to write a Transact-SQL query to to compare a parent table and a child table and find out if there are any
parent records that don't have a match in the child table. Generally, there are three ways this can be done:
Using a NOT EXISTS

SELECT a.hdr_key
FROM hdr_tbl a
WHERE NOT EXISTS (SELECT * FROM dtl_tbl b WHERE a.hdr_key = b.hdr_key)
Using a LEFT JOIN
SELECT a.hdr_key
FROM hdr_tbl a
LEFT JOIN dtl_tbl b ON a.hdr_key = b.hdr_key
WHERE b.hdr_key IS NULL
Using a NOT IN

SELECT hdr_key
FROM hdr_tbl
WHERE hdr_key NOT IN (SELECT hdr_key FROM dtl_tbl)
In each case, the above query will return identical results. But, which of these three variations of the same query produces the best
performance? Assuming everything else is equal, the best performing version through the worst performing version will be from top to
bottom, as displayed above. In other words, the NOT EXISTS variation of this query is generally the most efficient.
I say generally, because the indexes found on the tables, along with the number of rows in each table, can influence the results. If you are
not sure which variation to try yourself, you can try them all and see which produces the best results in your particular circumstances. [7.0,
2000] Updated 11-15-2004
*****
Be careful when using OR in your WHERE clause, it is fairly simple to accidentally retrieve much more data than you need, which
hurts performance. For example, take a look at the query below:
SELECT companyid, plantid, formulaid
FROM batchrecords
WHERE companyid = '0001' and plantid = '0202' and formulaid = '39988773'
OR
companyid = '0001' and plantid = '0202'
As you can see from this query, the WHERE clause is redundant, as:
companyid = '0001' and plantid = '0202' and formulaid = '39988773'
is a subset of:
companyid = '0001' and plantid = '0202'
In other words, this query is redundant. Unfortunately, the SQL Server Query Optimizer isn't smart enough to know this, and will do
exactly what you tell it to. What will happen is that SQL Server will have to retrieve all the data you have requested, then in effect do a
SELECT DISTINCT to remove redundant rows it unnecessarily finds.
In this case, if you drop this code from the query:
OR
companyid = '0001' and plantid = '0202'
then run the query, you will receive the same results, but with much faster performance. [6.5, 7.0, 2000] Updated 11-15-2004
*****
If you need to verify the existence of a record in a table, don't use SELECT COUNT(*) in your Transact-SQL code to identify it, which
is very inefficient and wastes server resources. Instead, use the Transact-SQL IF EXITS to determine if the record in question exits, which
is much more efficient. For example:
Here's how you might use COUNT(*):
IF (SELECT COUNT(*) FROM table_name WHERE column_name = 'xxx')
Here's a faster way, using IF EXISTS:
IF EXISTS (SELECT * FROM table_name WHERE column_name = 'xxx')
The reason IF EXISTS is faster than COUNT(*) is because the query can end immediately when the text is proven true, while COUNT(*)
must count go through every record, whether there is only one, or thousands, before it can be found to be true. [7.0, 2000] Updated 11-15-
2004
*****
Let's say that you often need to INSERT the same value into a column. For example, perhaps you have to perform 100,000 INSERTs a
day into a particular table, and that 90% of the time the data INSERTed into one of the columns of the table is the same value.
If this the case, you can reduce network traffic (along with some SQL Server overhead) by creating this particular column with a default
value of the most common value. This way, when you INSERT your data, and the data is the default value, you don't INSERT any data
into this column, instead allowing the default value to automatically be filled in for you. But when the value needs to be different, you will
of course INSERT that value into the column. [6.5, 7.0, 2000] Updated 11-15-2004
*****
Performing UPDATES takes extra resources for SQL Server to perform. When performing an UPDATE, try to do as many of the
following recommendations as you can in order to reduce the amount of resources required to perform an UPDATE. The more of the
following suggestions you can do, the faster the UPDATE will perform.

• If you are UPDATing a column of a row that has an unique index, try to only update one row at a time.

• Try not to change the value of a column that is also the primary key.

• When updating VARCHAR columns, try to replace the contents with contents of the same length.

• Try to minimize the UPDATing of tables that have UPDATE triggers.

• Try to avoid UPDATing columns that will be replicated to other databases.

• Try to avoid UPDATing heavily indexed columns.

• Try to avoid UPDATing a column that has a reference in the WHERE clause to the column being updated.
Of course, you may have very little choice when UPDATing your data, but at least give the above suggestions a thought. [6.5, 7.0, 2000]
Added 7-2-2003
*****
If you have created a complex transaction that includes several parts, one part of which has a higher probability of rolling back the
transaction than the others, better performance will be provided if you locate the most likely to fail part of the transaction at the front of the
greater transaction. This way, if this more-likely-to-fail transaction has to roll back because of a failure, there has been no resources wasted
on the other less-likely-to-fail transactions. [6.5, 7.0, 2000] Added 7-2-2003

How do you determine the Nth row in a SQL Server database?

Answer

Consider the Pubs sample database that ships with SQL Server 2000. Our task is to determine the third, but last date when one
employee joined the company. Several approaches are possible here. Let's first have a look at the different methods available to us,
before getting into a basic performance analysis.
1) Using TOP. This is probably the most intuitive one.

SELECT TOP 1
hire_date
FROM
employee
WHERE
hire_date
NOT IN(
SELECT TOP 2
hire_date
FROM
employee
ORDER BY
hire_date DESC)
ORDER BY
hire_date DESC

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

Not much explanation needed here. The NOT IN rules out the TOP 2 hire dates. And from the remaining row we take the TOP 1 hire
date. This is a straightforward approach.

2) Here we us the SQL Server feature to assign the value of a variable to the last row processed.

DECLARE @dt DATETIME


SELECT TOP 3
@dt = hire_date
FROM
employee
ORDER BY
hire_date DESC
SELECT @dt hire_date

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

The variable @dt is assigned to every row in the resultset. But since we force an ORDER BY, the last row processed contains the date
we are interested in and @dt is assigned this value. We now only need to SELECT the variable to get the desired result.

3) Use a temporary table. Below, we show a generic approach that uses a stored procedure that accepts the desired row as an input
parameter and returns the corresponding hire date:

USE PUBS
GO
CREATE PROC dbo.GetNthLatestEntry (@NthLatest INT)
AS
SET NOCOUNT ON
BEGIN
CREATE TABLE #Entry
(
ID INT PRIMARY KEY NOT NULL IDENTITY(1,1)
, Entry DATETIME NOT NULL
)
INSERT INTO #Entry (Entry) SELECT hire_date FROM employee ORDER BY hire_date DESC
SELECT
Entry hire_date
FROM
#Entry
WHERE
ID = @NthLatest
DROP TABLE #Entry
END
SET NOCOUNT OFF
GO
EXEC dbo.GetNthLatestEntry 3
DROP PROCEDURE dbo.GetNthLatestEntry

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

4) Until now we always used either this or that proprietary feature of SQL Server. Either TOP or the IDENTITY property. Now we try
to make this portable and use ANSI SQL.

SELECT
e1.hire_date
FROM
employee AS e1
INNER JOIN
employee AS e2
ON
e1.hire_date <= e2.hire_date
GROUP BY
e1.hire_date
HAVING COUNT(DISTINCT e2.hire_date) = 3

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

If you are interested in how this statement works, we suggest you have a look at the books by Joe Celko.

So, we now have four different methods to get the same result. Which should we choose? Well, the classical answer here: It depends!
If your goal is to make your SQL as portable as possible, you will surely choose the ANSI SQL method. If you, however, do not
bother about portability, you still have three different methods to choose from. Let's now have a look at the output of SET
STATISTICS IO ON. The results below correspond to the four methods described above.

1. Table 'employee'. Scan count 4, logical reads 8, physical reads 0, read-ahead reads 0.

2. Table 'employee'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.

3. Table 'employee'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.

4. Table 'employee'. Scan count 44, logical reads 88, physical reads 0, read-ahead reads 0.
As you can see, one method clearly differs from all the others. This is the ANSI SQL method. Portability has its price. The first
method was the TOP method. It creates 4 times the IO of the other 2 methods. Though it is logical IO, it still is IO. So, the choice now
is between the temp table approach and the variable assignment approach. A choice here might be dependent on how busy your whole
system is. The use of temp tables might cause issues in tempdb. So, for such simple questions, using the variable assignment method
seems to be a fairly reasonable choice. Running Profiler to measure the duration here, is not very meaningful, since the employee table
all in all has just 43 rows. So every method is executed very fast. On larger table it is a good practice to build up a test scenario to see
how the different methods perform in your specific environment.

Why can it take so long to drop a clustered index?


Answer

Generally speaking, indexes can speed up queries tremendously. This comes at the cost, as changes to the underlying data have to be
reflected in the indexes when the index column(s) are modified.

Before we get into the reasons why dropping a clustered index can be time-consuming, we need to take a short look at the different
index structures in SQL Server.

Every table can have one, and only one clustered index. A clustered index sorts the data physically according to its index keys. And
since there can only be one physically sorted order on a table at a time, this sounds pretty obvious. If a table does not have a clustered
index, it is called a heap.

The second index structure are non-clustered indexes. You can create non-clustered indexes on tables with clustered indexes, heaps,
and indexed views.

The difference between both index structures is at the leaf level of the index. While the leaf level of a clustered index actually is the
table's data itself, you only find pointers to the data at the leaf level of a non-clustered index. Now we need to understand an important
difference:

• When a table has a clustered index created, the pointers contain the clustered index keys for that row.

• When a table does not have a clustered index, the pointers consist of the so-called RowID, which is a combination of
FileNumber:PageNumber:Slot.
When you understand this distinction, you can derive the answer to the original question yourself. When you drop a clustered index,
SQL Server will have to recreate all non-clustered indexes on that table (assuming there are any). During this recreation, the clustered
index keys are replaced by the RowID. It should be obvious that this is a time-consuming operation, especially on larger tables or
tables with many indexes.
One way around the long delay when dropping a clustered index (assuming there are also non-clustered indexes) is to first drop the
non-clustered indexes, and then drop the clustered index. Likewise, you should first create the clustered index and then the non-
clustered indexes. Sticking to these recommendation, you won't waste time and server resources that are spent better otherwise.

Is it possible to keep the database in memory?

Answer

In a very real sense, SQL Server automatically attempts to keep as much of the database in memory as it can.

By default, when SQL Server is using memory dynamically, it queries the system periodically to determine the amount of free
physical memory available. If there is more memory free, SQL Server recommits memory to the buffer cache, which SQL Server uses
to store data for ready access. SQL Server adds memory to the buffer cache only when its workload requires more memory; a server at
rest does not grow its buffer cache.

SQL Server allocates much of its virtual memory to a buffer cache and uses the cache to reduce physical I/O. Each instance of SQL
Server automatically caches execution plans in memory based upon available memory. Data is read from the database disk files into
the buffer cache. Multiple logical reads of the data can be satisfied without requiring that the data be physically read again.

By maintaining a relatively large buffer cache in virtual memory, an instance of SQL Server can significantly reduce the number of
physical disk reads it requires.

Another method of providing performance improvement is using DBCC PINTABLE, which is used to store tables in memory on a
more or less permanent basis. It works best for small tables that are frequently accessed. The pages for the small table are read into
memory one time, and then all future references to their data do not require a disk read. SQL Server keeps a copy of the page available
in the buffer cache until the table is unpinned using the DBCC UNPINTABLE statement. This option should be used sparingly as it
can reduce the amount of overall buffer cache available for SQL Server to use dynamically.

How can you use IIf in Transact-SQL?

Answer

This is a quite common question. It is usually asked by people arriving at SQL Server with a background in Microsoft Access. They
either want to use SQL Server as a backend for their Access project, or they are otherwise upsizing from Access to SQL Server. The
answer, however is usually not much appreciated at first:
There is no IIf in SQL Server's Transact SQL language!

Like it or not, such queries have to be rewritten using the CASE expression. Let's look at a simple example:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, IIf([Country]="Germany","0049 " & [Phone],[Phone]) AS Telefon
FROM
Customers
This is a valid query in Access, which evaluates within Access' Northwind sample database whether a Customer is located in Germany
or not. If this is the case (pun intended!), it automatically adds the international telephone number for Germany in front of the phone
number. If you try to run this in SQL Server's Query Analyzer, you'll get:
Server: Msg 170, Level 15, State 1, Line 5
Line 5: Incorrect syntax near '='.
That's it. The query stops with this error message. So, as was mentioned above, the query has to be rewritten using the CASE
expression. That might look something like this:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, CASE
WHEN Country='Germany'
THEN '0049 ' + Phone
ELSE Phone
END AS Phone
FROM
Customers
This is a valid Transact-SQL query, which SQL Server can understand and execute.
CASE is one of the most powerful commands in the Transact-SQL language. In contrast to IIf, where you only evaluate one logical
expression at a time, this limitation does not exist for CASE. Try, for example, to put this in one single IIf expression:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, CASE Country
WHEN 'Germany'
THEN '0049 ' + Phone
WHEN 'Mexico'
THEN 'Fiesta ' + Phone
WHEN 'UK'
THEN 'Black Pudding (Yuk!) ' + Phone
ELSE Phone
END AS Phone
FROM
Customers
Don't spent too much time here on the sense of this query, but you will get the idea of what is possible with CASE. And once you are
familiar with using CASE, you'll hardly miss IIf anymore.
What happens when my integer IDENTITY runs out of scope?

Answer

Before we actually look at the answer, let's recall some basics of the IDENTITY property and SQL Server's numerical data types.

You can define the IDENTITY property on columns of the INT data type and on DECIMAL with scale 0. This gives you a range of:

TINYINT 0 - 255

SMALLINT -32.768 - 32.767

INT -2.147.483.648 - 2.147.483.647

BIGINT -2^63 - 2^63-1


When you decide to use the DECIMAL datatype you have a potential range from -10^38 to 10^38-1.

So, keeping this in mind, we're now ready to answer the original question here. What happens when an INTEGER IDENTITY value is
about to run out of scope?
CREATE TABLE id_overflow
(
col1 INT IDENTITY(2147483647,1)
)
GO
INSERT INTO id_overflow DEFAULT VALUES
INSERT INTO id_overflow DEFAULT VALUES
SELECT * FROM id_overflow
DROP TABLE id_overflow

(1 row(s) affected)

Server: Msg 8115, Level 16, State 1, Line 2


Arithmetic overflow error converting IDENTITY to data type int.
Arithmetic overflow occurred.
This script creates a simple table with just one column of type INT. We have also created the IDENTITY property for this column. But
instead of now adding more than 2 billion rows to the table, we rather set the seed value to the positive maximum value for an
INTEGER. The first row inserted is assigned that value. Nothing unusual happens. The second insert, however, fails with the above
error. Apparently SQL Server does not start all over again or tries to fill the maybe existing gaps in the sequence. Actually, SQL Server
does nothing automatically here. You have to do this by yourself. But what can you do in such a case?
Probably the easiest solution is to alter the data type of the column to BIGINT, or maybe right on to DECIMAL(38,0) like so:
CREATE TABLE id_overflow
(
col1 INT IDENTITY(2147483647,1)
)
GO
INSERT INTO id_overflow DEFAULT VALUES
ALTER TABLE id_overflow
ALTER COLUMN col1 BIGINT
INSERT INTO id_overflow DEFAULT VALUES
SELECT * FROM id_overflow
DROP TABLE id_overflow

col1
--------------------
2147483647
2147483648

(2 row(s) affected)
If you know in advance that your table needs to keep that many rows, you can do:
CREATE TABLE bigint_t
(
col1 BIGINT IDENTITY(-9223372036854775808, 1)
)
GO
INSERT INTO bigint_t DEFAULT VALUES
SELECT * FROM bigint_t
DROP TABLE bigint_t

col1
--------------------
-9223372036854775808

(1 row(s) affected)
Or the DECIMAL(38,0) variation:
CREATE TABLE decimal_t
(
col1 DECIMAL(38,0) IDENTITY(-99999999999999999999999999999999999999, 1)
)
GO
INSERT INTO decimal_t DEFAULT VALUES
SELECT * FROM decimal_t
DROP TABLE decimal_t

col1
----------------------------------------
-99999999999999999999999999999999999999

(1 row(s) affected)
One might be distressed in one's aesthetical taste by those negative numbers, but it's a fact, that one now shouldn't have to worry about
running out of scope for quite some time.

What is the difference between DELETE and TRUNCATE? Is one faster than the other?

Answer

DELETE logs the data for each row affected by the statement in the transaction log and physically removes the record from the file.
The recording of each affected row can let your transaction log grow massively. However, when you run your databases in full
recovery mode, this is necessary for SQL Server to be able to recover the database in case of a disaster to the most recent state. The
fact, that each row is logged explains also, why, DELETE statements can be slow.

TRUNCATE is faster than DELETE due to the way TRUNCATE "removes" the data. Actually, TRUNCATE does not remove data,
but rather deallocates whole data pages and removes pointers to indexes. The data still exists until it is overwritten or the database is
shrunk. This action does not require great resources and is therefore very fast. It is a common mistake to think that TRUNCATE is not
logged. This is wrong. The deallocation of the data pages is recorded in the log file. Therefore, BOL refers to TRUNCATE operations
as "minimally logged" operations. You can use TRUNCATE within a transaction, and when this transaction is rolled-back, the data
pages are reallocated again and the database is again in its original, consistent state.

Some limitations do exist for TRUNCATE.

• You need to be db_owner, ddl_admin or owner of the table to be able to fire a TRUNCATE statement.

• TRUNCATE will not work on tables, which are references by one or more FOREIGN KEY constraints.
So if TRUNCATE is so much faster than DELETE why should one use DELETE at all? Well, TRUNCATE is an all-or-nothing
approach. You can not specify just to truncate that rows that match a certain criteria. It's either all rows or none. You can, however, use
a workaround here. Suppose you want to delete more rows from a table than will remain. In this case you can export the rows that you
want to keep to a temporary table, run the TRUNCATE statement, and finally reimport the remaining rows from the temporary table.
If your table contains a column with the IDENTITY property defined on it, and you want to keep the original IDENTITY values, be
sure to enabled IDENTITY_INSERT on the table before you reimport from the temporary table. Chances are good that this
workaround is still faster than a DELETE operation. You can also set the recovery mode to "Simple" before you start this workaround,
and then back to "Full" one it is done. However, keep in mind, that is this case, you might only be able to recover to the last backup.
Ask yourself, if this is good enough for you!

My application is very INSERT heavy. What can I do to speed up the performance of INSERTs?

Answer

Here are a variety of tips that can help speed up INSERTs.

1) Use RAID 10 or RAID 1, not RAID 5 for the physical disk array that stores your SQL Server database. RAID 5 is slow on
INSERTs because of the overhead of writing the parity bits. Also, get faster drives, a faster controller, and consider turning on write
caching on the controller if it is not already turned on (although this has its disadvantages, such as lost data if your hardware fails).

2) The fewer the indexes on the table, the faster INSERTs will be.

3) Try to avoid page splits. Ways to do this include having an appropriate fillfactor and pad_index, rebuilding indexes often, and
consider adding a clustered index on an incrementing key for the table (this forces pages to be added one after another, and page splits
are not an issue).

4) Keep the columns widths as narrow as possible.

5) If data length in a column is consistent, use CHAR columns, or if data length varies a lot, use VARCHAR columns.

6) Try to batch INSERTs rather than to INSERT one row at a time. But this can also cause problems if the batch of INSERTs is too
large.

None of these suggestions will radically speed up your INSERTs by themselves, but put together, they all will contribute to overall
faster INSERTs.

My SQL Server seems to take memory, but never releases it. Is this normal?

Answer

If you are running SQL Server 7.0 or SQL Server 2000, and have the memory setting set to dynamically manage memory (the default
setting), SQL Server will automatically take as much RAM as it needs (assuming it is available) from the available RAM. Assuming
that the operating system or other application running on the same physical server don't need more RAM, SQL Server will keep
control of the RAM, even if it really doesn't need it. The reason for this is because it is more resource efficient for SQL Server o keep
holding the RAM (even if it doesn't currently need it) than to release and grab it over and over as memory needs change.

If your SQL Server is a dedicated SQL Server, it is very normal for SQL Server to take memory, but to never release it.

If you have set SQL Server to use a minimum amount of memory (not a default setting), once SQL Server grabs this amount, it will
not give it up until it is restarted. This can account for some instances of SQL Server not giving up memory.

If you have a non-dedicated SQL Server, and there are other applications running on the same physical server, SQL Server will give
up some of its memory if needed. But this may not happen instantly. For example, if SQL Server needs a specific amount of memory
to complete a current task, it won't give up that memory until that task is complete. In the meantime, your other application may cause
your server to have excessive paging, which can hurt performance. The best solution to the issue of SQL Server having to fight for
memory with other applications is to either add more RAM to the server, or to move the other applications off the server to another
server.
Is there any significant performance difference when joining tables across different databases on the same server?

Answer

This is very easy to test yourself. For example, make a copy of pubs and call is pubs2. Then create a query to JOIN two related tables
from within pubs. Create one JOIN that JOINs two tables in the same database, and create a second JOIN, but for one of the
JOINed tables, modify the query so that it points to the table in the other database. Then run both queries and examine their query
plans.

In virtually every case, the execution plans are identical, which tells you that performance of the query, whether it is inside a single
database, or between two databases on the same serve, are more or less identical.

On the other hand, if the databases are on separate servers, performance will suffer greatly due to network latency, etc.
Is there any performance difference between using SET or SELECT to assign values in Transact-SQL?

Answer

This is virtually no difference in performance between using SET or SELECT to assign values. In most cases, you will want to use the
ANSI standard, which says you should use SELECT. See the SQL Server Books Online for the exact syntax for both options.

Which is faster when using SQL Server 2000, temp tables or the new table datatype?

Answer

Generally speaking, if the data you are dealing with is not large, then the table datatype will be faster than using a temp table. But if
the amount of data is large, then a temp table most likely will be faster. Which method is faster is dependent on the amount of RAM in
your server available to SQL Server, and this of course can vary from server to server. The greater the RAM in a server, the greater
number of records that can be efficiently stored in a table datatype. You may have to test both methods to determine which method is
best for your situation.

Here are some reasons why the table datatype, when used with reasonable amounts of data, are generally faster than using a temp
table:
• Records are stored in memory, not in a temp table in the tempdb database, so performance is much faster.

• Table variables act like local variables and have a well-defined scope. Whenever the batch, function, or stored procedure
that created the table variable goes away, the table variable is automatically cleaned up.

• When a table variable is used inside a stored procedure instead of a temp table, fewer recompilations occur, reducing server
overhead.

• Table variables require less locking and logging resources when compared to temporary tables, reducing server overhead
and boosting concurrency.
If you haven't learned how to use table variables yet, you need to take the time to do so as soon as you can. They can be powerful tools
in the correct situations.

1) Do we need to create a primary key constraint on a table that already has a clustered unique index? If so, why?

2) If we need a primary key constraint, should we delete the clustered unique index we created, and then add the primary
key constraint? My concern is that we will end up with two clustered unique indexes on the same column(s) that will
add to database load times, etc.

3) Is there an advantage to using the primary key constraint to automatically create the clustered unique index as opposed
to just outright creating the clustered unique index if they are on the same column.

Answer

Let's take a look at each of these three questions.


First Question: Technically speaking, a primary key is not required on a table for any reason. A primary key serve two purposes. It
acts to enforce entity integrity of the table. What this means is that is ensure that there are no duplicate records in a table. Duplicate
records in a table can lead to all kinds of problems, and by adding a primary key to a table, you eliminate this possibility. Next, a
primary key can be used along with a foreign key to ensure that referential integrity is maintained between tables. Because of these
two reasons, it is generally recommended that all tables have a primary key.
By itself, a primary key does not have a direct affect on performance. But indirectly, it does. This is because when you add a primary
key to a table, SQL Server creates a unique index (clustered by default) that is used to enforce entity integrity. But as you have already
discovered, you can create your own unique indexes on a table, which has the same affect on performance. So, strictly speaking, a
primary index does not affect performance, but the index used by the primary key does.
Now to directly answer your question. You don't have to have primary keys on your tables if you don't care about the benefits that
arise from using them. If you like, you can keep the current indexes you have, and assuming they are good choices for the types of
queries you will be running against the table, then performance will be enhanced by having them. Replacing your current indexes, and
replacing them with primary keys will not help performance.
Second Question: I think I answered most of this question above. But I do want to point out that you cannot have two clustered
indexes on the same table. So if you did want to add primary keys to the tables you currently have, you can, using one of two
techniques. First, you can choose to add a primary key using a non-clustered index, or second, you can drop your current indexes, and
then add a primary key using a clustered index.
Third Question: Here's what I do. I assign primary keys to every table because this is a best database design practice. But before I do, I
evaluate whether or not the primary key's index should be clustered or non-clustered, and then choose accordingly. Since you can only
have one clustered index, it should be chosen well. See my clustered tips webpage for more details. Next, I then evaluate the table for
additional indexes that may be needed, and proceed accordingly.
It is not a good idea, from a performance perspective, to accept the default of a clustered index on a primary key, as it may not be the
best choice for the use of a clustered index. In addition, it is not a good idea to "double up" on indexes. In other words, don't put a
Primary key non-clustered index on a column, and a clustered index on the same column (this is possible, although never a good idea).
Always think about indexes and why they exist. Only add indexes where they are needed, and nowhere else. Too many indexes can be
as bad for performance as too few indexes.

1) Delete from emp and delete * from emp will work in oracle but
Delete from emp only works in sql server

2) select * from employee where emp_name like '%C%' it display all the records having single ‘C’ in emp_name Ex :
ACF,CGT,TGC etc
select * from employee where emp_name like '%%C' it displays all the records having ‘c’ in last
ex: ABC,VBC etc

3) To Rename a table in sql server


--create table mytest(myname char(20))
-- sp_rename @objname = 'mytest', @newname = 'Newmytest'

4) Getting tables list from SQL Server


SELECT *
FROM sysobjects
WHERE type = 'U' order by name

5) I wrote one procedure in SQL SERVER(t-sql) I want to schedule that procedure to run every day morning at 7:00AM
Please let me how to schedule
It is a simple Task, just go to "management" option in enterprise
mangaer
> and choose "Jobs" Option
> Right Click ,
> select New Job
> give the Name of the Job
> then select "Steps" Tab
> Click "New" Button
> select "Open" Button
> Choose the procedure you want to run
> give steps name
> Choose Ok button
> Select "Shedules" Tab
> Select "New shedule" Button
> Give Shedule Name
> Select "Change Button"
> Choose "Daily" Option Button
> Choose the Daily Frequency
> click 'OK' button

Potrebbero piacerti anche