Sei sulla pagina 1di 36

Crack SQL Performance Tuning Interview.

Q:-1. What is SQL Query Optimization?

Ans. Query Optimization is the process of writing the query in a way so that it could execute
quickly. It is a significant step for any standard application.

Q:-2. What are some tips to improve the performance of SQL queries?

Ans. Optimizing SQL queries can bring substantial positive impact on the performance. It also
depends on the level of RDBMS knowledge you have. Lets now go over some of the tips for
tuning SQL queries.

1. Prefer to use views and stored procedures in spite of writing long queries. Itll also help in
minimizing network load.

2. Its better to introduce constraints instead of triggers. They are more efficient than triggers and
can increase performance.

3. Make use of table-level variables instead of temporary tables.

4. The UNION ALL clause responds faster than UNION. It doesnt look for duplicate rows
whereas the UNION statement does that regardless of whether they exist or not.

5. Prevent the usage of DISTINCT and HAVING clauses.

6. Avoid excessive use of SQL cursors.

7. Make use of SET NOCOUNT ON clause while building stored procedures. It represents the
rows affected by a T-SQL statement. It would lead to reduced network traffic.

8. Its a good practice to return the required column instead of all the columns of a table.

9. Prefer not to use complex joins and avoid disproportionate use of triggers.

10. Create indexes for tables and adhere to the standards.

Q:-3. What are the bottlenecks that affect the performance of a Database?

Ans. In a web application, the database tier can prove to be a critical bottleneck in achieving the
last mile of scalability. If a database has performance leakage, that can become a bottleneck and
likely to cause the issue. Some of the common performance issues are as follows.

1. Abnormal CPU usage is the most obvious performance bottleneck. However, you can fix it by
extending CPU units or replacing with an advanced CPU. It may look like a simple issue but
abnormal CPU usage can lead to other problems.
2. Low memory is the next most common bottleneck. If the server isnt able to manage the peak
load, then it poses a big question mark on the performance. For any application, memory is very
critical to perform as its way faster than the persistent memory. Also, when the RAM goes down
to a specific threshold, then the OS turns to utilize the swap memory. But it makes the application
to run very slow.

You can resolve it by expanding the physical RAM, but it wont solve memory leaks if there is
any. In such a case, you need to profile the application to identify the potential leaks within its
code.

3. Too much dependency on external storage like SATA disk could also come as a bottleneck. Its
impact gets visible while writing large data to the disk. If output operations are very slow, then it
is a clear indication an issue becoming the bottleneck.

In such cases, you need to do scaling. Replace the existing drive with a faster one. Try upgrading
to an SSD hard drive or something similar.

Q:-4. What are the steps involved in improving the SQL performance?

Ans.

Discover First of all, find out the areas of improvement. Explore tools like Profiler, Query
execution plans, SQL tuning advisor, dynamic views, and custom stored procedures.

Review Brainstorm the data available to isolate the main issues.

Propose Here is a standard approach one can adapt to boost the performance. However, you
can customize it further to maximize the benefits.

1. Identify fields and create indexes.


2. Modify large queries to make use of indexes created.
3. Refresh table and views and update statistics.
4. Reset existing indexes and remove unused ones.
5. Look for dead blocks and remove them.

Validate Test the SQL performance tuning approach. Monitor the progress at a regular interval.
Also, track if there is any adverse impact on other parts of the application.

Publish Now, its time to share the working solution with everyone in the team. Let them know
all the best practices so that they can use it with ease.

Back to top

Q:-5. What is a explain plan?

Ans. Its a term used in Oracle. And it is a type of SQL clause in Oracle which displays the
execution plan that its optimizer plans for executing the SELECT/UPDATE/INSERT/DELETE
statements.
Q:-6. How do you analyze an explain plan?

Ans. While analyzing the explain plan, check the following areas.

1. Driving Table
2. Join Order
3. Join Method
4. Unintentional cartesian product
5. Nested loops, merge sort, and hash join
6. Full Table Scan
7. Unused indexes
8. Access paths

Q:-7. How do you tune a query using the explain plan?

Ans. The explain plan shows a complete output of the query costs including each subquery. The
cost is directly proportional to the query execution time. The plan also depicts the problem in
queries or sub-queries while fetching data from the query.

Q:-8. What is Summary advisor and what type of information does it provide?

Ans. Summary advisor is a tool for filtering and materializing the views. It can help in elevating
the SQL performance by selecting the proper set of materialized views for a given workload. And
it also provides data about the Materialized view recommendations.

Q:-9. What could most likely cause a SQL query to run as slow as 5 minutes?

Ans. Most probably, a sudden surge in the volume of data in a particular table could slow down
the output of a SQL query. So collect the required stats for the target table. Also, monitor any
change in the DB level or within the underlying object level.

Q:-10. What is a Latch Free Event? And when does it occur? Alos, how does the system
handles it?

Ans. In Oracle, Latch Free wait event occurs when a session requires a latch, attempts to get it
but fails because someone else has it.

So it sleeps with a wait eying for the latch to get free, wakes up and tries again. The time duration
for it was inactive is the wait time for Latch Free. Also, there is no ordered queue for the waiters
on a latch, so the one who comes first gets it.

Back to top

Q:-11. What is Proactive tuning and Reactive tuning?


Ans.

Proactive tuning The architect or the DBA determines which combination of system resources
and available Oracle features fulfill the criteria during Design and Development.

Reactive tuning It is the bottom-up approach to discover and eliminate the bottlenecks. The
objective is to make Oracle respond faster.

Q:-12. What are Rule-based Optimizer and Cost-based Optimizer?

Ans. Oracle determines how to get the required data for processing a valid SQL statement. It uses
one of following two methods to take this decision.

Rule-based Optimizer When a server doesnt have internal statistics supporting the objects
referenced by the statement, then the RBO method gets preference. However, Oracle will
deprecate this method in the future releases.

Cost-based Optimizer When there is an abundance of the internal statistics, the CBO gets the
precedence. It verifies several possible execution plans and chooses the one with the lowest cost
based on the system resources.

Q:-13. What are several SQL performance tuning enhancements in Oracle?

Ans. Oracle provides many performance enhancements, some of them are:

1. Automatic Performance Diagnostic and Tuning Features


2. Automatic Shared Memory Management It gives Oracle control of allocating memory within
the SGA.
3. Wait-model improvements A number of views have come to boost the Wait-model.
4. Automatic Optimizer Statistics Collection Collects optimizer statistics using a scheduled job
called GATHER_STATS_JOB.
5. Dynamic Sampling Enables the server to enhance performance.
6. CPU Costing Its the basic cost model for the optimizer (CPU+I/O), with the cost unit as
time optimizer notifies.
7. Rule Based Optimizer Obsolescence No more used.
8. Tracing Enhancements End to End tracing which allows a client process to be identified via
the Client Identifier instead of using the typical Session ID.

Q:-14. What are the tuning indicators Oracle proposes?

Ans. The following high-level tuning indicators are available to establish if a database is
experiencing bottlenecks or not:

1. Buffer Cache Hit Ratio.


It uses the following formula.

Hit Ratio = (Logical Reads Physical Reads) / Logical Reads

Action: Advance the DB_CACHE_SIZE (DB_BLOCK_BUFFERS prior to 9i) to improve the hit
ratio.

2. Library Cache Hit Ratio.

Action: Advance the SHARED_POOL_SIZE to increase the hit ratio.

Q:-15. What do you check first if there are multiple fragments in the SYSTEM tablespace?

Ans. First of all, check if the users dont have the SYSTEM tablespace as their TEMPORARY or
DEFAULT tablespace assignment by verifying the DBA_USERS view.

Back to top

Q:-16. When would you add more Copy Latches? What are the parameters that control the
Copy Latches?

Ans. If there is excessive contention for the Copy Latches, check from the redo copy latch hit
ratio.

In such a case, add more Copy Latches via the initialization parameter
LOG_SIMULTANEOUS_COPIES to double the number of CPUs available.

Q:-17. How do you confirm if a tablespace has disproportionate fragmentation?

Ans. You can confirm it by checking the output of SELECT against the dba_free_space table. If
it points that the no. of a tablespaces extents is more than the count of its data files, then it proves
excessive fragmentation.

Q:-18. What can you do to optimize the %XYZ% queries?

Ans. First of all, set the optimizer to scan all the entries from the index instead of the table. You
can achieve it by specifying hints.

Please note that crawling the smaller index takes less time than to scan the entire table.

Q:-19. Where does the I/O statistics per table exist in Oracle?
Ans. There is a report known as UTLESTAT which displays the I/O per tablespace. But it
doesnt help to find the table which has the most I/O.

Q:-20. When is the right time to rebuild an index?

Ans. First of all, select the target index and run the ANALYZE INDEX VALIDATE
STRUCTURE command. Every time you run it, a single row will get created in the
INDEX_STATS view.

But the row gets overwritten the next time you run the ANALYZE INDEX command. So better
move the contents of the view to a local table. Thereafter, analyze the ratio of DEL_LF_ROWS
to LF_ROWS and see if you need to rebuild the index.

Q:-21. What exactly would you do to check the performance issue of SQL queries?

Ans. Mostly the database isnt slow, but its the worker session which drags the performance.
And its the abnormal session accesses which cause the bottlenecks.

1. Review the events that are in wait or listening mode.


2. Hunt down the locked objects in a particular session.
3. Check if the SQL query is pointing to the right index or not.
4. Launch SQL Tuning Advisor and analyze the target SQL_ID for making any performance
recommendation.
5. Run the free command to check the RAM usage. Also, use TOP command to identify any
process hogging the CPU.

Back to top

Q:-22. What is the information you get from the STATSPACK Report?

Ans. We can get the following statistics from the STATSPACK report.

1. WAIT notifiers
2. Load profile
3. Instance Efficiency Hit Ratio
4. Latch Waits
5. Top SQL
6. Instance Action
7. File I/O and Segment Stats
8. Memory allocation
9. Buffer Waits

Q:-23. What are the factors to consider for creating Index on Table? Also, How to select a
column for Index?

Ans. Creation of index depends on the following factors.


1. Size of table,
2. Volume of data

If Table size is large and we need a smaller report, then its better to create Index.

Regarding the column to be used for Index, as per the business rule, you should use a primary key
or a unique key for creating a unique index.

Q:-24. What is the main difference between Redo, Rollback, and Undo?

Ans.

Redo Log that records all changes made to data, including both uncommitted and committed
changes.

Rollback Segments to store the previous state of data before the changes.

Undo Helpful in building a read consistent view of data. The data gets stored in the undo
tablespace.

Q:-25. How do you identify the shared memory and semaphores of a particular DB instance
if there are running multiple servers?

Ans. Set the following parameters to distinguish between the in-memory resources of a DB
instance.

1. SETMYPID
2. IPC
3. TRACEFILE_NAME

Use the ORADEBUG command to explore their underlying options.

Q. What are the bottlenecks that effects the performance of a Database /


Application
Ans:
The top performance bottlenecks for OLTP applications are outlined as:

Database Design \ Database Code


Application Design \ Application Code
CPU bottleneck
Memory bottleneck
IO bottleneck
Blocking bottleneck
Network bottleneck
Server Hardware
Database Design \ Database Code
Too many indexes on frequently updated (inclusive of inserts, updates and
deletes):
Tables incur extra index maintenance overhead. Generally, OLTP database designs
should keep the number of indexes to a functional minimum, again due to the high
volumes of similar transactions combined with the cost of index maintenance
Statistics may not be updated or missing statistics
Excess use of cursors and temporary tables
Too much of normalization
Do not use the conversion/system/user defined functions in where clause
Unused indexes incur the cost of index maintenance for inserts, updates,
and deletes without benefiting any users:
Unused indexes should be eliminated. Any index that has been used (by select,
update or delete operations) will appear in sys.dm_db_index_usage_stats. Thus,
any defined index not included in this DMV has not been used since the last re-
start of SQL Server

Choose the Appropriate Data Types


Always choose the smallest appropriate data type. Avoid NCHAR/NVARCHAR unless
there is a need of storing Unicode.

Use Triggers Cautiously


Keep the code in your triggers to the very minimum to reduce overhead. The more
code that runs in the trigger, the slower each INSERT, UPDATE, and DELETE that
fires it will be

Dont Access More Data Than You Need


Dont return more columns or rows of data to the client than absolutely necessary.
This just increases disk I/O on the server

Avoid Using Cursors


Wherever possible Try to use alternative solutions includes Temp-Tables,
Derived tables, Table Variables or Recursive CTEs etc
Always select the cursor with the least amount of overhead. The most efficient
cursor you can choose is the fast forward-only cursor.
When you are done using a cursor, dont just CLOSE it, DEALLOCATE
Use Joins Appropriately
If you have two or more tables that are frequently joined together, then the
columns used for the joins should have an appropriate index. If the columns
used for the joins are not naturally compact, then considering adding surrogate
keys to the tables that are compact in order to reduce the size of the keys,
thus decreasing read I/O during the join process, and increasing overall
performance. You will learn more about indexing in the next section of this
article.
For best performance, the columns used in joins should be of the same data
types. And if possible, they should be numeric data types rather than character
types.
Avoid joining tables based on columns with few unique values. If columns used
for joining arent mostly unique, then the SQL Server optimizer will perform a
table scan for the join, even if an index exists on the columns. For best
performance, joins should be done on columns that have unique indexes.
If you have to regularly join four or more tables to get the recordset you need,
consider denormalizing the tables so that the number of joined tables is
reduced. Often, by adding one or two columns from one table to another, joins
can be reduced.
Generally, frequent operations requiring 5 or more table joins should be
avoided by redesigning the database
Encapsulate Your Code in Stored Procedures
Try to put all your T-SQL code in stored procedures which reduces the network
traffic by just calling the proc from application and reduces the I/O overhead
by using the compiled execution plan
Always use the option SET NOCOUNT ON
Design the procs to avoid the deadlocks
Collect all inputs before the transaction begins
Keep transaction short with in a batch
Use the correct isolation levels
Try to use with no lock option
Application Design / Application code:
Application Design issues:
Perform as many data-centered tasks as possible on SQL Server in the form of
stored procedures. Avoid manipulating data at the presentation and business
services tiers.
Dont maintain state (dont store data from the database) in the business
services tier. Maintain state in the database as much as possible
Dont create complex or deep object hierarchies. The creation and use of
complex classes or a large number of objects used to model complex business
rules can be resource intensive and reduce the performance and scalability of
your application. This is because the memory allocation when creating and
freeing these objects is costly.
Consider designing the application to take advantage of database connection
pooling and object pooling using Microsoft Transaction Server (MTS). MTS allows
both database connections and objects to be pooled, greatly increasing the
overall performance and scalability of your application.
If your application runs queries against SQL Server that by nature are long,
design the application to be able to run queries asynchronously. This way, one
query does not have to wait for the next before it can run. One way to build in
this functionality into your n-tier application is to use the Microsoft Message
Queue Server (MSMQ).
Application Code:
Use OLE DB to Access SQL Server:
You can access SQL Server data using either ODBC or OLE DB. For best
performance, always select OLE DB. OLE DB is used natively by SQL Server,
and is the most effective way to access any SQL Server data.
Use DSN-less in Connection String:
While creating an ADO connection to SQL Server, you can either use a DSN in
the connection string, or you can use a DSN-less connection. For optimal
performance, use DSN-less connections. Using them prevents the need for
the OLE DB driver to look up connection string information in the registry of
the client the application code is running on, saving some overhead.
Encapsulate your DML (Data Manipulation Language) in Stored Procedures
ADO allows you three different ways to SELECT, INSERT, UPDATE, or DELETE
data in a SQL Server database. You can use ADOs methods, you can use
dynamic SQL, or you can use stored procedures. For better performance
prefer Stored Procedures
Encapsulate Your ADO Code in COM Components
Put the ADO code that accesses SQL Server data into COM components. This
gives you all the standard benefits of COM components, such as object
pooling using MTS. And for ASP-based applications, it provides greater speed
because the ADO code in COM objects is already compiled, unlike ADO code
found in ASP pages. How you implement your data manipulation code in
COM components should be considered when the application is first
designed.
For optimum performance, COM objects should be compiled as in-process
DLLs (which is required if they are to run under MTS). You should always
employ early binding when referencing COM objects, and create them
explicitly, not implicitly.
CPU bottlenecks:
Signal waits > 25% of total waits.
(See sys.dm_os_wait_stats for Signal waits and Total waits. Signal waits measure
the time spent in the runnable queue waiting for CPU. High signal waits indicate a
CPU bottleneck.)

Plan re-use < 90%.


(A query plan is used to execute a query. Plan re-use is desirable for OLTP
workloads because re-creating the same plan (for similar or identical transactions)
is a waste of CPU resources. Compare SQL Server SQL Statistics: batch
requests/sec to SQL compilations/sec. Compute plan re-use as follows: Plan re-use
= (Batch requests SQL compilations) / Batch requests. Special exception to the
plan re-use rule: Zero cost plans will not be cached (not re-used) in SQL 2005 SP2.
Applications that use zero cost plans will have a lower plan re-use but this is not a
performance issue.)

Memory bottleneck:
Consistently low average page life expectancy. (MSSQL$Instance: Buffer
Manager\Page Life Expectancy:)
(See Average Page Life Expectancy Counter which is in the Perfmon object SQL
Server Buffer Manager (this represents is the average number of seconds a page
stays in cache). For OLTP, an average page life expectancy of 300 is 5 minutes.
Anything less could indicate memory pressure, missing indexes, or a cache flush)

Consistently low SQL Cache hit ratio. (MSSQL$Instance: Plan Cache\Cache Hit
Ratio:)
(OLTP applications (e.g. small transactions) should have a high cache hit ratio.
Since OLTP transactions are small, there should not be (1) big drops in SQL Cache
hit rates or (2) consistently low cache hit rates < 90%. Drops or low cache hit may
indicate memory pressure or missing indexes.)
IO bottleneck:
High average disk seconds per read.
(When the IO subsystem is queued, disk seconds per read increases. See Perfmon
Logical or Physical disk (disk seconds/read counter). Normally it takes 4-8ms to
complete a read when there is no IO pressure. When the IO subsystem is under
pressure due to high IO requests, the average time to complete a read increases,
showing the effect of disk queues. Periodic higher values for disk seconds/read
may be acceptable for many applications. For high performance OLTP applications,
sophisticated SAN subsystems provide greater IO scalability and resiliency in
handling spikes of IO activity. Sustained high values for disk seconds/read (>15ms)
does indicate a disk bottleneck.)

High average disk seconds per write.


(See Perfmon Logical or Physical disk. The throughput for high volume OLTP
applications is dependent on fast sequential transaction log writes. A transaction
log write can be as fast as 1ms (or less) for high performance SAN environments.
For many applications, a periodic spike in average disk seconds per write is
acceptable considering the high cost of sophisticated SAN subsystems. However,
sustained high values for average disk seconds/write is a reliable indicator of a
disk bottleneck.)

Big IOs such as table and range scans due to missing indexes.
Blocking bottleneck:
High average row lock or latch waits.
(The average row lock or latch waits are computed by dividing lock and latch wait
milliseconds (ms) by lock and latch waits. The average lock wait ms computed
from sys.dm_db_index_operational_stats represents the average time for each
block.)

Top wait statistics


High number of deadlocks.
(See Profiler Graphical Deadlock under Locks event to identify the statements
involved in the deadlock.)

Network bottleneck:
High network latency coupled with an application that incurs many round trips
to the database.
Network bandwidth is used up.
(See counters packets/sec and current bandwidth counters in the network
interface object of Performance Monitor. For TCP/IP frames actual bandwidth is
computed as packets/sec * 1500 * 8 /1000000 Mbps)
Server Hardware:
Most slow applications are slow because of poor up front design, not because of
slow hardware. Since the applications design cant be changed at the time when
deployed to production, about the only thing you can try to help boost
performance is to throw hardware at it.
CPU: Always purchase a server with the ability to expand its number of CPUs.
Usually it goes for larger servers with four or more CPUs. Always leave room for
growth.
Memory: Try to get enough RAM to hold the largest table you expect to have,
and if you can afford it, get all the RAM your server can handle, which is often
2GB or more.
I/O Subsystem: At the very minimum, purchase hardware-based RAID for your
databases. As a rule of thumb, you will to purchase more smaller drives, not
fewer larger drives in your array. The more disks that are in an array, the
faster I/O will be.
Network Connection: At the server, have at least one 100Mbs network card,
and it should be connected to a switch. Ideally, you should have two network
cards in the server connected to a switch in full-duplex mode.
For best performance on a server, SQL Server should be the only application
running on the server, other than management utilities. Dont try to save a few
bucks by putting your IIS server on the same server as SQL Server.

Q. What is the process of tuning the Performance?


Ans:
Identification Use native tools like Profiler, Query Tuning Advisor, Query
Execution Plans, Performance Monitor, system stored procedures, dynamic
management views, custom stored procedures or third party tools
Analysis Analyze the data to determine the core problems
Providing Solution
Creating new index on appropriate columns
Altering the complex quires to make them use the existing indexes.
By Updating Statistics for Tables and Views.
By Rebuilding and Reorganizing indexes.
By Resolving blocking problems.
By removing Deadlocks.
Testing Test the various options to ensure they perform better and do not
cause worse performance in other portions of the application
Knowledge sharing Share your experience with the team to ensure they
understand the problem and solution, so the issue does not occur again
Q. How to choose the correct (Clustered/ Non- Clustered) index on a column?
Ans:
Selecting Clustered Index:
Clustered indexes are ideal for queries that select by a range of values or
where you need sorted results. Examples of this include when you are using
BETWEEN, <, >, GROUP BY, ORDER BY, and aggregates such as MAX, MIN, and
COUNT in your queries.
Clustered indexes are good for queries that look up a record with a unique
value (such as an employee number) and when you need to retrieve most or all
of the data in the record.
Clustered indexes are good for queries that access columns with a limited
number of distinct values, such as columns that holds country or state data. But
if column data has little distinctiveness, such as columns with a yes or no, or
male or female, then these columns should not be indexed at all.
Avoid putting a clustered index on columns that increment, such as an identity,
date, or similarly incrementing columns, if your table is subject to a high level
of INSERTS.
Selecting Non Clustered Index:
Non-clustered indexes are best for queries that return few rows (including just
one row) and where the index has good selectivity (above 95%).
If a column in a table is not at least 95% unique, then most likely the SQL
Server Query Optimizer will not use a non-clustered index based on that
column. For example, a column with yes or no as the data wont be at
least 95% unique.
Keep the width of your indexes as narrow as possible, especially when
creating composite (multi-column) indexes. This reduces the size of the index
and reduces the number of reads required to read the index, boosting
performance.
If possible, try to create indexes on columns that have integer values instead of
characters. Integer values have less overhead than character values.
If you know that your application will be performing the same query over and
over on the same table, consider creating a covering index on the table. A
covering index includes all of the columns referenced in the query.
An index is only useful to a query if the WHERE clause of the query matches the
column(s) that are leftmost in the index. So if you create a composite index,
such as City, State, then a query such as WHERE City = Houston' will use
the index, but the query WHERE STATE = TX' will not use the index.
Q. How to read the graphical execution plan?
Ans:
The plan should be read from right to left

Check the Graphical execution plan of a stored procedure / Query


Table Scan Index is missing
Index Scan Proper indexes are not using
BookMark Lookup Limit the number of columns in the select list
Filter Remove any functions from where clause, May require additional
indexes
Sort Does the data really need to be sorted? Can an index be used to avoid
sorting? Can sorting be done at the client more efficiently?
DataFlow Arrow High density: Sometimes you find few rows as outcome but
the arrow line density indicates the query/proc processing huge number of rows
Cost Can easily find out which table / operation taking much time
From the execution plan we can find out the bottleneck and give the possible
solution to avoid the latency
Q. Why the Actual and Estimated Execution Plans Might Differ
Ans

When Statistics are Stale:


The main cause of a difference between the plans is differences between
the statisticsand the actual data. This generally occurs over time as data is added
and deleted.
When the Estimated plan is invalid:
When the batch contains temporary tables or the T-SQL statements which refers
some of the objects that are not currently existed in the database, but will be
created once the batch is run. (Create table is there in batch)

Q. What are the permissions required to view execution plans?


Ans:
Either the user must be mapped to sysadmin, db_owner, db_creator or he/she will
be granted the permission Show Plan.

GRANT SHOWPLAN TO [username]

Q. What are the tools available for performance tuning/monitoring?


Ans:
Performance Studio: Act as a Central Data Repository, Collect Selected SQL
Server Performance Data and Display Performance Reports
Activity Monitor: It displays graphically about Processes, Resource Waits,
Datafile I/O, Recent expensive Quires.
Database Tuning Advisor (DTA): Recommend indexes
Profiler: Can run traces and find out the expensive/long running
quires/transactions
Execution Plans: There are three types Graphical, Text and XML.
DMV: Dynamic management views shows the current state of the sql server
PerfMon: Windows native tool to view / monitor the performance of both sql
and windows servers
Third Party: Redgate products
Q. How to identify the CPU bottlenecks and how to resolve it?
Ans:
Identifying CPU Bottlenecks:
Firstly we have to confirm that SQL Server CPU utilization is high. Run the below
query

SELECT Timestamp, CONVERT(XML, record) AS XmlRecord

FROM SYS.DM_OS_RING_BUFFERS

WHERE ring_buffer_type = NRING_BUFFER_SCHEDULER_MONITOR

AND record like %<SystemHealth>%

ORDER BY timestamp DESC

One record is stored every minute up to a maximum of 256 records. Clicking on any
of the XML links will take you to the XML editor and will show an entry similar to
below
<Record id=434 type=RING_BUFFER_SCHEDULER_MONITOR time=22398046>

<SchedulerMonitorEvent>

<SystemHealth>

<ProcessUtilization>55</ProcessUtilization>
<SystemIdle>35</SystemIdle>
<UserModeTime>228180000</UserModeTime>

<KernelModeTime>251812000</KernelModeTime>

<PageFaults>64252</PageFaults>

<WorkingSetDelta>21770240</WorkingSetDelta>

<MemoryUtilization>100</MemoryUtilization>

</SystemHealth>

</SchedulerMonitorEvent>

</Record>

Information from above XML:

ProcessUtilization: Percentage of CPU utilized by SQL Server 55%

SystemIdle: Percentage of Idle CPU 35%

Other processes using CPU: 100- (55+35) = 10 %

Now find out the query/proc/process that is making CPU utilization High:
SELECT TOP 20

qst.sql_handle,

qst.execution_count,

qst.total_worker_time AS Total_CPU,

total_CPU_inSeconds = Converted from microseconds


qst.total_worker_time/1000000,

average_CPU_inSeconds = Converted from microseconds

(qst.total_worker_time/1000000) / qst.execution_count,

qst.total_elapsed_time,

total_elapsed_time_inSeconds = Converting from microseconds

qst.total_elapsed_time/1000000,

st.text AS Query,

qp.query_plan

from

sys.dm_exec_query_stats as qst

CROSS APPLY sys.dm_exec_sql_text(qst.sql_handle) as st

cross apply sys.dm_exec_query_plan (qst.plan_handle) as qp

ORDER BY qst.total_worker_time DESC

From the above script we can find the commands which are taking the most CPU
time along with the execution plan. By reviewing the execution plan you can see
what additional indexes need to be added to the database which will improve
database performance and decrease the CPU load time.

By adding missing indexes or by using the proper indexes we can decrease the load
on CPU.

Other options:
Sp_monitor: Displays statistics, including CPU usage, I/O usage, and the
amount of time idle since sp_monitor was last executed. We can get the
information about the CPU Time (Sec), I/O Time (Sec), Count of
Input\Output Packets, No of logins attempted, Errors in reading/writing
network packets etc.
@@CPU_BUSY / @@IO_BUSY: Returns the time that SQL Server has spent
working since it was last started. Result is in CPU time increments, or ticks,
and is cumulative for all CPUs, so it may exceed the actual elapsed time.
Multiply by @@TIMETICKS to convert to microseconds. But it may not the
accurate value to be considered.
PerfMon
Profiler
Q. Can you tell me what the Wait Type LAZY WRITTER is?
Ans:
The job of the lazy writer is to find dirty pages in the buffer pool and write them
out to disk and drop those pages from cache.

Q. Can we find performance bottleneck from sysprocesses?


Ans:
Yes. We may not confirm that it is the only bottleneck but at least we can find the
bottleneck. Lastwaittype column with waittime plays a vital role in identifying the
issue. This is a very interesting column because it can tell you what the offending
query is waiting for to complete.
Network_io: There is too much of traffic in Network
Cxpacket: Your process is waiting on other parallel processes to complete.
SOS_SCHEDULER_YIELD: CPU bound. We may not have enough CPU in your box
IO_Completion: Disk issue. We may not have enough disk space or running on
corrupted disk array.
Q. What Are SQL Server Waits?
Ans:
Instead of measuring activity of CPU, storage, or memory, why not ask what SQL
Server has been waiting on when executing queries?

In general there are three categories of waits that could affect any given request:

Resource waits are caused by a particular resource, perhaps a specific lock


that is unavailable when the requested is submitted.
External waits occur when SQL Server worker thread is waiting on an external
process
Queue waits normally apply to internal background tasks, such as ghost
cleanup, which physically removes records that have been previously deleted.
Q. How could you know the statistics are outdated?
Ans:
If old statistics is your problem, you will likely experience this as a gradual decline
of SQL Server slowing down over many days or weeks, or you may have just
upgraded your platform (from 2000 to 2008) and forgot to update the statistics.
Out of date statistics cause inaccurate execution plans.

Q. What are the main parameters we need to check when you are dealing with
memory performance?
Ans:
There are four significant properties of sql server.

Max server memory and Min server memory:


Use the two server memory options, min server memory and max server memory,
to reconfigure the amount of memory (in megabytes) that is managed by the SQL
Server Memory Manager for a SQL Server process used by an instance of SQL
Server. By default Min Memory is set to be 0 and Max Memory is set to be
2147483647 MB (21 GB). Never leave these two settings as default. Depends on the
memory available and other applications running on windows Server, change these
two settings.

For example we have 24 GB available and the settings can be like this:

Min Memory: 1 GB

Max Memory: 16 GB

Remember total max memory of all instances should not exceeds the actual
physical memory available

Priority boost: By default, the priority boost setting is 0, which causes SQL Server
to run at a normal priority. If you set priority boost to 1, the SQL Server process
runs at a high priority.
Lightweight pooling: Switch on this parameter when you want to make sql server
use the fiber mode facility. Unless there is a real need and environment (Large
multi-processor servers) available we should not use this option at production
servers.

Question Difficulty = Easy

Question 1: Name five different tools which can be used for performance
tuning and their associated purpose.
o Performance Monitor\System Monitor - Tool to capture macro level
performance metrics.
Additional information
Automate Performance Monitor Statistics Collection for SQL
Server and Windows
Windows Reliability and Performance Monitor to troubleshoot
SQL Server
o Profiler - Tool to capture micro level performance metrics based on the
statements issued by a login, against a database or from host name.
Additional information: Tip Category - Profiler and Trace
o Server Side Trace - System objects to write the detailed statement metrics to a
table or file, similar to Profiler.
Additional information: SQL Server Performance Statistics Using a
Server Side Trace
o Dynamic Management Views and Functions - SQL Server objects with low
level metrics to provide insight into a specific portion of SQL Server i.e. the
database engine, query plans, Service Broker, etc.
Additional information: Dynamic Management Views\Functions
o Management Studio's Built-In Performance Reports - Ability to capture point
in time metrics as pre-defined by Microsoft.
Additional information: Built-In Performance Reports in SQL Server
2005
o Custom scripts - Custom scripts can be developed to monitor performance,
determine IO usage, monitor fragmentation, etc. all in an effort to improve
performance.
o Third party applications - Performance monitoring and tuning applications
from vendors in the SQL Server community.
Additional information: SQL Server Performance Monitoring Tools

Question 2: Explain how the hardware running SQL Server can help or
hinder performance.
o Taxed CPUs will queue requests and hinder query performance.
o Insufficient memory could cause paging resulting in slow query performance.
o Incorrectly configured disk drives could exacerbate IO problems.
o Additional information: Hard Drive Configurations for SQL
Server and Hardware 101 for DBAs

Question 3: Why is it important to avoid functions in the WHERE clause?


o Because SQL Server will scan the index or the table as opposed to seeking the
data. The scan operation is a much more costly operation than a seek.
o Often a slightly different approach can be used to prevent using the function in
the WHERE clause yielding a favorable query plan and high performance.
o Additional information: Performance Tip: Avoid functions in WHERE clause

Question 4: How is it possible to capture the IO and time statistics for your
queries?
o Use the SET STATISTICS IO and SET STATISTICS TIME settings in your
queries or enable the settings in your Management Studio session.
o Additional information: Getting IO and time stats for your queries

Question 5: True or False - It is possible to correlate the Performance


Monitor metrics with Profiler data in a single SQL Server native product?
o True - This functionality is possible with SQL Server Profiler.
o Additional information: Correlating Performance Monitor and Trace Data

Question Difficulty = Moderate

Question 1: How can I/O statistics be gathered and reviewed for individual
database files?
o By using the fn_virtualfilestats function to capture the metrics.
o This process can be automated with a script to determine the file usage with
numerous samples.
o Additional Information: Gathering I/O statistics down to the database file level

Question 2: What is a query plan and what is the value from a performance
tuning perspective?
o A query plan is the physical break down of the code being passed to the SQL
Server optimizer.
o The value from a performance tuning perspective is that each component of
the query can be understood and the percentage of resource utilization can be
determined at a micro level. As query tuning is being conducted, the detailed
metrics can be reviewed to compare the individual coding techniques to
determine the best alternative.
o Additional Information:
Maximizing your view into SQL Query Plans
Capturing Graphical Query Plans with Profiler
Tip Category: Query Plans
Tip Category: Query Optimization

Question 3: True or False - It is always beneficial to configure TempDB with


an equal number of fixed sized files as the number of CPU cores.
o False - With always being the operative word in the question.
o Depending on the version of SQL Server, the disk subsystem, load, queries,
etc., a 1 to 1 ratio of files to cores may be necessary on high end SQL Servers
with intense processing.
o If you do not have that luxury, a starting point may to be have half the number
of tempdb files as compared to CPU cores.
o This is a configuration to load test and monitor closely depending on the type
of processing, load, hardware, etc. that your SQL Server is expected to
support.

Question 4: Explain the NOLOCK optimizer hint and some pros\cons of using
the hint.
o The NOLOCK query hint allows SQL Server to ignore the normal locks that
are placed and held for a transaction allowing the query to complete without
having to wait for the first transaction to finish and therefore release the locks.
o This is one short term fix to help prevent locking, blocking or deadlocks.
o However, when the NOLOCK hint is used, dirty data is read which can
compromise the results returned to the user.
o Additional information: Getting rid of some blocking issues with NOLOCK

Question 5: Explain three different approaches to capture a query plan.


o SHOWPLAN_TEXT
o SHOWPLAN_ALL
o Graphical Query Plan
o sys.dm_exec_query_optimizer_info
o sys.dm_exec_query_plan
o sys.dm_exec_query_stats

Question Difficulty = Advanced

Question 1: True or False - A LEFT OUTER JOIN is always faster than a


NOT EXISTS statement.
o False - With always being the operative word. Depending on the situation the
OUTER JOIN may or may not be faster than a NOT EXISTS statement. It is
necessary to test the techniques, review the query plans and tune the queries
accordingly.

Question 2: Name three different options to capture the input (code) for a
query in SQL Server.
o DBCC INPUTBUFFER
o fn_get_sql
o sys.dm_exec_sql_text
o Additional information:
SQL Server statements currently running with fn_get_sql
SQL Server 2000 to 2005 Crosswalk - Code Identification

Question 3: Explain why the NORECOMPUTE option of UPDATE


STATISTICS is used.
o This command is used on a per table basis to prevent the table from having
statistics automatically updated based on the 'Auto Update Statistics' database
configuration.
o Taking this step will prevent UPDATE STATISTICS from running during an
unexpected time of the day and cause performance problems.
o By setting this configuration it is necessary to manually UPDATE
STATISTICS on a regular basis.
o Additional information: The NORECOMPUTE option of UPDATE
STATISTICS

Question 4: Explain a SQL Server deadlock, how a deadlock can be


identified, how it is a performance problem and some techniques to correct
deadlocks.
o A deadlock is a situation where 2 spids have data locked and cannot release
their lock until the opposing spid releases their lock. Depending on the
severity of the deadlock, meaning the amount of data that is locked and the
number of spids that are trying to access the same data, an entire chain of spids
can have locks and cause a number of deadlocks, resulting in a performance
issue.
o Deadlocks can be identified by Profiler in either textual, graphical or XML
format.
Additional information: Capturing Deadlock Information in XML
Format and How To: Graphical Deadlock Chain
o Deadlocks are a performance problem because they can prevent 2 or more
processes from being able to process data. A deadlock chain can occur and
impact hundreds of spids based on the data access patterns, number of users,
object dependencies, etc.
o Deadlocks could require a database design change, T-SQL coding change to
access the objects in the same order, separating reporting and OLTP
applications, including NOLOCK statements in SELECT queries that can
accept dirty data, etc.

Question 5: Please explain why SQL Server does not select the same query
plan every time for the same code (with different parameters) and how SQL
Server can be forced to use a specific query plan.
o The query plan is chosen based on the parameters and code being issued to the
SQL Server optimizer. Unfortunately, a slightly different query plan can
cause the query to execute much longer and use more resources than another
query with exactly the same code and only parameter differences.
o The OPTIMIZE FOR hint can be used to specify what parameter value we
want SQL Server to use when creating the execution plan. This is a SQL
Server 2005 and beyond hint.
o Additional information: Optimize Parameter Driven Queries with the
OPTIMIZE FOR Hint

Sql server - What is an index?

Indexes help us to find data faster. It can be created on a single column or a combination of columns........

Sql server - Disadvantages of the Indexes


Use of intexes slow down Data modification operations (such as INSERT, UPDATE, DELETE)........

Sql server - Define Clustered and Non-Clustered Index


A clustered index reorders the way records are stored. A non clustered index is in which the logical
the index does not match the physical stored order of the rows on disk......

Sql server - What is Unique Index?


Unique index is the index that is applied to any column of unique value........

Sql server - Difference between clustered and non-clustered index


Both stored as B-tree structure. The leaf level of a clustered index is the actual data where as leaf l
non-clustered index is pointer to data........

Sql server - Explain the 'Fillfactor' concept in Indexes


The fill factor option is provided for smoothening index data storage and performance........

Sql server - How do you maintain a fill factor in existing indexes?


Usually the data keeps getting added, deleted, or updated in the table due to which the fill factor is
implemented during the index creation itself........

Sql server What is it unwise to create wide clustered index keys?


A clustered index is a good choice for searching over a range of values. After an indexed row is fou
remaining rows.......

Sql server - What is full-text indexing?


Full text indexes are stored in the file system and are administered through the database.......

Sql server - What is fill factor and pad index?


A fill factor is a specification done during the creation of indexes so that a particular amount of spac
left on a leaf level page........

Sql server - Describe important index characteristics


The characteristics of the indexes are: They fasten the searching of a row. They are sorted by the K
values........

Sql server - What are the types of indexes?


Types of indexes: Clustered: It sorts and stores the data row of the table or view in order based on
key.........

Sql server - Describe the purpose of indexes


Allow the server to retrieve requested data, in as few I/O operations.......

Sql server - Determine when an index is appropriate


When there is large amount of data. For faster search mechanism indexes are appropriate........

Sql server - Syntax and an example for create, rename and delete index
CREATE INDEX index_name ON table_name (col_1,col_2..);........

Q. What are the primary differences between an index reorganization and an


index rebuild?
Ans:
Reorganization is an online operation by default; a rebuild is an offline
operation by default
Reorganization only affects the leaf level of an index
Reorganization swaps data pages in-place by using only the pages already
allocated to the index; a rebuild uses new pages/allocations
Reorganization is always a fully-logged operation; a rebuild can be a minimally-
logged operation
Reorganization can be stopped mid-process and all completed work is retained;
a rebuild is transactional and must be completed in entirety to keep changes
Q. During an index reorganization operation, if the index spans multiple files,
will pages be allowed to migrate between files
Ans:
No pages will not migrate between files during an index reorganization.

Q. If you need to REBUILD a non-clustered index that is 10GB in size and have
5GB of free data-file space available with no room to grow the data file(s), how
can you accomplish the task?
Ans:
When rebuilding an existing non-clustered index, you typically require free space
that is approximately equivalent to 2.2 times the size of the existing index, since
during a rebuild operation, the existing index is kept until the rebuilt structure is
complete and an additional approximately 20% free space for temporary sorting
structures used during the rebuild operation

In this case, you would require at least an additional 10+ GB of free space for
the rebuild operation to succeed, since the index itself is 10GB in size.
Using SORT_IN_TEMPDB would not suffice in this case, since only the temporary
sort tables are stored in tempdb in this case, and at least 10 GB of free space
would still be required in the database data files.
Your possibilities are different for SQL Server2000 vs. SQL Server 2005/2008:

In SQL Server 2000, you only have 1 option in this case. Drop the index and
recreate it.
In SQL Server 2005/2008, you can do the same as you did with SQL Server
2000 (drop, recreate), but you also have another option. If you first disable the
index (via the ALTER INDEXDISABLE statement) the existing space consumed
by the index will be freed. Then running a simple ALTER INDEXREBUILD
command will allow the build operation to use the now 15gb of free space to
build the index.
Q. What is the Covering Index?
Ans:
If you know that your application will be performing the same query over and over
on the same table, consider creating a covering index on the table. A covering
index, which is a form of a composite non clustered index, includes all of the
columns referenced in SELECT, JOIN, and WHERE clauses of a query. Because of
this, the index contains the data you are looking for and SQL Server doesnt have
to look up the actual data in the table, reducing logical and/or physical I/O, and
boosting performance.
On the other hand, if the covering index gets too big (has too many columns), this
could actually increase I/O and degrade performance.

Q. What is Online Indexing?


Ans:
Online Indexing is a new feature available in SQL Server 2005. In SQL Server 2005,
DBAs can create, rebuild, or drop indexes online. The index operations on the
underlying table can be performed concurrently with update or query operations.
This was not possible in previous versions of SQL Server. In the past, indexing
operations (reorganizing or rebuilding) were usually performed as a part of other
maintenance tasks running during off-peak hours. During these offline operations,
the indexing operations hold exclusive locks on the underlying table and associated
indexes. During online index operations, SQL Server 2005 eliminates the need of
exclusive locks

The Online indexing feature is very helpful for environments that run 24 hours a
day, seven days a week. The Online Indexing feature is available only in the
Enterprise Edition of SQL Server 2005/2008.

Q. How Online Indexing works?


Ans:
The online index operation can be divided into three phases:

Preparation
Build
Final
The Build phase is a longest phase of all. It is in this phase where the creation,
dropping, or rebuilding of indexes take place. The duration of the Build phase
depends on the size of the data and the speed of the hardware. Exclusive locks are
not held in this phase, so concurrent DML operations can be performed during this
phase. The Preparation and Final phases are for shorter durations. They are
independent of the size factor of the data. During these two short phases, the
table or the indexed data is not available for concurrent DML operations.

Q. How to know unused indexes in a table.


Ans:
By using a DMV well find the unused indexes details. We have a DMV called
sys.dm_db_index_usage_stats which retrieves the statistics of indexes , if an
index id is not in that dmv then we can say that index is not been using from a long
time.

Q. What are the index limitations?


Ans:
INDEXTYPE 32 BIT / 64 BIT

NON CLUSTERED 999 / 999


XML indexes 249 / 249

Columns per index key 16 / 16 Only 900 Byte

Q. What are the different indexes available?


Ans:
Index Types
Clustered Index: A clustered index sorts and stores the data rows of the table
or view in order based on the clustered index key. The clustered index is
implemented as a B-tree index structure that supports fast retrieval of the
rows, based on their clustered index key values.
Non Clustered Index: A nonclustered index can be defined on a table or view
with a clustered index or on a heap. Each index row in the nonclustered index
contains the nonclustered key value and a row locator. This locator points to
the data row in the clustered index or heap having the key value. The rows in
the index are stored in the order of the index key values, but the data rows are
not guaranteed to be in any particular order unless a clustered index is created
on the table.
Unique Index: An index that ensures the uniqueness of each value in the
indexed column. If the index is a composite, the uniqueness is enforced across
the columns as a whole, not on the individual columns. For example, if you
were to create an index on the FirstName and LastName columns in a table, the
names together must be unique, but the individual names can be duplicated. A
unique index is automatically created when you define a primary key or unique
constraint:
Primary key: When you define a primary key constraint on one or more
columns, SQL Server automatically creates a unique, clustered index if a
clustered index does not already exist on the table or view. However, you
can override the default behavior and define a unique, nonclustered index
on the primary key.
Unique: When you define a unique constraint, SQL Server automatically
creates a unique, nonclustered index. You can specify that a unique
clustered index be created if a clustered index does not already exist on the
table.
Index with Included Columns: A nonclustered index that is extended to include
nonkey columns in addition to the key columns. They can be data types not
allowed as index key columns. They are not considered by the Database Engine
when calculating the number of index key columns or index key size. Need to
use INCLUDE clause while creating index.
Full Text Indexes: A special type of token-based functional index that is built
and maintained by the Microsoft Full-Text Engine for SQL Server. It provides
efficient support for sophisticated word searches in character string data.
Spatial Indexes: A spatial index provides the ability to perform certain
operations more efficiently on spatial objects (spatial data) in a column of
the geometry data type. The spatial index reduces the number of objects on
which relatively costly spatial operations need to be applied.
Filtered Index: An optimized nonclustered index, especially suited to cover
queries that select from a well-defined subset of data. It uses a filter predicate
to index a portion of rows in the table. A well-designed filtered index can
improve query performance, reduce index maintenance costs, and reduce index
storage costs compared with full-table indexes. Use where clause in create
index statement.
Composite index: An index that contains more than one column. In both SQL
Server 2005 and 2008, you can include up to 16 columns in an index, as long as
the index doesnt exceed the 900-byte limit. Both clustered and nonclustered
indexes can be composite indexes.
XML Indexes: A shredded, and persisted, representation of the XML binary
large objects (BLOBs) in the xml data type column.

Q. I have a situation where I need to create an index on 20 columns. Is it


practically possible? If yes justify?
Ans:
Yes ! We can create an index on 20 columns using INCLUDE clause.

You can include nonkey columns in a nonclustered index to avoid the current index
size limitations of a maximum of 16 key columns and a maximum index key size of
900 bytes. The SQL Server Database Engine does not consider nonkey columns
when calculating the number of index key columns or the total size of the index
key columns. In a nonclustered index with included columns, the total size of the
index key columns is restricted to 900 bytes. The total size of all nonkey columns is
limited only by the size of the columns specified in the INCLUDE clause; for
example, varchar(max) columns are limited to 2 GB. The columns in the INCLUDE
clause can be of all data types, except text, ntext, and image.

Q. On which basis we will create computed indexes?


Ans:
For composite indexes, take into consideration the order of the columns in the
index definition. Columns that will be used in comparison expressions in the
WHERE clause (such as WHERE FirstName = Charlie) should be listed first.
Subsequent columns should be listed based on the uniqueness of their values, with
the most unique listed first.

Q. How to design indexes for a database? Or


What are the index creation best practices?
Ans:
Understand the characteristics of the database (OLTP / OLAP)
Understand the characteristics of the most frequently used queries.
Understand the characteristics of the columns used in the queries.
Choose the right index at the right place. For example, creating a clustered
index on an existing large table would benefit from the ONLINE index option.
Determine the optimal storage location for the index. A nonclustered index can
be stored in the same filegroup as the underlying table, or on a different
filegroup. If those filegroups are in different physical drives it will improve the
performance.
Database Considerations:
Avoid over indexing
Use many indexes to improve query performance on tables with low update
requirements, but large volumes of data.
Indexing small tables may not be optimal
Indexes on views can provide significant performance gains when the view
contains aggregations and/or table joins. The view does not have to be
explicitly referenced in the query for the query optimizer to use it
Column considerations:
Keep the length of the index key short for clustered indexes. Additionally,
clustered indexes benefit from being created on unique or nonnull columns.
Columns that are of the ntext, text, image, varchar(max), nvarchar(max), and
varbinary(max) data types cannot be specified as index key columns. However,
varchar(max), nvarchar(max), varbinary(max), and xml data types can
participate in a nonclustered index as nonkey index columns
Columns that are of the ntext, text, image, varchar(max), nvarchar(max), and
varbinary(max) data types cannot be specified as index key columns. However,
varchar(max), nvarchar(max), varbinary(max), and xml data types can
participate in a nonclustered index as nonkey index columns
Consider using filtered indexes on columns that have well-defined subsets
Consider the order of the columns if the index will contain multiple columns.
The column that is used in the WHERE clause in an equal to (=), greater than
(>), less than (<), or BETWEEN search condition, or participates in a join, should
be placed first.
Index Characteristics: Determine the right index depends on the business need
Clustered versus nonclustered
Unique versus nonunique
Single column versus multicolumn
Ascending or descending order on the columns in the index
Full-table versus filtered for nonclustered indexes
Determine the Fill Factor
Q. Can we create index on table variables?
Yes, Implicitly
Create Index on Table Variable
Creating an index on a table variable can be done implicitly within the declaration
of the table variable by defining a primary key and creating unique constraints.
The primary key will represent a clustered index, while the unique constraint a
non clustered index.

DECLARE @Users TABLE

UserID INT PRIMARY KEY,

UserName varchar(50),

UNIQUE (UserName)

)
The drawback is that the indexes (or constraints) need to be unique. One potential
way to circumvent this however, is to create a composite unique constraint:

DECLARE @Users TABLE

UserID INT PRIMARY KEY,

UserName varchar(50),

FirstName varchar(50),

UNIQUE (UserName,UserID)

You can also create the equivalent of a clustered index. To do so, just add the
clustered reserved word.

DECLARE @Users TABLE

UserID INT PRIMARY KEY,

UserName varchar(50),

FirstName varchar(50),

UNIQUE CLUSTERED (UserName,UserID)

Q. What is the Heap Table?


Ans:
HEAP

Data is not stored in any particular order


Specific data cannot be retrieved quickly, unless there are also non-clustered
indexes
Data pages are not linked, so sequential access needs to refer back to the index
allocation map (IAM) pages
Since there is no clustered index, additional time is not needed to maintain the
index
Since there is no clustered index, there is not the need for additional space to
store the clustered index tree
These tables have a index_id value of 0 in the sys.indexes catalog view
Q. How to get the index usage information?
Q. Query to retrieve the indexes that are being used?
Ans:
sys.dm_db_index_usage_stats bind with sysobjects
This view gives you information about overall access methods to your indexes.
There are several columns that are returned from this DMV, but here are some
helpful columns about index usage:

user_seeks number of index seeks


user_scans- number of index scans
user_lookups number of index lookups
user_updates number of insert, update or delete operations
sys.dm_db_index_operational_stats bind with sysindexes
This view gives you information about insert, update and delete operations that
occur on a particular index. In addition, this view also offers data about locking,
latching and access methods. There are several columns that are returned from
this view, but these are some of the more interesting columns:

leaf_insert_count total count of leaf level inserts


leaf_delete_count total count of leaf level inserts
leaf_update_count total count of leaf level updates
SELECT DB_NAME(DATABASE_ID) AS DATABASENAME,
OBJECT_NAME(B.OBJECT_ID) AS TABLENAME,
INDEX_NAME = (SELECT NAME
FROM SYS.INDEXES A
WHERE A.OBJECT_ID = B.OBJECT_ID
AND A.INDEX_ID = B.INDEX_ID),
USER_SEEKS,
USER_SCANS,
USER_LOOKUPS,
USER_UPDATES
FROM SYS.DM_DB_INDEX_USAGE_STATS B
INNER JOIN SYS.OBJECTS C
ON B.OBJECT_ID = C.OBJECT_ID
WHERE DATABASE_ID = DB_ID(DB_NAME())
AND C.TYPE <> S

Q. What is fill factor? How to choose the fill factor while creating an index?
Ans:
The Fill Factor specifies the % of fullness of the leaf level pages of an index. When
an index is created or rebuilt the leaf level pages are written to the level where
the pages are filled up to the fill factor value and the remainder of the page is left
blank for future usage. This is the case when a value other than 0 or 100 is
specified. For example, if a fill factor value of 70 is chosen, the index pages are
all written with the pages being 70 % full, leaving 30 % of space for future usage.

Q. When to choose High or Low Fillfactor Value?


Ans:
You might choose a high fill factor value when there is very little or no change in
the underlying tables data, such as a decision support system where data
modification is not frequent, but on a regular and scheduled basis. Such a fill
factor value would be better, since it creates an index smaller in size and hence
queries can retrieve the required data with less disk I/O operations since it has to
read less pages.

On the other hand if you have an index that is constantly changing you would want
to have a lower value to keep some free space available for new index entries.
Otherwise SQL Server would have to constantly do page splits to fit the new values
into the index pages.

Q. What methods are available for removing fragmentation of any kind on an


index in SQL Server?
Ans:
SQL Server 2000:

DBCC INDEXDEFRAG
DBCC DBREINDEX
CREATE INDEXDROP EXISTING (cluster)
DROP INDEX; CREATE INDEX
SQL Server 2005 and above: The same processes as SQL Server 2000, only
different syntax

ALTER INDEXREORGANIZE
ALTER INDEXREBUILD
CREATE INDEXDROP EXISTING (cluster)
DROP INDEX; CREATE INDEX
Q. What page verification options are available in SQL Server and how do they
work?
Ans:
SQL Server 2000: Only Torn Page Detection is available in SQL Server 2000.
SQL Server 2005: Both Torn Page Detection and Checksum page verification
options exist in SQL Server 2005. Page verification checks help to discover
damaged database pages caused by disk I/O path errors. Disk I/O path errors can
be the cause of database corruption problems and are generally caused by power
failures or disk hardware failures that occur at the time the page is being written
to disk.
TORN PAGE DETECTION:
Works by saving a specific bit for each 512-byte sector in the 8-kilobyte (KB)
database page and is stored in the database page header when the page is written
to disk. When the page is read from disk, the torn bits stored in the page header
are compared to the actual page sector information. Unmatched values indicate
that only part of the page was written to disk. In this situation, error message 824
(indicating a torn page error) is reported to both the SQL Server error log and the
Windows event log. Torn pages are typically detected by database recovery if it is
truly an incomplete write of a page. However, other I/O path failures can cause a
torn page at any time.

CHECKSUM:
Works by calculating a checksum over the contents of the whole page and stores
the value in the page header when a page is written to disk. When the page is read
from disk, the checksum is recomputed and compared to the checksum value
stored in the page header. If the values do not match, error message 824
(indicating a checksum failure) is reported to both the SQL Server error log and the
Windows event log. A checksum failure indicates an I/O path problem. To
determine the root cause requires investigation of the hardware, firmware drivers,
BIOS, filter drivers (such as virus software), and other I/O path components.

Q. What are the primary differences between an index reorganization and an


index rebuild?
Ans:
Reorganization is an online operation by default; a rebuild is an offline
operation by default
Reorganization only affects the leaf level of an index
Reorganization swaps data pages in-place by using only the pages already
allocated to the index; a rebuild uses new pages/allocations
Reorganization is always a fully-logged operation; a rebuild can be a minimally-
logged operation
Reorganization can be stopped mid-process and all completed work is retained;
a rebuild is transactional and must be completed in entirety to keep changes
Q. During an index reorganization operation, if the index spans multiple files,
will pages be allowed to migrate between files
Ans:
No pages will not migrate between files during an index reorganization.

Q. How to make forcefully use an index in a query? Or What table hint needs to
be specified to forcefully use an index in a query?
Ans:
We can specify Index table hint in a query to forcefully use an index.

Ex: SELECT Emp_ID, Name FROM Emp WITH(INDEX(NIX_NAME_EMP))

Q. What are the index statistics?


Ans:
Index statistics contain information about the distribution of index key values. By
distribution, I mean the number of rows associated with each key value. SQL
Server uses this information to determine what kind of execution plan to use when
processing a query.

Q. When are index statistics updated?


Ans:
The AUTO_UPDATE_STATISTICS database setting controls when statistics are
automatically updated. Once statistics have been created, SQL Server then
determines when to update those statistics based on how out-of-date the statistics
might be. SQL Server identifies out of date statistics based on the number of
inserts, updates, and deletes that have occurred since the last time statistics were
updated, and then recreates the statistics based on a threshold. The threshold is
relative to the number of records in the table. (Enable the properties auto
create statistics and auto update statistics for OLTP)

Q. Explain database options Auto Update Statistics and Auto Update


StatisticsAsynchronous?
Ans:

Auto Update Statistics: If there is an incoming query but statistics are stale then
sql server first update the statistics before building the execution plan.
Auto Update Statistics Asynchronous: If there is an incoming query but statistics
are stale then sql servers uses the stale statistics, builds the execution plan and
then update the statistics.
Q. How to update statistics manually?
Ans:
If you want to manually update statistics, you can use either sp_updatestats or
UPDATE STATISTICS <statistics name>

Q. What are the various types of statistics available?


Ans:
There are three different types of statistics available.

Statistics created due to index creation.


Statistics created by optimizer.
User defined statistics created from CREATE STATISTICS
Q. What are the various options to be considered while designing a new
execution plan?
Ans:
There are a number of factors that influence the decision in choosing the proper
execution plan. One of the most important ones are cardinality estimations, the
process of calculating the number of qualifying rows that are likely after filtering
operations are applied. A query execution plan selected with inaccurate
cardinality estimates can perform several orders of magnitude slower than one
selected with accurate estimates. These cardinality estimations also influence plan
design options, such as join-order and parallelism. Even memory allocation, the
amount of memory that a query requests, is guided by cardinality estimations.

Other factors that influence the optimizer in choosing the execution plan are:
Q. How histogram built?
Ans:
As we all know that when we index a column, SQL Server does two things:

1. Sorts the values of the column in an internal data structure called Index This
data structure contains the sorted value and a pointer to its respective row in
the table.
2. Generates a histogram of values.
The histogram shows the distribution of values for the sorted column. For example:

if we have following values in an integer column 1,2,2,3,4,1,4,5,4 then a typical


histogram will be similar to this

Value Rows matched


1 2
2 2
3 1
4 3
5 1
So suppose if you are searching for all rows having column values between 3 and
5 then by looking at the above histogram table you can estimate that you will get
total 1+3+1=5 rows. If your table contains just 9 rows in total then 5 rows means
approximately 55% of total rows will match the criteria. Based on this calculation
you may elect to directly scan the table instead of first reading the index, fetching
the pointers of the matching rows and then read the actual rows. Although, it is a
very simple example and there are other factors involve in deciding whether to use
index or not but this example demonstrates the concept in very simple terms.
Q. How to find out when statistics updated last time?
Ans:
A simple logic is, run the query and observe the values for both estimated rows
and actual rows, if they both are close to each other you need not worried about
the statistics. If you find big difference between them then you need to think
about updating statistics.

In general we can find out last statistics updated info from below query

select object_name(object_id) as table_name

,name as stats_name

,stats_date(object_id, stats_id) as last_update

from sys.stats

where objectproperty(object_id, IsUserTable) = 1

order by last_update

Q. What is RID Lookup \ Bookmark Lookup?


Ans:
RID lookup will be seen when you use non-clustered indexes with join queries.

In order to understand the RID look up we need to first understand how non-
clustered indexes work with clustered indexes and heap tables. Below is a simple
figure which describes the working and their relationships.
Non-clustered indexes also use the B-tree structure fundamental to search data. In
non-clustered indexes the leaf node is a Rowid which points to different things
depending on two scenarios:-

Scenario 1:- If the table which is having the primary key in the join has a clustered
index on the join key then the leaf nodes i.e. rowid will point to the index key of
clustered index hence a clustered index seek happens
Scenario 2 :- if the table which is having the primary does not have a clustered
index then the non-clustered index leaf node rowid will point to actual row on
the heap table. As the data is stored in a different heap table, it uses the lookup
(i.e. RID lookup) to get to the actual row hence an Index seek (Non clustered) with
RID lookup happens

Potrebbero piacerti anche