Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
net
www.WorldMags.net & www.aDowns.net
C
WWW.S
.SQLMAG.COM
ontents JUNE 2
2010 Vol. 12 • No. 6
16 SQL Server
COVER STORY
2008 R2
New Features
—Michael
—M
—Mic
Mic
icha
ch ell O
Otey
tte
ey
FEATURES
21 Descending Indexes 39 Getting Started with
—Itzik Ben-Gan Parallel Data Warehouse
Learn about special cases of SQL Server index B-trees —Rich Johnson
and their use cases related to backward index ordering, The SQL Server 2008 R2 Parallel Data Warehouse
as well as when to use descending indexes. (PDW) Edition is Microsoft’s first offering in the
27 Troubleshooting
Massively Parallel Processor (MPP) data warehouse
space. Here’s a peek at what PDW is and what it can do.
Transactional Replication
—Kendal Van Dyke
Find out how to use Replication Monitor, tracer
tokens, and alerts to stay ahead of replication prob-
lems, as well as how to solve three specific transactional
replication problems. Editor’s Tip
E
33 Maximizing Report We’re resurfacing our most
Performance with Parameter- popular articles in the SQL
Driven Expressions Server classics column in the
—William Vaughn July issue. Which SQL Magg articles are your
Want your users to have a better experience and favorites? Let me know at mkeller@sqlmag.com.
decrease the load on your server? Learn to use —Megan Keller, associate editor
parameter-driven expressions.
IN EVERY ISSUE
WWW.SQLMAG.CO
LMAG.COM
JUNE 2010 Vol. 12 • No
No. 6
The Smart Guide to Building World-Class Applications
Technology Group
Senior Vice President,
Technology Media Group
Editorial
Editorial and Custom Strategy Director
Kim Paulsen
kpaulsen@windowsitpro.com
Michele Crockett
crockett@sqlmag.com
Technical Director Michael Otey
motey@sqlmag.com
5 Editorial:
Executive Editor, IT Group Amy Eisenberg
Executive Editor, SQL Server and Developer Sheila Molnar
Group Editorial Director Dave Bernard
Readers Weigh In on Microsoft’s Support for dbernard@windowsitpro.com
DBA and BI Editor Megan Bearly Keller
Small Businesses Editors
—Michael Otey Karen Bemowski, Jason Bovberg, Anne Grubb, Linda Harty,
Caroline Marwitz, Chris Maxcer, Lavon Peters, Rita-Lyn Sanders,
7
Zac Wiggy, Brian Keith Winstead
Reader to Reader Production Editor Brian Reinholz
13
Contributing Editors
Itzik Ben-Gan IBen-Gan@SolidQ.com
Kimberly & Paul: Michael K. Campbell mike@sqlservervideos.com
SQL Server Questions Answered Kalen Delaney kalen@sqlserverinternals.com
Brian Lawton brian.k.lawton@redtailcreek.com
Your questions are answered regarding dropping clustered indexes Douglas McDowell DMcDowell@SolidQ.com
and changing the definition of the clustering key. Brian Moran BMoran@SolidQ.com
Michelle A. Poolet mapoolet@mountvernondatasystems.com
48
Paul Randal paul@sqlskills.com
Kimberly L. Tripp kimberly@sqlskills.com
The Back Page: William Vaughn vaughnwilliamr@gmail.com
SQL Server 2008 LOB Data Types Richard Waymire rwaymi@hotmail.com
Art & Production
—Michael Otey Senior Graphic Designer Matt Wiebe
Production Director Linda Kirchgesler
Advertising Sales
Publisher Peg Miller
Director of IT Strategy and Partner Alliances Birdie Ghiglione
619-442-4064 birdie.ghiglione@penton.com
Online Sales and Marketing Manager
Dina Baird dina.baird@penton.com, 970-203-4995
PRODUCTS
kim.eck@penton.com
Reprints
Reprint Sales Diane Madzelonka
43 Product Review:
888-858-8851 diane.madzelonka@penton.com
216-931-9268
Circulation & Marketing
Panorama NovaView Suite IT Group Audience Development Director Marie Evans
Customer Service service@sqlmag.com
—Derek Comingore
Panorama’s NovaView Suite offers all the OLAP functions you
could ask for, most of which are part of components installed on
the server.
45
Chief Executive Officer Sharon Rowlands
Industry News: Sharon.Rowlands@penton.com
Chief Financial Officer/ Executive Vice President Jean Clifton
Bytes from the Blog Jean.Clifton@penton.com
Copyright
Derek Comingore compares two business intelligence suites: Unless otherwise noted, all programming code and articles in this issue are
copyright 2010, Penton Media, Inc., all rights reserved. Programs and articles
Tableau Software’s Tableau 5.1 and Microsoft’s PowerPivot. may not be reproduced or distributed in any form without permission in
writing from the publisher. Redistribution of these programs and articles, or
47
the distribution of derivative works, is expressly prohibited. Every effort has
been made to ensure examples in this publication are accurate. It is the reader’s
New Products responsibility to ensure procedures and techniques used from this publication
are accurate and appropriate for the user’s installation. No warranty is implied
Check out the latest products from Lyzasoft, Attunity, Aivosto, or expressed. Please back up your files before you run a new procedure or pro-
gram or make significant changes to disk files, and be sure to test all procedures
Embarcadero Technologies, and HiT Software. and programs before putting them into production.
List Rentals
Contact MeritDirect, 333 Westchester Avenue, White Plains, NY or www
.meritdirect.com/penton.
Readers Weigh In on
Microsoft’s Support for
Small Businesses Michael Otey
M
LISTING 2: Query That Includes All Days LISTING 3: Query That Includes All Stores
CREATE TABLE dbo.MySalesTable2 CREATE TABLE [dbo].[MyStoresTable](
( Store_ID INT, TransactionDate SMALLDATETIME ) [Store_ID] [int] NOT NULL,
GO CONSTRAINT [PK_MyStores] PRIMARY KEY CLUSTERED ([Store_ID] ASC) )
INSERT INTO dbo.MyStoresTable (Store_ID)
-- Populate the table. VALUES (100),(200),(300),(400),(500),(600),(700)
INSERT INTO dbo.MySalesTable2
SELECT 100, '2009-10-05' UNION DECLARE @BofWeek datetime = '2009-10-01 00:00:00'
SELECT 200, '2009-10-05' UNION SELECT st2.Store_ID, st2.Day_of_Week
SELECT 200, '2009-10-06' UNION FROM
SELECT 300, '2009-10-01' UNION (SELECT st.Store_ID, DATES.Day_of_Week
SELECT 300, '2009-10-07' UNION FROM (
SELECT 400, '2009-10-04' UNION VALUES
SELECT 400, '2009-10-06' UNION (CONVERT(varchar(35),@BofWeek ,101)),
SELECT 500, '2009-10-01' UNION (CONVERT(varchar(35),dateadd(DD,1,@BofWeek),101)),
SELECT 500, '2009-10-02' UNION (CONVERT(varchar(35),dateadd(DD,2,@BofWeek),101)),
-- Transaction for October 03, 2009, not inserted. (CONVERT(varchar(35),dateadd(DD,3,@BofWeek),101)),
-- SELECT 500, '2009-10-03' UNION (CONVERT(varchar(35),dateadd(DD,4,@BofWeek),101)),
SELECT 500, '2009-10-04' UNION (CONVERT(varchar(35),dateadd(DD,5,@BofWeek),101)),
SELECT 500, '2009-10-05' UNION (CONVERT(varchar(35),dateadd(DD,6,@BofWeek),101))
SELECT 500, '2009-10-06' UNION ) DATES(Day_of_Week)
SELECT 500, '2009-10-07' CROSS JOIN
GO (SELECT Store_ID FROM dbo.MyStoresTable ) st
) AS st2
A DECLARE @BofWeek datetime = '2009-10-01 00:00:00' LEFT JOIN dbo.MySalesTable2 st3
SELECT st2.Store_ID, st2.Day_of_Week ON st3.Store_ID = st2.Store_ID AND
FROM st3.TransactionDate = st2.Day_of_Week
(SELECT st.Store_ID, DATES.Day_of_Week WHERE st3.TransactionDate IS NULL
FROM ( ORDER BY st2.Store_ID, st2.Day_of_Week
VALUES GO
(CONVERT(varchar(35),@BofWeek ,101)),
(CONVERT(varchar(35),dateadd(DD,1,
@BofWeek),101)),
(CONVERT(varchar(35),dateadd(DD,2, You need
Y d tto b
be cautious
ti with
ith solutions
l ti th
thatt use
@BofWeek),101)),
(CONVERT(varchar(35),dateadd(DD,3,
indexed temporary tables. A solution might work well
@BofWeek),101)), in one environment but timeout in another, killing
(CONVERT(varchar(35),dateadd(DD,4, your application. For example, I initially tested Query-
@BofWeek),101)),
(CONVERT(varchar(35),dateadd(DD,5, UsingIndexedTemporaryTables.sql using a temporary
@BofWeek),101)), table with 15,000 rows. When I changed number of
(CONVERT(varchar(35),dateadd(DD,6,
@BofWeek),101))
rows for the temporary table to 16,000, the query’s
) DATES(Day_of_Week) response time increased more than four times—from
CROSS JOIN 120ms to 510ms. So, you need to know your produc-
(SELECT DISTINCT Store_ID FROM dbo
.MySalesTable2 ) st tion system workload types, SQL instance configu-
) AS st2 ration, and hardware limitations if you plan to use
LEFT JOIN dbo.MySalesTable2 st3
ON st3.Store_ID = st2.Store_ID AND indexed temporary tables.
st3.TransactionDate = st2.Day_of_Week Another way to optimize the performance of
WHERE st3.TransactionDate IS NULL
queries is to use the EXCEPT and INTERSECT
ORDER BY st2.Store_ID, st2.Day_of_Week
GO operators, which were introduced in SQL Server 2005.
These set-based operators can increase efficiency when
you need to work with large data sets.
the
h revamped d query (Q
(QueryUsingIndexedTemporary-
U i I d dT I created a version of the revamped query (Query-
Tables.sql) that uses an indexed temporary table. This UsingEXCEPTOperator.sql) that uses the EXCEPT
code uses Radhakrisnan’s original data (i.e., data that operator. Once again, this code uses Radhakrisnan’s
includes the October 03, 2009, transaction), which was original data. QueryUsingEXCEPTOperator.sql pro-
created with MySalesTable.Table.sql. vides the fastest and most stable performance. It ran
Like the queries in Listings 2 and 3, QueryUsing- five times faster than Radhakrisnan’s original query.
IndexedTemporaryTables.sql creates the @BofWeek (A table with a million rows was used for the tests.)
variable, which defines the first day of the reporting You can download the solutions I discussed (as well
period. Next, it uses the CREATE TABLE command as MySalesTable.Table.sql) from the SQL Server Maga-
to create the #StoreDate local temporary table, which zinee website. I’ve provided two versions of the code. The
has two columns: Store_ID and Transaction_Date. first set of listings is compatible with SQL Server 2008.
Using the INSERT INTO…SELECT clause, the code (These are the listings you see here.) The second set can
populates the #StoreDate temporary table with all pos- be executed in a SQL Server 2005 environment.
sible Store_ID and Transaction_Date combinations. —Gennadiy Chornenkyy,
Finally, the code uses a CREATE INDEX statement to data architect, ADP Canada
create an index for the #StoreDate temporary table. InstantDoc ID 125130
SP_WhoIsActive
G detailed information
Get f about the sessions
running on your SQL Server system
•
T o say I like SP_WhoIsActive is an understate-
ment. This is probably the most useful and effec-
tive stored procedure I’ve ever encountered for activity
Query text is available that includes the statements
that are currently running, and you can optionally
include the outer batch by setting @get_outer_
monitoring. The purpose of the SP_WhoIsActive command = 1. In addition, SP_WhoIsActive
stored procedure is to give DBAs and developers as can pull the execution plan for the active session
much performance and workload data about SQL statement using the @get_planss parameter.
Server’s internal workings as possible, while retaining • Deltas of numeric values between the last run and the
both flexibility and security. It was written by Boston- current run of the script can be assigned using the Kevin Kline
area consultant and writer Adam Machanic, who @delta_interval = N (where N is seconds) parameter.
(kevin.kline@quest.com) is the director of
is also a long-time SQL Server MVP, a founder of • Filtered results are available on session, login, data- technology for SQL Server Solutions at Quest
SQLBlog.com, and one of the most elite individuals base, host, and other columns using simple wild- Software and a founding board member of
who are qualified to teach the Microsoft Certified cards similar to the LIKE clause. You can filter the international PASS. He is the author of
Master classes. to include or exclude values, as well as exclude SQL in a Nutshelll, 3rd edition (O’Reilly).
Adam, who has exhaustive knowledge of SQL sleeping SPIDs and system SPIDs so that you can
Server internals, knew that he could get more detailed focus on user sessions.
information about SQL Server performance than • Transaction details, such as how many transaction
Editor’s Note
what was offered natively through default stored log entries have been written for each database, are
procedures, such as SP_WHO2 and SP_LOCK, and governed by the @get_transaction_info parameter. We want to hear your
SQL Server Management Studio (SSMS). Therefore, • Blocks and locks are easily revealed using param- feedback on the Tool Time
he wrote the SP_WhoIsActive stored procedure to eters such as @find_block_leaders, which, when discussion forum at
quickly retrieve information about users’ sessions combined with sorting by the [blocked_session_ sqlforums.windowsitpro
and activities. Let’s look at SP_WhoIsActive’s most count] column, puts the lead blocking sessions at .com/web/forum/categories
important features. top. Locks are similarly revealed by setting the
.aspx?catid=169&entercat=y.
@get_locks parameter.
Key Parameters • Long-term data collection is facilitated via a set
SP_WhoIsActive does almost everything you’d of features designed for data collection, such as SP_WhoIsActive
expect from an activity-monitoring stored proce- defining schema for output or a destination table
dure, such as displaying active SPIDs and transac- to hold the collected data. BENEFITS:
It provides detailed
tions and locking and blocking, but it also does a information about
variety of things that you aren’t typically able to do SP_WhoIsActive is the epitome of good T-SQL all of the sessions
unless you buy a commercial activity-monitoring coding practices. I encourage you to spend a little time running on your
solution. One key feature of the script is flex- perusing the code. You’ll note, from beginning to end, SQL Server system,
including what they’re
ibility, so you can enable or disable (or even specify the strong internal documentation, intuitive and read-
doing and how they’re
different levels of information for) any of the fol- able naming of variables, and help-style comments impacting server
lowing parameters: describing all parameters and output columns. The behavior.
• Online help is available by setting the parameter procedure is completely safe against SQL injection
SYSTEM
@help = 1, which enables the procedure to return attacks as well, since it parses input parameter values REQUIREMENTS:
commentary and details regarding all of the input to a list of allowable and validated values. SQL Server 2005
parameters and output column names. SP1 and later; users
• Aggregated wait stats, showing the number of System Requirements need VIEW SERVER
STATE permissions
each kind of wait and the minimum, maximum, Adam releases new versions of the stored procedure
and average wait times are controlled using the at regular intervals at http://tinyurl.com/WhoIsActive. HOW TO GET IT:
@get_task_info parameter with input values of SP_WhoIsActive requires SQL Server 2005 SP1 or You can download
SP_WhoIsActive
0 (don’t collect), the default of 1 (lightweight later. Users of the stored procedure need VIEW
from sqlblog.com/
collection mode), and 2 (collect all current waits, SERVER STATE permissions, which can be granted tags/Who+is+Active/
with the minimum, maximum, and average wait via a certificate to minimize security issues. default.aspx.
times). InstantDoc ID 125107
I
’ve learned that my clustering key (i.e., the whether the clustered index is enforcing a primary Pa new blog, Kimberly &
Paul’s
columns on which I defined my clustered index) key constraint. In SQL Server 2000, the DROP_ Paul: SQL Server Questions
should be unique, narrow, static, and ever- EXISTING clause was added to let you change the Answered, on the SQL Mag
increasing. However, my clustering key is on a GUID. definition of the clustered index without causing all
website at www.sqlmag.com.
Although a GUID is unique, static, and relatively the nonclustered indexes to be rebuilt twice. The first
narrow, I’d like to change my clustering key, and rebuild is because when you drop a clustered index, the
therefore change my clustered index definition. How table reverts to being a heap, so all the lookup refer-
can I change the definition of a clustered index? ences in the nonclustered indexes must be changed
from the clustering key to the row identifier (RID), as
This question is much more complex than it seems, I described in the answer to the previous question. The
and the process you follow is going to depend on second nonclustered index rebuild is because when
14
4 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
SQL SERVER
QUESTIONS
ANSWERED
2. Disable any foreign key constraints. This is LISTING 2: Code to Generate the DISABLE
where you want to be careful if there are users using Command
the database. In addition, this is also where you SELECT
might want to use the following query to change the DISABLE_STATEMENT =
N'ALTER TABLE '
database to be restricted to only DBO use: + QUOTENAME(convert(sysname, schema_name(o2.schema_id)), N']')
+ N'.'
+ QUOTENAME(convert(sysname, o2.name), N']')
ALTER DATABASE DatabaseName + N' NOCHECK CONSTRAINT '
+ QUOTENAME(convert(sysname, object_name(f.object_id)), N']')
SET RESTRICTED_USER , ENABLE_STATEMENT =
WITH ROLLBACK AFTER 5 N'ALTER TABLE '
+ QUOTENAME(convert(sysname, schema_name(o2.schema_id)), N']')
+ N'.'
+ QUOTENAME(convert(sysname, o2.name), N']')
The ROLLBACK AFTER n clause at the + N' WITH CHECK CHECK CONSTRAINT '
end of the ALTER DATABASE statement lets + QUOTENAME(convert(sysname, object_name(f.object_id)), N']'),
RECHECK_CONSTRAINT =
you terminate user connections and put the N'SELECT OBJECTPROPERTY (OBJECT_ ID('
database into a restricted state for modifications. + QUOTENAME(convert (sysname, object_name(f.object_id)), N'''')
+ N'), ''CnstIsNotTrusted'')'
As for automating the disabling of foreign key FROM
sys.objects AS o1,
constraints, I leveraged some of the code from sys.objects AS o2,
sp_fkeyss and significantly altered it to generate sys.columns AS c1,
sys.columns AS c2,
the DISABLE command (similarly to how we did sys.foreign_keys AS f
this in step 1 for disabling nonclustered indexes), INNER JOIN sys.foreign_key_columns AS k
ON (k.constraint_object_id = f.object_id)
which Listing 2 shows. Use the column for INNER JOIN sys.indexes AS i
ON (f.referenced_object_id = i.object_id
DISABLE_STATEMENTS to disable the foreign AND f.key_index_id = i.index_id)
key constraints, and keep the remaining informa- WHERE
o1.[object_id] = object_id('tablename')
tion handy because you’ll need it to reenable and AND i.name = 'Primary key Name'
recheck the data, as well as verify the foreign key AND o1.[object_id] = f.referenced_object_id
AND o2.[object_id] = f.parent_object_id
constraints after you’ve recreated the primary key AND c1.[object_id] = f.referenced_object_id
AND c2.[object_id] = f.parent_object_id
as a unique nonclustered index. AND c1.column_id = k.referenced_column_id
3. Drop the constraint-based clustered index AND c2.column_id = k.parent_column_id
ORDER BY 1, 2, 3
using the following query:
SQL Server
2008 R2
New Features
What you nneed
eed tto
o kknow
now aabout
bout tthe
he nnew
ew
BI and relational database functionality
S
QL Ser Serve
verr 20
2008
08 R2R2 isis Mi
Micro
icroso
sofft’
ft’s lat
ates
testt re
relleas
leasee comb
mbinbinat
atiion
ion ha
hard
rdwa
dware
re andd sof
oft
ftwar
twaree so
solluti
luti
tion
on th
that’
hat’
ts
off iitts
ts ent
nter
ter
erpr
priis
pr ise re
ise rella
lati
lati
tion
onall dattab
abas
bas
asee an
and d bu
busisi-
i- avai
ail
ila
labl
labl
blee on
onlly
ly thr
hrough
hrou h sel
elec
lec
ectt OE
OEMMss ssuc
M uch
uc h as HPHP,
ness inttelli
llige
g nce (B( I)) platf
tform,, and d it
it builild
ds on Dellll,, and d IB
IBM M. OEM
OEMs supp pplly and
d preconfi figure all
ll
the base of functionality established by SQL Server the hardware, including the storage to support the
2008. However, despite the R2 moniker, Microsoft data warehouse functionality. The Parallel Data
has added an extensive set of new features to SQL Warehouse Edition uses a shared-nothing Massively
Server 2008 R2. Although the new support for self- Parallel Processing (MPP) architecture to support
service BI and PowerPivot has gotten the lion’s share data warehouses from 10TB to hundreds of tera-
Michael Otey of attention, SQL Server 2008 R2 includes several bytes in size. As more scalability is required, addi-
(motey@sqlmag.com) is technical other important enhancements. In this article, we’ll tional compute and storage nodes can be added.
director for Windows IT Pro and SQL
look at the most important new features in SQL As you would expect, the Parallel Data Ware-
Server Magazinee and author of Microsoft
SQL Server 2008 New Featuress (Osborne/
Server 2008 R2. house Edition is integrated with SQL Server Integra-
McGraw-Hill). tion Services (SSIS), SQL Server Analysis Services
New Editions (SSAS), and SQL Server Reporting Services (SSRS).
Some of the biggest changes with the R2 release of For more in-depth information about the SQL
SQL Server 2008 are the new editions that Micro- Server 2008 R2 Parallel Data Warehouse Edition,
soft has added to the SQL Server lineup. SQL Server see “Getting Started with Parallel Data Warehouse,”
2008 R2 Datacenter Edition has been added to the page 39, InstantDoc ID 125098.
top of the relational database product lineup and The SQL Server 2008 R2 lineup includes
brings the SQL Server product editions in-line with • SQL Server 2008 R2 Parallel Data Warehouse
the Windows Server product editions, including its Edition
Datacenter Edition. SQL Server 2008 R2 Datacenter • SQL Server 2008 R2 Datacenter Edition
Edition provides support for systems with up to 256 • SQL Server 2008 R2 Enterprise Edition
processor cores. In addition, it offers multiserver • SQL Server 2008 R2 Developer Edition
management and a new event-processing technol- • SQL Server 2008 R2 Standard Edition
ogy called StreamInsight. (I’ll cover multiserver • SQL Server 2008 R2 Web Edition
management and StreamInsight in more detail later • SQL Server 2008 R2 Workgroup Edition
in this article.) • SQL Server 2008 R2 Express Edition (Free)
The other new edition of SQL Server 2008 R2 • SQL Server 2008 Compact Edition (Free)
is the Parallel Data Warehouse Edition. The Paral-
lel Data Warehouse Edition, formerly code-named More detailed information about the SQL Server 2008
Madison, is a different animal than the other edi- R2 editions, their pricing, and the features that they
tions of SQL Server 2008 R2. It’s designed as a Plug support can be found in Table 1. SQL Server 2008 R2
and Play solution for large data warehouses. It’s a supports upgrading from SQL Server 2000 and later.
Multiserver Management
Some of the most important additions to SQL
Server 2008 R2 on the relational database side
are the new multiserver management capabili-
ties. Prior to SQL Server 2008 R2, the multi-
server management capabilities in SQL Server
were limited. Sure, you could add multiple serv-
ers to SQL Server Management Studio (SSMS),
but there was no good way to perform similar
tasks on multiple servers or to manage multiple
servers as a group. SQL Server 2008 R2 includes
a new Utility Explorer, which is part of SSMS,
to meet this need. The Utility Explorer lets you
create a SQL Server Utility Control Point where
you can enlist multiple SQL Server instances to
be managed, as shown in Figure 2. The Utility
Explorer can manage as many as 25 SQL Server
instances.
The Utility Explorer displays consolidated
performance, capacity, and asset information
for all the registered servers. However, only SQL
Server 2008 R2 instances can be managed with
FFigure 1 the initial release; support for earlier SQL Server
Creating a PowerPivot chart and PowerPivot table for data analysis versions is expected to be added with the fi
first ser-
vice pack. Note that multiserver management is
available only in SQL Server 2008 R2 Enterprise
Edition and Datacenter Edition. You can find
out more about multiserver management at www
.microsoft.com/sqlserver/2008/en/us/R2-
multi-server.aspx.
StreamInsight
StreamInsight is a near real-
time event monitoring and pro-
cessing framework. It’s designed
to process thousands of events
per second, selecting and writ-
ing out pertinent data to a SQL
Server database. This type of
high-volume event processing
is designed to process manufac-
turing data, medical data, stock
exchange data, or other process-
control types of data streams
where your organization wants
to capture real-time data for F Figure 3
data mining or reporting.
The Report Designer 3.0 design surface
StreamInsight is a program-
ming framework and doesn’t have
a graphical interface. It’s available only in SQL Server • The ability to connect to and manage SQL Azure
2008 R2 Datacenter Edition. You can read more instances
about SQL Server 2008 R2’s StreamInsight technol- • The addition of SSRS support for SharePoint zones
ogy at www.microsoft.com/sqlserver/2008/en/us/R2- • The ability to create Report Parts that can be
complex-event.aspx. shared between multiple reports
• The addition of backup compression to the
Report Builder 3.0 Standard Edition
Not all businesses are diving into the analytical side
of BI, but almost everyone has jumped onto the You can learn more about the new features in SQL
SSRS train. With SQL Server 2008 R2, Microsoft Server 2008 R2 at msdn.microsoft.com/en-us/library/
has released a new update to the Report Builder por- bb500435(SQL.105).aspx.
tion of SSRS. Report Builder 3.0 (shown in Figure 3)
offers several improvements. Like Report Builder 2.0, To R2 or Not to R2?
fice Ribbon interface. You can integrate SQL Server 2008 R2 includes a tremendous amount
it sports the Offi
geospatial data into your reports using the new Map of new functionality for an R2 release. Although the
Wizard, and Report Builder 3.0 includes support for bulk of the new features, such as PowerPivot and
adding spikelines and data bars to your reports so that the Parallel Data Warehouse, are BI oriented, there
queries can be reused in multiple reports. In addition, are also several significant
fi new relational database
you can create Shared Datasets and Report Parts that enhancements, including multiserver management
are reusable report items stored on the server. You can and Master Data Services. However, it remains to be
then incorporate these Shared Datasets and Report seen how quickly businesses will adopt SQL Server
Parts in the other reports that you create. 2008 R2. All current Software Assurance (SA) cus-
tomers are eligible for the new release at no addi-
Other Important tional cost, but other customers will need to evaluate
Enhancements if the new features make the upgrade price worth-
Although SQL Server 2008 R2 had a short two- while. Perhaps more important than price are the
year development cycle, it includes too many new resource demands needed to roll out new releases of
features to list in a single article. The following are core infrastructure servers such as SQL Server.
some other notable enhancements included in SQL That said, PowerPivot and self-service BI are poten-
Server 2008 R2: tially game changers, especially for organizations that
• The installation of slipstream media containing have existing BI infrastructures. The value these fea-
current hotfixes and updates tures bring to organizations heavily invested in BI
• The ability to create hot standby servers with makes SQL Server 2008 R2 a must-have upgrade.
database mirroring InstantDoc ID 125003
Descending
Index ordering, parallelism,
m, and ranking calculations
Indexes
C
ertain aspects of SQL Server index B-trees of SQL
QL SServer 2008 SP1 C
Cumulative
l i U Update
d 6)6)—not
and their use cases are common knowledge, because there’s a technical problem or engineering
but some aspects are less widely known difficulty with supporting the option, but simply
because they fall into special cases. In this article because it hasn’t yet floated as a customer request.
I focus on special cases related to backward index My guess is that most DBAs just aren’t aware of
ordering, and I provide guidelines and recommenda- this behavior and therefore haven’t thought to ask
tions regarding when to use descending indexes. All for it. Although performing a backward scan gives
my examples use a table called Orders that resides you the benefit of relying on index ordering and
in a database called Performance. Run the code in therefore avoiding expensive sorting or hashing, the
Itzik Ben-Gan
Listing 1 to create the sample database and table and query plan can’t benefit from parallelism. If you (Itzik@SolidQ.com) is a mentor with Solid
Quality Mentors. He teaches, lectures, and
populate it with sample data. Note that the code in find a case in which parallelism is important, you
consults internationally. He’s a SQL Server MVP
Listing 1 is a subset of the source code I prepared for need to arrange an index that allows an ordered and is the author of several books about
my book Inside Microsoft SQL Server 2008: T-SQL forward scan. T-SQL, including Microsoft SQL Server 2008:
Queryingg (Microsoft Press, 2009), Chapter 4, Query Consider the following query as an example: T-SQL Fundamentalss (Microsoft Press).
Tuning. If you have the book and already created the
Performance database in your system, you don’t need USE Performance;
to run the code in Listing 1.
One of the widely understood aspects of SQL
Server indexes is that the leaf level of an index enforces
SELECT *
FROM dbo.Orders
MORE on the WEB
M
Download the listing at
bidirectional ordering through a doubly-linked list. WHERE orderid <= 100000 InstantDoc ID 125090.
This means that in operations that can potentially rely ORDER BY orderdate;
on index ordering—for example, filtering (seek plus
partial ordered scan), grouping (stream aggregate), There’s a clustered index defined on the table with
presentation ordering (ORDER BY)—the index can orderdate ascending as the key. The table has
be scanned either in an ordered forward or ordered 1,000,000 rows, and the number of qualifying rows
backward fashion. So, for example, if you have a in the query is 100,000. My laptop has eight logical
query with ORDER BY col1 DESC, col2 DESC, CPUs. Figure 1 shows the graphical query plan for
SQL Server can rely on index ordering both when you this query. Here’s the textual plan:
create the index on a key list with ascending ordering
(col1, col2) and with the exact reverse ordering (col1 |--Parallelism(Gather Streams, ORDER BY:
DESC, col2 DESC). ([orderdate] ASC))
So when do you need to use the DESC index key |--Clustered Index Scan(OBJECT:([idx_cl_od]),
option? Ask SQL Server practitioners this question, WHERE:([orderid]<=(100000)) ORDERED
and most of them will tell you that the use case is when FORWARD)
there are at least two columns with opposite ordering
requirements. For example, to support ORDER BY As you can see, a parallel query plan was used. Now
col1, col2 DESC, there’s no escape from defining try the same query with descending ordering:
one of the keys in descending order—either (col1,
col2 DESC), or the exact reverse order (col1 DESC, SELECT *
col2). Although this is true, there’s more to the use of FROM dbo.Orders
descending indexes than what’s commonly known. WHERE orderid <= 100000
ORDER BY orderdate DESC;
Index Ordering and Parallelism
As it turns out, SQL Server’s storage engine isn’t Figure 2 shows the graphical query plan for this
coded to handle parallel backward index scans (as query. Here’s the textual plan:
Figure 1 Figure 2
Parallel query plan Serial query plan
|--Clustered Index Scan(OBJECT:([idx_cl_od]), Wayne discovered this bug and reported it, and it was
WHERE:([orderid]<=(100000)) ORDERED BACKWARD) fixed in SQL Server 2000 SP1.
Note that although an ordered scan of the index was Index Ordering and Ranking
used, the plan is serial because the scan ordering is Calculations
backward. If you want to allow parallelism, the index Back to cases in which descending indexes are
must be scanned in an ordered forward fashion. So relevant, it appears that ranking calculations—par-
in this case, the orderdate column must be defined ticularly ones that have a PARTITION BY clause—
with DESC ordering in the index key list. need to perform an ordered forward scan of the index
This reminds me that when descending indexes in order to avoid the need to sort the data. Again, this
were introduced in SQL Server 2000 RTM, my friend is the case only when the calculation is partitioned.
Wayne Snyder discovered an interesting bug. Suppose When the calculation isn’t partitioned, both a forward
you had a descending clustered index on the Orders and backward scan can be utilized. Consider the fol-
table and issued the following query: lowing example:
22
2 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
DESCENDING INDEXES FEATURE
Figure 3
Query plan with sort for nonpartitioned ranking calculation
|--Sequence Project(DEFINE:([Expr1004] order or exactly reversed, plus include the rest of the
=row_number)) columns from the query in the INCLUDE clause for
|--Segment coverage purposes. With this in mind, to support the
|--Parallelism(Gather Streams, ORDER previous query you can define the index with all the
BY:([orderdate] DESC, [orderid] DESC)) keys in ascending order, like so:
|--Sort(ORDER BY:([orderdate] DESC, [orderid]
DESC)) CREATE UNIQUE INDEX idx_od_oid_i_cid_filler
|--Clustered Index Scan(OBJECT:([idx_cl_od])) ON dbo.Orders(orderdate, orderid)
INCLUDE(custid, filler);
Indexing guidelines for queries with nonpartitioned
ranking calculations are to have the ranking ordering Rerun the query, and observe in the query execution
columns in the index key list, either in specified plan in Figure 4 that the index was scanned in an
Figure 5
Query plan with sort for partitioned ranking calculation
|--Index Scan(OBJECT:([idx_od_oid_i_
cid_ filler]))
ordered backward fashion. Here’s the textual form CREATE UNIQUE INDEX idx_cid_odD_oidD_i_
of the plan: filler
ON dbo.Orders(custid, orderdate DESC,
|--Sequence Project(DEFINE:([Expr1004]=row_number)) orderid DESC)
|--Segment INCLUDE(filler);
|--Index Scan(OBJECT:([idx_od_oid_i_cid_
filler]), ORDERED BACKWARD) Examine the query execution plan in Figure 6, and
observe that the index was scanned in an ordered
However, when partitioning is involved in the forward fashion and a sort was avoided. Here’s the
ranking calculation, it appears that SQL Server is textual plan:
strict about the ordering requirement—it must match
the ordering in the expression. For example, consider |--Sequence Project(DEFINE:([Expr1004]=row_
the following query: number))
|--Segment
SELECT |--Index Scan(OBJECT:([idx_cid_odD_
ROW_NUMBER() OVER(PARTITION BY custid oidD_i_filler]), ORDERED FORWARD)
ORDER BY orderdate DESC,
orderid DESC) AS RowNum, When you’re done, run the following code for
orderid, orderdate, custid, filler cleanup:
FROM dbo.Orders;
DROP INDEX dbo.Orders.idx_od_oid_i_cid_
When partitioning is involved, the indexing guidelines filler;
are to put the partitioning columns first in the key DROP INDEX dbo.Orders.idx_cid_od_oid_i_
list, and the rest is the same as the guidelines for filler;
nonpartitioned calculations. Now try to create an DROP INDEX dbo.Orders.idx_cid_odD_ oidD_i_filler;
index following these guidelines, but have the ordering
columns appear in ascending order in the key list: One More Time
In this article I covered the usefulness of descending
CREATE UNIQUE INDEX idx_cid_od_oid_i_filler indexes. I described the cases in which index ordering
ON dbo.Orders(custid, orderdate, orderid) can be relied on in both forward and backward linked
INCLUDE(filler); list order, as opposed to cases that support only forward
direction. I explained that partitioned ranking calcula-
Observe in the query execution plan in Figure 5 tions can benefit from index ordering only when an
that the optimizer didn’t rely on index ordering but ordered forward scan is used, and therefore to benefit
instead sorted the data. Here’s the textual form of from index ordering you need to create an index in
the plan: which the key column ordering matches that of the
ORDER BY elements in the ranking calculation. I also
|--Parallelism(Gather Streams) explained that even when backward scans in an index
|--Index Insert(OBJECT:([idx_cid_od_oid_i_ are supported, this prevents parallelism; so even in those
filler])) cases there might be benefit in arranging an index that
|--Sort(ORDER BY:([custid] ASC, [orderdate] matches the ordering requirements exactly rather than
ASC, [orderid] ASC) PARTITION in reverse.
ID:([custid])) InstantDoc ID 125090
24
4 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
www.WorldMags.net & www.aDowns.net
www.WorldMags.net & www.aDowns.net
FEATURE
Troubleshooting
Transactional
Replication
3 common transactional replication problems solved
T
ransactional replication is a useful way to shows the name, current status, and number of
keep schema and data for specific objects Subscribers for each publication on the Publisher;
synchronized across multiple SQL Server Subscription Watch List, which shows the status and
databases. Replication can be used in simple scenarios estimated latency (i.e., time to deliver pending com-
involving a few servers or can be scaled up to complex, mands) of all Subscriptions to the Publisher; and
multi-datacenter distributed environments. However, Agents, which shows the last start time and current
no matter the size or complexity of your topology, status of the Snapshot, Log Reader, and Queue Reader
the number of moving parts involved with replica- agents, as well as various automated maintenance jobs Kendal Van Dyke
tion means that occasionally problems will occur that created by SQL Server to keep replication healthy.
(kendal.vandyke@gmail.com) is a senior DBA
require a DBA’s intervention to correct. Expanding a Publisher node in the treeview shows in Celebration, FL. He has worked with SQL
In this article, I’ll show you how to use SQL Server’s its publications. Selecting a publication displays four Server for more than 10 years and managed
native tools to monitor replication performance, tabbed views in the right pane: All Subscriptions, high-volume replication topologies for nearly
receive notification when problems occur, and diagnose which shows the current status and estimated latency seven years. He blogs at kendalvandyke
.blogspot.com.
the cause of those problems. Additionally, I’ll look at of the Distribution Agent for each Subscription; Tracer
three common transactional replication problems and Tokens, which shows the status of recent tracer tokens
explain how to fix them. for the publication (I’ll discuss tracer tokens in more
will display a context menu with options that include way through to Subscribers (the latency values shown
stopping and starting the agent, viewing the agent’s for agents in Replication Monitor are estimated).
profile, and viewing the agent's job properties. Double- Creating a tracer token writes a special marker to the
clicking an agent will open a new window that shows transaction log of the Publication database that’s read
specific details about the agent’s status. by the Log Reader agent, written to the distribution
Distribution Agent windows have three tabs: database, and sent through to all Subscribers. The time
Publisher to Distributor History, which shows the it takes for the token to move through each step is saved
status and recent history of the Log Reader agent for in the Distribution database.
the publication; Distributor to Subscriber History, Tracer tokens can be used only if both the Pub-
which shows the status and recent history of the Dis- lisher and Distributor are on SQL Server 2005 or
tribution Agent; and Undistributed Commands, which later. Subscriber statistics will be collected for push
shows the number of commands at the distribution subscriptions if the Subscriber is running SQL Server
database waiting to be applied to the Subscriber and 7.0 or later and for pull subscriptions if the Subscriber
an estimate of how long it will take to apply them. Log is running SQL Server 2005 or higher. For Subscribers
Reader and Snapshot Reader agent windows show that don’t meet these criteria (non–SQL Server Sub-
only an Agent History tab, which displays the status scribers, for example), statistics for tracer tokens will
and recent history of that agent. still be gathered from the Publisher and Distributor.
When a problem occurs with replication, such To add a tracer token you must be a member of the
as when a Distribution Agent fails, the icons for sysadmin fixed server role or db_ownerr fixed database
the Publisher, Publication, and agent will change role on the Publisher.
depending on the type of problem. Icons overlaid by To add a new tracer token or view the status of
a red circle with an X indicate an agent has failed, a existing tracer tokens, navigate to the Tracer Tokens
white circle with a circular arrow indicates an agent tab in Replication Monitor. Figure 2 shows an
is retrying a command, and a yellow caution symbol example of the Tracer Tokens tab showing latency
indicates a warning. Identifying the problematic details for a previously inserted token. To add a new
agent is simply a matter of expanding in the treeview token, click Insert Tracer. Details for existing tokens
the Publishers and Publications that are alerting to can be viewed by selecting from the drop-down list on
a condition, selecting the tabs in the right pane for the right.
the agent(s) with a problem, and double-clicking the
agent to view its status and information about the Know When There Are Problems
error. Although Replication Monitor is useful for viewing
replication health, it’s not likely (or even reasonable)
Measuring the Flow of Data that you’ll keep it open all the time waiting for an error
Understanding how long it takes for data to move to occur. After all, as a busy DBA you have more to
through each step is especially useful when trouble- do than watch a screen all day, and at some point you
shooting latency issues and will let you focus your atten- have to leave your desk.
tion on the specific segment that’s problematic. Tracer However, SQL Server can be configured to raise
tokens were added in SQL Server 2005 to measure the alerts when specific replication problems occur. When
flow of data and actual latency from a Publisher all the a Distributor is initially set up, a default group of
alerts for replication-related events is created. To view
the list of alerts, open SSMS and make a connection
to the Distributor in Object Explorer, then expand the
SQL Server Agent and Alerts nodes in the treeview.
To view or configure an alert, open the Alert proper-
ties window by double-clicking the alert or right-click
the alert and choose the Properties option from the
context menu. Alternatively, alerts can be configured
in Replication Monitor by selecting a Publication in
the left pane, viewing the Warnings tab in the right
pane, and clicking the Configure Alerts button.
The options the alert properties window offers for
response actions, notification, etc., are the same as an
alert for a SQL Server agent job. Figure 3 shows the
Warnings tab in Replication Monitor.
There are three alerts that are of specific interest
Figure 2
Fi for transactional replication: Replication: Agent failure,
Tracer Tokens tab showing latency details for a token Replication: Agent retry, and Replication Warning:
28
8 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
TROUBLESHOOTING TRANSACTIONAL REPLICATION FEATURE
part of a batch wrapped by a transaction) and All Subscriptions tab, and choose the Agent Profile
uses @@rowcount to verify that only one row was menu option. A new window will open that lets you
affected. The primary key is used to check for which change the selected agent profile; select the check box
row needs to be inserted, updated, or deleted; for for the Continue on data consistency errorss profile,
inserts, if a row with the primary key already exists and then click OK. Figure 4 shows an example of
at the Subscriber, the command will fail because the Agent Profile window with this profile selected.
of a primary key constraint violation. For updates The Distribution Agent needs to be restarted for
or deletes, if no matching primary key exists, the new profile to take effect; to do so, right-click
@@rowcount returns 0 and an error will be raised the Subscriber and choose the Stop Synchronizing
that causes the Distribution Agent to fail. menu option. When the Subscriber’s status changes
Solution: If you don’t care which command is from Running to Not Running, right-click the Sub-
failing, you can simply change the Distribution scriber again and select the Start Synchronizing menu
Agent’s profile to ignore the errors. To change the option.
profile, navigate to the Publication in Replication This profile is a system-created profile that will
Monitor, right-click the problematic Subscriber in the skip three specific errors: inserting a row with a
duplicate key, constraint violations, and rows missing
from the Subscriber. If any of these errors occur while
using this profile, the Distribution Agent will move
on to the next command rather than failing. When
choosing this profile, be aware that the data on the
Subscriber is likely to become out of sync with the
Publisher.
If you want to know the specific command that’s
failing, the sp_browsereplcmdss stored procedure can
be executed at the Distributor. Three parameters are
required: an ID for the Publisher database, a transac-
tion sequence number, and a command ID. To get the
Publisher database ID, execute the code in Listing 1
on your Distributor (filling in the appropriate values
for Publisher, Subscriber, and Publication). To get the
transaction sequence number and command ID, navi-
gate to the failing agent in Replication Monitor, open
its status window, select the Distributor to Subscriber
History tab, and select the most recent session with
an Error status. The transaction sequence number
and command ID are contained in the error details
message. Figure 5 shows an example of an error
message containing these two values.
Finally, execute the code in Listing 2 using the
FFigure 4 values you just retrieved to show the command that’s
Continue on data consistency errors profifile selected in the Distribution Agent’s profi
file failing at the Subscriber. Once you know the com-
mand that’s failing, you can make changes at the
Subscriber for the command to apply successfully.
Distribution Agent fails with the error message
Could not find stored procedure 'sp_MSins_.
Cause: The Publication is configured to deliver
INSERT, UPDATE, and DELETE commands using
Figure 5
Fi stored procedures, and the procedures have been
dropped from the Subscriber. Replication stored
An error message containing the transaction sequence number and command ID
procedures aren’t considered to be system stored
procedures and can be included using schema com-
LISTING 2: Code to Show the Command parison tools. If the tools are used to move changes
that’s Failing at the Subscriber from a non-replicated version of a Subscriber database
EXECUTE distribution.dbo.sp_browsereplcmds to a replicated version (e.g., migrating schema changes
@xact_seqno_start = '0x0000001900001926000800000000',
@xact_seqno_end = '0x0000001900001926000800000000', from a local development environment to a test envi-
@publisher_database_id = 29,
@command_id = 1
ronment) the procedures could be dropped because
they don’t exist in the non-replicated version.
30
0 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
TROUBLESHOOTING TRANSACTIONAL REPLICATION FEATURE
Solution: This is an easy problem to fix. In the (i.e., making it a value of 768 or 1,024) should
published database on the Publisher, execute the be sufficient to resolve the issue. Click OK after
sp_scriptPublicationcustomprocs stored procedure modifying the value. Rebooting will ensure that the
to generate the INSERT, UPDATE, and DELETE new value is used by Windows. For more informa-
stored procedures for the Publication. This procedure tion about the non-interactive desktop heap, see
only takes one parameter—the name of the Publica- “Unexpected behavior occurs when you run many
tion—and returns a single nvarchar(4000) column as processes on a computer that is running SQL Server”
the result set. When executed in SSMS, make sure (support.microsoft.com/kb/824422).
to output results to text (navigate to Control-T or
Query Menu, Results To, Results To Text) and that Monitoring Your Replication
the maximum number of characters for results to Environment
text is set to at least 8,000. You can set this value by When used together, Replication Monitor, tracer
selecting Tools, Options, Query Results, Results to tokens, and alerts are a solid way for you to monitor
Text, Maximum number of characters displayed in your replication topology and understand the
each column. After executing the stored procedure, source of problems when they occur. Although the
copy the scripts that were generated into a new query techniques outlined here offer guidance about how
window and execute them in the subscribed database to resolve some of the more common issues that
on the Subscriber. occur with transactional replication, there simply
Distribution Agents won’t start or don’t appear to isn’t enough room to cover all the known problems
do anything. in one article. For more tips about troubleshooting
Cause: This typically happens when a large replication problems, visit the Microsoft SQL Server
number of Distribution Agents are running on the Replication Support Team’s REPLTalk k blog at
same server at the same time; for example, on a blogs.msdn.com/repltalk.
Distributor that handles more than 50 Publications InstantDoc ID 104703
or Subscriptions. Distribution Agents are indepen-
dent executables that run outside of the SQL Server
process in a non-interactive fashion (i.e., no GUI).
Windows Server uses a special area of memory
called the non-interactive desktop heap to run these
kinds of processes. If Windows runs out of available
memory in this heap, Distribution Agents won’t be
able to start.
Solution: Fixing the problem involves making
a registry change to increase the size of the non-
interactive desktop heap on the server experi-
encing the problem (usually the Distributor) and
rebooting. However, it’s important to note that
modifying the registry can result in serious prob-
lems if it isn’t done correctly. Be sure to perform the
following steps carefully and back up the registry
before you modify it:
1. Start the Registry Editor by typing regedit32
.exe in a run dialog box or command prompt.
2. Navigate to the HKEY_LOCAL_
MACHINE\SYSTEM\CurrentControlSet\Control\
Session Manager\SubSystems key in the left pane.
3. In the right pane, double-click the Windows
value to open the Edit String dialog box.
4. Locate the SharedSection parameter in
the Value data input box. It has three values
separated by commas, and should look like the
following:
SharedSection=1024,3072,512
Maximizing
Report Performance
with
Parameter-Driven
Expressions
ed up your reports
O
nly static reports saved as pre-rendered such as mountain bikes. Once the user chooses a
images—snapshot reports—can be loaded specific bike from within the class, a subreport is gen-
and displayed (almost) instantly, so users erated to show details, including a photograph and
are accustomed to some delay when they ask for other computed information. By separating the photo
reports that reflect current data. Some reports, how- and computed information from the base query, you
ever, can take much longer to generate than others. can help the report processor generate the base report
Complex or in-depth reports can take many hours to much more quickly.
produce, even on powerful systems, while others can In this example, my primary goal is to help the
be built and rendered in a few seconds. Parameter- user focus on a specific subset of the data—in other
William Vaughn
driven expressions, a technique that I expect is new words, to help users view only information in which (billva@betav.com) is an expert on Visual
Studio, SQL Server, Reporting Services, and
to many of you, can help you greatly in speeding up they’re interested. You can do this several ways, but
data access interfaces. He’s coauthor of the
your reports. typically you either add a parameter-driven WHERE Hitchhiker’s Guide series, includingg Hitchhiker’s
Visit the online version of this article at www clause to the initial query or parameter-driven filters Guide to Visual Studio and SQL Serverr, 7th ed.
.sqlmag.com, InstantDoc ID 125092, to down- to the report data regions. I’ll do the latter in this (Addison-Wesley).
load the example I created against the Adventure- example.
Works2008 database. If you’re interested in a more
general look at improving your report performance,
Because the initial SELECT query exe-
cuted by the report processor in this example
MORE
MO on the WEB
Download the example
see the web-exclusive sidebar “Optimizing SSRS doesn’t include a WHERE clause, it makes and read about more
Operations,” which offers more strategies for creating sense to capture several parameters that the optimization techniques
reports that perform well. In this sidebar, I walk you at InstantDoc ID 125092.
through the steps your system goes through when it
creates a report. I also share strategies, such as using
a stored procedure to return the rowsets used in a
report so that the SQL Server query optimizer can
reuse a cached query plan, eliminating the need to
recompile the procedure.
The concepts I’ll discuss here aren’t dependent
on any particular version of SQL Server Reporting
Services (SSRS) but I’ll be using the 2008 ver-
sion for the examples. Once you’ve installed the
AdventureWorks2008 database, you’ll start Visual
Studio (VS) 2008 and load the ClientSide Filtering
.sln project. (This technique will work with VS
2005 business intelligence—BI—projects, but I built
the example report using VS 2008 and you can’t
load it in VS 2005 because the Report Definition
Language—RDL—format is different.) Open Shared
Data Source and the Project Properties to make sure
the connection string points to your SSRS instance.
The example report captures parameters from the Figure 1
Fi
user to focus the view on a specific class of bicycles, The report’s Design pane
report processor can use to narrow the report’s focus. query or change the columns being fetched, the
(There’s nothing to stop you from further refining designer doesn’t keep in sync with the RDL very
the initial SELECT to include parameter-driven well. Be prepared to open the RDL to make changes
WHERE clause filtering.) I’ll set up some report from time to time.
parameters—as opposed to query parameters—to 3. Right-click the Parameters folder in the
accomplish this goal. Report Data dialog and choose Add Parameter
1. In VS’s Business Intelligence Development to open the Report Parameter Properties dialog
Studio (BIDS) Report Designer, navigate to the box, which Figure 2 shows. Here’s where each of
report’s Design pane, which Figure 1 shows. Note the parameters used in the filter expressions or the
that report-centric dialog boxes such as the Report query parameters are defined. The query param-
eters should be generated automatically if your
query references a parameter in the WHERE clause
By separating the photo and once the dataset is bound to a data region on the
report (such as a Tablix report element).
computed information from the 4. Use the Report Parameter Properties dialog
box to name and define the prompt and other
base query, you can help the report attributes for each of the report parameters you’ll
processor generate the base report need to filter the report. Figure 2 shows how I set
the values, which are used to provide the user with
much more quickly. a drop-down list of available Product Lines from
which to choose.
Data window only appear when focus is set to the Note that the Value setting is un-typed—you
Design pane. can’t specify a length or data type for these values.
2. Use the View menu to open the Report Data This can become a concern when you try to com-
dialog box, which is new in the BIDS 2008 Report pare the supplied value with the data column values
Designer. This box names each of the columns in the Filters expression I’m about to build, espe-
returned by the dataset that’s referenced by (the cially when the supplied parameter length doesn’t
case-sensitive) name. If you add columns to the match the length of the data value being tested.
dataset for some reason, make sure these changes 5. I visited the Default Value tab of the Report
are reflected in the Report Data dialog box as well Parameter Properties dialog box to set the param-
as on your report. eter default to M (for Mountain Bikes). If all of
Don’t expect to be able to alter the RDL (such your parameters have default values set, the report
as renaming the dataset) based on changes in the processor doesn’t wait before rendering the report
Report Data dialog box. When you rename the when first invoked. This can be a problem if you
can’t determine a viable default parameter configu-
ration that makes sense.
6. The next step is to get the report processor
to filter the report data based on the parameter
value. On the Report Designer Design pane,
click anywhere on the Table1 Tablix control and
right-click the upper left corner of the column-
chooser frame that appears. The trick is to make
sure you’ve selected the correct element before
you start searching for the properties, and you
should be sure to choose the correct region when
selecting a report element property page. It’s easy
to get them confused.
7. Navigate to the Tablix Properties menu item,
as Figure 3 shows. Here, you should add one or
more parameter-driven filters to focus the report’s
data. Start by clicking Add to create a new Filter.
8. Ignore the Dataset column drop-down list
because it will only lead to frustration. Just click the
fx expression button. This opens an expression
editor where we’re going to build a Boolean-
Figure 2
Fi returning expression that tests for the filter values.
The Report Parameter Properties dialog box It’s far easier to write your own expressions that
34
4 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
PARAMETER-DRIVEN EXPRESSIONS FEATURE
SSRS instance and save the report to the SSRS catalog. 1. Using Internet Explorer (I haven’t had much
Deploying the report could take 30 seconds or longer luck with Firefox), browse to your SSRS Folder
the first time, as I’ve already discussed, so be patient. .aspx page, as Figure 7 shows. Your report should
appear in the report folder where you deployed it.
Creating a Snapshot Report 2. Find your report and click it. The report
Now that the report is deployed, you need to navigate should render (correctly) in the browser window,
to it with Report Manager so that you can generate a and this is how your end users should see the
snapshot. Use Report Manager to open the deployed report. Report users shouldn’t be permitted to see
report’s properties pages. or alter the report parameters, however—be sure to
configure the report user rights before going into
production. The following operations assume that
you have the appropriate rights.
3. Using the Parameters tab of the report prop-
erty dialog box, you can modify the default values
assigned to the report, hide them, and change the
prompt strings. More importantly, you’ll need to set
up the report to use stored login credentials before
you can create a snapshot.
4. Navigate to the Data Sources tab of the report
properties page. I configured the report to use a
custom data source that has credentials that are
securely stored in the report server. This permits
the report to be run unattended if necessary (such
as when you set up a schedule to regenerate the
snapshot). In this case, I’ve created a special SQL
Figure 6
Fi Server login that has been granted very limited
The project’s properties page rights to execute the specific stored procedure being
used by the report and query against the appropriate
tables, but nothing else.
5. Next, navigate to the Execution tab of the
report property dialog box. Choose Render this
report from a report execution snapshot. Here’s
where you can define a schedule to rebuild the
snapshot. This makes sense for reports that take
a long time to execute, because when the report is
requested, the last saved version is returned, not a
new execution.
6. Check the Create a report snapshot when you
click the Apply button on this page box and click
Apply. This starts the report processor on the desig-
Figure 7
Fi nated SSRS instance and creates a snapshot of the
Browsing to the SSRS Folder.aspx page report. The next time the report is invoked from a
browser, the data retrieved when the snapshot was
built is reused—no additional queries are required
to render the report. It also means that as the user
changes the filter parameters, it’s still not neces-
sary to re-run the query. This can save considerable
time—especially in cases where the queries require a
long time to run.
36
6 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
PARAMETER-DRIVEN EXPRESSIONS FEATURE
T
his summerr Microsoft will release the SQL Q orchestration of a single
g server called the control node.
Server 2008 R2 Parallel Data Warehouse The control node accepts client query requests, then
(PDW) Edition, its first product in the creates an MPP execution plan that can call upon one
Massively Parallel Processor (MPP) data ware- or more compute nodes to execute different parts of
house space. PDW uniquely combines MPP software the query, often in parallel. The retrieved results are
acquired from DATAllegro, parallel compute nodes, sent back to the client as a single result set.
commodity servers, and disk storage. PDW lets you
scale out enterprise data warehouse solutions into the Taking a Closer Look
hundreds of terabytes and even petabytes of data for Let’s dive deeper into PDW’s architecture in Figure 1.
Rich Johnson
the most demanding customer scenarios. In addition, As I mentioned previously, PDW has a control node (richjohn@microsoft.com) is a business intelligence
architect with Microsoft working in the US Public
because the parallel compute nodes work concur- that clients connect to in order to query a PDW data-
Sector Services organization. He has worked for
rently, it often takes only seconds to get the results of base. The control node has an instance of the SQL Microsoft since 1996 as a development manager
queries run against tables containing trillions of rows. Server 2008 relational engine for storing PDW meta- and architect of SQL Server database solutions for
For many customers, the large data sets and the fast data. It also uses this engine for storing intermediate OLTP and data-warehousing implementations.
query response times against those data sets are game- query results in TempDB for some query types. The
changing opportunities for competitive advantage. control node can receive the results of intermediate
The simplest way to think of PDW is a layer of
integrated software that logically forms an umbrella
query results from multiple compute nodes
for a single query, store those results in MORE on the WEB
M
over the parallel compute nodes. Each compute node SQL Server temporary tables, then merge Download the code at
InstantDoc ID 125098.
is a single physical server that runs its own instance those results into a single result set for final
of the SQL Server 2008 relational engine in a shared- delivery to the client.
nothing architecture. In other words, compute node 1 The control node is an active/passive cluster server.
doesn’t share CPU, memory, or storage with compute Plus, there’s a spare compute node for redundancy
node 2. and failover capability.
Figure 1 shows the architecture for a PDW data A PDW data rack contains 8 to 10 compute
rack. The smallest PDW will take up two full racks of nodes and related storage nodes, depending on the
space in a data center, and you can add storage and hardware vendor. Each compute node is a physical
compute capacity to PDW one data rack at a time. server that runs a standalone SQL Server 2008 rela-
A data rack contains 8 to 10 compute servers from tional engine instance. The storage nodes are Fibre
vendors such as Bull, Dell, HP, and IBM, and Fibre Channel-connected storage arrays containing 10 to
Channel storage arrays from vendors such as EMC2, 12 disk drives.
HP, and IBM. The sale of PDW includes precon- You can add more capacity by adding data racks.
figured and pretested software and hardware that’s Depending on disk sizes, a data rack can contain in the
tightly configured to achieve balanced throughput neighborhood of 30TB to 40TB of useable disk space.
and I/O for very large databases. Microsoft and these (These numbers can grow considerably if 750GB or
hardware vendors provide full planning, implementa- larger disk drives are used by the hardware vendor.)
tion, and configuration support for PDW. The useable disk space is all RAID 1 at the hardware
The collection of physical servers and disk storage level and uses SQL Server 2008 page compression.
arrays that make up the MPP data warehouse is often So, if your PDW appliance has 10 full data racks,
referred to as an appliance. Although the appliance you have 300TB to 400TB of useable disk space and
is often thought of as a single box or single database 80 to 100 parallel compute nodes. As of this writing,
server, a typical PDW appliance is comprised of each compute node is a two-socket server with each
dozens of physical servers and disk storage arrays CPU having at least four cores. In our example, that’s
all working together, often in parallel and under the 640 to 800 CPU cores and lots of Fibre Channel
40
0 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
PARALLEL DATA WAREHOUSE FEATURE
my_DB that has 16TB of distributed data space, 1TB LISTING 1: Code that Creates
of replicated table space, and 800GB of log file space a Replicated Table
on a PDW appliance with eight compute nodes: CREATE TABLE DimAccount
(
AccountKey int NOT NULL,
CREATE DATABASE my_DB ParentAccountKey int NULL,
AccountCodeAlternateKey int NULL,
WITH ( AUTOGROW = ON ParentAccountCodeAlternateKey int NULL,
,REPLICATED_SIZE = 1024 GB AccountDescription nvarchar(50) ,
AccountType nvarchar(50) ,
,DISTRIBUTED_SIZE = 16,384 GB Operator nvarchar(50) ,
CustomMembers nvarchar(300) ,
,LOG_SIZE = 800 GB ) ValueType nvarchar(50) ,
CustomMemberOptions nvarchar(200)
)
A total of 8TB of usable disk space (1024GB × 8 WITH (CLUSTERED INDEX(AccountKey),
DISTRIBUTION = REPLICATE);
compute nodes) will be consumed by replicated
tables because each compute node needs enough disk
space to contain a copy of each replicated table. Two
LISTING 2: Code that Creates
terabytes of usable disk space will be consumed by
a Distributed Table
each of the 8 compute nodes (16,384GB / 8 compute
nodes) for distributed tables. Each compute node will CREATE TABLE FactSales
(
also consume 100GB of usable disk space (800GB / StoreIDKey int NOT NULL,
ProductKey int NOT NULL,
8 compute nodes) for log files. As a general rule of DateKey int NOT NULL,
SalesQty int NOT NULL,
thumb, the overall log-file space for a user database SalesAmount decimal(18,2) NOT NULL
should be estimated at two times the size of the largest )
WITH (CLUSTERED INDEX(DateKey),
data file being loaded. DISTRIBUTION = HASH(StoreIDKey));
When creating a new user database, you won’t
be able to create file groups. PDW does this auto-
matically during database creation because file group distributed as evenly as possible across all the com-
design is tightly configured with the storage to achieve pute nodes and their distributions. For a retailer
overall performance and I/O balance across all com- with a large point of sale (POS) fact table and a
pute nodes. large store-inventory fact table, a good candidate
After the database is created, you use the CREATE for the distribution key might be the column that
TABLE command to create both replicated and dis- contains the store ID. By hash distributing both fact
tributed tables. PDW’s CREATE TABLE command tables on the store ID, you might create a fairly even
is very similar to a typical SQL Server CREATE distribution of the rows across all compute nodes.
TABLE command and even includes the ability to Also, PDW will co-locate on the same distribution
partition distributed tables as well as replicated tables. (i.e., rows from the POS fact table and rows from
The most visible difference in this command on PDW the store-inventory fact table for the same store ID).
is the ability to create a table as replicated or to create Co-located data is related, so queries that access
a table as distributed. POS data and related store inventory data should
As a general rule of thumb, replicated tables perform very well.
should be 1GB or smaller in size. Listing 1 contains To take full advantage of PDW, designing data-
a sample CREATE TABLE statement that creates bases and tables for the highest-priority queries is
a replicated table named DimAccount. As you crucial. PDW excels at scanning and joining large
can see, the DISTRIBUTION argument is set to distributed tables, and often queries against these
REPLICATE. large tables are mission critical. A good database
Generally speaking, distributed tables design on PDW often takes a lot of trial and
are used for transaction or fact tables that are error. What you learned in the single server data-
often much larger than 1GB in size. In some base world isn’t always the same in the MPP data
cases, large dimension tables—for example, a warehouse world. For instance, clustered indexes
500-million row customer account table—is a can work well for large distributed tables, but non-
better candidate for a distributed table. Listing 2 clustered indexes can degrade query performance
contains code that creates a distributed table in some cases because of the random I/O patterns
named FactSales. (You can download the code in they create on the disk storage. PDW is tuned and
Listing 1 and Listing 2 by going to www.sqlmag configured to achieve high rates of sequential I/O
.com, entering 125098 in the InstantDoc ID text against large tables. For many queries, sequential
box, clicking Go, then clicking the Download the I/O against a distributed table can be faster than
Code Here button.) As I mentioned previously, using nonclustered indexes, especially under con-
a single-attribute column must be chosen as the current workloads. In the MPP data warehouse
distribution key so that data loading can be hash world, this is known as an index-light design.
42
2 June 2010 SQL Server Magazine • www.sqlmag.com
www.WorldMags.net & www.aDowns.net
PRODUCT
REVIEW
recommended minimum of 4 CPUs and roughly two source from the perspective of the NovaView suite.
cores per 100 users) and that enough physical RAM You might wonder why anyone would want to use
be present to run all the pnSessionHost.exe processes another BI tool with PowerPivot. PowerPivot is an
(with a recommended minimum of 4GB). outstanding self-service BI product, but it’s also
On the software side, NovaView requires Windows a version-one product. There are a few areas of
Server 2008 or Windows Server 2003, IIS 6 or higher, PowerPivot that Microsoft has left to improve upon
.NET Framework 2.0 or higher, and Microsoft Visual that NovaView will provide, including complex
J# 2.0. If you’re going to source data from SQL hierarchies, data security, and additional data visual-
Server Analysis Services (SSAS) cubes, you’ll need ization options.
a separate server installation with SSAS. NovaView NovaView offers end-to-end BI delivery, and it
can also work with many other mainstream enterprise does it quite well. Panorama has clearly used its deep
data sources. knowledge of OLAP and MDX to produce some of
NovaView Server provides infrastructure services the very best delivery options on the market today.
for the entire NovaView suite of client tools. It sup- Businesses that are looking to extend their existing
ports a wide variety of data sources, including the Microsoft data warehouse and BI solutions or make
SQL Server relational engine, SSAS, Oracle, SAP, PowerPivot enterprise-ready should strongly con-
flat files, and web services. NovaView Server is a sider NovaView. Given the sheer breadth and depth
highly scalable piece of software that can support of the suite, it’s obvious that not all customers will
up to thousands of users and terabytes of data, need all of its components. Small-to-midsized busi-
according to reports from Panorama Software’s nesses might find NovaView’s relatively high cost
established customers. prohibitive.
The NovaView Dashboards, NovaView Visuals, InstantDoc ID 104648
and NovaView GIS Framework components pro-
vide the next layer of business intelligence (BI)
delivery, including basic analytics and other visual- PANORAMA NOVAVIEW SUITE
izations. NovaView Dashboards provides a mature Pros: All user-facing components are browser based; supports both
modern-day dashboarding product that lets you OLAP and non-OLAP data sources; components are tightly bound; supports
create complex views from NovaView Server. Both core needs of both business and IT users
Key Performance Indicators (KPIs) and charts are Cons: High price; additional server components required; neither edition is as
easily created and combined to form enterprise graphically rich or fl
fluid as alternatives such as Tableau Software’s client
dashboards.
Rating:
NovaView Visuals provides advanced informa-
tion visualization options for NovaView Dashboards. Price: Server licenses range from $12,000 to $24,000, depending on con-
For example, capabilities equivalent to ProClarity’s figuration; client licensess range from $299 to $14,000
Decomposition Tree are included as part of NovaView Recommendation: If you’re in the market for a third-party toolset to add
Visuals. With one click, you can map out analytics that functionality to Microsoft’s BI tools, your search is over. But if you only need a
come from NovaView Analytics. few of the suite’s functions, its cost could be prohibitive.
NovaView SharedViews represents a joint ven- Contact: info@panorama.com • 1-416-545-0990 • www.panorama.com
ture between Panorama Software and Google. This
PowerPivot vs.Tableau
This article is a summarized version of Derek Tableau 5.1
Comingore’s original blog. To read the full article, go to Tableau Desktop provides an easy-to-use, drag-and-
sqlmag.com/go/SQLServerBI. drop interface letting anyone create pivot tables and
data visualizations. Dashboard capabilities are also
Microsoft PowerPivot 2010 available in Tableau Desktop by combining multiple
PowerPivot is composed of Desktop (PowerPivot for worksheets into a single display.
Excel) and Server (PowerPivot for SharePoint) com- Tableau’s strength lies in the product’s visualiza-
ponents. The client experience is embedded directly tion and easy-to-use authoring capabilities. The for-
Derek Comingore
within Microsoft Excel, so its authoring experience is mula language is easy enough to use and build custom
Excel. Users can create custom measures, calculated measures and calculated fields with. Both Tableau (dcomingore@bivoyage.com) is a principal
architect with B.I. Voyage, a Microsoft Partner
fields, subtotals, grand totals, and percentages. It uses Server and Desktop installations are extremely easy
that specializes in business intelligence services
a language called Data Analysis eXpressions (DAX) to perform. However, working with massive volumes and solutions. He’s a SQL Server MVP and
to create custom measures and calculated fields. of data can be painful. Tableau’s formula language holds several Microsoft certifications.
PowerPivot ships in x64 and leverages a column- is impressive in its simplicity but isn’t as extensive as
oriented in-memory data store, so you can work with PowerPivot’s DAX. Tableau Server is a good server
massive volumes of data efficiently. DAX provides an product but cannot offer the wealth of features found
extensive expression language to build custom mea- in SharePoint Server.
sures and fields with. PowerPivot for Excel supports If your company is looking for an IT-oriented
practically any data source available. PowerPivot for product that is geared for integration with corporate
SharePoint offers a wealth of features from Share- BI solutions, PowerPivot is a no brainer. If your
Point. On the downside, PowerPivot for Excel can be company is looking for a business-user centric
confusing and DAX is very complex. PowerPivot for platform with little IT usage or corporate BI
Excel does not support parent/child hierarchies either. integration, Tableau should be your choice.
BUSINESS INTELLIGENCE
Lyzasoft Enhances Data Collaboration Tool Editor’s Tipp
Lyzasoft has announced Lyza 2.0. This version lets groups within an enterprise synthesize data from many
Got a great
sources, visually analyze the data, and compare their findings among the workgroup. New features include
micro-tiling, improved user format controls, ad hoc visual data drilling, n-dimensional charting, advanced
new product?
sorting controls, and a range of functions for adding custom fields to charts. Lyza 2.0 also introduces new Send announce-
collaboration features, letting users interact with content in the form of blogs, charts, tables, dashboards, ments to products@
and collections. Lyza costs $400 for a one-year subscription and $2,000 per user for a perpetual license. To sqlmag.com.
learn more, visit www.lyzasoft.com. —Brian Reinholz,
production editor
DATABASE ADMINISTRATION
Attunity Updates Change Data Capture Suite
Attunity announced a new release of its change data capture and operational data replication software,
Attunity, with support for SQL Server 2008 R2. Attunity now tracks log-based changes across all versions of
SQL Server, supports data replication into heterogeneous target databases, and fully integrates with SQL Server
Integration Services (SSIS) and Business Intelligence Development Studio (BIDS). To learn more or download
a free trial, visit www.attunity.com.
DATABASE ADMINISTRATION
Easily Design PDF Flow Charts
Aivosto has released Visustin 6.0, a flow chart generator that converts
T-SQL code to flow charts. The latest version can create PDF flow
charts from 36 programming languages—the new version adds sup-
port for JCL, Matlab, PL/I, Rexx and SAS code. Visustin reads source
code and visualizes each function as a flow chart, letting it see how the
functions operate. With the software, you can easily view two charts side
by side, making for easy comparisons. The standard edition costs $249
and the pro edition costs $499. To learn more, visit www.aivosto.com.
DATABASE ADMINISTRATION
Embarcadero Launches Multi-platform DBArtisan XE
Embarcadero Technologies introduced DBArtisan XE, a solution that lets DBAs maximize the performance
of their databases regardless of type. DBArtisan XE helps database administrators manage and optimize the
schema, security, performance, and availability of all their databases to diagnose and resolve database issues.
DBArtisan XE is also the first Embarcadero product to include Embarcadero ToolCloud as a standard feature.
ToolCloud provides centralized licensing and provisioning, plus on-demand tool deployment, to improve tool
manageability for IT organizations with multiple DBArtisan users. DBArtisan starts at $1,100 for five server
connections. To learn more or download a free version, visit www.embarcadero.com.
DATABASE ADMINISTRATION
DBMoto 7 Adds Multi-server Synchronization
HiT Software has released version 7 of DBMoto, the
company’s change data capture software. DBMoto 7
offers multi-server synchronization, letting organiza-
tions keep multiple operational databases synchronized
across their environment, including special algorithms for
multi-server conflict resolution. Other features include
enhanced support for remote administration, new gran-
ular security options, support for XML data types, and
new transaction log support for Netezza data warehouse
appliances. To learn more or download a free trial, visit
www.hitsw.com.
SQL Server Magazine, June 2010. Vol. 12, No. 6 (ISSN 1522-2187). SQL Server Magazinee is published monthly by Penton Media, Inc., copyright 2010, all rights reserved. SQL Server is a
registered trademark of Microsoft Corporation, and SQL Server Magazinee is used by Penton Media, Inc., under license from owner. SQL Server Magazinee is an independent publication
a for the editorial policy or other contents of the publication. SQL Server Magazine, 221 E.
not affiliated with Microsoft Corporation. Microsoft Corporation is not responsible in any way
29th St., Loveland, CO 80538, 800-621-1544 or 970-663-4700. Sales and marketing offices: 221 E. 29th St., Loveland, CO 80538. Advertising rates furnished upon request. Periodicals
Class postage paid at Loveland, Colorado, and additional mailing offices. Postmaster: Send address changes to SQL Server Magazine, 221 E. 29th St., Loveland, CO 80538. Subscribers:
Send all inquiries, payments, and address changes to SQL Server Magazine, Circulation Department, 221 E. 29th St., Loveland, CO 80538. Printed in the U.S.A.