Sei sulla pagina 1di 437

MCT USE ONLY.

STUDENT USE PROHIBITED


O F F I C I A L M I C R O S O F T L E A R N I N G P R O D U C T

20467A
Designing Business Intelligence Solutions
with Microsoft SQL Server 2012
MCT USE ONLY. STUDENT USE PROHIBITED
iiDesigning Business Intelligence Solutions with Microsoft SQL Server 2012

Information in this document, including URL and other Internet Web site references, is subject to change
without notice. Unless otherwise noted, the example companies, organizations, products, domain names,
e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with
any real company, organization, product, domain name, e-mail address, logo, person, place or event is
intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the
user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in
or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the express written permission of
Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property
rights covering subject matter in this document. Except as expressly provided in any written license
agreement from Microsoft, the furnishing of this document does not give you any license to these
patents, trademarks, copyrights, or other intellectual property.

The names of manufacturers, products, or URLs are provided for informational purposes only and
Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding
these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a
manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links
may be provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not
responsible for the contents of any linked site or any link contained in a linked site, or any changes or
updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission
received from any linked site. Microsoft is providing these links to you only as a convenience, and the
inclusion of any link does not imply endorsement of Microsoft of the site or the products contained
therein.
© 2012 Microsoft Corporation. All rights reserved.

Microsoft and the trademarks listed at


http://www.microsoft.com/about/legal/en/us/IntellectualProperty/Trademarks/EN-US.aspx are trademarks of
the Microsoft group of companies. All other trademarks are property of their respective owners.

Product Number: 20467A

Part Number (if applicable):

Released: xx/20xx
MCT USE ONLY. STUDENT USE PROHIBITED
MICROSOFT LICENSE TERMS
OFFICIAL MICROSOFT LEARNING PRODUCTS
MICROSOFT OFFICIAL COURSE Pre-Release and Final Release Versions

These license terms are an agreement between Microsoft Corporation and you. Please read them. They apply to
the Licensed Content named above, which includes the media on which you received it, if any. These license
terms also apply to any updates, supplements, internet based services and support services for the Licensed
Content, unless other terms accompany those items. If so, those terms apply.

BY DOWNLOADING OR USING THE LICENSED CONTENT, YOU ACCEPT THESE TERMS. IF YOU DO NOT ACCEPT
THEM, DO NOT DOWNLOAD OR USE THE LICENSED CONTENT.

If you comply with these license terms, you have the rights below.

1. DEFINITIONS.

a. “Authorized Learning Center” means a Microsoft Learning Competency Member, Microsoft IT Academy
Program Member, or such other entity as Microsoft may designate from time to time.

b. “Authorized Training Session” means the Microsoft-authorized instructor-led training class using only
MOC Courses that are conducted by a MCT at or through an Authorized Learning Center.

c. “Classroom Device” means one (1) dedicated, secure computer that you own or control that meets or
exceeds the hardware level specified for the particular MOC Course located at your training facilities or
primary business location.

d. “End User” means an individual who is (i) duly enrolled for an Authorized Training Session or Private
Training Session, (ii) an employee of a MPN Member, or (iii) a Microsoft full-time employee.

e. “Licensed Content” means the MOC Course and any other content accompanying this agreement.
Licensed Content may include (i) Trainer Content, (ii) software, and (iii) associated media.

f. “Microsoft Certified Trainer” or “MCT” means an individual who is (i) engaged to teach a training session
to End Users on behalf of an Authorized Learning Center or MPN Member, (ii) currently certified as a
Microsoft Certified Trainer under the Microsoft Certification Program, and (iii) holds a Microsoft
Certification in the technology that is the subject of the training session.

g. “Microsoft IT Academy Member” means a current, active member of the Microsoft IT Academy
Program.

h. “Microsoft Learning Competency Member” means a Microsoft Partner Network Program Member in
good standing that currently holds the Learning Competency status.

i. “Microsoft Official Course” or “MOC Course” means the Official Microsoft Learning Product instructor-
led courseware that educates IT professionals or developers on Microsoft technologies.
MCT USE ONLY. STUDENT USE PROHIBITED
j. “Microsoft Partner Network Member” or “MPN Member” means a silver or gold-level Microsoft Partner
Network program member in good standing.

k. “Personal Device” means one (1) device, workstation or other digital electronic device that you
personally own or control that meets or exceeds the hardware level specified for the particular MOC
Course.

l. “Private Training Session” means the instructor-led training classes provided by MPN Members for
corporate customers to teach a predefined learning objective. These classes are not advertised or
promoted to the general public and class attendance is restricted to individuals employed by or
contracted by the corporate customer.

m. “Trainer Content” means the trainer version of the MOC Course and additional content designated
solely for trainers to use to teach a training session using a MOC Course. Trainer Content may include
Microsoft PowerPoint presentations, instructor notes, lab setup guide, demonstration guides, beta
feedback form and trainer preparation guide for the MOC Course. To clarify, Trainer Content does not
include virtual hard disks or virtual machines.

2. INSTALLATION AND USE RIGHTS. The Licensed Content is licensed not sold. The Licensed Content is
licensed on a one copy per user basis, such that you must acquire a license for each individual that
accesses or uses the Licensed Content.

2.1 Below are four separate sets of installation and use rights. Only one set of rights apply to you.

a. If you are a Authorized Learning Center:


i. If the Licensed Content is in digital format for each license you acquire you may either:
1. install one (1) copy of the Licensed Content in the form provided to you on a dedicated, secure
server located on your premises where the Authorized Training Session is held for access and
use by one (1) End User attending the Authorized Training Session, or by one (1) MCT teaching
the Authorized Training Session, or
2. install one (1) copy of the Licensed Content in the form provided to you on one (1) Classroom
Device for access and use by one (1) End User attending the Authorized Training Session, or by
one (1) MCT teaching the Authorized Training Session.
ii. You agree that:
1. you will acquire a license for each End User and MCT that accesses the Licensed Content,
2. each End User and MCT will be presented with a copy of this agreement and each individual
will agree that their use of the Licensed Content will be subject to these license terms prior to
their accessing the Licensed Content. Each individual will be required to denote their
acceptance of the EULA in a manner that is enforceable under local law prior to their accessing
the Licensed Content,
3. for all Authorized Training Sessions, you will only use qualified MCTs who hold the applicable
competency to teach the particular MOC Course that is the subject of the training session,
4. you will not alter or remove any copyright or other protective notices contained in the
Licensed Content,
MCT USE ONLY. STUDENT USE PROHIBITED
5. you will remove and irretrievably delete all Licensed Content from all Classroom Devices and
servers at the end of the Authorized Training Session,
6. you will only provide access to the Licensed Content to End Users and MCTs,
7. you will only provide access to the Trainer Content to MCTs, and
8. any Licensed Content installed for use during a training session will be done in accordance
with the applicable classroom set-up guide.

b. If you are a MPN Member.


i. If the Licensed Content is in digital format for each license you acquire you may either:
1. install one (1) copy of the Licensed Content in the form provided to you on (A) one (1)
Classroom Device, or (B) one (1) dedicated, secure server located at your premises where
the training session is held for use by one (1) of your employees attending a training session
provided by you, or by one (1) MCT that is teaching the training session, or
2. install one (1) copy of the Licensed Content in the form provided to you on one (1)
Classroom Device for use by one (1) End User attending a Private Training Session, or one (1)
MCT that is teaching the Private Training Session.
ii. You agree that:
1. you will acquire a license for each End User and MCT that accesses the Licensed Content,
2. each End User and MCT will be presented with a copy of this agreement and each individual
will agree that their use of the Licensed Content will be subject to these license terms prior
to their accessing the Licensed Content. Each individual will be required to denote their
acceptance of the EULA in a manner that is enforceable under local law prior to their
accessing the Licensed Content,
3. for all training sessions, you will only use qualified MCTs who hold the applicable
competency to teach the particular MOC Course that is the subject of the training session,
4. you will not alter or remove any copyright or other protective notices contained in the
Licensed Content,
5. you will remove and irretrievably delete all Licensed Content from all Classroom Devices and
servers at the end of each training session,
6. you will only provide access to the Licensed Content to End Users and MCTs,
7. you will only provide access to the Trainer Content to MCTs, and
8. any Licensed Content installed for use during a training session will be done in accordance
with the applicable classroom set-up guide.

c. If you are an End User:


You may use the Licensed Content solely for your personal training use. If the Licensed Content is in
digital format, for each license you acquire you may (i) install one (1) copy of the Licensed Content in
the form provided to you on one (1) Personal Device and install another copy on another Personal
Device as a backup copy, which may be used only to reinstall the Licensed Content; or (ii) print one (1)
copy of the Licensed Content. You may not install or use a copy of the Licensed Content on a device
you do not own or control.
MCT USE ONLY. STUDENT USE PROHIBITED
d. If you are a MCT.
i. For each license you acquire, you may use the Licensed Content solely to prepare and deliver an
Authorized Training Session or Private Training Session. For each license you acquire, you may
install and use one (1) copy of the Licensed Content in the form provided to you on one (1) Personal
Device and install one (1) additional copy on another Personal Device as a backup copy, which may
be used only to reinstall the Licensed Content. You may not install or use a copy of the Licensed
Content on a device you do not own or control.

ii. Use of Instructional Components in Trainer Content. You may customize, in accordance with the
most recent version of the MCT Agreement, those portions of the Trainer Content that are logically
associated with instruction of a training session. If you elect to exercise the foregoing rights, you
agree: (a) that any of these customizations will only be used for providing a training session, (b) any
customizations will comply with the terms and conditions for Modified Training Sessions and
Supplemental Materials in the most recent version of the MCT agreement and with this agreement.
For clarity, any use of “customize” refers only to changing the order of slides and content, and/or
not using all the slides or content, it does not mean changing or modifying any slide or content.

2.2 Separation of Components. The Licensed Content components are licensed as a single unit and you
may not separate the components and install them on different devices.

2.3 Reproduction/Redistribution Licensed Content. Except as expressly provided in the applicable


installation and use rights above, you may not reproduce or distribute the Licensed Content or any portion
thereof (including any permitted modifications) to any third parties without the express written permission
of Microsoft.

2.4 Third Party Programs. The Licensed Content may contain third party programs or services. These
license terms will apply to your use of those third party programs or services, unless other terms accompany
those programs and services.

2.5 Additional Terms. Some Licensed Content may contain components with additional terms,
conditions, and licenses regarding its use. Any non-conflicting terms in those conditions and licenses also
apply to that respective component and supplements the terms described in this Agreement.

3. PRE-RELEASE VERSIONS. If the Licensed Content is a pre-release (“beta”) version, in addition to the other
provisions in this agreement, then these terms also apply:

a. Pre-Release Licensed Content. This Licensed Content is a pre-release version. It may not contain the
same information and/or work the way a final version of the Licensed Content will. We may change it
for the final version. We also may not release a final version. Microsoft is under no obligation to
provide you with any further content, including the final release version of the Licensed Content.

b. Feedback. If you agree to give feedback about the Licensed Content to Microsoft, either directly or
through its third party designee, you give to Microsoft without charge, the right to use, share and
commercialize your feedback in any way and for any purpose. You also give to third parties, without
charge, any patent rights needed for their products, technologies and services to use or interface with
any specific parts of a Microsoft software, Microsoft product, or service that includes the feedback. You
will not give feedback that is subject to a license that requires Microsoft to license its software,
technologies, or products to third parties because we include your feedback in them. These rights
MCT USE ONLY. STUDENT USE PROHIBITED
survive this agreement.

c. Term. If you are an Authorized Training Center, MCT or MPN, you agree to cease using all copies of the
beta version of the Licensed Content upon (i) the date which Microsoft informs you is the end date for
using the beta version, or (ii) sixty (60) days after the commercial release of the Licensed Content,
whichever is earliest (“beta term”). Upon expiration or termination of the beta term, you will
irretrievably delete and destroy all copies of same in the possession or under your control.

4. INTERNET-BASED SERVICES. Microsoft may provide Internet-based services with the Licensed Content,
which may change or be canceled at any time.

a. Consent for Internet-Based Services. The Licensed Content may connect to computer systems over an
Internet-based wireless network. In some cases, you will not receive a separate notice when they
connect. Using the Licensed Content operates as your consent to the transmission of standard device
information (including but not limited to technical information about your device, system and
application software, and peripherals) for internet-based services.

b. Misuse of Internet-based Services. You may not use any Internet-based service in any way that could
harm it or impair anyone else’s use of it. You may not use the service to try to gain unauthorized access
to any service, data, account or network by any means.

5. SCOPE OF LICENSE. The Licensed Content is licensed, not sold. This agreement only gives you some rights
to use the Licensed Content. Microsoft reserves all other rights. Unless applicable law gives you more
rights despite this limitation, you may use the Licensed Content only as expressly permitted in this
agreement. In doing so, you must comply with any technical limitations in the Licensed Content that only
allows you to use it in certain ways. Except as expressly permitted in this agreement, you may not:
• install more copies of the Licensed Content on devices than the number of licenses you acquired;
• allow more individuals to access the Licensed Content than the number of licenses you acquired;
• publicly display, or make the Licensed Content available for others to access or use;
• install, sell, publish, transmit, encumber, pledge, lend, copy, adapt, link to, post, rent, lease or lend,
make available or distribute the Licensed Content to any third party, except as expressly permitted
by this Agreement.
• reverse engineer, decompile, remove or otherwise thwart any protections or disassemble the
Licensed Content except and only to the extent that applicable law expressly permits, despite this
limitation;
• access or use any Licensed Content for which you are not providing a training session to End Users
using the Licensed Content;
• access or use any Licensed Content that you have not been authorized by Microsoft to access and
use; or
• transfer the Licensed Content, in whole or in part, or assign this agreement to any third party.

6. RESERVATION OF RIGHTS AND OWNERSHIP. Microsoft reserves all rights not expressly granted to you in
this agreement. The Licensed Content is protected by copyright and other intellectual property laws and
treaties. Microsoft or its suppliers own the title, copyright, and other intellectual property rights in the
Licensed Content. You may not remove or obscure any copyright, trademark or patent notices that
appear on the Licensed Content or any components thereof, as delivered to you.
MCT USE ONLY. STUDENT USE PROHIBITED
7. EXPORT RESTRICTIONS. The Licensed Content is subject to United States export laws and regulations. You
must comply with all domestic and international export laws and regulations that apply to the Licensed
Content. These laws include restrictions on destinations, End Users and end use. For additional
information, see www.microsoft.com/exporting.

8. LIMITATIONS ON SALE, RENTAL, ETC. AND CERTAIN ASSIGNMENTS. You may not sell, rent, lease, lend or
sublicense the Licensed Content or any portion thereof, or transfer or assign this agreement.

9. SUPPORT SERVICES. Because the Licensed Content is “as is”, we may not provide support services for it.

10. TERMINATION. Without prejudice to any other rights, Microsoft may terminate this agreement if you fail
to comply with the terms and conditions of this agreement. Upon any termination of this agreement, you
agree to immediately stop all use of and to irretrievable delete and destroy all copies of the Licensed
Content in your possession or under your control.

11. LINKS TO THIRD PARTY SITES. You may link to third party sites through the use of the Licensed Content.
The third party sites are not under the control of Microsoft, and Microsoft is not responsible for the
contents of any third party sites, any links contained in third party sites, or any changes or updates to third
party sites. Microsoft is not responsible for webcasting or any other form of transmission received from
any third party sites. Microsoft is providing these links to third party sites to you only as a convenience,
and the inclusion of any link does not imply an endorsement by Microsoft of the third party site.

12. ENTIRE AGREEMENT. This agreement, and the terms for supplements, updates and support services are
the entire agreement for the Licensed Content.

13. APPLICABLE LAW.


a. United States. If you acquired the Licensed Content in the United States, Washington state law governs
the interpretation of this agreement and applies to claims for breach of it, regardless of conflict of laws
principles. The laws of the state where you live govern all other claims, including claims under state
consumer protection laws, unfair competition laws, and in tort.

b. Outside the United States. If you acquired the Licensed Content in any other country, the laws of that
country apply.

14. LEGAL EFFECT. This agreement describes certain legal rights. You may have other rights under the laws of
your country. You may also have rights with respect to the party from whom you acquired the Licensed
Content. This agreement does not change your rights under the laws of your country if the laws of your
country do not permit it to do so.

15. DISCLAIMER OF WARRANTY. THE LICENSED CONTENT IS LICENSED "AS-IS," "WITH ALL FAULTS," AND "AS
AVAILABLE." YOU BEAR THE RISK OF USING IT. MICROSOFT CORPORATION AND ITS RESPECTIVE
AFFILIATES GIVE NO EXPRESS WARRANTIES, GUARANTEES, OR CONDITIONS UNDER OR IN RELATION TO
THE LICENSED CONTENT. YOU MAY HAVE ADDITIONAL CONSUMER RIGHTS UNDER YOUR LOCAL LAWS
WHICH THIS AGREEMENT CANNOT CHANGE. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAWS,
MICROSOFT CORPORATION AND ITS RESPECTIVE AFFILIATES EXCLUDE ANY IMPLIED WARRANTIES OR
CONDITIONS, INCLUDING THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NON-INFRINGEMENT.
MCT USE ONLY. STUDENT USE PROHIBITED
16. LIMITATION ON AND EXCLUSION OF REMEDIES AND DAMAGES. TO THE EXTENT NOT PROHIBITED BY
LAW, YOU CAN RECOVER FROM MICROSOFT CORPORATION AND ITS SUPPLIERS ONLY DIRECT
DAMAGES UP TO USD$5.00. YOU AGREE NOT TO SEEK TO RECOVER ANY OTHER DAMAGES, INCLUDING
CONSEQUENTIAL, LOST PROFITS, SPECIAL, INDIRECT OR INCIDENTAL DAMAGES FROM MICROSOFT
CORPORATION AND ITS RESPECTIVE SUPPLIERS.

This limitation applies to


o anything related to the Licensed Content, services made available through the Licensed Content, or
content (including code) on third party Internet sites or third-party programs; and
o claims for breach of contract, breach of warranty, guarantee or condition, strict liability, negligence,
or other tort to the extent permitted by applicable law.

It also applies even if Microsoft knew or should have known about the possibility of the damages. The
above limitation or exclusion may not apply to you because your country may not allow the exclusion or
limitation of incidental, consequential or other damages.

Please note: As this Licensed Content is distributed in Quebec, Canada, some of the clauses in this agreement
are provided below in French.

Remarque : Ce le contenu sous licence étant distribué au Québec, Canada, certaines des clauses dans ce
contrat sont fournies ci-dessous en français.

EXONÉRATION DE GARANTIE. Le contenu sous licence visé par une licence est offert « tel quel ». Toute
utilisation de ce contenu sous licence est à votre seule risque et péril. Microsoft n’accorde aucune autre garantie
expresse. Vous pouvez bénéficier de droits additionnels en vertu du droit local sur la protection dues
consommateurs, que ce contrat ne peut modifier. La ou elles sont permises par le droit locale, les garanties
implicites de qualité marchande, d’adéquation à un usage particulier et d’absence de contrefaçon sont exclues.

LIMITATION DES DOMMAGES-INTÉRÊTS ET EXCLUSION DE RESPONSABILITÉ POUR LES DOMMAGES. Vous


pouvez obtenir de Microsoft et de ses fournisseurs une indemnisation en cas de dommages directs uniquement
à hauteur de 5,00 $ US. Vous ne pouvez prétendre à aucune indemnisation pour les autres dommages, y
compris les dommages spéciaux, indirects ou accessoires et pertes de bénéfices.
Cette limitation concerne:
• tout ce qui est relié au le contenu sous licence , aux services ou au contenu (y compris le code)
figurant sur des sites Internet tiers ou dans des programmes tiers ; et
• les réclamations au titre de violation de contrat ou de garantie, ou au titre de responsabilité
stricte, de négligence ou d’une autre faute dans la limite autorisée par la loi en vigueur.

Elle s’applique également, même si Microsoft connaissait ou devrait connaître l’éventualité d’un tel dommage.
Si votre pays n’autorise pas l’exclusion ou la limitation de responsabilité pour les dommages indirects,
accessoires ou de quelque nature que ce soit, il se peut que la limitation ou l’exclusion ci-dessus ne s’appliquera
pas à votre égard.

EFFET JURIDIQUE. Le présent contrat décrit certains droits juridiques. Vous pourriez avoir d’autres droits prévus
par les lois de votre pays. Le présent contrat ne modifie pas les droits que vous confèrent les lois de votre pays
si celles-ci ne le permettent pas.

Revised December 2011


MCT USE ONLY. STUDENT USE PROHIBITED
xDesigning Business Intelligence Solutions with Microsoft SQL Server 2012
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 xi

Acknowledgments
Microsoft Learning wants to acknowledge and thank the following for their contribution toward
developing this title. Their effort at various stages in the development has ensured that you have a good
classroom experience.

Graeme Malcolm – Lead Content Developer


Graeme Malcolm is a Microsoft SQL Server subject matter expert and professional content developer at
Content Master—a division of CM Group Ltd. As a Microsoft Certified Trainer, Graeme has delivered
training courses on SQL Server since version 4.2; as an author, Graeme has written numerous books,
articles, and training courses on SQL Server; and as a consultant, Graeme has designed and implemented
business solutions based on SQL Server for customers all over the world.

Chris Testa-O’Neill – Technical Reviewer


Chris Testa-O’Neil is a Senior Consultant at Coeo (www.coeo.com), a leading provider of SQL Server
Managed Support and Consulting in the UK and Europe. He is also a Microsoft Certified Trainer, Microsoft
Most Valuable Professional for SQL Server, and lead author of Microsoft E-Learning MCTS courses for SQL
Server 2008. Chris has spoken at a range of SQL Server events in the UK, Europe, Australia and the United
States. He is also one of the organizers of SQLBits, SQLServerFAQ and a UK Regional Mentor for SQLPASS.
MCT USE ONLY. STUDENT USE PROHIBITED
xiiDesigning Business Intelligence Solutions with Microsoft SQL Server 2012

Contents
Module 1: Planning a BI Solution
Lesson 1: Elements of a BI Solution page 2
Lesson 2: The Microsoft BI Platform page 10
Lesson 3: Planning a BI Project page 17
Lab: Planning a BI Solution page 24

Module 2: Planning SQL Server Business Intelligence Infrastructure


Lesson 1: Considerations for BI Infrastructure page 2
Lesson 2: Planning Data Warehouse Hardware page 10
Lab: Planning SQL Server Business Intelligence Infrastructure page 16

Module 3: Designing a Data Warehouse


Lesson 1: Data Warehouse Design Overview page 2
Lesson 2: Designing Dimension Tables page 8
Lesson 3: Designing Fact Tables page 15
Lab A: Designing a Data Warehouse Logical Schema page 18
Lesson 4: Designing a Data Warehouse Physical Implementation page 21
Lab B: Designing a Data Warehouse Physical Implementation page 33

Module 4: Designing an ETL Solution


Lesson 1: ETL Overview page 2
Lesson 2: Planning Data Extraction page 5
Lesson 3: Planning Data Transformation page 9
Lesson 4: Planning Data Loads page 13
Lab: Designing an ETL Solution page 20

Module 5: Designing Analytical Data Models


Lesson 1: Introduction to Analytical Data Models page 2
Lesson 2: Designing an Analytical Data Model page 8
Lab A: Designing Analytical Data Models page 14
Lesson 3: Designing Dimensions page 22
Lab B: Designing Dimensions and Hierarchies page 27
Lesson 4: Enhancing Data Models page 35
Lab C: Enhancing Data Models page 40

Module 6: Planning a BI Delivery Solution


Lesson 1: Considerations for Delivering BI page 2
Lesson 2: Common Reporting Scenarios page 6
Lesson 3: Choosing a Reporting Tool page 10
Lab: Planning a BI Delivery Solution page 14
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 xiii

Module 7: Designing a Reporting Services Solution


Lesson 1: Planning a Reporting Services Solution page 2
Lesson 2: Designing Reports page 9
Lesson 3: Planning Report Consistency page 15
Lab: Designing a Reporting Services Solution page 18

Module 8: Designing a Microsoft Excel-Based Reporting Solution


Lesson 1: Using Excel for Data Analysis and Reporting page 2
Lesson 2: PowerPivot for Excel page 6
Lesson 3: Power View for Excel page 9
Lab: Designing a Microsoft Excel-Based Reporting Solution page 11

Module 9: Planning a SharePoint Server BI Solution


Lesson 1: Introduction to SharePoint Server as a BI Platform page 2
Lesson 2: Planning Security for a SharePoint Server BI Solution page 8
Lesson 3: Planning Reporting Services Configuration page 13
Lesson 4: Planning PowerPivot Configuration page 16
Lesson 5: Planning for PerformancePoint Services page 20
Lab: Planning a SharePoint Server BI Solution page 25

Module 10: Monitoring and Optimizing a BI Solution


Lesson 1: Overview of BI Monitoring page 2
Lesson 2: Monitoring and Optimizing the Data Warehouse page 5
Lesson 3: Monitoring and Optimizing Analysis Services page 10
Lesson 4: Monitoring and Optimizing Reporting Services page 16
Lab: Monitoring and Optimizing a BI Solution page 20

Module 11: Operating a BI Solution


Lesson 1: Overview of BI Operations page 2
Lesson 2: ETL Operations page 6
Lesson 3: Data Warehouse Operations page 11
Lesson 4: Analysis Services Operations page 16
Lesson 5: Reporting Services Operations page 20
Lab: Operating a BI Solution page 23

Lab Answer Keys


Module 1 Lab: Planning a BI Solution page 1
Module 2 Lab: Planning SQL Server Business Intelligence Infrastructure page 1
Module 3 Lab A: Designing a Data Warehouse Logical Schema page 1
Module 3 Lab B: Designing a Data Warehouse Physical Implementation page 1
Module 4 Lab: Designing an ETL Solution page 1
Module 5 Lab A: Designing Analytical Data Models page 1
Module 5 Lab B: Designing Dimensions and Hierarchies page 1
MCT USE ONLY. STUDENT USE PROHIBITED
xivDesigning Business Intelligence Solutions with Microsoft SQL Server 2012

Module 5 Lab C: Enhancing Data Models page 1


Module 6 Lab: Planning a BI Delivery Solution page 1
Module 7 Lab: Designing a Reporting Services Solution page 1
Module 8 Lab: Designing a Microsoft Excel-Based Reporting Solution page 1
Module 9 Lab: Planning a SharePoint Server BI Solution page 1
Module 10 Lab: Monitoring and Optimizing a BI Solution page 1
Module 11 Lab: Operating a BI Solution page 1
MCT USE ONLY. STUDENT USE PROHIBITED
About This Course i

About This Course


This section provides a brief description of the course, audience, suggested prerequisites, and course
objectives.

Course Description

Note: This first release (‘A’) MOC version of course 20467A has been developed on pre-release
software. Microsoft Learning will release a ‘B’ version of this course with enhanced PowerPoint
slides, copy-edited content, and Course Companion content on Microsoft Learning site.

This training course teaches database and business intelligence (BI) professionals how to plan and design
a BI solution that is based on Microsoft SQL Server 2012 and other Microsoft BI technologies.

Audience
This course is not designed for students who are new to SQL Server 2012 BI technologies; it is targeted at
BI professionals with experience of implementing solutions with SQL Server database Engine, SQL Server
Integration Services, SQL Server Analysis Services, and SQL Server Reporting Services.

Student Prerequisites
In addition to their professional experience, students who attend this training should already have the
following technical knowledge:

 A basic understanding of dimensional modeling (star schema) for data warehouses

 Basic server hardware knowledge

 The ability to create Integration Services packages that include control flows and data flows

 The ability to create a basic multidimensional cube with Analysis Services

 The ability to create a basic tabular model with PowerPivot and Analysis Services
 The ability to create Reporting Services reports with Report Designer

 The ability to implement authentication and permissions in the SQL Server database engine,
Analysis Services, and Reporting Services

 Familiarity with SharePoint Server and Microsoft Office applications – particularly Excel

Course Objectives
After completing this course, students will be able to:

 Plan a BI solution.

 Plan SQL Server BI infrastructure.

 Design a data warehouse.

 Design an extract, transform and load (ETL) solution.

 Design analytical data models.

 Plan a BI delivery solution.

 Design a Reporting Services solution.


MCT USE ONLY. STUDENT USE PROHIBITED
ii About This Course

 Design a Microsoft Excel-based reporting solution.

 Plan a SharePoint Server BI solution.

 Monitor and optimize a BI solution.

 Operate a BI solution.

Course Outline
The course outline is as follows:

Module 1, “Planning a BI Solution"

Module 2, “Planning SQL Server Business Intelligence Infrastructure"

Module 3, “Designing a Data Warehouse"

Module 4, “Designing an ETL Solution"

Module 5, “Designing Analytical Data Models"

Module 6, “Planning a BI Delivery Solution"

Module 7, “Designing a Reporting Services Solution"


Module 8, “Designing a Microsoft Excel-Based Reporting Solution"

Module 9, “Planning a SharePoint Server BI Solution"

Module 10, “Monitoring and Optimizing a BI Solution"


Module 11, “Operating a BI Solution"

Course Materials
The following materials are included with your kit:

 Course Handbook: a succinct classroom learning guide that provides the critical technical
information in a crisp, tightly-focused format, which is essential for an effective in-class learning
experience.

 Lessons: guide you through the learning objectives and provide the key points that are critical to
the success of the in-class learning experience.

 Labs: provide a real-world, hands-on platform for you to apply the knowledge and skills learned
in the module.

 Module Reviews and Takeaways: provide on-the-job reference material to boost knowledge
and skills retention.

 Lab Answer Keys: provide step-by-step lab solution guidance.


Course Companion Content: searchable, easy-to-browse digital content with integrated premium
online resources that supplement the Course Handbook.

 Modules: include companion content, such as questions and answers, detailed demo steps and
additional reading links, for each lesson. Additionally, they include Lab Review questions and
answers and Module Reviews and Takeaways sections, which contain the review questions and
answers, best practices, common issues and troubleshooting tips with answers, and real-world
issues and scenarios with answers.
MCT USE ONLY. STUDENT USE PROHIBITED
About This Course iii

 Resources: include well-categorized additional resources that give you immediate access to the
most current premium content on TechNet, MSDN®, or Microsoft® Press®.

Note: For this version of the Courseware on Prerelease Software, Companion Content is not
available. However, the Companion Content will be published when the next (B) version of this
course is released, and students who have taken this course will be able to download the
Companion Content at that time from the
http://www.microsoft.com/learning/companionmoc site. Please check with your instructor
when the ‘B’ version of this course is scheduled to release to learn when you can access
Companion Content for this course.

Student Course files: includes the Allfiles.exe, a self-extracting executable file that contains all
required files for the labs and demonstrations.

Note: For this version of the Courseware on Prerelease Software, Allfiles.exe file is not available.
However, this file will be published when the next (B) version of this course is released, and students
who have taken this course will be able to download the Allfiles.exe at that time from the
http://www.microsoft.com/learning/companionmoc site.

 Course evaluation: at the end of the course, you will have the opportunity to complete an online
evaluation to provide feedback on the course, training facility, and instructor.

 To provide additional comments or feedback on the course, send an email to


support@mscourseware.com. To inquire about the Microsoft Certification Program, send an
email to mcphelp@microsoft.com.
MCT USE ONLY. STUDENT USE PROHIBITED
iv About This Course

Virtual Machine Environment


This section provides the information for setting up the classroom environment to support the business
scenario of the course.

Virtual Machine Configuration


In this course, you will use Microsoft® Hyper-V™ to perform the labs.

Important: At the end of each lab, you must close the virtual machine and must not save
any changes. To close a virtual machine (VM) without saving the changes, perform the
following steps:

1. On the virtual machine, on the Action menu, click Close.

2. In the Close dialog box, in the What do you want the virtual machine to do? list, click
Turn off and delete changes, and then click OK.

The following table shows the role of each virtual machine that is used in this course:

Virtual machine Role


20467A-MIA-DC Domain controller for the ADVENTUREWORKS
domain.

20467A-MIA-SQLBI SQL Server and SharePoint Server

Software Configuration
The following software is installed:
 Microsoft Windows Server 2012

 Microsoft SQL Server 2012 SP1

 Microsoft Office 2013


 Microsoft SharePoint Server 2013

 Microsoft Visio 2013

Course Files
The files associated with the labs in this course are located in the D:\Labfiles folder on the 20467A-MIA-
SQLBI virtual machine.

Classroom Setup
Each classroom computer will have the same virtual machines configured in the same way.

Course Hardware Level


To ensure a satisfactory student experience, Microsoft Learning requires a minimum equipment
configuration for trainer and student computers in all Microsoft Certified Partner for Learning Solutions
(CPLS) classrooms in which Official Microsoft Learning Product courseware is taught.
MCT USE ONLY. STUDENT USE PROHIBITED
About This Course v

Hardware Level 6+

 Intel Virtualization Technology (Intel VT) or AMD Virtualization (AMD-V) processor


 Dual 120 GB hard disks 7200 RM SATA or better*
 8GB or higher
 DVD drive
 Network adapter with Internet connectivity
 Super VGA (SVGA) 17-inch monitor
 Microsoft Mouse or compatible pointing device
 Sound card with amplified speakers
*Striped
In addition, the instructor computer must be connected to a projection display device that supports SVGA
1024 x 768 pixels, 16 bit colors.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
1-1

Module 1
Planning a BI Solution
Contents:
Module Overview 1-1

Lesson 1: Elements of a BI Solution 1-2

Lesson 2: The Microsoft BI Platform 1-10

Lesson 3: Planning a BI Project 1-17

Lab: Planning a BI Solution 1-24

Module Review and Takeaways 1-26

Module Overview
Business Intelligence (BI) is an increasingly important IT service in many businesses. In the past, BI
solutions were primarily the preserve of large corporations; but as data storage, analytical, and reporting
technologies become more affordable, many small and medium-sized organizations are able to take
advantage of BI solutions.

As a SQL Server database professional, you may be required to participate in, or perhaps even lead, a
project with the aim of implementing an effective BI solution. Therefore, it is important that you have a
good understanding of the various elements that comprise a BI solution, the business and IT personnel
typically involved in a BI project, and the Microsoft products that you can use to implement the solution.

Objectives
After completing this module, you will be able to:

 Describe the elements of a typical BI solution.

 Describe key considerations for planning a BI project.

 Select appropriate Microsoft technologies for a BI solution.


MCT USE ONLY. STUDENT USE PROHIBITED
1-2 Planning a BI Solution

Lesson 1
Elements of a BI Solution
Although there’s no single definitive template for a BI solution, there are some common elements that are
typical across most BI implementations. Being familiar with these common elements will help you identify
the key components required for your specific BI solution.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the common elements in a typical BI solution.

 Describe the role of business data sources in a BI solution.

 Describe the role of a data warehouse in a BI solution.

 Describe the role played by an extract, transform, and load (ETL) process in a BI solution.
 Describe the role played by analytical models in a BI solution.

 Describe reporting and analysis in a BI solution.

Overview of a BI Solution
Fundamentally, all BI solutions are designed to
take data generated by business operations,
structure it into an appropriate format for
consistent analysis and reporting, and use the
information gained by examining the data to
improve business performance. No two BI
solutions are identical, but most include the
following elements:
 Business data sources. The data that will
ultimately provide the basis for business
decision making through the BI solution
usually resides in existing business
applications or external data sources (which may be commercially available data sets or data exposed
by business partner organizations).

 A data warehouse. To make it easier to analyze and report on the business as a whole, the business
data is typically consolidated into a data warehouse. Depending on the size of the organization, and
the specific BI methodology adopted, this may be a single, central database that is optimized for
analytical queries; or a distributed collection of data marts, each pertaining to a specific area of the
business.

 Extract, transform, and load (ETL) processes. To get the business data from the data sources into
the data warehouse, an ETL process periodically extracts data from the source systems, transforms the
structure and content of the data to conform to the data warehouse schema, and loads it into the
data warehouse. ETL processes are often implemented within a wider enterprise integration
management (EIM) framework that ensures the integrity of data across multiple systems through
master data management (MDM) and data cleansing.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-3

 Analytical data models. The data warehouse schema is usually optimized for analytical querying and
in some cases you may decide to perform all analysis and reporting directly from the data warehouse
itself. However, it is common to build analytical data models on top of the data warehouse to abstract
the underlying data tables, add custom analytical values such as key performance indicators, and
aggregate the data for faster analytical processing.

 Reporting. Most BI solutions include a reporting element that enables business users to view reports
containing business information. Most reporting solutions provide a set of standard business reports
that are generated on a regular basis, and some also empower users to perform self-service reporting
in order to generate their own custom reports. Reports can be created directly from the data
warehouse or from the analytical data models built on it, depending on your specific business
requirements and constraints.

 Analytical Information Worker Tools. In addition to reports, most BI solutions deliver analytical
information to business users through information worker tools. These tools might be locally installed
applications, such as Microsoft Excel; or interactive dashboards in web-based applications, such as
Microsoft SharePoint Server.

Business Data Sources


Most businesses use software applications to
process business operations. For example, a retail
business might use a point-of-sale (POS) system
to process sales transactions as customers
purchase goods, and an inventory management
system to perform stock control operations, such
as ordering new stock as goods sell out. Most
organizations also use a human resources system
to manage employee records, and many sales-
oriented businesses use customer relationship
management (CRM) systems to manage customer
contact details and sales opportunities.
The systems used in an organization might be purpose-built applications or based on simple documents,
such as spreadsheets. In some cases, business operations might be automated by sensors or plant
machinery. Regardless of the specific implementation, all of these systems generate some form of business
data; and this data is the starting point for any BI solution.

Business data sources for a BI solution typically include some or all of the following:

 Application databases, often implemented as relational databases in systems such as SQL Server,
Oracle, or Microsoft Access.

 Proprietary data stores, such as those used by many commercial financial accounting applications.

 Documents such as Excel workbooks.

 Sensor readings emitted by plant machinery, which may be captured as a data stream using
technologies such as Microsoft SQL Server StreamInsight.

 External data sources such as cloud-based databases or web services.

 Master data hubs that contain definitive data values for core business entities.

One of the first tasks in any BI project is to audit the available data sources and try to identify:

 The specific data that is stored in each source.


MCT USE ONLY. STUDENT USE PROHIBITED
1-4 Planning a BI Solution

 The volume of data currently stored and being generated by ongoing operations.

 The data types and range of values for important business data fields.

 Business-specific values used to indicate key information (for example, a POS system may use numeric
codes to indicate payment types, such as 0 for cash, 1 for credit, and so on).

 Common errors, reliability issues, and missing or null values in the data.

 Data duplication or inconsistencies across multiple systems.

 Existing data integration processes.

 Data source usage patterns and update periodicity.

 Technologies that can be used to extract the source data to a staging database.

The Data Warehouse


The data warehouse is the central data repository
on which all reporting and analysis is based.
Typically, it contains numeric business measures
that are important to the business, such as
revenue, cost, or profit; and the key business
entities or dimensions by which those measures
can be aggregated, such as fiscal period,
customer, or product.

Kimball and Inmon Methodologies


Typically, a data warehouse is implemented as a
relational database in which the business data has
been denormalized into a star schema consisting
of fact tables that contain numeric measures and dimension tables that contain attribute hierarchies across
which the measures are aggregated. This approach reflects the dimensional model methodology
promoted by Ralph Kimball, and is the most common approach adopted by Microsoft SQL Server
customers. In the Kimball methodology, the data warehouse may consist of multiple data marts, each
dealing with a specific area of the business. The fact and dimension tables in these data marts are
conformed so that they share the same grain (granularity) and dimension attributes across all data marts.
This enables the data marts to be independent data stores that can be logically viewed as a single
enterprise data warehouse (EDW). The Kimball methodology is often referred to as a bottom-up approach.

An alternative data warehouse design, popularized by Bill Inmon, is the corporate information factory
(CIF) model. In the CIF model, the enterprise data warehouse stores the business data in a normalized
relational schema. This is then used to feed departmental data marts, in which specific subsets of the data
are exposed in a star schema. The dependency of the data marts on a central EDW leads many to refer to
the Inmon methodology as a top-down approach.

Common Implementations
Although the Kimball and Inmon methodologies in their pure form are designed for BI solutions that
distribute the data across multiple departmental data marts, it is common for organizations to begin with
a Kimball-style data mart for a subset of the business that expands over time into a single, central data
warehouse database for the entire enterprise. The availability of inexpensive storage and the increasing
power of server hardware mean that a single data warehouse can support a huge volume of data and
heavy user workloads.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-5

In very large enterprises, a federated approach is often used in which a hub-and-spoke architecture
synchronizes departmental data marts with a central enterprise data warehouse.

Note: SQL Server can be used to support both Kimball and Inmon style data warehouse
solutions. In response to the more common use of the Kimball methodology, the SQL Server
database engine has been designed to optimize star-join queries and most documentation about
data warehouse implementation in SQL Server assumes a dimensional model rather than a
normalized EDW. In deference to these facts, this course focuses on a Kimball style data
warehouse. However, you should investigate the details of both approaches and consider which
best suits your specific business requirements and constraints.

Extract, Transform, and Load Processes


If the data warehouse is the central “brain” of the
BI solution, the ETL system is the heart. It pumps
business data through the solution to keep the
data warehouse up to date and ensure that the BI
solution as a whole continues to deliver value to
the business. A significant proportion of the effort
to design and build an effective BI solution goes
into the ETL subsystem.

ETL Processes in a Data Warehousing


Solution
ETL is about more than just copying data from a
source to the data warehouse. ETL processes must
extract data efficiently, filtering extraction operations to include only new or changed rows wherever
possible. Typically, the extracted data is initially loaded into a staging area so that loads to the data
warehouse tables can be synchronized across data extracted from multiple sources and performed at the
most appropriate time. Additionally, the ETL process typically applies transformations to the data before it
is loaded, in order to ensure that the data values and schema conform to the data warehouse dimensional
model. Finally, when loading the data warehouse, an important part of the ETL process is to handle slowly
changing dimensions (dimension records that must be updated while retaining historical versions). When
loading large volumes of fact or dimension data, the ETL process must minimize the adverse impact on
data warehouse queries and ensure that the data is loaded in the least amount of time possible.

Another consideration for ETL is the logging strategy that you will use to record ETL activity provide
troubleshooting information in the event of a failure somewhere in the ETL process.

Enterprise Integration Management


ETL is a subset of a larger framework for managing data known as enterprise integration management
(EIM). Software vendors and database professionals differ on the specific details of the elements that
comprise an EIM solution, but in the SQL Server platform, EIM is generally considered to include:

 ETL capabilities provided by SQL Server Integration Services (SSIS).

 Data cleansing and matching capabilities provided by Data Quality Services (DQS).

 Master data management (MDM) capabilities provided by Master Data Services (MDS).

Using ETL for Application Data Integration


In addition to populating and refreshing the data warehouse, ETL processes can be used to synchronize
data across multiple business applications―for example, to refresh product data in an e-commerce
MCT USE ONLY. STUDENT USE PROHIBITED
1-6 Planning a BI Solution

system from a centrally managed catalog. To accomplish this, you can use SSIS or other synchronization
technologies such as SQL Server replication. When planning a BI solution in environments where data is
transferred between source systems, it is important to understand the lineage of the data and to be aware
of the schedule on which these data transfers occur.

Managing Data Quality


In any system that relies on user input, there is a risk that the quality of the data will be impaired because
of erroneous or duplicate data entry. Although most applications perform a degree of input validation,
there is always a possibility that some data is invalid (for example, a user may enter “New Yrk” instead of
“New York”), inconsistent (for example, one user may enter “CA” and another user may enter “California”),
or duplicated (for example, an existing customer may re-register on an e-commerce site with a different
email address because the user had forgotten his or her password). Data quality technologies, such as
DQS, enable you to automate the identification of invalid or inconsistent column values and duplicate
rows. This makes it possible to cleanse and de-duplicate source data before loading it into a data
warehouse, improving the integrity of the business analysis and reporting provided by your BI solution.

Master Data Management


In an organization with multiple business applications, it is possible for the same business entity to be
represented in multiple data sources. This presents the challenge of identifying the definitive version of
the entity. For example, an organization might store customer details in a CRM system, an order
processing system, and an e-commerce site profile management system. If the same customer exists in all
three systems, and the address data does not match, it is difficult to establish which of the addresses on
record is correct. By implementing a master data management system, the organization can establish a
definitive master record and use it to ensure the integrity of data across all systems.

Analytical Data Models


Most corporate BI solutions include analytical data
models that provide information workers with a
way to “slice and dice” the measures in the data
warehouse by aggregating them across the
dimensions. Often, these analytical models are
referred to as cubes. Technically, a cube is a
specific organization of measures and dimensions
in a multidimensional data model, but the word
“cube” is commonly used as a generic term to
refer to any data model that enables users to
aggregate measures by business entities.

Benefits of Analytical Data Models


It is possible to create BI solutions that support reporting and analysis directly from tables or views in the
data warehouse. However, in most scenarios, creating a separate analytical data model layer results in the
following benefits:

 The data model abstracts the underlying data warehouse tables, which enables you to create models
that reflect how business users perceive the business entities and measures regardless of the data
warehouse table schema. If necessary, you can modify or expand the underlying data warehouse
without affecting the data model used by business users to perform analysis.

 Because the data model reflects the users’ view of the business, data analysis is easier for information
workers with little or no understanding of database schema design. You can use meaningful names
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-7

for tables and fields and define hierarchies based on attributes in dimension tables that make the
data more intuitive for business users.

 You can add custom logic to a data model that adds business value when analyzing the data. For
example, you can define key performance indicators (KPIs) that make it easier to compare actual
business measures with targets.

 Although the SQL Server database engine can provide extremely high query performance, a data
warehouse typically contains a massive volume of data. Because most analysis involves aggregating
measures across multiple dimensions, the processing overhead for complex queries can result in
unacceptable response times―especially when many users access the data concurrently. A data
model typically pre-aggregates the data, which provides vastly superior performance for analytical
queries.

 Data models are a common feature in BI solutions, and a number of standards have been established.
By creating a data model, you can expose your analytical data through a standard interface to be
consumed by client applications, such as Microsoft Excel or third-party analytical tools.

Types of Analytical Data Model


SQL Server 2012 supports two kinds of analytical models:

 Multidimensional data models. Multidimensional data models have been supported in every
version of SQL Server Analysis Services since the release of SQL Server 7.0. You can use a
multidimensional data model to create an Analysis Services database that contains one or more
cubes, each of which provides aggregations of measures in measure groups across multiple
dimensions.

 Tabular data models. Tabular data models were first introduced with PowerPivot in SQL Server 2008
R2, and they are enhanced in SQL Server 2012. From the point of view of a user performing analysis,
tabular model provide similar functionality to a multidimensional model (in fact, in many cases, the
two models are indistinguishable from one another). For BI developers, tabular models do not require
as much online analytical processing (OLAP) modeling knowledge as multidimensional models,
because they are based on relationships between multiple tables of data.

Note: Multidimensional and tabular models are compared in more detail in Module 5:
Designing Analytical Data Models.

Reporting and Analysis


The primary purpose of a BI solution is to help
organizations track and improve business
performance through reporting and analysis.

Reporting
Reporting is the communication of information
gained from BI. Most organizations rely on reports
to summarize business performance and activities.
Consequently, most BI solutions include a
reporting element that generates these reports.
Typical reports include financial and management
reports (such as cash flow, profit and loss, balance
sheet, open orders, and other accounts-based
reports), and other reports, depending on the nature of the business (for example, a retail business might
MCT USE ONLY. STUDENT USE PROHIBITED
1-8 Planning a BI Solution

require stock inventory reports, whereas a technical support call center might require a report that shows
call log data).

In some scenarios, users might need to view reports interactively in a web browser or custom application;
whereas in others, the reports might be required to be sent as email attachments in specific formats (such
as Excel workbooks or Word documents). In many cases, the reports might need to be printed (for
example, to send a physical report to customers or shareholders). When planning a reporting solution,
you must take into consideration the reports that are required, the audiences for those reports, and how
they will be delivered.

Regardless of the specific reports that are required, or how they will be distributed and consumed, there
are two common approaches to report generation in most BI solutions:

 IT-provided reports. Traditionally, standard business reports are created by a specialist report
developer and automatically generated with current data as requested or on a regular basis. Although
the reports themselves may be developed by a business user with report development skills, the
reports themselves are generally supported by IT and delivered though the organization’s reporting
infrastructure.

 Self-service reporting. As business users have become more technically proficient and report
authoring tools have become easier to use, many organizations supplement standard reports with the
ability for users to create their own reports with no intervention from IT. For self-service reporting to
be effective, some initial work needs to be done to design and implement a suitable reporting
infrastructure; but after that is in place, the users can benefit from the ability to customize the reports
they use without placing an additional burden on the IT department.

Analysis
Analysis is the interpretation of business data delivered by the BI solution. For some business users,
notably business analysis, performing analysis is a discrete activity that involves using specialist analytical
tools to examine data in analytical models. For others, analysis is simply a part of everyday work and takes
the form of using reports or dashboards as a basis for business decision making.
In general, when planning a BI solution, you should consider the following kinds of analytical
requirements:

 Interactive analysis. Some BI solutions must support interactive “slice and dice” analysis in business
tools such as Microsoft Excel or specialist data analysis tools. The resulting information can then be
published as a report.

 Dashboards and scorecards. Commonly, analytical data can be summarized in a dashboard or


scorecard and embedded into business applications or portals, such as SharePoint Server sites. These
kinds of solutions might provide some interactivity to enable users to “drill down” into specific details,
or they may simply show important KPIs.

 Data mining. Most analysis and reporting concerns historical data, but a BI solution can also support
predictive analysis by using that historical data to determine trends and patterns.

Data Sources
While you can access data for analysis and generate reports from virtually any data source; but in a BI
solution, reports are commonly based on one of the following data sources:

 Analytical data models. If you have created analytical data models in your BI solution, you can use
them as a source for analysis and reports. This approach enables you to take advantage of the
benefits of data models in your reporting solution as described in the previous topic.

 The data warehouse. You can create analytical reports directly from the data warehouse (or a
departmental data mart). This enables you to express queries in Transact-SQL which may be a more
familiar to a report developer than a data modeling query language (such as MDX or DAX).
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-9

Note: Considerations for designing a reporting solution are discussed in more depth later
in this course.
MCT USE ONLY. STUDENT USE PROHIBITED
1-10 Planning a BI Solution

Lesson 2
The Microsoft BI Platform
Microsoft products are used to provide the IT infrastructure for most of the organizations in the world.
Therefore, it makes sense for many of these organizations to consider using the Microsoft platform for BI
and benefiting from the close integration and common infrastructure capabilities of the various products
that can be used to deliver a BI solution.

As a Microsoft BI professional, you need to know which products can be used to implement the various
elements of a BI solution, and how those products can be integrated to work together.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the role of Windows Server in a BI solution.


 Describe the role of SQL Server in a BI solution.

 Describe the role of SharePoint Server in a BI solution.

 Describe the role of Office applications in a BI solution.


 Identify hardware appliances that can be used in a BI solution.

 Describe considerations for upgrading from previous versions of SQL Server.

Windows Server
Microsoft Windows Server 2012 is the foundation
for a Microsoft-based enterprise solution and
provides a number of core infrastructure services,
including:

 Network communication and management.


 Active Directory–based authentication and
security management.

 Core application services, such as the


Microsoft .NET Framework and Internet
Information Services (IIS).

 Storage management, including NTFS disk


volumes and Storage Spaces.

 Failover Clustering.

 Virtualization.

Windows Server 2012 Editions


Windows Server 2012 is available in the following editions:

 Windows Server 2012 Datacenter. This edition provides all features of Windows Server and is
optimized for highly virtualized environments.

 Windows Server 2012 Standard. This edition provides all features of Windows Server and is
designed for physical or minimally virtualized environments.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-11

 Windows Server 2012 Essentials. This edition is designed for small business with up to 25 users and
50 client devices.

 Windows Server 2012 Foundation. This edition is designed for environments with up to 15 users.

Note: Although Windows Server 2012 includes comprehensive infrastructure management


tools, large enterprises might consider also using Microsoft System Center 2012 products to
manage enterprise infrastructure.

SQL Server
Microsoft SQL Server 2012 provides the core data
services for a BI solution. These services include:

 The SQL Server database engine, which is


used for application databases, operations
databases, and the data warehouse
throughout the BI solution.
 SQL Server Integration Services (SSIS), which is
used as the primary platform for ETL
processes.
 Data Quality Services (DQS), which provides
data cleansing and matching capabilities.

 Master Data Services (MDS), which provides master data management capabilities.

 SQL Server Analysis Services (SSAS), which provides a storage and query processing engine for
multidimensional and tabular data models.

 SQL Server Reporting Services (SSRS), which provides a platform for publishing and delivering reports
that users can consume through a native web-based interface or have delivered by way of
subscriptions.

SQL Server 2012 Editions


SQL Server 2012 is available in the following core editions:

 SQL Server 2012 Enterprise. You should use this edition for data warehouses and BI solutions that
require advanced SSIS features, such as fuzzy logic and change data capture (CDC) components.

 SQL Server 2012 Business Intelligence. You should use this edition for servers hosting SSIS, DQS,
and MDS. You should also use this edition for SSRS and SSAS solutions that require more than 16
processor cores or if you need to support tabular data models, PowerPivot for SharePoint, Power
View for SharePoint, or advanced data mining.

 SQL Server 2012 Standard. You can use this edition for solutions that require basic SSRS reporting,
SSAS multidimensional models, and basic data mining.

Note: SQL Server 2012 is also available in Web and Express editions, but these are generally
not appropriate for BI solutions. A special edition of SQL Server named Parallel Data Warehouse
provides support for massively parallel processing (MPP) data warehouse solutions, but this
edition is only available pre-installed on an enterprise data warehouse appliance from selected
Microsoft hardware partners.
MCT USE ONLY. STUDENT USE PROHIBITED
1-12 Planning a BI Solution

SharePoint Server
Microsoft SharePoint Server 2013 provides
enterprise information sharing services through
collaborative websites. SharePoint Server provides
the following BI capabilities:

 Excel Services. Users can view and interact


with Excel workbooks that are shared in a
SharePoint document library through a web
browser―including workbooks that use data
connections to query data in a data
warehouse or Analysis Services data model.

 PowerPivot for SharePoint. Users can share


and interact with Excel workbooks that
contain a PowerPivot tabular data model. This enables business users to create and share their own
analytical data models.

 Integration with SSRS. You can deliver and manage reports and data alerts through SharePoint
document libraries instead of the native Report Manager interface provided with SSRS.

 Power View. Power View is an interactive data visualization technology through which users can
graphically explore a tabular data model in a web browser.
 PerformancePoint Services. PerformancePoint Services enables BI developers to create dashboards
and scorecards that deliver KPIs and reports through a SharePoint site.

Office Applications
Microsoft Office 2013 provides productivity
applications that business users can use to
consume and interact with BI data. These
applications include:

 Microsoft Excel. Excel is the most commonly


used data analysis tool in the world, and can
be used to:

o Import data from a data warehouse and


use it to create charts and reports.

o Create interactive PivotTables and


PivotCharts from analytical data models
in SSAS or PowerPivot for SharePoint.

o Create PowerPivot workbooks that contain tabular data models without requiring SSAS.

o Create Power View visualizations from tabular models in the workbook or external tabular
models.

 Microsoft Word. Word is a document authoring tool. In a BI scenario, users can export SSRS reports
in Word format and use Word’s editing and reviewing tools to enhance them.

 Microsoft PowerPoint. PowerPoint is a widely used presentation tool. Users can save Power View
visualizations as PowerPoint presentations, and present business data in a dynamic, interactive format.

 Microsoft Visio. Visio is a diagramming tool that can be used to visualize data mining analyses.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-13

Upgrading to SQL Server 2012


If you have an existing BI solution that is based on
a previous version of SQL Server, you should
consider the following general guidelines before
upgrading to SQL Server 2012:

 Decide whether you want to upgrade in-place


by using SQL Server setup, or use a side-by-
side upgrade approach in which a new
instance of SQL Server is installed and
databases, SSIS packages, reports, and other
objects are migrated to it from the existing
instance. In-place upgrades are generally
simpler to perform, but side-by-side upgrades
enable you to keep the existing instance operational during the upgrade process and give you the
option to revert to the existing instance in the event of a problem.

 Verify that the hardware and edition of Windows Server you plan to use are adequate for SQL Server
2012. SQL Server product documentation (Books Online) includes details of minimum hardware and
software requirements for each SQL Server component.

 Verify that upgrading is possible from your current installation, or plan to upgrade by installing a new
installation of SQL Server and migrating databases, SSIS packages, reports, and other objects. You can
upgrade 32-bit installations of previous versions of SQL Server to SQL Server 2012 on the 32-bit
subsystem (WOW64) of a 64-bit server, and 64-bit installations of previous versions must be
upgraded to SQL Server 2012 64-bit. You can upgrade from the following previous versions of SQL
Server (Books Online includes a table showing specific edition upgrade paths):

o SQL Server 2005 SP4 or later.


o SQL Server 2008 SP2 or later.

o SQL Server 2008 R2 SP1 or later.

o Some editions of SQL Server 2012 can be upgraded to a different edition.


 Verify that the SQL Server features you currently use are supported in SQL Server 2012. Books Online
contains a list of deprecated and discontinued features as well as breaking changes.

 Use Upgrade Advisor to prepare for upgrades. Upgrade Advisor is a tool provided on the SQL Server
2012 installation media that you can use to analyze an existing SQL Server installation and identify
any issues that could potentially prevent a successful upgrade.

Upgrading SQL Server Database Engine Instances


When planning to upgrade from a previous version of the database engine, consider using the Distributed
Replay Utility to run a typical database workload from your current solution on a proof-of-concept SQL
Server 2012 instance, and compare performance and behavior to ensure compatibility.

Upgrading SQL Server Analysis Services


SQL Server 2012 Analysis Services can be installed in one of two modes: multidimensional mode and
tabular mode. Because tabular mode did not exist in previous versions, existing Analysis Services instances
can only be upgraded to SQL Server 2012 in multidimensional mode.

Upgrading SQL Server Integration Services


Consider the following guidelines for upgrading to SQL Server 2012 Integration Services from previous
versions:
MCT USE ONLY. STUDENT USE PROHIBITED
1-14 Planning a BI Solution

 Upgrading SSIS does not replace the previous instance of the SSIS service or tools. However, after
upgrading, you cannot use the old version of the tools to create, manage, or run SSIS packages. If you
have upgraded SSIS and want to use a command line utility such as DTExec.exe, you should enter the
full path to the required version of the utility.

 Upgrading SSIS does not upgrade existing packages to the new format used in SQL Server 2012. You
should upgrade these packages by using the SSIS Package Upgrade Wizard.

 Support for Data Transformation Services (DTS) packages created in SQL Server 2000 has been
discontinued in SQL Server 2012. If you have existing DTS packages, you must migrate them to SQL
Server 2005, 2008, or 2008 R2 Integration Services packages before they can be migrated to the SQL
Server 2012 Integration Services format. If this is not possible, you must recreate your packages in
SQL Server 2012 Integration Services after upgrading.

 Support for ActiveX script components in SSIS packages has been discontinued in SQL Server 2012.
ActiveX scripts in existing packages can be upgraded to Visual Studio Tools for Applications (VSTA) by
using the SSIS Package Upgrade Wizard.

Upgrading Master Data Services


Master Data Services was first introduced in SQL Server 2008 R2, and consists of a SQL Server database
and an IIS web application. There is no direct upgrade path for Master Data Services components, so you
must use the following technique to upgrade to SQL Server 2012:

1. Upgrade the existing SQL Server 2008 R2 database engine instance to SQL Server 2012, or install a
new instance of SQL Server 2012. If you install a new instance, you can move the existing Master Data
Services database to the new instance or continue to host it in the SQL Server 2008 R2 instance.

2. Add the Master Data Services feature to the SQL Server 2012 instance.

3. Use SQL Server 2012 Master Data Services Configuration Manager to upgrade the existing Master
Data Services database to the new version of the schema.

4. Use SQL Server 2012 Master Data Services Configuration Manager to create a new Master Data
services web application and associate it with the upgraded database.

Upgrading SQL Server Reporting Services


When planning to upgrade to SQL Server 2012 Reporting Services, consider the following guidelines:

 Similar to other components of SQL Server, you can perform an in-place upgrade, or you can install a
new instance and migrate existing reports, data sets, report parts, and data sources by attaching the
existing report server database to the new server.

 Back up the report server encryption key before upgrading, and restore it to the new instance if you
are upgrading by installing a new instance.

 You cannot perform an in-place upgrade that changes the installation mode. For example, you
cannot use SQL Server Setup to upgrade SQL Server 2008 Reporting Services in native mode to SQL
Server 2012 Reporting Services in SharePoint Integrated mode.

 The format used for reports was updated in SQL Server 2008 R2. The compiled versions of reports are
automatically updated the first time they are run on an upgraded report server. The source report
definition language (RDL) files are not upgraded.

 You can perform an in-place upgrade from SQL Server 2008 R2 Reporting Services integrated with a
SharePoint Server 2010 farm to SQL Server 2012 or SQL Server 2012 SP1 Reporting Services with no
downtime.

 You can perform an in-place upgrade from SQL Server 2005 SP4 or 2008 SP2 Reporting Services
integrated with a SharePoint Server 2007 farm to SQL Server 2012 or SQL Server 2012 SP1 Reporting
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-15

Services with a SharePoint Server 2010 farm, but downtime is required because both SQL Server and
SharePoint must be upgraded. You should consider performing a new installation of both products
and migrating content and reports.

 SQL Server 2012 SP1 is required for integration with a SharePoint Server 2013 farm. In-place upgrade
from previous versions of SQL Server and SharePoint is not supported, but you can install a new
SharePoint Server 2013 farm, migrate the SharePoint content and configuration databases to the new
farm, install SQL Server 2012 Reporting Services in SharePoint integrated mode, and migrate the
existing reporting services objects.

Upgrading PowerPivot
If you have an existing SQL Server 2008 R2 version of PowerPivot installed in a SharePoint Server 2010
farm, consider the following guidelines for upgrading it:

 You must apply SQL Server 2008 R2 SP1 to the PowerPivot instance of SQL Server Analysis Services
before upgrading.

 You must apply SharePoint Server 2010 SP1 and the SharePoint Server 2010 August 2010 cumulative
update or later to all SharePoint servers in the farm before upgrading.

 Use SQL Server 2012 Setup to upgrade the PowerPivot instance of Analysis Services.

 Use the PowerPivot Configuration Tool or PowerShell cmdlets to upgrade the solutions and websites
in the farm.

 Remove the SQL Server 2008 R2 version of the PowerPivot add-in from all installations of Microsoft
Excel 2007 or Excel 2010 and replace it with the SQL Server 2012 version of the add-in. By default,
Excel 2013 includes the SQL Server 2012 PowerPivot add-in.

SQL Server Appliances


Microsoft has partnered with a number of
hardware vendors to create the following SQL
Server-based appliances that are optimized for
specific BI workloads:

 Enterprise data warehouse appliances.


Enterprise data warehouse appliances include
SQL Server Parallel Data Warehouse edition,
which uses massively parallel processing
(MPP) technologies to support high
performance data warehouses that scale up
to 500 TB of data or more.

 Departmental and mid-market data


warehouse appliances. For organizations that don’t require the scalability of an enterprise data
warehouse, appliances for smaller data warehouses that need to store up to 5 TB of data can provide
a solution that is fast and simple to deploy.
 Business decision appliances. Installing and configuring an effective BI server that includes
SharePoint and PowerPivot requires significant product expertise and can be a time consuming task.
Business decision appliances provide an out-of-the-box solution that can be used to provision self-
service BI server easily and quickly.

When planning a SQL Server-based BI solution, you should consider the following potential benefits of
using appliances:
MCT USE ONLY. STUDENT USE PROHIBITED
1-16 Planning a BI Solution

 Massive scalability of enterprise data warehouses that goes beyond what is possible to achieve with a
symmetric multi-processing (SMP) architecture.

 Rapid time to solution compared to designing and creating a custom server build.

 Pre-tested and optimized hardware and software configurations that are specifically designed for BI
workloads.

 Single-source purchasing and support.

 Additional value-add services, including installation, training, and consulting services.

Note: Microsoft partners with multiple hardware vendors to offer appliance-based BI


solutions. For more information about the appliances from specific vendors, see
http://www.microsoft.com/sqlserver/en/us/solutions-technologies/appliances.aspx.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-17

Lesson 3
Planning a BI Project
Statistics show that a surprisingly high number of BI projects in organizations throughout the world fail.
Often projects are abandoned before completion, fail to deliver all of the originally specified deliverables,
or simply do not deliver a solution that adds value to the business. In many cases, the fundamental cause
of failure is that the project was insufficiently envisioned or that key stakeholders were not included in the
planning.

Careful planning can help to ensure that a BI project runs smoothly with a successful outcome. By
applying some common best practices, you can increase the likelihood that your BI project will not be
added to the long list of BI project failures.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe key features of a BI project.

 Identify common project personnel in a BI project.


 Describe the role of business stakeholders in a BI project.

 Identify infrastructure commonly used in a BI project.

 Plan pilot solutions as part of a BI project.

BI Project Overview
There are numerous frameworks for planning and
managing IT projects, and many organizations
have a policy to use a specific approach when
implementing a new solution. Whichever
approach you use, a BI project must start with the
business requirements and use these to inform the
design of the overall technical architecture, the
data warehouse and ETL, and the reporting and
analysis that the solution will provide.

Business Requirements
The most important thing to consider when
planning a BI project is that the core purpose of
the project is to improve the business. More than any other type of IT project; BI projects are closely and
inseparably bound to business processes and goals. Most IT projects require a deep understanding of
technology, but in a BI project you must also have detailed knowledge of how various business processes
work and interact with one another, and what the commercial aims of the business are.

Understanding the overall structure, processes, and goals of the business makes it easier to gather,
interpret, and prioritize the business requirements for the BI solution. Typically, BI requirements are
fundamentally about being able to quantify core business metrics across various aspects of the business in
order to measure business performance and inform business decisions. For example, a requirement might
be that the solution enables sales managers to see monthly sales revenue by salesperson in order to
reward success and identify employees that need additional support or motivation. Or another
MCT USE ONLY. STUDENT USE PROHIBITED
1-18 Planning a BI Solution

requirement might be to view quarterly order amounts by product line in order to plan more efficient
manufacturing based on demand trends. Only after you have identified the specific business requirements
for your BI solution can you start considering the design of the infrastructure, data warehouse and ETL
solution, and analytical reports.

Technical Architecture and Infrastructure Design


When you have a good understanding of the business requirements, you can start to plan the overall
solution architecture. Based on the business requirements, you can identify the required elements of the
BI solution (as described in the first lesson in this module) and consider the software products you want to
use to implement those elements (as described in the second lesson in this module).

After selecting the technologies you intend to use, you can start to design the infrastructure for the BI
solution, including server hardware and configuration, security, and high availability considerations.

Note: Server hardware and infrastructure considerations are discussed in Module 2:


Planning BI Infrastructure.

Data Warehouse and ETL Design


The business requirements determine the data that the BI solution must include―specifically, the numeric
measures that users need to aggregate (for example, revenue or profit), and the business dimensions
across which they must be aggregated (for example, salesperson or product line). When you have
identified the data that is required to meet the business requirements, you can start to design the data
warehouse in which that data will be stored and the ETL process that will be used to populate and refresh
the data warehouse from the business applications where the data currently resides.

Note: Data warehouse design is discussed in Module 3: Designing a Data Warehouse. ETL
design is discussed in Module 4: Designing an ETL Solution.

Reporting and Analysis Design


In many cases, the business requirements for a BI solution are actually expressed as specifications for
reports or analytical data sets. Business users often describe the reports they want to be able to consume,
the dashboards they want to be able to view, or the PivotTables they want to be able to create. Using the
business requirements to identify the information that users want to get out of the BI solution helps you
engage with the users to refine your understanding of how they want to consume or interact with that
information, which in turn enables you to design an appropriate solution for analysis and reporting.

Note: Considerations for designing analytical data models are discussed in Module 5:
Designing Analytical Data Models. Solutions for delivering analysis and reports are discussed in
Module 6: Planning a BI Delivery Solution, Module 7: Designing a Reporting Services Solution,
Module 8: Designing an Excel-Based Reporting Solution, and Module 9: Planning a SharePoint
Server BI Solution.

Monitoring and Optimization


Performance optimization is a consideration all through the design and implementation of the various
elements of the BI solution, and your project planning should include consideration of how performance
of the overall solution will be monitored and optimized as the volumes of data and users grow. In
particular, you must consider how you will measure performance, what expectations users have for
performance, and how you will identify performance degradation.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-19

Note: Performance monitoring and optimization is discussed in Module 10: Monitoring and
Optimizing a BI Solution.

Operations and Maintenance


When planning a BI solution, it is easy just to focus on the functional requirements of the solution.
However, it is important to also consider operational requirements and factor them into the design.
Operational requirements for a BI solution include a number of ongoing tasks, such as index maintenance
in the data warehouse, backup procedures for all databases and data stores used in the solution,
scheduled automation of ETL tasks, data model processing, logging and auditing, and many others.

Note: Considerations for operations and maintenance are discussed in Module 11: Planning
BI Operations.

Project Infrastructure
It is easy to focus on the infrastructure
requirements of the solution you intend to build
and overlook the infrastructure required to
actually build it. In the same way that a
construction project to build an office building
requires a site office, parking facilities for the
construction crew, and so on, a BI project requires
hardware and software resources for the project
team to use during the development of the
solution.

Project Management Infrastructure


From the beginning, a project requires
infrastructure to enable team members to communicate, collaborate, and document project planning
information. Examples of this kind of infrastructure include:
 Office productivity applications.

 Microsoft Project.

 A SharePoint Server site for the project.

Design and Development Tools


When designing the BI solution, the team will need tools such as:

 Microsoft Visio to support diagrammatic design.


 SQL Server Data Tools to develop data models, reports, and SSIS packages.

 Other Microsoft Visual Studio components to develop custom application components.

 Team Foundation Server (TFS) to provide source control and issue tracking capabilities.

Development and Test Infrastructure


In addition to development tools, the project will require servers on which to develop and test the various
elements of the solution. These servers will require the same software used in the production
solution―for example, SQL Server and SharePoint Server; and the test servers should be configured as
similarly as possible to the envisioned production infrastructure. However, considering the complexity of
MCT USE ONLY. STUDENT USE PROHIBITED
1-20 Planning a BI Solution

some enterprise-scale BI solutions, you may choose to simplify the test environment―for example, by
provisioning a single-server installation of SharePoint Server instead of a multi-server farm, combining
SQL Server components on a single server instead of provisioning dedicated servers, and using standalone
servers instead of failover clusters.

Project Personnel
A BI project involves several roles. These roles
typically include:

 A project manager. Coordinates project


tasks and schedules and ensures that the
project is completed on time and within
budget.

 A BI solution architect. Has overall


responsibility for the technical design of the
data warehousing solution.

 A data modeler. Designs the data warehouse


schema and analytical data models.
 A database administrator. Designs the physical architecture and configuration of the data
warehouse database. In addition, database administrators who have responsibility for data sources
that are used in the data warehousing solution must be involved in the project to provide access to
the data sources that the ETL process uses.

 An infrastructure specialist. Implements the server and network infrastructure for the data
warehousing solution.
 An ETL developer. Builds the ETL workflow for the data warehousing solution.

 Report developer. Creates the reporting elements of the BI solution.

 Business users. Provide requirements and help to prioritize the business questions that the data
warehousing solution will answer. Often, the team includes a business analyst as a full-time member
to help interpret the business questions and ensure that the solution design meets the needs of the
users.

 Testers. Verify the business and operational functionality of the solution as it is developed.

Note: The list in this topic is not exhaustive and represents roles that must be performed,
not necessarily individual people. In some cases, multiple roles may be performed by a single
person―though in general, you should avoid having testers validate their own development
work.

In addition to the technical project personnel listed here, the project team should include business
stakeholders from the very beginning of the planning phase. The roles performed by business
stakeholders are discussed in the next topic.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-21

Business Stakeholders
The previous topic described the technical roles
required in a BI project. However, the project
team should also include representatives from key
areas of the business to help ensure that the
solution meets the business requirements and to
help promote user acceptance.

Executive Sponsor
The culture of each organization is unique, but in
almost all businesses, a BI project will face
personality clashes and political obstacles that
must be navigated to create a solution that is in
the best interests of the business as a whole.
Employees tend to focus on their own specific areas of the business, and they can often be resistant to
changes that affect their day-to-day activities or to what they see as external interference in their
responsibilities.
The challenge of obtaining “buy-in” from business users is easier to overcome if the project has an
executive sponsor who has aligned the project goals with the strategic aims of the business and can
champion the project at the highest level of the organization. When the BI project team meets resistance
or contradictory views from business users, the executive sponsor can use his or her influence to resolve
the issue.

Business Users
Although executive sponsorship is essential to drive the project forward, it is important to take into
account the input from business users. A solution that is enforced on users without consultation is unlikely
to gain acceptance, and in most cases it is unlikely that the primarily technical members of the project
team have sufficient knowledge of the business to create a useful solution even if users could be
persuaded to accept it.

Businesses are complex ecosystems in which many processes interact to achieve multiple objectives. In
some organizations, the business processes are formally defined and documented, but even when this is
the case, it is likely that day-to-day activities vary, often significantly, from “official” practices. Generally,
business users have a better insight into how the business processes actually work, what the various data
elements used in those processes actually mean, and how important they are than can be gained by a
technical architect examining existing systems and their documentation.

For example, suppose an existing system for processing sales includes a data field named SZ_Code with
values such as STD-140 and SPC-190. The usage of this field is not listed in the application
documentation, yet you see that it is used in approximately 75 percent of sales orders. Only a business
user who is familiar with the sales order process could tell you that the field represents a size code for
products that are available in multiple sizes, and that the value STD-140 represents a standard size of 140
centimeters, whereas SPC-190 means that the product was ordered in a special-order size of 190
centimeters that had to be custom made.

Data Stewards
Some information workers have particularly detailed knowledge of the business processes and data in a
specific area of the business. By formally including these people in the BI project team, you can have them
adopt the role of data steward (sometimes referred to as data governor or data curator) for the data
elements used in their area of the business. A data steward can provide valuable services to the project,
including:
MCT USE ONLY. STUDENT USE PROHIBITED
1-22 Planning a BI Solution

 Representing the interests of a specific business area while the BI solution is planned. For example,
ensuring that all of the data elements that are important to that business area are included in the
data warehouse design or that the reports required by that business areas are considered.

 Validating and interpreting data values in the source systems that will be used to populate the data
warehouse, and helping to identify the appropriate transformations and exceptions that will need to
be implemented.

 Taking ongoing responsibility for maintaining a Data Quality Services knowledge base for the
business area, so that data values can be cleansed and matched effectively.

 Taking ongoing responsibility for maintaining relevant business entities in a Master Data Services
model to ensure consistency of data definitions across the organization.

Project Scope
From the very beginning of a project, it is
important to prioritize the business requirements
in terms of their value to the business, and the
feasibility of meeting them with specific
constraints, such as available data, budget, and
project deadlines. This enables you to scope the
project in a way that maximizes the chances of it
successfully delivering value to the business.

Initial Scoping
After the initial requirements gathering is
accomplished, the project team and business
stakeholders must negotiate the importance or
value of the requirements. At this stage, you may be able to judge the feasibility of meeting some
objectives, but others will require further investigation to identify suitable source data or to estimate the
effort required.

You can use a matrix to record the relative value and feasibility of each requirement as they are agreed by
the team members. It is likely that there will be some disagreements about the importance of some
objectives, and feasibility may not be easy to assess. In these cases, you should make a note of the issues
and move on. At this stage, it is important to get a comprehensive view of the potential project
scope―further iterations of the design process will gradually resolve prioritization conflicts and help
clarify feasibility.

Refining the Scope


After the initial scoping discussion, the feasibility of the requirements that were identified can be
investigated. Typically this investigation involves:

 Using the techniques for auditing data sources discussed in the first lesson of this module to
determine whether sufficient data is available and accessible to meet the requirements.
 Estimating the development effort and skills required for each of the requirements.

As the investigations reveal more information, the team should meet to refine the matrix created during
the initial scoping exercise.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-23

Identifying a Pilot Scope


When the scope is considered to be well defined, the team should examine the requirements in the high
value, high feasibility quadrant of the matrix and further prioritize them to determine a set of
requirements that can be addressed in a pilot or proof-of-concept solution.

Using a pilot project enables you to reduce the time it takes for the BI project to add value to the
business. By prioritizing the requirements based on their value and feasibility, you can quickly
demonstrate the effectiveness of the BI initiative without losing the momentum the project has built up
during the initial scoping phase. In most cases, the pilot focuses on a related set of requirements, often in
a specific, high-profile area of the business. However, because you have used the scoping phase to
consider all requirements, you can design the pilot with extensibility in mind, ensuring that the design of
the pilot will support the addition of the other highly important business requirements at a later stage.

After scoping the pilot, you can start designing the solution. However, you must make sure that the
project team carefully considers the following questions:

 How will the pilot incorporate user acceptance testing (UAT)? Instead of delivering the solution to all
users in the affected area of the business, you may want to enroll a subset of users in the pilot
program with a particular focus on providing feedback on the usability and usefulness of the solution.
Often, these users can provide valuable feedback that results in improvements to the design of
reports, data models, dashboards, SharePoint document library structures, and other user-visible
aspects of the solution.

 How will you measure the success of the pilot? Other than qualitative measures based on feedback
from users, you should consider quantitative goals for the pilot. The criteria for success should
ultimately be aligned with the business goals, so you need to be able to measure the effects of the
solution in terms of revenue growth, increased profitability, reduced costs, increased customer
satisfaction survey scores, or whatever quantifiable goal the BI solution is intended to help the
business achieve. Therefore, you should determine a realistic time interval over which the success of
the project should be assessed.
MCT USE ONLY. STUDENT USE PROHIBITED
1-24 Planning a BI Solution

Lab: Planning a BI Solution


Scenario
Adventure Works Cycles is a multinational corporation that manufactures and sells bicycles and cycling
accessories. The company sells its products through an international network of resellers, and in recent
years has developed a direct sales channel through an e-commerce website.

The company is financially sound and has a strong order book; however, sales volumes have remained
relatively static for the past few years. Senior management is under pressure from shareholders to develop
a strategy for growth that will drive increased revenue and profit. Management believes that a key factor
in their growth strategy is investment in technology that improves collaboration between the various
divisions of the company, and enables them to track and share key business performance metrics.

Objectives
After completing this lab, you will be able to:
 Identify and prioritize business requirements.

 Determine software for a BI solution.

Estimated Time: 45 Minutes


Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Scoping a BI Solution


Scenario
Adventure Works employees have identified a number of business areas that could benefit from greater
access to key business information through analytics and reporting. To meet these needs, the senior
management at Adventure Works has approved the scoping of an enterprise BI initiative to provide
decision makers with the information they need to make the business more efficient.
As a BI consultant, you have been asked to review the business requirements and assist in the scoping and
design of a pilot BI solution.

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Review Interview Transcripts.

3. Prioritize Business Requirements.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab01\Starter folder as Administrator.

 Task 2: Review Interview Transcripts


 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab01\Starter
folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 1-25

 Task 3: Prioritize Business Requirements


 Collaborate with two or three other students.

 Use Requirements Matrix.docx in the D:\labfiles\Lab01\Starter folder as a framework to identify and


prioritize business requirements based on the interviews.

Results: At the end of this exercise, you should have created a matrix that shows the relative value and
feasibility of the business requirements for the BI solution.

Exercise 2: Determining Software Requirements


Scenario
Having performed the initial requirements gathering and scoping exercises, you must now start
determining the required elements of the BI solution and suggest software that can be used to implement
them.
The main tasks for this exercise are as follows:

1. Suggest Software for the BI Solution

 Task 1: Suggest Software for the BI Solution


 Still working as a small group, discuss the business requirements you have identified and suggest
Microsoft software components that could be used to meet them.

 Document your software suggestions and the rationale for your choices in Software
Requirements.docx in the D:\Labfiles\Lab01\Starter folder.

Results: At the end of this exercise, you should have a list of suggested software components for the BI
solution.

Question: If you had been able to conduct discussions with real business stakeholders instead
of read paper-based interview transcripts, how would you have elicited clearer, more specific
business requirements?

Question: In a real project, how would you have determined the feasibility of the requirements
you captured in the requirements matrix?
MCT USE ONLY. STUDENT USE PROHIBITED
1-26 Planning a BI Solution

Module Review and Takeaways


Module Review
In this module, you have learned about the common elements of a BI solution and how Microsoft
software products can help you implement them. You have also learned about the key roles and scoping
tasks in a BI project.

Question: What challenges have you experienced when gathering, interpreting, and prioritizing
business requirements in previous projects? How did you overcome them? What tips can you
share with your fellow students?
MCT USE ONLY. STUDENT USE PROHIBITED
2-1

Module 2
Planning SQL Server Business Intelligence Infrastructure
Contents:
Module Overview 2-1

Lesson 1: Considerations for BI Infrastructure 2-2

Lesson 2: Planning Data Warehouse Hardware 2-10

Lab: Planning SQL Server Business Intelligence Infrastructure 2-16

Module Review and Takeaways 2-19

Module Overview
The server and hardware infrastructure for a BI solution is a key consideration in any BI project. You must
balance the performance and scalability gains you can achieve by maximizing hardware specifications and
distributing the elements of your BI solution across multiple servers against hardware and software
licensing costs, and implementation complexity.

This module discusses considerations for selecting server hardware and distributing SQL Server services
across servers.

Note: This module focuses on SQL Server components. Considerations for including
SharePoint Server in a BI solution are discussed in Module 9: Planning a SharePoint Server BI
Solution.

Objectives
After completing this module, you will be able to:

 Describe key considerations for BI infrastructure.

 Plan data warehouse infrastructure.

 Plan ETL infrastructure.

 Plan Analysis Services infrastructure.

 Plan Reporting Services infrastructure.


MCT USE ONLY. STUDENT USE PROHIBITED
2-2 Planning SQL Server Business Intelligence Infrastructure

Lesson 1
Considerations for BI Infrastructure
Planning server infrastructure for a SQL Server-based BI solution requires an understanding of how the
various SQL Server components work together, and how typical workloads for SQL Server components use
hardware resources.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe where infrastructure planning fits into the BI project framework.

 Describe considerations for characterizing the size of a BI solution.

 Describe key features of the workloads associated with SQL Server components.

 Describe high-level topology designs for BI solutions.


 Describe scale-out options for services in a BI solution.

 Describe high availability options for services in a BI solution.

Infrastructure Planning in a BI Project


Infrastructure planning is one of the core pillars of
a BI project. In common with all aspects of
designing a BI solution, the technical architecture
and infrastructure design depends on the business
requirements that the solution is intended to
address.

Planning a SQL Server-based BI infrastructure


involves the following considerations:

 Assigning SQL Server components to physical


servers.

 Specifying required hardware resources for


the servers, storage, and network infrastructure.

 Selecting and designing appropriate scale-out infrastructure solutions.

 Selecting and designing appropriate high-availability infrastructure solutions.


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-3

System Sizing Considerations


Each BI solution has its own unique characteristics
that depend on the business requirements it is
designed to address. There is no such thing as a
“standard” BI solution. However, when initially
planning the infrastructure for a BI solution, it can
be useful to start by classifying the type of
solution that must be implemented into one of
three generic sizes: small, medium, or large. The
factors that determine the size of a BI solution are:

 Data volume. This is the amount of data that


the data warehouse must store, and the size
and frequency of incremental loads of new
data. The primary consideration is the number of rows in fact tables, but you must also allow for
dimension data, indexes, and data models that are stored on disk.

 Analysis and Reporting Complexity. This includes the number, complexity, and predictability of the
queries that will be used to analyze the data or produce reports. Typically, BI solutions must support a
mix of the following types of query:

o Simple. Relatively straightforward SELECT statements.


o Medium. Repeatedly executed queries that include aggregations or many joins.

o Complex. Unpredictable queries with complex aggregations, joins, and calculations.

 Number of Users. This is the total number of information workers who will access the system, and
how many of them will do so concurrently.

 Availability Requirements. These include when will the system need to be used, and what planned
or unplanned downtime can the business tolerate.
Although it is difficult to be precise when categorizing a solution, the following table suggests some
typical examples the characteristics of small, medium, and large BI systems.

Small Medium Large

Data Volume 100s of GBs to 1 TB 1 to 10 TB 10 TB to 100s of TBs

Analysis and  Over 50% simple  50% simple  30-35% simple


Reporting
 30% medium  30-35% medium  40% medium
Complexity
 Less than 10%  10-15% complex  20-25% complex
complex

Number of Users 100 total 1,000 total 1,000s of concurrent users


10 to 20 concurrent 100 to 200
concurrent

Availability Business hours 1 hour of downtime 24x7 operations


Requirements per night
MCT USE ONLY. STUDENT USE PROHIBITED
2-4 Planning SQL Server Business Intelligence Infrastructure

BI Workloads
In addition to the size categorization of the BI
solution your infrastructure needs to support, it is
useful to understand the types of workload that
typically occur in a BI solution. Each workload uses
hardware resources, and it is important to assess
the total impact of all workloads on resource
utilization and identify potential for contention
between workloads with similar resource
requirements.

Extract, Transform, and Load (ETL)


Workloads
The ETL subsystem of the BI solution performs the
following workloads:

 Control flow tasks. SQL Server Integration Services (SSIS) packages often include control flow tasks
that require CPU processing time, memory, disk I/O, and network I/O. The specific resource
requirements for control flow tasks can vary significantly, so if your ETL solution includes a substantial
number of control flow tasks, you should monitor resource utilization on a test system to better
understand the workload profile.
 Data query and insert. Fundamentally, ETL involves querying data sources and inserting and
updating data into staging and data warehouse tables. This incurs I/O and query processing on data
sources, the staging databases, and the data warehouse―especially if data loads require rebuilding
indexes or partition management.

 Network data transfer. Typically, ETL processes transfer large volumes of data from one server to
another. This incurs network I/O and can require significant network bandwidth.
 In-memory data pipeline. Data flow tasks in an SSIS package use an in-memory pipeline
architecture to process transformations. This places a demand on system memory resources. On
systems where there is contention between the SQL Server database engine and SSIS for memory
resources, data flow buffers might need to spill to disk, which reduces data flow performance and
incurs disk I/O.

 SSIS catalog or MSDB database activity. SSIS packages deployed in project mode are stored in an
SSIS Catalog database. Alternatively, packages deployed in package mode are stored either in the
MDSB database or on the file system. Whichever deployment model is used, the ETL process must
access the package storage to load packages and their configuration. If the SSIS catalog or MSDB
database is used, and it is located on a SQL Server instance that hosts other databases used in the BI
solution (such as the data warehouse, staging database, or Report Server catalog), there may be
contention for database server resources.

Data Model Workloads


The SQL Server Analysis Services (SSAS) data models in a BI system require the following workloads:

 Processing. Data models contain aggregated values for the measures in the data warehouse, and
they must be processed to load the required data from the data warehouse tables into the model and
perform the necessary aggregation calculations. When data in the data warehouse is refreshed, the
data models must be partially or fully processed again to reflect the new and updated data in the
data warehouse.

 Aggregation storage. Data models must store the aggregated data in a structure that is optimized
for analytical queries. Typically, multidimensional models are stored on disk with some data cached in
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-5

memory for performance reasons. Tabular models are usually stored completely in memory;
therefore, they require sufficient memory resources.

 Query execution. When users perform analytical queries against a data model, the model must
process the query and generate results. This requires CPU processing time, memory, and potentially
disk I/O.

Reporting Workloads
Although some reporting can be performed by client applications directly against the data warehouse or
data models, many BI solutions include SQL Server Reporting Services (SSRS). An SSRS reporting solution
typically involves the following workloads:

 Handling client requests. Clients submit requests to Reporting Services over HTTP, so the report
server must listen for and process HTTP requests.

 Data source queries. Reports are based on datasets that must be retrieved by querying data sources.
Typically, the data sources for the reports in a BI solution are SSAS data models or the data
warehouse, so report execution incurs query processing overheads in those sources.
 Report rendering. After the data for a report has been retrieved, SSRS must render the report data
into the required format using the report definition language (RDL) for the report and the specific
rendering extension. Depending on the size, format, and complexity of the report, report rendering
can incur substantial CPU and memory resources.

 Caching. To reduce query processing and rendering overheads, SSRS can cache datasets and reports
in the report server temporary database. Datasets and reports can be cached on first use or you can
use scheduled jobs to pre-cache objects at a regular interval.

 Snapshot execution. In addition to caching reports, you can create persisted report snapshots at a
scheduled interval and store them in the report server database.

 Subscription processing. You can configure SSRS to execute and deliver reports to file shares or
email addresses on a scheduled basis.

 Report Server catalog I/O. Report definitions and resources, such as images, are stored in the report
server catalog; and database I/O tasks are required to retrieve these when needed. Additionally,
database I/O is required to retrieve cached reports and datasets from the temporary database, and to
retrieve report snapshots from the catalog database. This database access may compete for resources
with other databases hosted in the same SQL Server instance.

Operations and Maintenance


In addition to the fundamental BI activity, a BI solution must support ongoing system operations and
maintenance activity, such as:

 Operating system activity.

 Logging activity.

 SQL Server Agent jobs, including;

o Execution of SSIS packages for ETL processes.

o Index and statistics maintenance in databases, including the data warehouse.

o Database backup tasks.


MCT USE ONLY. STUDENT USE PROHIBITED
2-6 Planning SQL Server Business Intelligence Infrastructure

Typical Server Topologies for a BI Solution


SQL Server provides a versatile service architecture
that enables you to choose a server topology that
best fits your needs. Selecting the right topology
for your solution usually involves balancing cost
and complexity against the need for high
scalability and flexibility.

Single Server BI Architecture


The simplest architecture for a SQL Server–based
BI solution is to install all elements of the solution
on a single server. Depending on the business
requirements of the BI solution, the components
on the server typically include:
 SQL Server Database Engine. Used to store the data warehouse, staging database, Reporting
Services databases, and SSIS Catalog database. Additionally, the SQL Server Agent may be used to
automate SSIS package execution and other operations by creating jobs, and schedules, which are
stored in the MSDB system database.

 SQL Server Integration Services. Used to execute packages that encapsulate ETL tasks and data
flows to extract data from source systems into the staging database, and then load it into the data
warehouse.

 SQL Server Analysis Services. Used to provide analytical data models and data mining functionality.
Depending on business requirements, two instances on Analysis Services may be required: one for
multidimensional models and data mining, the other for tabular models.

 SQL Server Reporting Services. Used to provide report publishing and execution services. Reporting
Services in native mode consists of a web service application, a web-based management user
interface, and two SQL Server databases.

Depending on business requirements, you may also choose to install SQL Server Data Quality Services on
the server to support data cleansing and matching capabilities for staged data before it is loaded into the
data warehouse.

Note: If SharePoint Server is required, you can deploy the SharePoint farm and SQL Server
integration components for Reporting Services and PowerPivot on the BI server. This architecture
is not recommended for BI solutions that require significant scalability or performance.
SharePoint Server topologies for BI are discussed in Module 9: Planning a SharePoint Server BI
Solution.

A single server architecture is suitable for test and development environments, and can be used in
production environments that have minimal data volumes and scalability requirements.

Distributed BI Architecture
If your BI solution requires even moderate levels of scalability, it will benefit from expanding beyond a
single server architecture and distributing the BI workload across multiple servers. Typical approaches to
distributing SQL Server components in a BI solution include:

 Creating a dedicated report server. In many BI solutions, Analysis Services provides a data model
that contains most (or even all) of the data in the data warehouse, and all reporting is performed
against the data model. In scenarios like this, there is little database activity in the data warehouse
other than during ETL loading and data model processing (loading data from the data warehouse
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-7

tables into the model and aggregating it). The workloads on the server that compete for resources
most of the time are Analysis Services and Reporting Services; therefore, you can increase scalability
and performance by moving the reporting workload to a separate server. You can install the
Reporting Services database on either server. Leaving it on the data warehouse server keeps all of the
data engine elements on a single server but can result in I/O workloads competing for disk resources.
Installing the reporting services database on the report server necessitates the database engine being
installed on both servers but results in a cleaner separation of workloads. In extremely large enterprise
solutions with intensive reporting requirements, you can add a third server as a dedicated host for the
Reporting Services database.

 Separating data warehouse and ETL workloads from analytical and reporting workloads. If the
data warehouse will be refreshed with new data frequently, or if it must support direct query access in
addition to processing data models, the database I/O activity might compete with Analysis Services
for disk, CPU, and memory resources. To prevent this, you can deploy Analysis Services on a separate
server. Depending on analytical and reporting workloads, you might choose to co-locate Analysis
Services and Reporting Services on the same server, or use a dedicated server for each service. If you
need to support tabular models and multidimensional or data mining models, and a single Analysis
Services server is not adequate to support both workloads, you could consider using separate servers
for the different types of Analysis Services model.

 Using a dedicated ETL server. If your ETL process requires frequent data extractions and loads, or
involves particularly resource-intensive transformations, overall performance might benefit from
moving Integration Services and the SSIS Catalog database to a dedicated server. Depending on the
specific transformation and load operations that your ETL process requires, you can choose to locate
the staging database on the ETL server or the data warehouse server, or use a two-phase staging
approach whereby extracted data is staged on the ETL server for transformation and cleansing, and
then loaded into staging tables on the data warehouse server before being inserted into the data
warehouse tables.
When designing a distributed architecture, the key goal is to eliminate contention for hardware resources.
Therefore, you gain the greatest benefits by identifying workloads with similar hardware utilization
profiles and separating them.

Scaling Out a BI Solution

A distributed architecture increases scalability by


separating the discrete workloads in a BI solution
across multiple servers. To support the highest
possible level of scalability, you can use a scale-
out architecture to share the same workload
across multiple servers.
You can use a scale-out architecture for the
following components:

 SQL Server Reporting Services. Install the


Reporting Services database on a single
database server, and then install the
Reporting Services report server service on multiple servers that all connect to the same Reporting
Services database. This approach separates the report execution and rendering workloads from the
database workloads required to store and retrieve report definitions, cached datasets and reports, and
snapshots.
MCT USE ONLY. STUDENT USE PROHIBITED
2-8 Planning SQL Server Business Intelligence Infrastructure

 SQL Server Analysis Services. Create a read-only copy of a multidimensional database and connect
to it from multiple Analysis Services query servers. To accomplish this, an SSAS server processes the
cubes in the database. The database is then detached, copied to a standby location, and reattached
so that it can be used by applications that need write-back capabilities. The copied database is then
attached in read-only mode to multiple SSAS instances, which will provide query services to clients.
Client requests can be distributed across the query servers by using a load-balancing middle-tier, or
you can assign specific subsets of the client population to specific query servers.

 The data warehouse. You can scale out an extremely large data warehouse in several ways; typically,
this is done by partitioning the data across multiple database servers and using middle-tier logic to
direct queries to the appropriate server instance. SQL Server Parallel Data Warehouse edition, which is
provided in pre-configured enterprise data warehouse appliances, uses a massively parallel processing
(MPP) shared nothing architecture to scale out data warehouse queries across multiple independent
compute and storage nodes.

 SQL Server Integration Services. Although it is not a pure scale-out technique, you can use multiple
SSIS servers to each perform a subset of the ETL processes in parallel. This approach requires
extensive custom logic to ensure that all tasks are completed, and it should be considered only if your
ETL requirements cannot be met through a scale-up approach in which you add hardware resources
to a single SSIS server.

Additional Reading: For more information about designing a scale-out solution for
Reporting Services, review the content and links in the SQLCAT Reporting Services Scale-Out
Architecture technical notes at
http://sqlcat.com/sqlcat/b/technicalnotes/archive/2008/06/05/reporting-services-scale-out-
architecture.aspx. For more information about using read-only databases to implement a scale-
out solution for Analysis Services, see Scale-Out Querying for Analysis Services with Read-Only
Databases at http://msdn.microsoft.com/en-us/library/ff795582(v=SQL.100).aspx.

Planning for High Availability


If one or more aspect of the BI solution requires
high availability, you can design a solution that
uses redundant servers with failover capabilities.
SQL Server 2012 includes a number of high
availability technologies that you can use to
protect critical services. These technologies
include:

 AlwaysOn Failover Clustering. Server-level


protection that uses Windows Server cluster
services to create a virtual server with multiple
redundant failover nodes.

 AlwaysOn Availability Groups. Database-


level protection that combines Windows Server cluster services with synchronization of database
transactions across multiple SQL Server instances.

 Database Mirroring. Database-level protection that synchronizes a database across two SQL Server
instances. Note that Database Mirroring is deprecated in this release of SQL Server and new solutions
should use AlwaysOn Availability Groups instead.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-9

 Log Shipping. Database-level protection that copies transaction logs from a primary server to a
secondary server, where they can be applied to a secondary copy of the database.

Planning High Availability for the Data Warehouse


The data warehouse is fundamentally a SQL Server database; theoretically, it can be protected by using
any of the high availability technologies provided by SQL Server. However, considering the large volume
of data typical of data warehouses and the fact that most data warehouse activity consists of data reads or
non-logged bulk load operations, AlwaysOn Failover Clustering is the most appropriate technology to
consider.

Additionally, considering the importance of the data warehouse data, the database files should be stored
on a redundant array of independent disks (RAID) array that provides protection in the case of a disk
failure.

Planning High Availability for Analysis Services


Analysis Services supports AlwaysOn Failover Clustering, so you should use this technology if you require
high availability for an Analysis Services instance. If Analysis Services is to be installed on the same server
as the database engine instance for the data warehouse, both components can be installed in a single
cluster installation.

Planning High Availability for Integration Services


Integration Services is not cluster-aware, and it does not support failover from one cluster node to
another. When SSIS packages are deployed in project mode, you can use an AlwaysOn Availability Group
to protect the SSIS Catalog database. However, there are some additional considerations for using
AlwaysOn Availability groups with the SSIS catalog:

 On failover, you must re-encrypt the master database key in the new primary server before any
packages that include encrypted sensitive data can be executed.

 Your SSIS packages must be able to recover the ETL process as a whole to a consistent state if
unplanned failover occurs. This means that you must include cleanup and failover logic in your
packages.
 If you must apply a patch or update that modifies the SSIS catalog schema, you must remove the SSIS
Catalog database from the availability group, patch it, and then re-establish the availability group.

Additional Reading: For more information about using AlwaysOn Availability Groups with
the SSIS catalog, see SSIS with AlwaysOn at
http://blogs.msdn.com/b/mattm/archive/2012/09/19/ssis-with-alwayson.aspx.

Planning High Availability for Reporting Services


As described earlier in this lesson, Reporting Services can be distributed across a multi-tier architecture in
which a report server processes requests and executes reports, and a separate database tier hosts the
report server catalog and temporary database. In this configuration, you can use network load balancing
(NLB) to distribute requests across report servers, and remove failed report servers from the NLB cluster to
maintain a highly available solution for managing report requests. To protect the database tier, you can
use either AlwaysOn Failover Clustering or an AlwaysOn Availability Group.
MCT USE ONLY. STUDENT USE PROHIBITED
2-10 Planning SQL Server Business Intelligence Infrastructure

Lesson 2
Planning Data Warehouse Hardware
The data warehouse is the foundation for a BI solution, and there are a number of recommendations from
Microsoft and its hardware partners that you should consider when planning a data warehouse system.
Data warehousing is substantially different from other database workloads, and the conventional database
design approach for optimizing hardware to support the highest possible number of I/O operations per
second (IOPS) is not always appropriate for a data warehouse.

This lesson introduces Microsoft SQL Server Fast Track Data Warehouse reference architectures, and it
goes on to explain how some of the Fast Track design principles can be applied when planning data
warehouse hardware.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the key features of SQL Server Fast Track Data Warehouse reference architectures.

 Explain how a core-balanced system architecture supports data warehousing workloads.


 Determine processor and memory requirements for a data warehouse.

 Determine storage requirements for a data warehouse.

 Describe considerations for choosing a storage solution.

SQL Server Fast Track Reference Architectures


Data warehouse design is a specialist skill that
many organizations may not have in-house, so the
BI team must work with consultants and hardware
vendors to determine the required hardware and
its configuration. This adds cost and time to the
project, and it does not always guarantee that the
resulting infrastructure is appropriate for the data
warehousing workload.

To help organizations overcome these challenges,


Microsoft has partnered with multiple hardware
vendors to create Fast Track Data Warehouse
reference architectures that reduce the time and
effort it takes to specify a data warehouse system. These reference architectures, together with hardware
appliances that can be purchased as out-of-the-box solutions, make it easier and quicker to implement an
effective data warehousing solution.

Pre-Tested Specifications and Guidance


Fast Track Data Warehouse reference architectures are designed by Microsoft SQL Server specialists and
consultants from Microsoft’s hardware partners; the reference architectures include specific hardware
components that can be combined in a prescribed configuration to create a SQL Server–based data
warehouse system. The architectures have been extensively tested with real-world data warehouse
workloads based on thousands of customer case studies, and they are certified to deliver specific
performance levels based on the size of your data warehouse solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-11

Multi-Vendor Support
Microsoft has partnered with multiple hardware vendors to create pre-tested system designs that use
commodity hardware. If your organization has an existing supplier relationship with one of the Fast Track
hardware vendors, you can easily specify a system based on components from that supplier and use the
support and consulting services offered by the hardware vendor to create a data warehouse system that is
based on a proven design.

Support for a Range of Data Warehouse Sizes


The Fast Track program is not a “one size fits all” solution. It includes multiple reference architectures for
small to large data warehouse solutions, with flexibility within each design to support your specific
requirements. You can evaluate the size of your required solution by using the principles described in the
previous lesson, and then identify the reference architecture that best suits your needs.

System Specification Tools


Microsoft provides a generic Excel workbook that you can use to determine the Fast Track configuration
you require. Additionally, many of the Fast Track hardware vendors offer their own tools that make it easy
to create a system specification that meets your needs and includes the right mix of hardware
components for your solution.

Additional Reading: For more information about Fast Track Data Warehouse reference
architectures, see An Introduction to Fast Track Data Warehouse Architectures at
http://msdn.microsoft.com/en-us/library/dd459146(v=SQL.100).aspx and Fast Track Data
Warehouse Reference Guide for SQL Server 2012 at http://msdn.microsoft.com/en-
us/library/hh918452.aspx.

Core-Balanced System Architecture


SQL Server Fast Track Data Warehouse reference
architectures are based on a core-balanced system
design that reflects the performance
characteristics of a typical data warehouse
workload. Even if you don’t plan on using one of
the published Fast Track Data Warehouse
reference architectures, you will benefit from
understanding the principles on which they are
designed and applying them to your own system
specification.

The core-balanced system architecture is based on


the fact that most data warehouse workloads
need to transfer large amounts of data, usually accessed by sequential read operations, across multiple
system components from where it is stored to the applications that request it. Each component through
which the data is transferred is a potential bottleneck that will limit the overall performance of the
system―the data can only flow to the requesting application at the rate of the slowest component. Any
components that can operate at a higher rate are underutilized, which unbalances the system and can
represent significant wasted cost.

The diagram on the slide shows a balanced system in which the I/O rates of each component in the
system are reasonably similar. This diagram represents a large-scale data warehousing system in which
data is stored in a storage area network with fiber channel connectivity and multiple storage enclosures,
each containing multiple disk arrays. However, the same principles apply to a smaller architecture.
MCT USE ONLY. STUDENT USE PROHIBITED
2-12 Planning SQL Server Business Intelligence Infrastructure

The I/O rate of hardware components, such as hard disks, array storage processors, and fiber channel host
bus adapters (HBAs) can be established through careful testing and monitoring by using tools like SQLIO,
and many manufacturers (particularly those who participate in the Fast Track program) publish the
maximum rates. However, the initial figure that you need to start designing a data warehouse system is
the maximum consumption rate (MCR) of a single processor core combined with the SQL Server database
engine. After the MCR for the CPU core architecture you intend to use has been established, you can
determine the number of processors required to support your workload and the storage architecture
required to balance the system.

Note that MCR is specific to a combination of a CPU and motherboard, and SQL Server; it is not a
measure of pure processing speed or an indication of the performance you can expect for all queries in
your solution; instead, it is a system-specific benchmark measure of maximum throughput per-core for a
data warehouse query workload. Calculating MCR requires executing a query that can be satisfied from
cache while limiting execution to a single core, and reviewing the execution statistics to determine the
number of megabytes of data processed per second.

Demonstration: Calculating Maximum Consumption Rate


This demonstration shows how to use a benchmark query to retrieve I/O statistics that can be used to
calculate the maximum consumption rate (MCR) of a system.

Demonstration Steps
Create tables for benchmark queries
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the SQL Server Management Studio icon.


3. When prompted, connect to the database engine on the localhost instance by using Windows
authentication.

4. On the File menu, point to Open, and then click File. Browse to the D:\Demofiles\Mod02\Starter
folder, select Create Benchmark DB.sql, and then click Open.

5. Click Execute, and wait for query execution to complete. This query creates a database containing
two tables, one with a clustered index and one without. Both tables contain a substantial number of
rows.

Execute a query to retrieve I/O statistics

1. On the File menu, point to Open, and then click File. In the D:\Demofiles\Mod02\Starter folder,
select Measure MCR.sql, and then click Open.

2. Click Execute, and wait for query execution to complete. The queries retrieve an aggregated value
from each table, and are performed twice. This ensures that on the second execution (for which
statistics are shown), the data is in cache so the I/O statistics do not include disk reads. Note that the
MAXDOP=1 clause ensures that only a single core is used to process the query.

Calculate MCR from the I/O statistics

1. In the results pane, click the Messages tab. This shows the statistics for the queries.

2. Add the logical reads value for the two queries together, and then divide the result by two to find
the average.

3. Add the CPU time value for the two queries together, and then divide the result by two to find the
average. Divide the result by 100 to convert it to seconds.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-13

4. Calculate MCR by using the following formula:

(average logical reads / average CPU time) * 8 / 1024

Determining Processor and Memory Requirements


After determining the MCR for the CPU cores you
intend to use, you can start to estimate the
number of cores that will be required to support
your anticipated query workload. You can also
make an initial assessment of memory
requirements.

Estimating CPU Requirements


MCR indicates the amount of data that can be
processed by a single core in a second. To
determine the number of cores required, you
must apply this rate to the following factors:

 The amount of data returned by an average query.

 The number of concurrent users you need to support.


 The target response time for a query.

The specific formula to apply is:

((Average query size in MB ÷ MCR) x Concurrent users) ÷ Target response time

For example, suppose the MCR of the CPU core you intend to use is 200 MB/s. If an average query is
expected to return 18,000 MB, the anticipated number of concurrent users is 10, and each query must
respond within 60 seconds; the calculation to find the number of cores required is:

((18000 ÷ 200) x 10) ÷ 60

This results in a requirement for 15 cores (which should be rounded up to 16 because no CPU architecture
includes exactly 15 cores).

Now that you know the number of cores required, you can make an initial determination of the number
of processors. For example, to provide 16 cores using quad-core processors, you would need 4 processors.
Alternatively, if dual-core processors are used, 8 CPUs would be required. However, keep in mind that you
need to balance the number of CPUs to closely match the number of storage arrays that will be used,
which in turn may depend on the volume of data your data warehouse must support.

Estimating RAM Requirements


Calculating the amount of RAM required is difficult because memory can be utilized by many workloads
to increase overall performance. You should generally consider a minimum figure for a small to medium
sized data warehouse system to be 4 GB per core, or 64 to 128 GB per CPU socket. If you intend to use
columstore indexes or support tabular data models on the data warehouse server, you should favor the
higher end of these estimates.

Another way to estimate memory requirements is to consider that in an average data warehouse
workload, users regularly need to access approximately 20 percent of the data stored in the data
warehouse (for example, in a data warehouse that stores 5 years of sales records, users mostly query the
most recent year). Having enough memory to maintain approximately 20 percent of the data in cache will
significantly enhance performance.
MCT USE ONLY. STUDENT USE PROHIBITED
2-14 Planning SQL Server Business Intelligence Infrastructure

Determining Storage Requirements


Before you can fully determine CPU, memory, and
storage hardware requirements, you must assess
the volume of data that the system must support.

Estimating Data Volumes for the Data


Warehouse
Most data warehouses consist predominantly of
fact data. Determining the volume of fact data the
data warehouse must store is the most significant
factor in assessing overall storage requirements.

1. Estimate Initial Fact Data


To start estimating data warehouse data volumes,
determine the number of fact rows that will be initially loaded into the data warehouse and multiply that
by the average size of a fact row. If you don’t know the average fact row size at this stage, use a
conservative estimate such as 100 bytes per row. For example, a data warehouse that will contain
200,000,000 fact rows, each 100 bytes in length, will have an initial fact data volume of approximately 20
GB.

2. Allow for Indexes and Dimensions


After estimating the initial fact data, add approximately 30 to 40 percent to allow for indexes and
dimensions. So, to continue the example with 20 GB of fact data, you would add approximately 8 GB (40
percent of 20 GB), giving an initial data volume of approximately 28 GB.

3. Project Fact Data Growth


Most data warehouses are refreshed with new data on a regular basis, so to be sure that your storage
solution will support the data warehouse in the future (say, three years from now), you must factor in the
anticipated incremental data that will be loaded. For example, suppose the fact data in our data
warehouse represents individual items that have been ordered in sales transactions, and the company
typically sells 5,000,000 items a month, you can expect to load 5,000,000 rows (each containing 100 bytes
of data), or approximately 500 MB each month. That equates to a data growth rate of 6 GB per year, so in
three years, the example data warehouse would need to support the initial 28 GB of data plus another 18
GB (6 GB per year x 3 years), giving a total of 46 GB.

4. Factor In Compression
Finally, you should plan to compress the data in your data warehouse. Typically, SQL Server provides a
compression factor of approximately 3:1, so the 46 GB of data should compress to approximately 15.5 GB
on disk.

Other Storage Requirements


In addition to the data warehouse itself, you must include other data in your storage estimation.
Additional storage is required for:

 Configuration databases. If databases used by other BI services, including the SSIS Catalog and
Reporting Services databases, are to be installed on the data warehouse server, you should include
them in your storage estimate. Additionally, the SQL Server instance includes system databases,
though in practice, these are usually stored on separate storage from the data warehouse data files.

 Transaction log files. Each database requires a transaction log. Typically, data warehouses are
configured to use the simple recovery model and few transactions are actually logged.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-15

 TempDB. Many data warehouse queries require temporary storage space, and it is generally
recommended to locate TempDB on a suitable storage column and assign it a suitable initial size to
avoid the system having it grow automatically as needed.

 Staging tables. Whether data is staged in a dedicated staging database, in tables within the data
warehouse database itself, or in a combination of both locations, you must allocate enough space to
allow for data staging during ETL processes.

 Backups. If you intend to back up the data warehouse and other databases to disk, you must ensure
that the storage design provides space for backup files.

 Analysis Services models. If you intend to host multidimensional Analysis Services data models on
the data warehouse server, you must allocate sufficient disk space for them.

Considerations for Storage Hardware


The optimal storage hardware solution for a data
warehouse depends on a number of factors,
including the volume of data and the system MCR
that data throughput from the storage system
must support. When planning a storage solution,
consider the following guidelines.

 Use more smaller disks instead of fewer


larger disks. Although it’s possible to create a
data warehouse that stores all of its data on a
single, large hard disk, a better balance of
throughput (and therefore overall system
performance) can usually be achieved by
distributing the data across multiple small disks. This distribution enables multiple disk reads to be
performed in parallel and reduces wait times for I/O operations.
 Use the fastest disks you can afford. Disk technologies have advanced dramatically in recent years,
with the speed of mechanical disks increasing and the advent of solid state disks that have no moving
parts. Most data warehouse read operations are sequential scans instead of the random I/O patterns
of OLTP systems, so seek times are minimized, but regardless of this advantage, a faster disk means
greater throughput when reading data. Solid state disks are typically more expensive than mechanical
disks, but if disk performance is critical, you may decide that the additional cost is worth paying. The
lack of moving parts makes them particularly effective for random I/O data access, typical of queries
against multidimensional data models.

 Use RAID 10, or minimally RAID 5. RAID 10 (in which data is both mirrored and striped) provides
the best balance of read performance and protection from disk failure, and this should usually be the
first choice for a data warehouse. However, the requirement for a complete set of redundant disks per
array can make this an expensive option. As an alternative, you can use RAID 5, which provides
striping for high read performance and parity-based data redundancy to protect against disk failure.

 Consider a dedicated storage area network. Although you can build a data warehouse that uses
direct attached storage (DAS), using a storage area network (SAN) generally makes it easier to
manage disk array configuration and to add storage in the future as the data warehouse grows. If you
do decide to use a SAN, it is best to have one that is dedicated to the BI solution and not shared with
other business applications. Additionally, try to balance the number of enclosures, storage processors
per enclosure, and disk groups to achieve a consistent I/O rate that takes advantage of parallel core
processing and matches the MCR of the system.
MCT USE ONLY. STUDENT USE PROHIBITED
2-16 Planning SQL Server Business Intelligence Infrastructure

Lab: Planning SQL Server Business Intelligence


Infrastructure
Scenario
You are planning a BI solution for Adventure Works Cycles, and have been asked to specify the hardware
required for the solution. You have interviewed business stakeholders in order to understand the business
requirements that the solution must support, and now you must design a SQL Server-based solution that
provides the right balance of functionality, performance, and cost.

At this time, you have been informed that you should not consider using SharePoint Server in your
planned solution.

Objectives
After completing this lab, you will be able to:
 Plan server topology for a SQL Server–based BI solution.

 Plan data warehouse hardware for a SQL Server-based BI solution.

Estimated Time: 60 Minutes.


Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Planning Server Topology


Scenario
You plan to build a BI solution that uses SQL Server components. The solution must meet the business
requirements and provide an appropriate level of performance, scalability, high availability, and potential
for expansion in the future. However, the design must also be cost-effective, and you must avoid over-
specifying servers and components.
The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Review the Interview Transcripts.


3. Design a Server Infrastructure for the BI Solution.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab02\Starter folder as Administrator.

 Task 2: Review the Interview Transcripts


 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab02\Starter
folder.

 Task 3: Design a Server Infrastructure for the BI Solution


 Collaborate with two or three other students.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-17

 Use Visio to document the servers required for the BI solution and the services on each server. Include
notes to justify your decisions.

 For the purposes of this exercise, you can ignore any requirements for:

o SharePoint Server.

o Data Quality Services.

o Master Data Services.

 Save your design as BI Topology.vsdx in the D:\Labfiles\Lab02\Starter folder.

Results: At the end of this exercise, you should have a Visio diagram that documents your server
infrastructure design.

Exercise 2: Planning Data Warehouse Hardware


Scenario
Now that you have planned your server infrastructure, you must create a hardware specification for the
data warehouse server. You will begin by calculating the MCR of the system you are currently using, but
then you will complete a planning worksheet for a new system with a published MCR figure.

The main tasks for this exercise are as follows:

1. Measure Maximum Consumption Rate.


2. Estimate Server Hardware Requirements.

 Task 1: Measure Maximum Consumption Rate


 Use the Create Benchmark DB.sql script in the D:\Labfiles\Lab02\Starter folder to create a
benchmark database on the localhost instance of SQL Server.
 Use the Measure MCR.sql script in the D:\Labfiles\Lab02\Starter folder to generate query
performance statistics.

 Use the statistics to calculate MCR for the database server.

 Use the calculated MCR figure to estimate the number of cores required to support the following
workload:

o Average data per query: 500 MB.

o Concurrent users: 10.

o Target response time: 20 s.

 Task 2: Estimate Server Hardware Requirements


 Use Microsoft Excel to open the DW Hardware Spec.xlsx workbook in the D:\Labfiles\Lab02\Starter
folder.

 Use the information in the workbook to:

o Calculate the number of cores required in the data warehouse server.

o Recommend the number and type (dual core or quad core) of processors to include in the data
warehouse server.

o Calculate the estimated volume of data in the data warehouse.

o Suggest a suitable amount of memory for the data warehouse server.


MCT USE ONLY. STUDENT USE PROHIBITED
2-18 Planning SQL Server Business Intelligence Infrastructure

o Calculate the storage requirements for the data warehouse server, assuming a compression ratio
of 3:1.

 Use Microsoft Word to open the Storage options.docx document in the D:\Labfiles\Lab02\Starter
folder and review the available storage options.

 Based on the storage requirements you have identified, select a suitable storage option and record
your selection in the DW Hardware Spec.xlsx workbook.

Results: After this exercise, you should have a completed worksheet that specifies the required hardware
for your data warehouse server.

Question: Review BI Topology.vsdx in the D:\Labfiles\Lab02\Solution folder. How do the


various suggested topologies (on multiple pages named A to F) compare to the topology
you designed in the lab?
Question: Review DW Hardware Spec.xlsx in the D:\Labfiles\Lab02\Solution folder. How
does the hardware specification in this workbook compare to the specification you created in
the lab?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 2-19

Module Review and Takeaways


Module Review
This module has described some key considerations for planning the hardware infrastructure for a SQL
Server–based BI solution. You should use the information in this module as a starting point, and use the
“more information” references to learn more about the supported distributed and scale-out architectures
for SQL Server components, and about the design principles used in Fast Track Data Warehouse reference
architectures.

Question: In a growing number of organizations, virtualization has become a core platform


for infrastructure. Hyper-V in Windows Server 2012 together with enterprise operations and
management software such as Microsoft System Center 2012 has enabled IT departments to
benefit from simpler provisioning, management, mobility, and recoverability of services.

What components of a BI infrastructure would you consider virtualizing, and why?


MCT USE ONLY. STUDENT USE PROHIBITED
2-20 Planning SQL Server Business Intelligence Infrastructure
MCT USE ONLY. STUDENT USE PROHIBITED
3-1

Module 3
Designing a Data Warehouse
Contents:
Module Overview 3-1

Lesson 1: Data Warehouse Design Overview 3-2

Lesson 2: Designing Dimension Tables 3-8

Lesson 3: Designing Fact Tables 3-15

Lab A: Designing a Data Warehouse Logical Schema 3-18

Lesson 4: Designing a Data Warehouse Physical Implementation 3-21

Lab B: Designing a Data Warehouse Physical Implementation 3-33


Module Review and Takeaways 3-36

Module Overview
The data warehouse is at the heart of most business intelligence (BI) solutions, and designing the logical
and physical implementation of the data warehouse is crucial to the success of the BI project. Although a
data warehouse is fundamentally a database, there are some significant differences between the design
process and best practices for an online transaction processing (OLTP) database and a data warehouse
that will support online analytical processing (OLAP) and reporting workloads.

This module describes key considerations for the logical design of a data warehouse, and then it discusses
best practices for the physical implementation of the data warehouse.

Objectives
After completing this module, you will be able to:

 Describe a process for designing a dimensional model for a data warehouse.

 Design dimensions tables for a data warehouse.

 Design fact tables for a data warehouse.

 Design and implement effective physical data structures for a data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
3-2 Designing a Data Warehouse

Lesson 1
Data Warehouse Design Overview
Before designing individual database tables and relationships, it is important to understand the key
concepts and design principles for a data warehouse. This lesson describes the dimensional model used in
most data warehouse designs and the process used to translate business requirements into a data
warehouse schema.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe where data warehouse design fits into an overall BI project.

 Describe the dimensional model used for most data warehouses.

 Apply an effective process for data warehouse design.


 Use a business process–based approach to dimensional modeling.

 Document dimensional model designs.

Data Warehouse Design in a BI Project


Data warehouse design is one of the core pillars of
the BI project framework, and is dependent on the
business requirements for the BI solution.

The key tasks that must be performed when


designing the data warehouse include:
 Interpreting business requirements to identify
the required analytical and reporting data.

 Identifying the business processes where the


required data is generated.

 Exploring the data generated by those


business processes.

 Creating a matrix of business processes and conformed dimensions by which the data must be
aggregated for analysis and reporting.

 Designing a dimensional model for each business process, including numeric facts and dimension
attributes and hierarchies.

 Translating the dimensional model designs into a database schema consisting of fact and dimension
tables.

 Designing appropriate physical data storage and structures to optimize data warehouse performance
and manageability.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-3

The Dimensional Model


Although data warehouses can be implemented as
normalized, relational database schemas, most
data warehouse designs are based on the
dimensional model advocated by Ralph Kimball. In
the dimensional model, the numeric business
measures that are analyzed and reported are
stored in fact tables, which are related to multiple
dimension tables in which the attributes by which
the measures can be aggregated are stored. For
example, a fact table might store sales order
measures, such as revenue and profit, and be
related to dimension tables that represent
business entities such as product and customer. These relationships make it possible to aggregate the
sales order measures by the attributes of a product (for example, to find the total profit for a particular
product model) or a customer (for example, to find the total sales revenue for customers who live in a
particular country).

Ideally, a dimensional model can be implemented in a database as a star schema, in which each fact table
is directly related to its relevant dimension tables. However, in some cases, one or more dimensions may
be normalized into a collection of related tables to form a snowflake schema. Generally, you should avoid
creating snowflake dimensions, if possible, because in a typical data warehouse workload, the
performance benefits of a single join between fact and dimension tables outweigh the data redundancy
reduction benefits of normalizing the dimension data.

The query optimizer in the Enterprise edition of SQL Server 2012 includes logic that detects star schema
joins in queries and optimizes the way these joins are processed accordingly. Based on the selectivity of
the query (that is, the proportion of rows from the fact table that the query is likely to return), the query
optimizer uses bitmap filters to quickly eliminate non-qualifying rows from the fact table (which generally
accounts for the largest cost in a data warehouse query).

Additional Reading: For more detailed information about star join query optimization, see
Introduction to New Data Warehouse Scalability Features in SQL Server 2008 at
http://msdn.microsoft.com/en-us/library/cc278097(v=SQL.100).aspx and Data Warehouse Query
Performance at http://technet.microsoft.com/en-us/magazine/2008.04.dwperformance.aspx.

The Data Warehouse Design Process


Although every project has its unique
considerations, there is a commonly used process
for designing a dimensional data warehouse that
many BI professionals have found to be effective.
The approach is largely based on the data
warehouse design patterns identified and
documented by Ralph Kimball and the Kimball
Group, though some BI professionals may vary in
how they approach each task.

Additional Reading: For a detailed


exploration of how to apply the Kimball
MCT USE ONLY. STUDENT USE PROHIBITED
3-4 Designing a Data Warehouse

dimensional modeling methodology to a SQL Server-based data warehouse design, read The
Microsoft Data Warehouse Toolkit (Wiley, 2011).

1. Determine analytical and reporting requirements


After you gather the business requirements for the BI solution, you must interpret these requirements in
terms of the analytical and reporting capabilities that the BI solution must provide. Typically, analytical
and reporting requirements support business requirements, so you will probably need to spend time
exploring each business requirement with the stakeholders who want them met in order to understand
the information that they need to meet it. For example, a sales executive might express a business
requirement as “We want to improve the sales performance of sales representatives in the most poorly
performing sales territories.” To meet this requirement, you need to understand how “sales performance”
is measured (for example, revenue, profitability, number of orders, or a combination of all three) and
against what aspects of the business it should be aggregated.

Typically, asking questions such as “how will you be able to tell if the business requirement is being met?”
leads the discussion toward the analytical and reporting requirements. For example, to determine whether
sales performance is improving, the sales executive might say that they need to be able to see “order
volume by territory” or “sales revenue by salesperson.” Requirements expressed like this make it easier to
determine the measures and dimensions the solution must include, because the requirement often takes
the form “measure by dimension”.
Additionally, most analytical and reporting requirements include a time-based aggregation. For example,
the sales executive might want to compare sales revenue by month or quarter.

2. Identify the business processes that generate the required data


Typically, the data required in the dimensional model is generated by an existing business process, so
after determining the data that you need to support analysis and reports, you must identify the business
processes that generate the source data.
For example, a business might include the following business processes:

 Order processing.

 Stock management.
 Order fulfillment.

 Manufacturing.

 Marketing and promotions.

 Financial accounting.

 Human Resources management.

Each of these processes generates data. This data includes numeric values and events that can be counted
(which can be sources for measures in a dimensional model) and information about key business entities
(which can be sources for dimension attributes).

3. Examine the source data for those business processes


In most organizations, each business process captures data in at least one system. For example, an order
processing business process might store details of orders, customers, and products in an order processing
system; and a financial accounting process typically stores details of accounts and balances in an accounts
application.

A significant part of the effort in designing a data warehouse solution involves exploring the data in these
source systems and interviewing the users, system administrators, and application developers who
understand it best. Initial exploration might simply take the form of running Transact-SQL queries to
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-5

determine distinct value counts, average numerical values, and row counts. You can then use the basic
information gathered from these initial queries and discussions with data specialists as a foundation for
deeper data profiling using tools such as the Data Profiling task in SQL Server Integration Services to
determine minimum and maximum field lengths, data sparseness and null counts, and the reliability of
relational dependencies.

At this stage, you do not need to perform a full audit of the data and start planning the extract, transform,
and load (ETL) solution, but you do need to identify if and where the measures and dimension attributes
you need to meet the reporting requirements are stored, what range of values exist for each required data
field, what data is missing or unknown, and at what granularity is the data available.

Note: For information about how to use the Data Profiling task in SSIS, attend course
10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012.

4. Conform dimensions across business processes


Identifying the business processes and exploring the data generated by each one will help you identify
some key business entities that are common across business processes. For example, the order processing,
manufacturing, and stock management business processes might both deal with a product entity.
Similarly, the order processing and order fulfillment processes might both deal with a customer entity.
Identifying dimensions that can be shared across multiple business processes is an important part of data
warehouse design because it enables you to define conformed dimensions that ensure that the same
definition of a business entity can be used to aggregate multiple facts and produce meaningful
comparisons. For example, by using a conformed product dimension, a data warehouse based on the
order processing, manufacturing, and stock management business processes will enable you to analyze a
specific product and compare the number ordered, the number manufactured, and the number held in
stock. If the product is perishable and has a limited shelf-life, this kind of comparison could provide
significant information for production planning and help reduce losses from spoiled, unsold products.

5. Prioritize business processes and define a dimensional model for each


Based on the business value of the identified reporting and analytical requirements, prioritize the business
processes and create a dimensional model for each one that is required to provide the necessary
analytical and reporting data. To do this, you must perform the following steps:
1. Identify the grain.

2. Select the required dimensions.

3. Identify the facts.


The details of these steps are discussed in the next topic.

6. Document and refine the models to determine the database logical schema
After you create initial dimensional models for each required business process, you can document the
models to show:

 The measures in the fact tables.

 The related dimension tables.

 The attributes and hierarchies required in the dimension tables.

You can then iteratively refine the model to design the fact and dimension tables that will be required in
the data warehouse database. Considerations for fact and dimension tables are discussed later in this
module.
MCT USE ONLY. STUDENT USE PROHIBITED
3-6 Designing a Data Warehouse

7. Design the physical data structures for the database


After you complete the logical database design, you can consider the physical implementation of the
database, including data file placement, indexes, table partitions, and data compression. These topics are
discussed in more depth later in this module.

Dimensional Modeling
After you identify the business processes and
conformed dimensions, you can document them
in a matrix, as shown on the slide. This approach is
based on the bus matrix design technique
promoted by the Kimball Group.

Additional Reading: For more information


about using a bus matrix as part of a data
warehouse design project, read The Microsoft
Data Warehouse Toolkit (Wiley, 2011).

You can then use the matrix to select each business process based on priority, and design a dimensional
model for the business process by performing the following steps:

1. Identify the grain. The grain of a dimensional model is the lowest level of detail at which you can
aggregate the measures. It is important to choose the level of grain that will support the most
granular of reporting and analytical requirements, so typically the lowest level of grain possible from
the source data is the best option. For example, an order processing system might record order data
at two levels: order-level data such as the order date, salesperson, customer, and shipping cost; and
line item–level data such as the products included in the order and their individual quantities, unit
costs, and selling prices. To support the most granular analysis and reporting, the grain should be
declared at the line item level (so the fact table will contain one row per line item).
2. Select the required dimensions. Next, determine which of the dimensions that are related to the
business process should be included in the model. The selection of dimensions depends on the
reporting and analytical requirements―specifically on the business entities by which the business
users need to aggregate the measures. Almost all dimensional models include a time-based
dimension, and the other dimensions generally become obvious as you review the requirements.
Additionally, at this stage, you might begin to identify specific attributes of the dimensions that will
be needed (such as the country, state, and city of a customer or the color and size of a product).

In the example on the slide, the Time, Customer, Product, and Salesperson dimensions are selected.

Note: The Time dimension in this example is used for both order date and ship date.
Although it would be possible to define an individual dimension for each type of date, it is more
common to create a single time dimension and use it for multiple roles. In an analytical model,
these multiple usages of the same dimension table are known as role-playing dimensions. This
technique is most commonly used for time tables, but it can be applied to any dimension that is
used in multiple ways―for example, a dimensional model for an airline flight scheduling business
process might use a single Airport dimension to support Origin and Destination role-playing
dimensions.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-7

3. Identify the facts. Finally, identify the facts that you want to include as measures. These are numeric
values that can be expressed at the level of the grain chosen earlier and aggregated across the
selected dimensions. Some facts will be taken directly from source systems, and others might be
derived from the base facts. For example, you might choose Quantity and Unit Price facts from an
order processing source system and then calculate a total Sales Amount. Additionally, depending on
the grain you choose for the dimensional model and the grain of the source data, you might need to
allocate measures from a higher level of grain across multiple fact rows. For example, if the source
system for the order processing business process includes a Tax measure at the order level, but the
facts are to be stored at the line item level, you will need to decide how to allocate the tax amount
across the line items. Typically, tax is calculated as a percentage of selling price, so it should be
straightforward to apply the appropriate tax rate to each line item based on the sales amount.

In the example on the slide, the Item Quantity, Unit Cost, and Unit Price measures are taken from the
source system at the line item level. From these, the Total Cost and Sales Amount measures for each line
item can be calculated. Additionally, the Shipping Cost measure is defined at the order level in the source
system, so it must be allocated across the line items. You do this by simply dividing it equally across each
row or applying a calculation that distributes the shared cost based on the quantity of each item ordered,
total line item weight, or some other appropriate formula.

Documenting Dimensional Models


After you design the initial dimensional models for
each business process, you can document them in
a simple diagram. A common format for this
documentation is a sun diagram, in which a fact
table is shown at the center of the dimensions to
which it is related.

As you refine the dimensional model, you can add


more detail to the sun diagram, including the
measures in the fact table and the attributes in the
dimension tables. In most cases, some or all of the
dimension attributes can be used to form a
hierarchy for drill-down analysis―for example,
enabling users to view aggregations of sales amount by year, month, and date or by country, state, and
city. You can add these hierarchies to the sun diagram to help you communicate and validate model
design with business stakeholders.

Eventually, the simple diagram will be refined to the point where it can be easily translated into a schema
design for database tables. At this stage, you can use a diagramming tool such as Microsoft Visio or a
specialist database modeling tool to start designing the logical schema of your data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
3-8 Designing a Data Warehouse

Lesson 2
Designing Dimension Tables
After you design the dimensional models for the data warehouse, you can translate the design into a
logical schema for the database. However, before you design dimension tables, it is important to consider
some common design patterns and apply them to your table specifications.

This lesson discusses some of the key considerations for designing dimension tables.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for dimension keys.

 Describe considerations for dimension attributes and hierarchies.

 Design dimensions that support values for None or Unknown.


 Design appropriate slowly changing dimensions for your business requirements.

 Design time dimension tables.

 Design self-referencing dimension tables.


 Include junk dimensions in a data warehouse design where appropriate.

Considerations for Dimension Keys


Each row in a dimension table represents an
instance of a business entity by which the
measures in the fact table can be aggregated. Like
other tables in a database, a key column uniquely
identifies each row in the dimension table. In
many scenarios, the dimension data is obtained
from a source system in which a key is already
assigned (sometimes referred to as the business
key); however it is standard practice when
designing a data warehouse to define a new
surrogate key that uses an integer value to identify
each row. A surrogate key is recommended for the
following reasons:

 The data warehouse might use dimension data from multiple source systems, so there is the
possibility that business keys are not unique.

 Some source systems use non-numeric keys, such as a globally unique identifier (GUID), or natural
keys, such as an email address to uniquely identify data entities. Integer keys are smaller and more
efficient to use in joins from fact tables.

 Each row in a dimension table represents a specific version of an instance of a business entity. If the
dimension table supports type 2 slowly changing dimensions, the table might need to contain
multiple rows that represent different versions of the same entity. These rows will have the same
business key, and without a surrogate key, they won’t be uniquely identifiable.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-9

Typically, the business key is retained in the dimension table as an alternate key. Business keys that are
based on natural keys can be familiar to users analyzing the data―for example, a ProductCode business
key that users will recognize might be used as an alternate key in the Product dimension table. However,
the main reason for retaining a business key is to make it easier to manage slowly changing dimensions
when loading new data into the dimension table. The ETL process can use the alternate key as a lookup
column to determine whether an instance of a business entity already exists in the dimension table.

Dimension Attributes and Hierarchies


In addition to the surrogate and alternate key
columns, a dimension table includes a column for
each attribute of the business entity that is needed
to support the reporting and analytical
requirements. When designing a dimension table,
you need to identify attributes that will be used in
reports and analysis, and include them in the
table. Typically, dimension attributes are used in
one of the following three ways:
 Hierarchies. Multiple attributes can be
combined to form hierarchies that enable
users to “drill-down” into deeper levels of
detail. For example, the Customer table in the slide includes Country, State, and City attributes that
can be combined for form a natural geographical hierarchy. Business users can view aggregated fact
data at each level―for example, to see sales order revenue by country―and then drill into a specific
country to see a breakdown by state, before drilling further into a specific state to see sales revenue
by city.

 Slicers. Attributes do not need to form hierarchies to be useful in analysis and reporting. Business
users can group or filter data based on single-level hierarchies to create analytical sub-groupings of
data. For example, the Gender attribute in the Customer table can be used to compare sales revenue
for male and female customers.
 Drill-through detail. Some attributes have little value as slicers or members of a hierarchy. For
example, it may be unlikely that a business user will need to analyze sales revenue by customer phone
number. However, it can be useful to include entity-specific attributes to facilitate drill-through
functionality in reports or analytical applications. For example, in a sales order report that enables
users to drill-down to the individual order level, users might want to double-click an order and drill-
through to see the name and phone number of the customer who placed the order.

Note: The terminology for interacting with data in reports can be confusing, and is
sometimes used inconsistently. For clarity, in this course, the term “drill-down” means expanding
a hierarchy to see the next level of aggregation, and “drill-through” means viewing details
outside of the current hierarchy for a selected row. For example, while viewing sales revenue by
customer geography, you might view total revenue for a specific country in the hierarchy. You
might then “drill-down” to see a subtotal for each state within that country (the next level in the
hierarchy), or “drill-through” to see demographic details for that country.

In the example on the slide, note that the Name column contains the full name of the customer. In a data
warehouse table schema, it is not usually necessary to normalize the data to its most atomic level as is
common in OLTP systems. In this example, it is unlikely that users will want to group or filter data by
MCT USE ONLY. STUDENT USE PROHIBITED
3-10 Designing a Data Warehouse

customer first name or last name, and the data only has drill-through value at the full name level of detail.
Therefore, the FirstName, MiddleName, and LastName columns in the source system have been
combined into a single Name field in the data warehouse.

Unknown and None


As a general rule, try to design your data
warehouse in a way that eliminates, or at least
minimizes, NULL values―particularly in fact table
key columns that reference dimension tables.
NULL values make it easy to accidentally eliminate
rows from reports and produce misleading totals.

Identifying the semantic meaning of


NULL
When you explore the source data for your BI
solution, pay particular attention to how NULL
values are used in the source data. The semantic
meaning of NULL might be “None” or “Unknown,”
depending on the context, and only by examining the data and consulting the users, administrators, and
developers who are familiar with the source system will you be able to confidently determine which. In the
example on the slide, the source data includes a column named DiscountType in which two rows have a
missing, or NULL, value. The fact that these rows include a non-zero Discount value indicates that NULL
does not necessarily always mean “None”, and is more likely to mean “Unknown.” Additionally, on the
rows where the Discount value is zero, the DiscountType value is consistently “N/A,” implying that “N/A”
is used in this system to mean “None”.

To support these two cases, a row for each case is added to the dimension table with appropriate
surrogate keys (such as -1 for "Unknown" and 0 for "None"). If the source systems had been more
ambiguous, you could add a single row to the dimension table to represent “None or unknown”.

NULL equality
Depending on the settings in a SQL Server database, you might not be able to compare NULL values for
equality. In its strictest definition, NULL means unknown, so a “NULL = NULL” comparison is actually
asking if one unknown value is the same as another unknown value; and because both values are
unknown, the answer is also unknown (and therefore NULL). You should not use NULL as the alternate key
for the “Unknown” dimension row, because lookup queries during the ETL load process must compare
this key to the data being loaded to determine whether a dimension row already exists. Instead, use an
appropriate key value that is unlikely to be the same as an existing business key, and use the Transact-SQL
ISNULL function to compare source rows with dimension rows, as shown in the following code sample.

SELECT d.DiscKey, s.DiscountType


FROM DimDiscount AS d
JOIN SourceData AS s ON ISNULL(s.DiscountType, 'Unknown') = d.DiscAltKey
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-11

Designing Slowly Changing Dimensions


Slowly changing dimensions (SCDs) are a
significant consideration in the design of
dimension tables, and you should try to identify
requirements for maintaining historic dimension
attribute values as early as possible in the design
process.

There are three common techniques used to


handle attribute value changes in SCDs:

 Type 1. Type 1 changes are the simplest type


of SCD to implement. Attribute values are
updated directly in the existing dimension
table row and no history is maintained. This
makes Type 1 changes suitable for attributes that are used to provide drill-through details, but
unsuitable for analytical slicers or hierarchy members where historic comparisons must reflect the
attribute values as they were at the time of the fact event.
 Type 2. Type 2 changes involve the creation of a new version of the dimension entity in the form of a
new row. Typically, a bit column in the dimension table is used as a flag to indicate which version of
the dimension row is the current one. Additionally, datetime columns are often used to indicate the
start and end of the period for which a version of the row was (or is) current. Maintaining start and
end dates makes it easier to assign the appropriate foreign key value to fact rows as they are loaded
so that they are related to the version of the dimension entity that was current at the time the fact
occurred.

 Type 3. Type 3 changes are rarely used. In a type 3 change, the previous value (or sometimes a
complete history of previous values) is maintained in the dimension table row. This requires
modifying the dimension tale schema to accommodate new values for each tracked attribute, and
can result in a complex and difficult to manage dimension table.

After you define the dimensional model for the data warehouse and are evolving your design from a sun
diagram to a database schema, it can be useful to annotate dimension attributes to indicate what kind of
SCD changes they must support. This will help you plan the metadata columns required for each
dimension table.

Time Dimension Tables


Most analysis and reporting requirements include
a need to aggregate values over time periods, so
almost every data warehouse includes a time
dimension table. When you design a time
dimension table, you must take into account the
following considerations:

 Surrogate key. Although best practice for


surrogate keys in dimension tables is usually
to use a simple integer value with no semantic
meaning, time dimension tables can benefit
from using an integer representation of the
date or time that the row represents. Ideally,
the values should be in ascending order relative to the dates they represent, so the best approach is
MCT USE ONLY. STUDENT USE PROHIBITED
3-12 Designing a Data Warehouse

to concatenate the integer values for each date part in descending order of scope. For example, use
the pattern YYYYMMDD to represent dates; so for January 31st 2013, use the value 20130131. This
ensures that the value used for the next sequential date (February 1st 2013) is a higher value of
20130201. The reason ascending values are recommended is that data warehouse queries typically
filter on a range of date or time values, and using an ascending numeric key enables you to use
indexes and partitions that store the fact data in chronological order, thereby enabling the query
optimizer to use sequential scans to read the data. Additionally, the actual datetime value for the row
is generally used as the alternate key to support datetime functions or client applications that can
apply datetime specific logic.

 Granularity. The level of granularity used for a time dimension table depends on the business
requirements. For many reporting and analysis scenarios, such as viewing details about sales orders,
the lowest level of granularity that is likely to be required is a day. However, in some scenarios, users
might need to aggregate facts by hours, minutes, or seconds, or even smaller increments. The lower
the level of granularity used, the more rows will exist in the dimension table, and storing a row for
increments of less than a day can result in extremely large tables. An alternative approach is to create
a date dimension table that contains a row for each day, and a second time dimension table that
stores a row for each required time increment in a 24-hour period. Fact tables that are used for
analysis of measures at the day level or higher can be related to the date dimension table only; and
facts that are measured at smaller time increments can be related to both the date and time
dimension tables.
 Range. Typically, a time dimension table stores a row for each increment between a start point and
an end point with no gaps. So for example, a time dimension in a data warehouse used to analysis
sales orders might have a row for each day between the first ever order and the most recent
order―even if no orders were placed on some of the intervening days. In reality, the start and end
dates are typically based on key calendar dates. For example, the start date might be January 1st of
the year the company started trading, or the start date of the company’s first fiscal year. The end date
is usually some future point, such as the end of the current year; and more rows are added
automatically as the end date gets closer to maintain a buffer of future dates. If the data warehouse
will be used to create and store projections or budget figures for future operations, you will need to
choose an end date that is far enough into the future to accommodate these values.

 Attributes and hierarchies. You need to include attributes for each time period by which data will
be aggregated, for example year, quarter, month, week, and day. These attributes tend to form
natural hierarchies. Additionally, you can add attributes to be used as slicers, such as weekday (which
for example, would enable users to compare typical sales volumes for each day of the week). In
addition to numeric values, you might want to include attributes for date element names, such as
month names and day names. This enables more user-friendly reports (for example, enabling users to
compare sales in March and April instead of month 3 and month 4), but you should also include the
numeric equivalents so that client applications can use them to sort the data into the correct
chronological order (for example, sorting months by month number instead of month name).

 Multiple calendars. Many organizations need to support multiple calendars, for example a normal
calendar year that runs from January to December, and a fiscal calendar, which might run from April
to March. If this is the case in your data warehouse, you can either create a separate time dimension
table for each calendar, or more preferably, include attributes for all alternative calendar values in a
single time dimension table. For example, a time dimension table might include a Calendar Year
attribute and a Fiscal Year attribute.

 Unknown values. In common with other dimension tables, you might need to support facts for
which a date or time value is unknown. Instead of requiring a NULL value in the fact table, consider
creating a row in the time dimension table for unknown values. You can use an obvious surrogate key
value, such as 00000000) for this row, but because the alternate key must be a valid date, you should
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-13

use a date outside of the normal range of business operations, such as January 1st 1753 or December
31st 9999 (these are the minimum and maximum values supported by the datetime data type).

Self-Referencing Dimension Tables


A common requirement in a data warehouse is to
support dimensions with parent-child hierarchies.
For example, an employee dimension might
consist of managers, each of whom has employees
reporting to him or her, who in turn might have
reports of their own.

Typically, parent-child hierarchies are


implemented as self-referencing tables, in which a
column in each row is used a foreign key reference
to a primary key value in the same table. Some
client applications, including SQL Server Analysis
Services, are aware of self-joins and can
automatically handle parent-child hierarchies in a dimension. For other applications, you might need to
implement some custom, recursive logic to enable analysis and reporting of these hierarchies.

When you implement a self-referencing dimension table in a data warehouse, you should think about the
following considerations:
 Like all dimension load operations, when records are to be loaded into the dimension table, the ETL
process must look up the alternate key to determine whether a record already exists for the entity.
However, the alternate key of the parent record must also be looked up to determine the correct
surrogate key to use in the foreign key column.

 You may have to deal with a situation where you need to load a record for which the parent record
has not yet been loaded.
 Supporting type 2 SCDs in a self-referencing dimension table can be complex. In a worst case
scenario, you might perform a type 2 change that results in a new row (and therefore a new surrogate
key), and then need to cascade that type 2 change to create new rows for all descendants of the
entity if the change has not altered the parent-child relationships.

Junk Dimensions
In some reporting and analytical requirements,
there are attributes that are useful for grouping or
filtering facts but which do not belong in any of
the dimensions defined in the dimensional model.
When these attributes have low cardinality (that is,
there are only a few discrete values), you can
group them into a single dimension table that
contains miscellaneous analytical values. This kind
of dimension table is generally referred to as a
junk dimension and is used to avoid creating
multiple, very small dimension tables.

For example, a sales orders dimensional model


might include true/false indicators for orders where goods were out of stock or where free shipping was
MCT USE ONLY. STUDENT USE PROHIBITED
3-14 Designing a Data Warehouse

provided, or there may be a column that stores “credit” or “debit” to indicate the payment method.
Instead of creating a dimension table for each of these attributes, you could combine them in every
possible combination in a junk dimension table.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-15

Lesson 3
Designing Fact Tables
Fact tables contain the numeric measures that can be aggregated across the dimensions in your
dimensional model. Fact tables can grow to be extremely large, and it is important to design them
carefully with reporting and analytical requirements, performance, and manageability in mind.

This lesson discusses common considerations for fact table design.

Lesson Objectives
After completing this lesson, you will be able to:

 Determine appropriate columns for a fact table.

 Design a fact table at an appropriate level of grain.

 Describe the types of measure that are stored in a fact table.


 Describe common types of fact table.

Fact Table Columns


A fact table usually consists of the following kinds
of columns:

 Dimension keys. Fact tables reference


dimension tables by storing the surrogate key
for each related dimension. In this way, a row
in a fact table is conceptually an intersection
between the dimension tables that is relates
to―for example, recording a sales order
placed on a specific date, for a specific
product, by a specific customer. You can add
foreign key constraints to these columns,
which will help the SQL Server query
optimizer detect star joins. However, constraints can slow down data load operations, and because
the surrogate keys are generated during a controlled ETL process, they do little to enforce referential
integrity.

 Measures. In most cases, a fact table is primarily used to store numeric measures that can be
aggregated by the related dimensions. For example, a row in a fact table that records sales orders
might include a column for the sales amount, which can then be aggregated by the dimensions to
show sales amount by date, by product, or by customer. In some cases, a fact table contains no
measures and is simply used to indicate that an intersection between the related dimensions
occurred. For example, a fact table in a manufacturing dimensional model might record a single row
each time a product assembly is completed, indicating the product and date dimension keys. The fact
table can then be used to calculate the number of times an assembly of each product was completed
per time period by simply counting the distinct rows. A fact table with no numeric measure columns
is sometimes referred to as a factless fact table.

 Degenerate dimensions. Sometimes, a fact has associated attributes that are neither keys nor
measures, but which can be useful to group or filter facts in a report or analysis. You can include this
column in the fact table where client applications can use it as a degenerate dimension by which the
MCT USE ONLY. STUDENT USE PROHIBITED
3-16 Designing a Data Warehouse

fact data can be aggregated. In effect, including degenerate dimension columns in a fact table
enables it to also be used as a dimension table. Using degenerate dimensions can be a good
alternative to using a junk dimension if the analytical attributes are specific to a single fact table.

Note: Unlike most database tables, a fact table does not necessarily require a primary
key―in fact, unless you have a business requirement to uniquely identify each row in the fact
table, you should avoid creating a unique key column for the fact table and avoid defining a
primary key constraint. Facts are generally aggregated, and queries rarely need to individually
identify a fact row. In some cases, the combination of dimension keys can uniquely identify a fact
row, but this is not guaranteed―for example, the same customer could purchase the same
product twice in one day.

Types of Measure
Fact tables can contain the following three kinds
of measure:

 Additive measures. Additive measures can


be summed across all dimensions. For
example, you could use a SalesAmount
measure in a fact table with OrderDateKey,
ProductKey, and CustomerKey dimension
keys to calculate total sales amount by time
period (such as month), product, or customer.

 Semi-additive measures. Semi-additive


measures can be summed by some
dimensions, but not others. Commonly, semi-
additive measures cannot be summed by time dimensions. For example, the number of items in stock
at the end of each day might be recorded as a StockCount measure in a fact table with DateKey and
ProductKey dimension keys can be used to calculate a total stock count for all products, but
summing the stock count across all the days in a month does not result in a total stock count value
for the month. To find out how products are in stock at the end of the month, you must use only the
StockCount value for the last day of the month.

 Non-additive measures. Non-additive measures cannot be summed by any dimension. For example,
a fact table for sales orders might include a ProfitMargin measure that records the profit margin for
each order. However, you cannot calculate the overall margin for any dimension by summing the
individual profit margins.
Generally, semi-additive and non-additive measures can be aggregated by using other functions, for
example, you could find the minimum stock count for a month or the average profit margin for a product.
Understanding the ways in which the measures can be meaningfully aggregated is useful when testing
and troubleshooting data warehouse queries and reports.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-17

Types of Fact Table


Generally, data warehouses include fact tables that
are one of the following three types:

 Transaction fact tables. The most common


kind of fact table is a transaction fact table, in
which each row records a transaction or event
at an appropriate grain. For example, a fact
table might record sales orders at the line
item grain, in which each row records the
purchase of a specific item. The measures in a
transaction fact table are usually additive.

 Periodic snapshot tables. Some fact tables


record measure values at a specific point in
time, and these are known as periodic snapshot fact tables. For example, a fact table might record the
stock movement for each day, including the opening and closing stock count figures. Measures in a
periodic snapshot fact table are often semi-additive.
 Accumulating snapshot fact tables. In some scenarios, you might want to use a fact table to track
the progress of a business process through multiple stages. To do this, you can use an accumulating
snapshot fact table. For example, a fact table might track an order from initial purchase through to
delivery by including a date dimension key field for the order date, the shipping date, and the
delivery date. The ShipDate and DeliveryDate columns for orders that have been placed but not yet
shipped will contain the dimension key for an “Unknown” or ”None” row in the time dimension table,
and these will be updated to reflect the appropriate dimension key as the order is shipped and
delivered.
MCT USE ONLY. STUDENT USE PROHIBITED
3-18 Designing a Data Warehouse

Lab A: Designing a Data Warehouse Logical Schema


Scenario
You are designing a BI solution for Adventure Works Cycles, and have conducted interviews to gather
information about current business processes and identify analytical and reporting requirements. Now
you must create dimensional models for the business processes and design a data warehouse database
schema to support the requirements.

Objectives
After completing this lab, you will be able to:

 Design and document a dimensional model.

 Design and document a database schema for a data warehouse.

Estimated Time: 90 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Identifying Business Processes and Dimensions


Scenario
The business stakeholders at Adventure Works Cycles have provided you with information about the
business processes in the company. Now you must use this information to create a dimensional matrix for
the business processes.

The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.


2. Review the Interview Transcripts.

3. Identify and Document Business Processes.

4. Examine Source Data.

5. Identify and Document Dimensions.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab03A\Starter folder as Administrator.

 Task 2: Review the Interview Transcripts


 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab03A\Starter
folder.

 Task 3: Identify and Document Business Processes


 Collaborate with two or three other students.

 Discuss the interviews, and identify the business processes in Adventure Works that generate the data
required to meet the analytical and reporting requirements.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-19

 Prioritize the business processes by their importance to the business in terms of analytical and
reporting requirements.

 Record the business processes you identify in the Matrix.xlsx Excel workbook in the
D:\Labfiles\Lab03A\Starter folder. List them in descending order of priority.

 Task 4: Examine Source Data


 Still working as a small group, examine the data sources that were described in the interviews.

 You can use SQL Server Management Studio to connect to the SQL Server database engine instances
described in the interviews, and you can use Excel to view the sample accounts data.

 You can create a database diagram for each source database in SQL Server Management Studio―this
can be a useful way to familiarize yourself with the source database schemas.

 Query the tables in the source databases to view the data they contain. If desired, you can export data
from source databases to comma-delimited text files and explore them further in Excel.
 If you want to profile any of the data, you can use the Data Profiling task in SQL Server Integration
Services to do so.

 As you examine the data sources, note the potential measures, dimensions, and dimension attributes
you discover; as well as any potential quality or consistency issues in the data.

 Task 5: Identify and Document Dimensions


 Still working as a small group, discuss the data sources and identify the dimensions that can be
supported by the data and meet the analytical and reporting requirements.
 Document the dimensions in the Matrix.xlsx workbook you used earlier, and indicate which
dimensions relate to which business processes by entering “x” in the intersecting cell. A Time
dimension has already been defined in the matrix.
 Try to define conformed dimensions that can be related to multiple business processes.

Results: At the end of this exercise, you will have created a matrix of business processes and dimensions.

Exercise 2: Designing Dimension Models and Data Warehouse Tables


Scenario
You have identified the business processes that generate data for the analytical and reporting
requirements, and the conformed dimensions that can be used to analyze the numerical measures they
generate. Now you must create dimensional models for the business processes and use them to design a
data warehouse schema.

The main tasks for this exercise are as follows:

1. Create a Sun Diagram.

2. Design Data Warehouse tables.

 Task 1: Create a Sun Diagram


 Still working as a small group, discuss the measures and dimensions for the highest priority business
process.

 Identify an appropriate grain to use in the dimensional model for this business process.
MCT USE ONLY. STUDENT USE PROHIBITED
3-20 Designing a Data Warehouse

 Create a sun diagram with the measures in appropriate fact tables with associated dimensions. You
can use any diagramming tool you want to, for example Visio, PowerPoint, Paint, or pen and paper.

 Add dimension attributes and hierarchies to the sun diagram based on the data attributes you
identified when examining the data and the analytical and reporting requirements gathered from the
interviews.

 As time permits, create sun diagrams for the remaining business processes in descending order of
priority.

 Task 2: Design Data Warehouse tables


 Still working as a small group, design a database schema for the dimensional models you have
created, starting with the highest priority business process.

 You can use Visio, PowerPoint, Paint, the SQL Server Management Studio table designer, or pen and
paper to create your design.

 As you create your design, include documentation about:

o Surrogate and alternate key columns.

o Relationships between tables.


o Slowly changing dimension attributes and metadata columns.

Results: At the end of this exercise, you will have a sun diagram showing the facts, measures,
dimensions, attributes, and hierarchies you have identified, and a database schema diagram showing
your design for dimension and fact tables.

Question: Use Excel to open Matrix.xlsx in the D:\Labfiles\Lab03A\Solution folder and


compare it to the matrix your group created during the lab. What are the significant
differences between your solution and the suggested solution, and how would you justify
your choices in the lab?

Question: Use Visio to open Initial Sun Diagram.vsdx in the D:\Labfiles\Lab03A\Solution


folder. How do the dimensional models in this document compare to your solution?

Question: Use Visio to open DW Schema.vsdx in the D:\Labfiles\Lab03A\Solution folder.


How does the database schema design in this document compare to your solution?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-21

Lesson 4
Designing a Data Warehouse Physical Implementation
After designing the logical schema for the data warehouse, you need to implement it as a physical
database. This requires careful planning for file placement, data structures such as partitions and indexes,
and compression. This lesson discusses considerations for all of these aspects of the physical database
design.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe typical data warehouse I/O activity.

 Plan file placement for a data warehouse.

 Plan partitioning for data warehouse tables.


 Design effective indexes for data warehouse queries.

 Plan data compression for a data warehouse.

 Design views in a data warehouse.

Data Warehouses I/O Activity


Before designing the physical database for the
data warehouse, it is useful to consider the types
of workload that it must support and the data it
must store. The database itself must store
potentially very large fact tables with millions of
rows, and dimension tables that are typically
related to fact tables by a single join. Typically, the
I/O activity in the database is generated by one of
the workloads described in this section or caused
by maintenance operations, such as backups.

ETL
ETL processes affect the data warehouse when
they load new or updated data into the data warehouse tables. In most cases, the inserts are performed as
bulk load operations to minimize logging and constraint checking. The load process may involve some
lookup operations to find alternate keys for slowly changing dimensions, and some update operations for
type 1 dimension changes or data modifications in fact tables where appropriate. Depending on the
design of the data structures, ETL load operations might also involve dropping and rebuilding indexes and
splitting partitions.

Data models
After each new load, any data models based on the data warehouse must be processed. This involves
reading data from the data warehouse tables into the data model and pre-aggregating measures to
optimize analysis queries. Depending on the size of the data warehouse and the time window for the
processing operation, the entire data model may be completely processed after each load, or an
incremental processing approach may be used in which only new or modified data is processed.
MCT USE ONLY. STUDENT USE PROHIBITED
3-22 Designing a Data Warehouse

Because of the volume of data being loaded into the model, the I/O activity typically involves sequential
table scans to read entire tables―particularly when performing a full process of the data model.

Reports
In some scenarios, all reporting is performed against the data models, so reporting does not affect the
data warehouse tables. However, it is common for some reports to query the data warehouse directly. In
scenarios where IT-provided reports are supported, the queries are generally predictable and retrieve
many rows with range-based query filters―often on a date field.

User queries
If self-service reporting is supported, users may be able to execute queries in the data warehouse (or use
tools that generate queries on their behalf). Depending on the query expertise of the users, this can result
in complex, unpredictable queries.

Consideration for Database Files


In Module 2: Planning BI Infrastructure, some key
considerations for data warehouse storage
hardware were discussed and it was
recommended that storage be provided by
multiple disks configured as RAID 10 or RAID 5
arrays. This storage is presented to the data
warehouse server as multiple logical disks
(sometimes referred to as LUNs, though
technically a LUN is a logical unit number used to
identify a unit of SCSI-based storage). When
designing the file placement for your data
warehouse, you must decide how best to use
these logical disks.

Data files and filegroups


Data files are used to pre-allocate disk storage for database objects. When planning files for a data
warehouse, consider the following guidelines:
 Create files with an initial size based on the eventual size of the objects that will be stored on them.
This pre-allocates sequential disk blocks and helps avoid fragmentation.

 Disable autogrowth. If you begin to run out of space in a data file, it is more efficient to explicitly
increase the file size by a large amount than to rely on incremental autogrowth.

Because the logical disks for the database files are typically already configured as RAID 10 or RAID 5
arrays, you generally do not need to use filegroups to distribute tables across physical disk platters in
order to improve I/O performance. However, you should consider the following guidance for using
filegroups in a data warehouse:

 Create at least one filegroup in addition to the primary filegroup, and then set it as the default
filegroup so you can separate data tables from system tables.

 Consider creating dedicated filegroups for extremely large fact tables and using them to place those
fact tables on their own logical disks.
 If some tables in the data warehouse are loaded on a different schedule from others, consider using
filegroups to separate the tables into groups that can be backed up independently.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-23

 If you intend to partition a large fact table, create a filegroup for each partition so that older, stable
rows can be backup up and then set as read-only.

Staging tables
Most data warehouses require staging tables to support incremental data loads from the ETL process. In
some cases, you might use a separate staging database as well as staging tables in the data warehouse
itself. Consider the following recommendations for staging tables.

 If a separate staging database is to be used, create it on a separate logical disk from the data
warehouse files.

 If the data warehouse will include staging tables, create a file and filegroup for the staging tables and
place them on a separate logical disk from the fact and dimension tables.

 An exception to the previous guideline is made for staging tables that will be switched with partitions
to perform fast loads. These must be created on the same filegroup as the partition with which they
will be switched.

TempDB
TempDB is used for temporary objects required for query processing. To avoid fragmentation of data files,
place it on a dedicated logical disk and set its initial size based on how much it is likely to be used. You
can leave autogrowth enabled, but set the growth increment to be quite large to ensure that performance
is not interrupted by frequent growth of TempDB. Additionally, consider creating multiple files for
TempDB to help minimize contention during page free space (PFS) scans as temporary objects are created
and dropped.

Transaction logs
Generally, the transaction mode of the data warehouse, staging database, and TempDB should be set to
Simple to avoid having to truncate transaction logs. Additionally, most of the inserts in a data warehouse
are typically performed as bulk load operations which are not logged. To avoid disk resource conflicts
between data warehouse I/O and logging, place the transaction log files for all databases on a dedicated
logical disk.

Backup files
You will need to implement a backup routine for the data warehouse, and potentially for a staging
database. In most cases, you will back up these databases to disk, so allocate a logical disk for this
purpose. You could allocate multiple logical disks and perform a mirrored backup, but because the disks
are already configured as RAID 5 or RAID 10 arrays, this would be of little benefit from a performance
perspective. Note that the backup files should be copied to offsite storage to provide protection in the
case of a complete storage hardware failure or natural disaster.
MCT USE ONLY. STUDENT USE PROHIBITED
3-24 Designing a Data Warehouse

Table Partitioning
Partitioning a table distributes data across
partitions based on a partition function that
defines a range of values for each partition. A
partition scheme maps the partitions to filegroups,
and the table is partitioned by applying the
partition scheme to the values in a specified
column.

Note: For information about how to


implement partitioning, see “Partitioned Tables
and Indexes” in SQL Server Books Online.

Why use partitioning?


Partitioning a large table can result in the following benefits:
 Improved query performance. By partitioning a table across filegroups, you can place specific
ranges of data on different disk spindles, which can improve I/O performance. However, in most data
warehouses, the disk storage is already configured as a RAID 10 or RAID 5 array, so this usually has
little benefit. However, when using a mix of fast solid state storage for recent, frequently accessed
data, and mechanical disks for older, less queried rows, you can use partitioning to balance disk
performance against storage costs. The biggest performance gain from partitioning in a data
warehouse is realized when queries return a range of rows that are filtered on the partitioning key. In
this case, the query optimizer can eliminate partitions that are not within the filter range, and
dramatically reduce the number of rows that need to be read.

 More granular manageability. When you partition a large table, you can perform some
maintenance operations at the partition level instead of on the whole table. For example, indexes can
be created and rebuilt on a per-partition basis, compression can be applied to individual partitions,
and by mapping partitions to filegroups, you can back up and restore partitions independently. This
enables you to back up older data once and then configure the backed up partitions as read-only.
Future backups can be limited to the partitions that contain new or updated data.
 Improved data load performance. The biggest benefit of using partitioning in a data warehouse is
that it enables you to load many rows very quickly by switching a staging table with a partition. This
technique dramatically reduces the time taken by ETL data loads, and with the right planning, it can
be achieved with minimal requirements to drop or rebuild indexes.

Best practices for partitioning in a data warehouse


When planning a data warehouse, consider the following best practices for partitioning.

 Partition large fact tables. Fact tables of around 50 GB or more should generally be partitioned for
the reasons described earlier. In general, fact tables benefit from partitioning more than dimension
tables.

 Partition on an incrementing date key. When defining a partition scheme for a fact table, use a
date key that reflects the age of the data as it is incrementally loaded by the ETL process. For
example, if a fact table contains sales order data, partitioning on the order date ensures that the most
recent orders are in the last partition and the earliest orders are in the first partition.

 Design the partition scheme for ETL and manageability. In a data warehouse, the query
performance gains realized by partitioning are small compared to the manageability and data load
performance benefits. Ideally, your partitions should reflect the ETL load frequency (monthly, weekly,
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-25

daily, and so on) because this simplifies the load process. However, you may want to merge partitions
periodically to reduce the overall number of partitions (for example, at the start of each year, you
could merge the monthly partitions for the previous year into a single partition for the whole year).

 Maintain an empty partition at the start and end of the table. You can use an empty partition at
the end of the table to simplify the loading of new rows. When a new set of fact table rows must be
loaded, you can place them in a staging table, split the empty partition (to create two empty
partitions), and then switch the staged data with the first empty partition (which loads the data into
the table and leaves the second empty partition you created at the end of the table, ready for the
next load). You can use a similar technique to archive or delete obsolete data at the beginning of the
table.

Note: Partitioning is only available in SQL Server Enterprise edition. In previous releases of
SQL Server Enterprise edition, the number of partitions per table was limited to 1,000. In SQL
Server 2012, this limit has been extended to 15,000. On 32-bit systems, you can create a table or
index with over 1,000 partitions, but this is not supported.

Demonstration: Partitioning a Fact Table


This demonstration shows how to create and use a partitioned table.

Demonstration Steps
Create a Partitioned Table
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. Start SQL Server Management Studio and connect to the localhost instance of the database engine
by using Windows authentication.

3. Open Partitions.sql from the D:\Demofiles\Mod03 folder.

4. Select the code under the comment Create a database, and then click Execute. This creates a
database for the demonstration.

5. Select the code under the comment Create filegroups, and then click Execute. This creates four
filegroups in the demo database.
6. Select the code under the comment Create partition function and scheme, and then click Execute.
This creates a partition function that defines four ranges of values (less than 20000101, 20000101 to
20010100, 20010101 to 20020100, and 20020101 and higher), and a partition scheme that maps
these ranges to the FG0000, FG2000, FG2001, and FG2002 filegroups.

7. Select the code under the comment Insert data into the partitioned table, and then click Execute.
This inserts four records into the table.

View Partition Metadata

1. Select the code under the comment Query the table, and then click Execute. This retrieves rows
from the table and uses the $PARTITION function to show which partition the datekey value in each
row is assigned to. This function is useful for determining which partition of a partition function a
specific value belongs in.

2. Select the code under the comment View filegroups, partitions, and rows, and then click Execute.
This code uses system tables to show the partitioned storage and the number of rows in each
MCT USE ONLY. STUDENT USE PROHIBITED
3-26 Designing a Data Warehouse

partition. Note that there are two empty partitions; one at the beginning of the table, and one at the
end.

Split a Partition

1. Select the code under the comment Add a new filegroup and make it the next used, and then
click Execute. This creates a new filegroup named FG2003 and adds it to the partition scheme as the
next used partition.

2. Select the code under the comment Split the empty partition at the end, and then click Execute.
This creates a new partition for values of 20030101 and higher and assigns it to the next used
filegroup (FG2003), leaving an empty partition for values between 20020101 and 20030100.

3. Select the code under the comment Insert new data, and then click Execute. This inserts two new
rows into the partitioned table.

4. Select the code under the comment View partition metadata, and then click Execute. This shows
that the two rows inserted in the previous step are in partition 4, and that partition 5 (on FG2003) is
empty.
Merge Partitions

1. Select the code under the comment Merge the 2000 and 2001 partitions, and then click Execute.
This merges the partition that contains the value 20010101 into the previous partition.
2. Select the code under the comment View partition metadata, and then click Execute. This shows
that partition 2 (on FG2000) now contains four rows, and that the partition that was previously on
FG2001 has been removed.

Considerations for Indexes


Most databases use indexes to maximize query
performance, and planning these indexes is an
important part of the database design process.
Before deciding which indexes to create, you need
to understand the workloads that the database
must support, and balance the need for improved
query performance against the affect that indexes
will have on data inserts and updates as well as
the overhead of maintaining indexes.

At first glance, a data warehouse seems to support


mostly read operations; therefore, many
inexperienced BI professionals are tempted to
create many indexes on all tables to support queries. However, another significant workload in most data
warehouses is the regular ETL data load, which can often involve many inserts and updates. Too many
indexes can slow down the ETL load process, and the need to periodically reorganize or rebuild indexes
can create a significant maintenance overhead.
The first consideration for indexing in a data warehouse is to determine whether any indexes are required.
It may seem unconventional to consider not creating indexes, but if the volume of fact data is relatively
small, and all user access to the data is through a data model that is fully processed after each data load,
there may be little performance benefit in maintaining indexes in the data warehouse. However, if your
data warehouse does not match this restrictive description, you will probably need to consider creating
some indexes. As with any database, the indexes you should create depend on the specific queries your
data warehouse must support and the need to balance the performance of those queries against data
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-27

inserts and updates, and index maintenance. However, in most data warehouse scenarios, you should
consider the guidelines in this topic as a starting point for index design.

Dimension table indexes


When designing indexes for dimension tables, consider the following guidelines:

 Create a clustered index on the surrogate key column. This column is used to join the dimension table
to fact tables, and a clustered index will help the query optimizer minimize the number of reads
required to filter fact rows.

 Create a non-clustered index on the alternate key column and include the SCD current flag, start
date, and end date columns. This index will improve the performance of lookup operations during
ETL data loads that need to handle slowly changing dimensions.

 Create non-clustered indexes on frequently searched attributes, and consider including all members
of a hierarchy in a single index.

Fact table indexes


When designing indexes for a fact table, consider the following guidelines:
 Create a clustered index on the most commonly searched date key. Date ranges are the most
common filtering criteria in most data warehouse workloads, so a clustered index on this key should
be particularly effective in improving overall query performance.
 Create non-clustered indexes on other, frequently searched dimension keys.

Columnstore indexes
SQL Server 2012 introduces columnstore indexes, an in-memory indexing solution that uses xVelocity
compression technology to organize index data in a column-based format instead of the row-based
format used by traditional indexes. Columnstore indexes are specifically designed to improve the
performance of queries against large fact tables joined to smaller dimension tables in a star schema, and
can dramatically improve the performance of most data warehouse queries. In many cases, you can
achieve the same performance improvements or better by replacing the recommended fact table indexes
described previously with a single columnstore index that includes all of the columns in the fact table.
There are some queries that do not benefit from columnstore indexes (for example, queries that return an
individual row from a dimension table will generally perform better by using a conventional clustered or
non-clustered index), but for most typical data warehouse queries that aggregate many fact rows by one
or more dimension attributes, columnstore indexes can be very effective.

Note: For more information about columnstore indexes, see “Columnstore Indexes” in SQL
Server Books Online.

A consideration for using columnstore indexes is that a table with a columnstore index defined on any of
its columns is read-only. For most fact tables, this restriction does not affect user query workloads,
because data warehouses are designed to support reporting and analytical queries, not transaction
processing. However, when the ETL process needs to load new data or update existing fact rows, a
columnstore index must be dropped and recreated. For an unpartitioned fact table, the overhead of
recreating the columnstore index after each data load can be significant. However, if the table is
partitioned, you can use the ability to switch partitions and staging tables to load or update data without
dropping the columnstore index.

Note: Techniques for loading partitioned fact tables with columnstore indexes are
discussed in Module 4: Designing an ETL Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
3-28 Designing a Data Warehouse

Demonstration: Creating Indexes


This demonstration shows how to create indexes and assess their performance benefits.

Demonstration Steps
Create Indexes on Dimension Tables

1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. Start SQL Server Management Studio and connect to the localhost instance of the database engine
by using Windows authentication.

3. Open Indexes.sql from the D:\Demofiles\Mod03 folder.

4. Select the code under the comment Create the data warehouse, and then click Execute. This
creates a database for the demonstration.

5. Select the code under the comment Create the DimDate dimension table, and then click Execute.
This creates a time dimension table named DimDate.
6. Select the code under the comment Populate DimDate with values from 2 years ago until the end of
this month, and then click Execute. This adds rows to the DimDate table.

7. Select the code under the comment Create indexes on the DimDate table, and then click Execute.
This creates a clustered index on the surrogate key column, and non-clustered indexes on commonly
queried attribute columns.

8. Select the code under the comment Create the DimCustomer table, and then click Execute. This
creates a dimension table named DimCustomer and inserts some customer data.

9. Select the code under the comment Create indexes on the DimCustomer table, and then click
Execute. This creates a clustered index on the surrogate key column, and non-clustered indexes on
commonly queried attribute columns.

10. Select the code under the comment Create the DimProduct table, and then click Execute. This
creates a dimension table named DimProduct and inserts some product data.
11. Select the code under the comment Create indexes on the DimProduct table, and then click
Execute. This creates a clustered index on the surrogate key column, and non-clustered indexes on a
commonly queried attribute column.

View Index Usage and Execution Statistics

1. Select the code under the comment Create a fact table, and then click Execute. This creates a fact
table named FactOrder that contains more than 7.5 million rows from the existing data in the
dimension tables.

2. On the toolbar, click the Include Actual Execution Plan button.

3. Select the code under the comment View index usage and execution statistics, and then click
Execute. This enables statistics messages and queries the tables in the data warehouse to view orders
for the previous six months.

4. After query execution completes, in the results pane, click the Messages tab. Note the logical reads
from each table―the number from the FactOrder table should be considerably higher than the
dimension tables; and note the CU time and elapsed time for the query.

5. Click the Execution Plan tab, which shows a visualization of the steps the query optimizer used to
execute the query. Scroll to the right and to the bottom, and note that a table scan was used to read
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-29

data from the FactOrder table. Then hold the mouse pointer over each of the Index Scan icons for
the dimension tables to see which indexes were used.

6. Execute the selected code again and compare the results when the data is cached.

Create Indexes on a Fact Table

1. Select the code under the comment Create traditional indexes on the fact table, and then click
Execute. This creates a clustered index on the date dimension key, and non-clustered indexes on the
other dimension keys (the operation can take a long time).

2. Select the code under the comment Empty the cache, and then click Execute. This clears any cached
data.

3. Select the code under the comment Test the traditional indexes, and then click Execute. This
executes the same query as earlier.

4. Click the Messages tab and compare the number of logical reads for the FactOrders table and the
CPU and elapsed time values with the previous execution. They should all be lower.

5. Click the Execution Plan tab and note that the clustered index on the date key in the fact table was
used.

6. Execute the selected code again and compare the results when the data is cached.

Create a Columnstore Index


1. Select the code under the comment Create a copy of the fact table with no indexes, and then click
Execute. This creates an un-indexed copy of the FactOrder table named FactOrderCS.

2. Select the code under the comment Create a columnstore index on the copied table, and then
click Execute. This creates a columnstore index on all columns in the FactOrderCS table.

3. Select the code under the comment Empty the cache again, and then click Execute. This clears any
cached data.
4. Select the code under the comment Test the columnstore index, and then click Execute. This
executes the same query as earlier.

5. Click the Messages tab and compare the number of logical reads for the FactOrdersCS table and the
CPU and elapsed time values with the previous execution. They should all be lower.

6. Click the Execution Plan tab and note that the columnstore index on the fact table was used.

7. Execute the selected code again and compare the results when the data is cached.

Data Compression
SQL Server 2012 Enterprise edition supports data
compression at both page and row level. Row
compression stores all fields in a variable width
format and reduces the number of bytes used to
store each field if possible. Page compression
applies the same compression technique to rows
on a page and also identifies redundant values
and stores them only once per page. You can
apply compression to a table, and index, or a
partition.
MCT USE ONLY. STUDENT USE PROHIBITED
3-30 Designing a Data Warehouse

Data compression in a data warehouse brings the following benefits:

 Reduced storage requirements. Although results vary, on average, most data warehouses can be
compressed at a ratio of 3.5 : 1, reducing the amount of disk space required to host the data files by
more than two thirds.

 Improved query performance. Compression can improve query performance in two ways. First,
fewer pages must be read from disk, so I/O is reduced; and second, more data can be stored on a
page, and therefore cached.

When page or row compression is used, data must be compressed and decompressed by the CPU, so the
performance gains resulting from compression must be balanced by the increase in CPU workload.
However, in most adequately specified data warehouse servers, the additional workload on CPU is
minimal compared to the benefits gained by compressing the data.

Best practices for data compression in a data warehouse


When planning tables, partitions, and indexes in a data warehouse, consider the following best practices
for data compression:
 Use page compression on all dimension tables and fact table partitions.

 If performance is CPU-bound, revert to row compression on frequently accessed partitions.

Demonstration: Implementing Data Compression


This demonstration shows a comparison between an uncompressed data warehouse and an identical
compressed one.

Demonstration Steps
Create Uncompressed Tables and Indexes

1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. Use Windows Explorer to view the contents of the D:\Demofiles\Mod03 folder, and set the folder
window to Details view and resize it if necessary so that you can see the Size column.

3. Start SQL Server Management Studio and connect to the localhost instance of the database engine
by using Windows authentication.

4. Open Compression.sql from the D:\Demofiles\Mod03 folder.

5. Select the code under the comment Create the data warehouse (from line2 to line 113 in the script),
and then click Execute. This creates a database with uncompressed tables.

6. While the script is still executing, view the contents of the D:\Demofiles\Mod03 folder and note the
increasing size of DemoDW.mdf. This is the data file for the database.

Note: The log file (DemoDW.ldf) will also be growing, but you can ignore this.

7. When execution is complete (after approximately 3 minutes), view the final size of DemoDW.mdf and
return to SQL Server Management Studio.

Estimate Compression Savings

1. Select the code under the comment Estimate size saving (line 119 in the script), and then click
Execute. This uses the sp_estimate_data_compression_savings system stored procedure to
compress a sample of the FactOrder table (which consists of a clustered index and two non-clustered
indexes).
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-31

2. View the results returned by the stored procedure, noting the current size and estimated compressed
size of each index.

Create Compressed Tables and Indexes

1. Select the code under the comment Create a compressed version of the data warehouse (from
line125 to line 250 in the script), and then click Execute. This creates a database with compressed
tables and indexes.

2. While the script is still executing, view the contents of the D:\Demofiles\Mod03 folder and note the
increasing size of CompressedDemoDW.mdf. This is the data file for the database.

Note: The log file (CompressedDemoDW.ldf) will also be growing, but you can ignore this.

3. When execution is complete (after approximately 3 minutes), compare the final size of
CompressedDemoDW.mdf with DemoDW.mdf (the file for the compressed database should be
smaller) and return to SQL Server Management Studio.

Compare Query Performance

1. Select the code under the comment Compare query performance (from line255 to line 277 in the
script), and then click Execute. This executes an identical query in the compressed and uncompressed
databases and displays execution statistics.

2. When execution is complete, click the Messages tab and compare the statistics for the two queries.
The execution time statistics (the second and third set of figures labeled “SQL Server Execution Time”)
should be similar, and the second query (in the compressed database) should have used considerably
less logical reads for each table than the first.

Using Views to Abstract Base Tables


You can create views in a data warehouse to
abstract the underlying fact and dimension tables.
Although views are not always necessary, you
should consider the following guidelines for views
when planning a data warehouse:

 Create a view for each dimension and fact


table, and use the NOLOCK query hint in the
view definition. You can then use these views
for all data access from clients instead of the
base tables, which will eliminate locking
overhead and optimize concurrency.

 Create views with user-friendly view and


column names. Often, a naming convention (such as prefixing dimension tables with “dim” and fact
tables with “fact”) is used when creating the tables in a data warehouse. Naming conventions such as
this are useful for database designers and administrators, but they can confuse business users.
Creating a layer of views with user-friendly names makes it easier for users to create their own data
models and reports from the data warehouse.

 Do not include metadata columns in views. Some columns are used for ETL operations or other
administrative tasks, and can be omitted from views that will be consumed by business users. For
example, SCD current flag, start date, and end date columns may not be required for end user
reporting or data models, so you can create views that do not include them.
MCT USE ONLY. STUDENT USE PROHIBITED
3-32 Designing a Data Warehouse

 Create views to combine snowflake dimension tables. If you have included snowflake dimensions in
your dimensional model, create a view for each set of related dimension tables to create a single
logical dimension table.

 Partition-align indexed views. SQL Server supports indexed views, which can be partitioned using the
same partition scheme as the underlying table. If you use indexed views, you should partition-align
them to support partition switching that does not require the indexes on the views to be dropped
and recreated.

 Use the SCHEMABINDING option. This ensures that the underlying tables cannot be dropped or
modified in such a way as to invalidate the view unless the view itself is dropped first. The
SCHEMABINDING option is a requirement for indexes views.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-33

Lab B: Designing a Data Warehouse Physical


Implementation
Scenario
You have designed a logical schema for a data warehouse, and now you must design the physical
implementation of the database. You have been provided with a database server to host the data
warehouse and related files.

Objectives
After completing this lab, you will be able to:

 Design storage for a data warehouse solution.

 Design data warehouse data structures.

Estimated Time: 90 Minutes


Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Designing File Storage


Scenario
Before you design the physical implementation of the data warehouse database, you must assess the
available storage and decide how you will use the disks available to you. You have been assigned a
database server and can use logical drives E to N for data warehouse storage. For the purposes of this lab,
you should assume that each volume is configured as a RAID 10 array on a storage area network (SAN),
and that there are no size restrictions on the volumes (in reality, the logical drives are partitions on a
virtual disk, each with approximately 11 GB of available disk space).

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Plan Data Warehouse Storage.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.
 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab03B\Starter folder as Administrator.

 Task 2: Plan Data Warehouse Storage


 Collaborate with two or three other students.

 Review the available storage (drives E, F, G, H, I, J, K, L, M, and N on MIA-SQLBI).

 Review your database schema design from the previous lab. If you did not complete the previous lab,
review DW Schema.vsdx in the D:\Labfiles\Lab03B\Starter folder. Note any fact tables that are likely to
become very large.

 Discuss and agree a storage plan that uses some or all of the available drives and includes
considerations for all aspects of the data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
3-34 Designing a Data Warehouse

 You can refer to the information in the “Considerations for Database Files” topic in the “Designing a
Data Warehouse Physical Implementation” lesson.

 Document your planned usage of the logical drives in the AWDataWarehouse.docx document in the
D:\Labfiles\Lab03B\Starter folder.

Results: At the end of this exercise, you should have a document that contains a table describing your
planned usage for each logical volume of the data warehouse server.

Exercise 2: Designing Warehouse Data Structures


Scenario
Now that you have planned the physical storage for the data warehouse, you must design the physical
implementation of the data warehouse database. This includes planning partitions, indexes, compression,
and views.

The main tasks for this exercise are as follows:

1. Plan Partitioning.
2. Plan Indexes.

3. Plan Compression.

4. Plan Views.

 Task 1: Plan Partitioning


 Still working as a small group, discuss the data warehouse schema you reviewed in the previous
exercise and decide which tables (if any) should be partitioned.

 If you plan to partition any tables, decide which column you will partition the table on, and the range
of data values to be allocated to each partition.

 If you want to experiment with your partitioned table design, create a test database in the localhost
instance of SQL Server.
Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about partitioned tables and indexes.

 Document your partitioning plan in the AWDataWarehouse.docx document you used in the previous
exercise, including justification for your decisions.

 Task 2: Plan Indexes


 Still working as a small group, discuss the indexes (if any) you want to include in the data warehouse.

 If you plan to include any indexes, decide the columns to be indexed in each table, and the types of
index to be used.

 If you want to experiment with indexes, create a test database in the localhost instance of SQL
Server.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about indexes.

 Document your index plan in the AWDataWarehouse.docx document you used in the previous
exercise, including justification for your decisions.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 3-35

 Task 3: Plan Compression


 Still working as a small group, discuss the compression (if any) you want to include in the data
warehouse.

 If you plan to use compression, decide which tables and indexes you will compress, and the type of
compression to be used.

 If you want to experiment with compression, create a test database in the localhost instance of SQL
Server.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about compression.

 Document your index plan in the AWDataWarehouse.docx document you used in the previous
exercise, including justification for your decisions.

 Task 4: Plan Views


 Still working as a small group, discuss the views (if any) you want to include in the data warehouse.
 If you plan to include any views, decide the views to be created and the settings used for each view.

 If you want to experiment with views, create a test database in the localhost instance of SQL Server.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about views.

 Document your index plan in the AWDataWarehouse.docx document you used in the previous
exercise, including justification for your decisions.

Results: At the end of this exercise, you will have a document that contains information about your
plans for partitions, indexes, compression, and views in the data warehouse.

Question: Like in the previous lab, there is no definitive correct solution, but a sample
solution has been provided. To view the sample data warehouse implementation, run Setup
Solution.cmd in the D:\Labfiles\Lab03B\Solution folder as Administrator. Then, after the
script finishes running, use SQL Server Management Studio to connect to the MIA-SQLDW
instance of the database engine and examine the AWDataWarehouse database. You can use
the AWDataWarehouse.docx document in the D:\Labfiles\Lab03B\Solution folder as a guide
to the key features of the solution.
After spending some time reviewing the solution, what are the key aspects of the
implementation that differ from your design in the lab, and how else might you have
designed the solution?
MCT USE ONLY. STUDENT USE PROHIBITED
3-36 Designing a Data Warehouse

Module Review and Takeaways


Module Review
This module has described considerations for translating business requirements and information about
business processes into a dimensional model, and then implementing that model as a data warehouse.
Every business is different, and each has its unique challenges and processes. You should use the
techniques and guidance in this module as a starting point, but be prepared to adapt typical data
warehouse schema elements to match particular business requirements.

Question: When designing a data warehouse, is it better or worse to have a strong


background in transactional database design?
MCT USE ONLY. STUDENT USE PROHIBITED
4-1

Module 4
Designing an ETL Solution
Contents:
Module Overview 4-1

Lesson 1: ETL Overview 4-2

Lesson 2: Planning Data Extraction 4-5

Lesson 3: Planning Data Transformation 4-9

Lesson 4: Planning Data Loads 4-13

Lab: Designing an ETL Solution 4-20

Module Review and Takeaways 4-28

Module Overview
The extract, transform, and load (ETL) element of the business intelligence (BI) solution is what makes it
possible to provide up-to-date analytical and reporting data in the data warehouse. Although some ETL
processes can be simple, many BI professionals find the design and implementation of effective ETL
solutions to be the most challenging aspect of creating a BI solution.
This module describes some general considerations for designing an ETL solution, and then it discusses
specific considerations for planning data extraction, transformation, and load processes.

Objectives
After completing this module, you will be able to:

 Describe considerations for designing an ETL solution.

 Plan data extraction.

 Plan data transformations.

 Plan Data loads.


MCT USE ONLY. STUDENT USE PROHIBITED
4-2 Designing an ETL Solution

Lesson 1
ETL Overview
When planning an ETL solution, your first step is to gain a high level understanding of the data flows that
must be performed to copy data from source systems to the tables in the data warehouse. This lesson
introduces commonly used ETL architectures that you should consider, and some useful techniques for
planning and documenting data flows that will help you design and maintain your ETL solution.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe how ETL design fits into the BI project framework.

 Describe commonly used ETL architectures.

 Document high-level data flows.


 Create source to target mappings.

ETL in a BI Project
ETL design is closely related to data warehouse
design. Indeed, many data warehouse design
decisions, such as metadata columns for slowly
changing dimensions, indexes, and partitioning,
are made with ETL processes in mind. As with all
aspects of a BI solution, the ETL design is driven by
business requirements. These requirements dictate
constraints for the ETL design, including:

 The frequency with which data in the data


warehouse must be updated to reflect source
system activities.

 Which attributes must retain historical values


in slowly changing dimensions.

 Whether the ETL solution must support updates to existing fact records.

 How long data must be stored in the data warehouse for analysis and reporting before being deleted
or archived.

In addition to the business requirements, when planning an ETL solution, you need to consider the
following:

 What auditing and logging is required?

 What should be done with extracted rows that fail data validation requirements?

 How will exceptions in the data flow be handled, logged, and communicated?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-3

Common ETL Data Flow Architectures


Fundamentally, ETL is concerned with data flow
from source systems to the data warehouse. The
data flow process can be performed directly from
source to target, or it can be performed in stages.
Factors that affect the choice of data flow
architecture include:

 The number of data sources.

 The volume of data to be transferred.

 The complexity of validation and


transformation operations to be applied to
the data.

 How frequently data is generated in source systems, and how long it is retained.
 Suitable times to extract source data while minimizing the impact on performance for users.

Single-stage ETL
In a very small BI solution with few data sources and simple data requirements, it may be possible to copy
data from data sources to the data warehouse in a single data flow. Basic data validations (such as
checking for NULL fields or specific value ranges) and transformations (such as concatenating multiple
fields into a single field, or looking up a value from a key) can either be performed during extraction (for
example, in the Transact-SQL statement used to retrieve data from a source database) or in-flight (for
example, by using transformation components in an SSIS data flow task).

Two-stage ETL
In many cases, a single-stage ETL solution is not suitable because of the complexity or volume of data
being transferred. Additionally, if multiple data sources are used, it is common to synchronize loads of the
data into the data warehouse to ensure consistency and integrity across fact and dimension data from
different sources, and to minimize the performance impact of the load operations on data warehouse
activity. If the data is not ready to extract from all systems at the same time, or if some sources are only
available at specific times when others are not available, a common approach is to stage the data in an
interim location before loading it into the data warehouse.

Typically, the structure of the data in the staging area is similar to the source tables, which minimizes the
extraction query complexity and duration in the source systems. When all source data is staged, it can
then be conformed to the data warehouse schema during the load operation―either as it is extracted
from the staging tables or during the data flow to the data warehouse.

Staging the data also provides a recovery point for data load failures and enables you to retain extracted
data for audit and verification purposes.

Three-stage ETL
A two-stage data flow architecture can reduce the extraction overhead and source systems and enable a
coordinated load of data from multiple sources. However, performing validation and transformations
during the data flow into the data warehouse can affect load performance, and cause the load to
negatively affect data warehouse activity. When large volumes of data must be loaded into the data
warehouse, it is important to minimize load times by preparing the data as much as possible before
performing the load operation.

For BI solutions that involve loading large volumes of data, a three-stage ETL process is recommended. In
this data flow architecture, the data is initially extracted to tables that closely match the source system
schemas―often referred to as a “landing zone.” From here, the data is validated and transformed as it is
MCT USE ONLY. STUDENT USE PROHIBITED
4-4 Designing an ETL Solution

loaded into staging tables that more closely resemble the target data warehouse tables. Finally, the
conformed and validated data can be loaded into the data warehouse tables.

Documenting High-Level Data Flows


An important part of designing an ETL solution is
to document the data flows you will need to
implement. The diagrams and notes that
document your data flows is commonly referred
to as “source-to-target” documentation, and it
commonly starts with a simple high-level diagram
for each table in the data warehouse. The diagram
shows the source tables from which the data
warehouse table fields originate and the validation
and transformations that must be applied during
the data flow.

As a general rule, use a consistent diagramming


approach for each table, and include as much detail about validation rules, transformations, and potential
issues as you can. It is common for these high-level diagrams to start simple and be refined as the ETL
design evolves.

Creating Source to Target Mappings


As your ETL design is refined, you will start to
develop a clear idea of what fields will be
extracted, generated, and validated at each stage
of the data flow. To help you document the
lineage of the data as it flows from the source to
the data warehouse tables, you can create detailed
source-to-target mappings that show detailed
information for the fields at each stage.
A common way to create a source to target
mapping is to use a spreadsheet divided into a set
of columns for each stage in the data flow. Start
with the fields in the target table, and then work
backward to determine the required staging, landing zone, and source fields along with any validation
rules and transformations that must be applied. The goal is to create a single document in which the
origins of a field in the target table can be traced back across a row to its source.

Like high-level data flow diagrams, many BI professionals have adopted different variations of source-to-
target mapping. If the organization in which you are working does not have a standard format for this
kind of documentation, the important thing is to use a consistent format that is helpful during ETL design
and easy to understand for anyone who needs to troubleshoot or maintain the ETL system in the future.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-5

Lesson 2
Planning Data Extraction
The first stage in any ETL process is to extract data from source systems. This lesson describes some
important considerations for planning and designing data extraction processes.

Lesson Objectives
After completing this lesson, you will be able to:

 Profile source systems.

 Describe techniques for identifying new and modified rows.

 Plan data extraction windows.

Profiling Source Systems


Before you start to plan specific data extraction
operations, you should start by considering some
more general factors that affect the design of the
data extraction solution. The following are some
common considerations:

 What data sources are there, and how will the


ETL solution connect to them?

An obvious starting point is to catalog the data


sources that are required to populate the tables in
the data warehouse. You will also need to
determine the types of each data source, what
options are available to connect to them, and the
credentials that the ETL process will use to access them. Some data may be stored in proprietary
applications, for which it is difficult to obtain drivers or providers. Even when a provider is available, you
may find that you can achieve better performance by having the application export its data to text files
that can be consumed by the ETL process.

 What data types and formats are used in each source system?

Next, examine the data sources to determine data type compatibility with the target tables in the data
warehouse. In many cases, you will need to use transformations to cast or convert source data types into
compatible target data types. Common data type issues include:

o Numeric data that is stored in text format in source systems―for example, a text file exported
from an accounts system is likely to contain numeric values as text fields.

o Numeric data that is the wrong numeric type―for example, a decimal column in a source table
that must be mapped to an integer column in the target table.

o Numeric data that is in the right data type but the wrong size―for example, an integer in a
source system that must be mapped to a tinyint column in the target table.

o Variations in date and time formats―for example, a datetime column in the source that must be
mapped to a date column in the target table.
MCT USE ONLY. STUDENT USE PROHIBITED
4-6 Designing an ETL Solution

o Text data that may need to be truncated―for example, a source nvarchar(100) column mapped
to a target nvarchar(50) column.

o Variations in data encoding―a simple example is an ASCII source column mapped to a Unicode
target column (such as a varchar field mapped to an nvarchar field). More complex issues can
arise if, for example, mainframe data in EBCDIC or packed decimal encoding must be extracted.

 What data integrity and validation issues exist in the source data?

After documenting the source and target data types, examine the source data to identify any data
integrity issues. Common issues include:

o Columns with a high proportion of null values. You need to decide how to handle null values
in the data flow, by ignoring them, using ISNULL to assign an alternative value (such as
“unknown” or -1), or by redirecting them to an interim table or for assessment and cleansing.

o Lookup columns with missing values. In most relational databases, referential integrity is
enforced between foreign keys and their corresponding primary key columns. However, you
cannot always rely on this, and you should try to identify cases where a lookup value in one table
does not have a matching row in the lookup table. If such rows exist, you may need to use an
OUTER join combined with an ISNULL expression to extract all rows.

o Columns with a specific range of valid values. For example, a Gender column might be
defined with a char(1) data type. All non-null values in this column should be either M or F. You
should examine the data source to determine whether this validation rule is enforced; and if it
isn’t, you should decide how you will find and handle invalid rows in the data flow.
o Data quality issues. Spend time with a business user who understands the data in context, and
identify potential data quality issues such as mistyped or misspelled values in free-form text entry
fields, or multiple values that mean the same thing.

Tools and techniques


When examining data sources, consider using the following tools and techniques:

 Use SQL Server Management Studio to examine table metadata and data in SQL Server data sources.
 Extract a sample of data to a text file by using the Import and Export Wizard or an SSIS package, and
examine the data in Excel.

 Use the Data Profiling task in an SSIS package to gather statistics about the source data.

If the source data has potential data quality issues, consider using Data Quality Services to identify and
resolve them.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-7

Identifying New and Modified Rows


In addition to the initial data warehouse load, your
ETL processes should also minimize the amount of
data extracted to reduce load times and hardware
resource utilization. Ideally, in each data source,
the ETL process should identify rows that were
added, modified, or deleted since the previous
load cycle and limit the extraction to those rows.

Data modification time fields


When reviewing data sources, particularly
relational databases, you may find that tables
include a timestamp or datetime column that is
used to log changes to the row. In some cases, this
column can be compared to a logged extraction time and used to identify new or modified rows.
However, you should be careful to ensure that the column is used consistently by all applications that
modify data in the source system, and for all types of data modifications. It is not uncommon for these
types of column to be updated by triggers that are disabled during bulk load operations, in which case
they may not be reliable as an indicator for rows that should be extracted.

Database modification tracking functionality


Alternatively, many data sources, such as SQL Server databases, can automatically track modifications and
expose metadata about modified rows. SQL Server supports two techniques for recording data
modifications: change tracking and change data capture (CDC). Additionally, SQL Server Integration
Services in SQL Server Enterprise Edition includes components that can be used to simplify extractions of
data that is tracked by CDC.

Custom change detection functionality


Some source systems may not provide any way to track data modifications. In this case, you will either
need to extract all rows during each ETL load cycle, or create a custom solution to identify modified
records. One approach to creating this kind of solution is to generate a hashed value from all updateable
columns in each source row and store the hashed values in the staging database. You can then compare
the stored hashed value against the rows during extraction and only extract rows where the stored hashed
value does not match the generated hash value. However, before using this approach in production, you
should verify that the overhead of generating and comparing hash values does not exceed that of simply
extracting and processing all rows.
MCT USE ONLY. STUDENT USE PROHIBITED
4-8 Designing an ETL Solution

Planning Extraction Windows


To help you determine when to perform the data
extraction process, consider the following
questions:

 How frequently is new data generated in the


source systems, and for how long is it
retained?

Some business applications generate only a few


transactions per day, and store the transaction
details permanently. Others generate transient
feeds of data that must be captured in real-time.
The volume of changes and storage interval of the
source data will determine the frequency of
extraction required to support the business requirements.

 What latency between changes in source system and reporting is tolerable?

Another factor in planning data extraction timings is the requirement for the data warehouse to be kept
up to date with changes in the source systems. If real-time (or near real-time) reporting must be
supported, data must be extracted and loaded into the data warehouse as soon as possible after each
change. Alternatively, if all reporting and analysis is historical, you may be able to leave a significant
period of time (for example, a month) between data warehouse loads. However, note that you do not
need to match data extractions one-to-one with data loads. If less overhead is created in the data source
by a nightly extraction of the day’s changes than a monthly extraction, you might choose to stage the
data nightly, and then load it into the data warehouse in one load operation at the end of the month.

 How long does data extraction take?

Perform a test extraction and note the time taken to extract a specific number of rows. Then, based on
how many new and modified rows are created in a particular time period, estimate the time an extraction
would take if performed hourly, daily, weekly, or at any other interval that makes sense based on your
answers to the first two questions.
 During what time periods are source systems least heavily used?

Some data sources may be available only during specific periods, and others might be too heavily used
during business hours to support the additional overhead of an extraction process. You must work closely
with the administrators and users of the data sources to identify the ideal data extraction time periods for
each source.

After you consider these questions for all source systems, you can start to plan extraction windows for the
data. Note that it is common to have multiple sources with different extraction windows, so that the
elapsed time to stage all of the data might be several hours or even days.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-9

Lesson 3
Planning Data Transformation
Most ETL processes require data transformations to conform source data to target table schemas. This
lesson describes some important considerations for planning and designing data transformations.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for choosing where in the data flow to perform transformations.

 Describe considerations for using Transact-SQL or SSIS data flow transformations.

 Describe considerations for handling invalid rows and errors.

 Plan data flow auditing.

Where to Perform Transformations


When planning an ETL solution, you must consider
the transformations that need to be applied to the
data to validate it and conform it to the target
table schemas, and where in the ETL process these
transformations should be applied.

Performing transformations on
extraction
If the data sources support it, you can perform
transformations in the queries used to extract
data. For example, in a SQL Server data source,
you can use joins, ISNULL expressions, CAST and
CONVERT expressions, and concatenation
expressions in the SELECT query used to extract the data. In an enterprise BI solution, this technique can
be used during the following extractions:
 Extraction from the source system.

 Extraction from the landing zone.

 Extraction from the staging area.

Performing transformations in the data flow


You can use SSIS data flow transformations to transform data during the data flow. For example, you can
use lookups, derived column transformations, and custom scripts to validate and modify rows in a data
flow; and you can use merge and split transformations to combine or create multiple data flow paths. In
an enterprise BI solution, this technique can be used during the following data flows:

 Source to landing zone.

 Landing zone to staging.

 Staging to data warehouse.


MCT USE ONLY. STUDENT USE PROHIBITED
4-10 Designing an ETL Solution

Performing transformations in-place


In some cases, it might make sense to transfer data from sources into one or more database tables, and
then perform UPDATE operations to modify the data in-place before the next phase of the ETL data flow.
For example, you might extract data from one source and stage it, and then update coded values based
on data from another source that is extracted during a later extraction window. In an enterprise BI
solution, this technique can be used in the following locations:

 Landing zone tables.

 Staging tables.

Guidelines for choosing where to perform transformations


Although there is no single correct place in the data flow to perform transformations, consider the
following guidelines for designing your solutions:

 Minimize the extraction workload on source systems. This enables you to extract the data in the
shortest time possible with minimal adverse effect on business processes and applications that use the
data source.
 Perform validations and transformations in the data flow as soon as possible. This enables you to
remove or redirect invalid rows and unnecessary columns early in the extraction process and reduce
the amount of data being transferred across the network.
 Minimize the time it takes to load the data warehouse tables. This enables you to get the new data
into production as soon as possible and perform the load with minimal adverse effect on data
warehouse users.

Transact-SQL vs. Data Flow Transformations


You can use Transact-SQL statements to transform
or validate columns during extraction or in-place.
Alternatively, you can use SSIS data flow
transformations to modify the data during the
data flow. The following table lists some typical
validation and transformation scenarios, together
with information about how to use Transact-SQL
or data flow transformations to implement a
solution.

Scenario Transact-SQL Data flow transformations


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-11

Scenario Transact-SQL Data flow transformations

Data type Use the CAST or Use the Data Conversion transformation.
conversion CONVERT function.

Concatenation Concatenate fields Use the Derived Column transformation.


in the SELECT clause
of the query.

Replacing NULL Use the ISNULL Use the Derived Column transformation with an
values function. expression containing the ReplaceNull function.

Looking up Use an INNER JOIN. Use the Lookup transformation.


related values
where referential
integrity is
enforced

Looking up Use an OUTER JOIN, Use the Lookup transformation with the Ignore
related values and optionally use failure option, and then add a transformation later in
where referential ISNULL to replace the data flow to handle null values (either by replacing
integrity is not null values where them with a Derived Column or redirecting them
enforced no matching rows with a Condition Split).
exist. Alternatively, use the Redirect rows to no match
output option and handle the nulls before using a
Merge transformation to return the fixed rows to the
main data flow.

Note: Some people who are not familiar with SSIS make the erroneous assumption that the
data flow processes rows sequentially, and that transformations in a data flow are inherently
slower than set-based transformations performed with Transact-SQL. However, the SSIS pipeline
performs set-based operations on buffered batches of rows, and it is designed to provide high
performance transformation in data flows.

Handling Invalid Rows and Errors


When you design the data flow for a table, make
sure you include a plan to redirect invalid rows or
rows that cause an error to a file or table where
they can be examined later. This enables you to
troubleshoot data validation issues and errors
without failing the entire ETL process. Additionally,
include exception handling event handlers in SSIS
packages that clean up and log important
troubleshooting information in the event of an
exception.

Redirecting invalid rows in the data flow


There are a number of ways to detect and redirect
invalid rows in the data flow. Ideally, you should try to fix any validation issues (for example, by replacing
null values with a suitable alternative) and include them in the rest of the data flow, but in some cases,
you must reject a row that contains invalid data that cannot be fixed automatically. Common techniques
for detecting and redirecting invalid rows include:
MCT USE ONLY. STUDENT USE PROHIBITED
4-12 Designing an ETL Solution

 Using the Conditional Split transformation that uses expressions to validate column values, and
redirects rows to multiple outputs based on the results of the expression evaluations.

 Using the No Match Output of a Lookup transformation to redirect rows for which there is no
matching value in a related table.

 Using the Error Output of a source, transformation, or destination to redirect rows that cause data
flow errors or that will be truncated.

Handling errors
In addition to planning to redirect invalid rows, you should plan a consistent error handling and logging
solution to make it easier to troubleshoot problems with the ETL system. If you plan to deploy the SSIS
packages in project mode, the SSIS Catalog provides detailed event logging that can be used to
troubleshoot errors. However, you may want to consider creating a custom solution over which you have
greater control. A common approach is to create a table for generic error event information, and use
event handlers to log details of the package being executed, the date and time of the error, the
component that caused the exception, and any other useful diagnostic information. Additionally, you
should consider creating an error table for each target table in the ETL process, and log table-specific data
there.

Logging Audit Information


In addition to your ETL solution logging errors, it
should also include audit logging that makes it
easy to trace ETL activity and data movement. You
can use the built-in SSIS logging capabilities to
write auditing information, such as the package
execution ID, the date and time, and number of
rows extracted, the number of rows redirected
because of validation issues, and the number of
rows loaded. Additionally, for packages deployed
in project mode, the SSIS Catalog automatically
provides comprehensive logging and reporting.
In some data warehouse implementations, you
might want to implement an audit dimension. An audit dimension is a table in which the contextual
metadata for each insert or update into a data warehouse table is recorded. Every table in the data
warehouse includes a key column that references the row created in the audit dimension when the target
table was originally loaded, and a second key column that references the row created in the audit
dimension when the target table was last updated.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-13

Lesson 4
Planning Data Loads
A key challenge in loading a data warehouse is minimizing the time it takes to load a large volume of
data. Data warehouse loads often involve tens or even hundreds of thousands of rows, and although
many organizations can support loading during periods of low use or inactivity, the load operation must
still be optimized to complete within the available load window.

This lesson describes considerations and techniques to help you plan an efficient data load process.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe techniques to minimize logging during data loads.

 Describe considerations for loading indexed tables.


 Load partitioned fact tables efficiently.

Minimizing Logging
One way in which load times can be reduced is to
minimize database transaction logging during the
load operations. SQL Server uses a write-ahead
transaction log to log transactional activity in the
database, and this logging overhead can affect
load performance. However, you can use
minimally logged operations, in which only extent
allocations and metadata changes are logged, to
reduce the adverse impact of logging. In most
cases, using the TABLOCK query hint causes the
database engine to use minimal logging if it is
supported by the operation being performed and
the destination table.

Set the data warehouse recovery mode to simple or bulk-logged


The first step in ensuring that logging is minimized is to set the recovery mode of the data warehouse
database to simple or bulk-logged. This ensures that minimal logging can be used for operations that
support it. Note that setting the recovery mode affects your ability to perform a transaction log backup,
so by making this configuration change, you are creating a constraint for your data warehouse backup
strategy. However, transactional activity is rare in a data warehouse, and in most cases, a backup regime
that includes full and differential backups is likely to be the most appropriate solution for a data
warehouse.

Consider enabling trace flag 610


Trace flag 610 was introduced in SQL Server 2008 and controls logging behavior for indexed tables. When
this trace flag is enabled, tables that contain data and have a clustered index can support minimal logging
for inserts, which can significantly improve load performance for high volumes of data. Additionally,
minimal logging is used for indexed tables wherever possible, even if the TABLOCK hint is not specified.
However, you should test performance and behavior with this trace flag before using it in a production
data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
4-14 Designing an ETL Solution

Use a bulk load operation to insert data


To take advantage of minimal logging, you must use a bulk load operation to inert data.

SSIS data flow destinations and the fast load option


If you are loading data into the data warehouse table from an SSIS data flow, select the Fast Load option
in the data flow destination for the data warehouse table. This configures the INSERT statement used by
the destination to include the TABLOCK hint and causes the operation to be minimally logged if possible.
When choosing a data flow destination for a bulk load operation into a SQL Server database, consider the
following guidelines:

 Use a SQL Server Destination component when both the source data and the SSIS service are hosted
on the same server as the destination tables. This destination provides the fastest bulk load
performance for SQL Server and supports bulk load options to fine-tune load behavior. You can use
this destination only if the SSIS service is running on the same server as the data warehouse into
which the data is being loaded.

 If SSIS is hosted on a different computer from the data warehouse, use an OLE DB Destination
component. This destination supports bulk load, though some additional configuration may be
required to support bulk loading of ordered data into a clustered index. Ordered loads into clustered
indexes are discussed in the next topic.

The bulk copy program (BCP)


The bulk copy program (BCP) uses the SQL Server bulk application programming interface (API);
therefore, it supports bulk load operations. Additionally, using SQL Server native format files can improve
load performance.

The BULK INSERT statement


You can use the Transact-SQL BULK INSERT statement to load data from a text file. The statement runs
within the SQL Server database engine process and uses a bulk load operation to insert the data.

The INSERT … SELECT Statement


You can use a Transact-SQL INSERT statement with a SELECT clause to insert data into a target table from
a staging table. For example, the following code inserts data from a staging table named StagedOrders
into a fact table named FactOrders, looking up surrogate keys in dimensions tables named DimDate and
DimProduct.

INSERT INTO FactOrders WITH (TABLOCK)


(OrderDateKey, ProductKey, Quantity, SalesPrice)
SELECT d.DateKey, p.ProductKey, s.Quantity, s.SalesPrice
FROM StagedOrders AS s
JOIN DimDate AS d
ON s.DateAltKey = d.DateAltKey
JOIN DimProduct AS p
ON s.ProductAltKey = p.ProductAltKey;

The SELECT INTO Statement


The SELECT INTO statement creates a new table from data retrieved by a query. Although this technique
can occasionally be useful for staging, it is not generally used to load data into a data warehouse because
it creates a new table instead of loading data into an existing table.

The MERGE Statement


The MERGE statement combines insert and update operations to merge changes from one table into
another. In a data warehouse loading scenario, it can be used to load new rows and perform type 1
updates to a dimension table, as shown in the following code example.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-15

MERGE INTO DimProduct WITH (TABLOCK) AS tgt


USING
-- Query to return staged data
(SELECT ProductAltKey, ProductName, Color
FROM StagedProducts) AS src (ProductAltKey, ProductName, Color)
-- Match staged records to existing dimension records
ON (src.ProductAltKey = tgt.ProductAltKey)
-- If a record for this product already exists, update it
WHEN MATCHED THEN
UPDATE
SET ProductName = src.ProductName,
Color = src.Color
-- If not, insert a new record
WHEN NOT MATCHED THEN
INSERT (ProductAltKey, ProductName, Color)
VALUES (src.ProductAltKey, src.ProductName, src.Color);

Additionally, when combined with the OUTPUT clause, you can use the $action metadata column to
detect updates, and implement type 2 changes, as shown in this code example.

INSERT INTO DimCustomer WITH (TABLOCK)


(CustomerAltKey, Name, City, CurrentFlag, StartDate, EndDate)
SELECT CustomerAltKey, Name, City, 1, getdate(), NULL
FROM
(MERGE INTO DimCustomer AS tgt
USING
-- Query to return staged customer data
(SELECT CustomerAltKey, Name, City
FROM [StagedCustomer)
AS src (CustomerAltKey, Name, City)
-- Match staged customers to existing (current) dimension records
ON (src.CustomerAltKey = tgt.CustomerAltKey AND tgt.CurrentFlag = 1)
-- If a current record for this customer already exists, mark it as a type 2
change
WHEN MATCHED THEN
UPDATE
SET tgt.CurrentFlag = 0, tgt.EndDate = getdate()
-- If not, insert a new record
WHEN NOT MATCHED THEN
INSERT (CustomerAltKey, Name, City, CurrentFlag, StartDate, EndDate)
VALUES (CustomerAltKey, Name, City, 1, getdate(), NULL)
-- Now output the records you've inserted or updated
OUTPUT $action, CustomerAltKey, Name, City)
AS Type2Changes(MergeAction, CustomerAltKey, Name, City)
-- and filter them so you insert new records for the type 2 updates.
WHERE MergeAction = 'UPDATE';

Additional Reading: For more information about non-logged and minimally logged
operations, see “The Data Loading Performance Guide” at http://msdn.microsoft.com/en-
us/library/dd425070(v=sql.100).
MCT USE ONLY. STUDENT USE PROHIBITED
4-16 Designing an ETL Solution

Loading Indexed Tables


When loading tables on which indexes have been
defined, there are some additional considerations
that your ETL process must take into account.

Consider dropping and recreating


indexes for large volumes of new data
If you must load a large volume of data into an
indexed table, you might find that dropping and
recreating the indexes incurs less overhead than
inserting data into the table with the indexes in
place. Use the following table as a guideline for
deciding whether to consider dropping indexes
before a bulk insert. If the volume of new data
relative to the existing data in the table exceeds the threshold for the provided index scenario, consider
dropping the indexes before loading the data.

Indexes New data relative to existing table size

Clustered index only 30%

Clustered index plus one non-clustered index 25%

Clustered index plus two non-clustered indexes 25%

Single non-clustered index only 100%

Two non-clustered indexes 60%

Additional Reading: For more information about when to drop an index to optimize a
data load, see “Guidelines for Optimizing Bulk Import” at http://msdn.microsoft.com/en-
us/library/ms177445.aspx. This article is the source of the information in the preceding table.

Sort data by the clustering key and specify the ORDER hint
When using the BULK INSERT statement, if the table you are loading has a clustered index that you do not
intend to drop, and the data to be loaded is already sorted by the clustering column, specify the ORDER
hint, as shown in the following example. This eliminates the internal sort operation that usually occurs
when inserting data into a clustered index.

BULK INSERT Accounts FROM 'D:\Data\Accounts.csv'


WITH (
FIELDTERMINATOR = ','
, ROWTERMINATOR = '\n'
, TABLOCK
, ORDER(TransactionDate)
)

If you are using the INSERT … SELECT statement, you cannot specify the ORDER hint, but the database
engine detects that the data is ordered by the clustering key and optimizes the insert operation
accordingly. If you are using an SSIS SQL Server destination, you can specify bulk load options, including
order columns on the Advanced tab of the SQL Destination Editor dialog box. For OLE DB destinations,
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-17

you must view the Component Properties tab of the Advanced Editor for the destination, and add the
ORDER(ColumnName) hint to the FastLoadOptions property.

Columnstore indexes make the table read-only


If you have designed tables with columnstore indexes, you should bear in mind that a table with
columnstore index cannot be modified. To load data into a table with a columnstore index, you must
either drop the index before loading the new data, or load the data into a new table with the same
schema as the indexed table and use a UNION clause to present the indexed historical rows and non-
indexed updateable rows as a single logical table to client applications. If you decide to use two tables,
you should periodically drop the columnstore index and merge the updateable data into the historical
table.

Alternatively, consider partitioning the table and using the partition switch technique described in the
next topic to load new data.

Loading Partitioned Fact Tables


In Module 3: Designing a Data Warehouse, you
learned how fact tables can be partitioned to
simplify management and data loads. When using
partitioned fact tables, you should consider the
following guidelines that will affect your ETL
design:
 Switch loaded tables into partitions.

 Partition-align indexed views

Switch loaded tables into partitions


After you create partitioned fact tables, you can
optimize data load operations by switching a
loaded table into an empty partition. This technique can be used to load a partition from a table that:
 Has the same schema as the partition, including column names and data types.

 Has the same indexes as the partition, including columnstore indexes.

 Has the same compression setting as the partition.

 Has a check constraint that uses the same criteria as the partition function.

 Is stored on the same filegroup as the partition.

To use this technique to load new data into a partition, maintain an empty partition at the end of the
table. The lower bound of the partition range for the empty partition should be the date key value for the
next set of data to be loaded. The basic technique to load new data into a partition uses the following
procedure:

1. If each partition is stored on its own filegroup, add a filegroup to the database and set it as the next
used filegroup for the partition scheme.

2. Split the empty partition at the end of the table, specifying the key for the upper bound of the data to
be loaded. This should create an empty partition for the new data, and a second empty partition to
be maintained at the end of the table for the next load cycle.
MCT USE ONLY. STUDENT USE PROHIBITED
4-18 Designing an ETL Solution

3. Create a table on the same filegroup as the second to last, empty partition, with the same columns
and data types as the partitioned table. For fastest load performance, create this table as a heap (a
table with no indexes).

4. Bulk insert the staged data into the load table you created in the previous step.

5. Add a constraint that checks that the partitioning key column values are within the range of the
target partition to the load table.

6. Add indexes to the load table that match the indexes on the partitioned table.

7. Switch the partition and the load table.

8. Drop the load table.

This technique works best when the table is partitioned on a date key that reflects the data warehouse
load cycle, so each new load is performed into a new partition. However, it can also be used when
partitions do not match load intervals:

 When partitions are based on more frequent intervals than load cycles (for example, each partition
holds a week’s worth of data, but the data is loaded monthly), you can switch multiple load tables
into multiple partitions.

 When partitions are based on less frequent intervals than load cycles (for example, each partition
holds a month’s worth of data, but the data is loaded daily), you can:
o Create a new partition for the load and then merge it with the previous partition.

o Switch out a partially loaded partition, drop the indexes on the partially populated load table,
insert the new rows, recreate the indexes, and switch the partition back in. This technique can
also be used for late arriving facts (rows that belong in partitions that have previously been
loaded) and updates.

Additional Reading: For more information about loading partitioned fact tables, see
“Loading Bulk Data into a Partitioned Fact Table” at http://technet.microsoft.com/en-
us/library/cc966380.aspx.

Partition-align indexed views


If you plan to use indexed views in the data warehouse, align the indexes to the partitions on the
underlying table. When indexed views are partition-aligned, you can switch partitions without having to
drop and recreate the indexes on the views.

Demonstration: Loading a Partitioned Fact Table


This demonstration shows how to use the partition switching technique to load a fact table.

Demonstration Steps
Split a Partition

1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. Start SQL Server Management Studio and connect to the localhost instance of the database engine
by using Windows authentication.
3. Open Partitions.sql from the D:\Demofiles\Mod04 folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-19

4. Select the code under the comment Create a database (from line 1 to line 45), and then click
Execute. This creates a database with a partitioned fact table, on which a columnstore index has been
created.

5. Select the code under the comment View partition metadata, and then click Execute. This shows
the partitions in the table with their starting and ending range values, and the number of rows they
contain. Note that the partitions are shown once for each index (or for the heap if no clustered index
exists). Note that the final partition (4) is for key values of 20020101 or higher and currently contains
no rows.

6. Select the code under the comment Add a new filegroup and make it the next used, and then
click Execute. This creates a filegroup, and configures the partition scheme to use it for the next
partition to be created.

7. Select the code under the comment Split the empty partition at the end, and then click Execute.
This splits the partition function to create a new partition for keys with the value 20030101 or higher.

8. Select the code under the comment View partition metadata again, and then click Execute. This
time the query is filtered to avoid including the same partition multiple times. Note that the table
now has two empty partitions (4 and 5).

Create a Load Table


1. Select the code under the comment Create a load table, and then click Execute. This creates a table
on the same filegroup as partition 4, with the same schema as the partitioned table.

2. Select the code under the comment Bulk load new data, and then click Execute. This inserts the
data to be loaded into the load table (in a real solution, this would typically be bulk loaded from
staging tables).

3. Select the code under the comment Add constraints and indexes, and then click Execute. This adds
a check constraint to the table that matches the partition function criteria, and a columnstore index
that matches the index on the partitioned table.

Switch a Partition
1. Select the code under the comment Switch the partition, and then click Execute. This switches the
load table with the partition on which the value 20020101 belongs. Note that the required partition
number is returned by the $PARTITION function.
2. Select the code under the comment Clean up and view partition metadata, and then click Execute.
This drops the load table and returns the metadata for the partitions. Note that partition 4 now
contains two rows―these are the rows that were inserted into the load table.
MCT USE ONLY. STUDENT USE PROHIBITED
4-20 Designing an ETL Solution

Lab: Designing an ETL Solution


Scenario
You have designed a data warehouse for Adventure Works Cycles and must now design the ETL processes
that will be used to load data from source systems into the data warehouse. You have decided to focus
your design on the Reseller Sales and Internet Sales dimensional models in the data warehouse, so you
can ignore the financial accounts and marketing campaigns fact tables and their related dimension tables.
The source data is in a number of sources, and you must examine each one to determine the columns and
data types and discover any data validation or quality issues. Then you must design the ETL data flows for
the tables involved in the Reseller Sales and Internet Sales dimensional models. Finally, you must design
SSIS packages to load data into the partitioned fact tables.

Objectives
After completing this lab, you will be able to:
 Prepare for ETL design.

 Create source-to-target documentation.

 Use SSIS to load a partitioned fact table.


Estimated Time: 120 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Preparing for ETL Design


Scenario
The data for the Reseller Sales and Internet Sales dimensional models currently exists in multiple SQL
Server source databases. Before designing the ETL process, you must explore these data sources and
gather information about the data that must be extracted from them.
The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Examine the Data Warehouse Schema.

3. Explore Data Sources.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to

 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab04\Starter folder as Administrator.

 Task 2: Examine the Data Warehouse Schema


 Collaborate with two or three other students.

 Use Microsoft Visio to open the DW Schema.vsxd diagram in the D:\Labfiles\Lab04\Starter folder,
and then examine the Reseller Sales and Internet Sales dimensional models.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-21

 Note that these diagrams indicate the columns in the dimension and fact tables, and the slowly
changing dimension (SCD) type for historical dimension attributes.

 Use SQL Server Management Studio to examine the columns and data types in the following tables in
the AWDataWarehouse database in the MIA-SQLDW instance of the database engine:

o dbo.DimCustomer

o dbo.DimDate

o dbo.DimProduct

o dbo.DimPromotion

o dbo.DimReseller

o dbo.DimSalesperson

o dbo.DimSalesTerritory

o dbo.FactInternetSales

o dbo.FactResellerSales

 Task 3: Explore Data Sources


 In SQL Server Management Studio, explore the following tables in the InternetSales database:
o dbo.SalesOrderHeader

o dbo.SalesOrderDetail

o dbo.Customer
o dbo.StateOrProvince

o dbo.Country

 These tables provide the source data for the following tables in the data warehouse:
o dbo.FactInternetSales

o dbo.DimCustomer

Note: Total product cost for a sales order is calculated by multiplying the unit cost for each order line
item by the ordered quantity. Similarly, a sales amount is calculated by multiplying the unit price by the
quantity.

 Explore the following tables in the ResellerSales database:

o dbo.SalesOrderHeader

o dbo.SalesOrderDetail

o dbo.Reseller

o dbo.BusinessType

o dbo.SalesEmployee

o dbo.SalesTerritory

o dbo.SalesRegion

o dbo.StateOrProvince

o dbo.Country

 These tables provide the source data for the following tables in the data warehouse:
MCT USE ONLY. STUDENT USE PROHIBITED
4-22 Designing an ETL Solution

o dbo.FactResellerSales

o dbo.DimReseller

o dbo.DimSalesperson

o dbo.DimSalesTerritory

Note: Total cost and sales amount for reseller orders are calculated the same way as for Internet orders.
The sales territory for a sales order is determined by the sales territory where the reseller placing the order
is located, not by the sales territory assigned to the salesperson. Sales territories are often reassigned
between salespeople, but resellers stay within a single sales territory.

 Explore the dbo.Promotions table in the Marketing database. This table provides the source data
for the DimPromotion table in the data warehouse.

Note: The MarketingPromotion column in the SalesOrderHeader table in the InternetSales database
contains the PromotionID value from this table when an order is placed in response to a promotion.
When no promotion is associated with the order, the MarketingPromotion column contains a NULL
value.
 Explore the following views in the ProductsMDS database:

o mdm.Product

o mdm.ProductSubcategory
o mdm.ProductCategory

 These views provide the source data for the DimProduct table in the data warehouse.

Note: This database represents a master data hub for the product data. This data is replicated to the
InternetSales and ProductSales databases, but the ProductsMDS database contains the master version
of the data.

Results: At the end of this exercise, you will have examined the data sources for the ETL process.

Exercise 2: Creating Source to Target Documentation


Scenario
Now that you have examined the source data, you can start to design the ETL data flows. Initially, you
need to create some high-level data flow diagrams. Then you will create more detailed source to target
mappings.

The main tasks for this exercise are as follows:

1. Review High-Level Data Flow Diagrams.

2. Create High-Level Data Flow Diagrams.

3. Review Detailed Source to Target Maps.

4. Create a Detailed Source to Target Map.

 Task 1: Review High-Level Data Flow Diagrams


 Still working as a small group, use Microsoft Visio to open the Source to Target Diagrams.vsdx
document in the D:\Labfiles\Lab04\Starter folder.

 On the DimCustomer page, view the data flow for the DimCustomer table, noting the following
details:
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-23

o The data flow is shown from the Customer table in the InternetSales database to the
DimCustomer table (which is in the AWDataWarehouse database).

o The steps that need to be performed during the data flow are documented next to the data flow.

o Data from the StateOrProvince and Country tables is added to the data flow during lookup
steps.

o The details of the SCD columns are shown next to the relevant steps.

 Review the other pages in the Visio document, noting the details documented for each data flow.

 Task 2: Create High-Level Data Flow Diagrams


 In the Source to Target Diagrams.vsdx Visio document, add a page named DimReseller that contains
a data flow diagram for the DimReseller table.

 The diagram should include details of:

o Source tables, including lookup operations.


o Steps that need to be performed during the data flow.

o SCD attributes.

 Add another page to document the high-level data flow for the FactResellerSales table.

 Task 3: Review Detailed Source to Target Maps


 Use Microsoft Excel to open the Source to Target Mapping.xlsx workbook in the
D:\Labfiles\Lab04\Starter folder.

 On the DimCustomer worksheet, scroll to the right to view the Data Warehouse section of the map,
and note that it contains the columns in the DimCustomer table. Each row documents a data flow
from a source column to a column in the DimCustomer table.

 Scroll back to the left, and note that the Source section of the worksheet contains details of the
source fields that are extracted from tables in the InternetSales database.

 Examine the Landing Zone section of the worksheet, and note that it contains details of the tables
that the source data is initially extracted to, together with any validation rules or transformations that
are applied during the extraction.

 Examine the Staging section of the worksheet, and note that it contains details of the staging tables
that are created from the extracted data in the landing zone, together with any validation rules or
transformations that must be applied to the data.

 Examine the FactInternetSales worksheet, and note that it documents the data flow for each column
in the FactInternetSales table.

 Task 4: Create a Detailed Source to Target Map


 In the Source to Target Mappings.xlsx workbook, on the FactResellerSales worksheet, note that the
columns in the Data Warehouse section have already been entered.

 Complete the source to target map for the FactResellerSales table.


 You should design a data flow in which the source data is initially extracted into appropriately named
landing zone tables and then transformed and loaded into staging tables before being loaded into
the data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
4-24 Designing an ETL Solution

Results: At the end of this exercise, you will have a Visio document that contains high-level data flow
diagrams and an Excel workbook that contains detailed source-to-target documentation.

Exercise 3: Using SSIS to Load a Partitioned Fact Table


Scenario
A colleague has implemented an SSIS package to load staged Internet sales fact data into the partitioned
FactInternetSales table. Now you must implement a similar package to load the FactResellerSales table.

The table is partitioned on the ShipDateKey column, and each data load contains only records for orders
that were shipped after the previous data load. The data for the current year is partitioned by month. Data
loads are performed on the last night of the month.

The main tasks for this exercise are as follows:

1. View FactInternetSales Partitions.


2. View Staged Internet Sales Data.

3. Review an SSIS Package to Load FactInternetSales.

4. Create an SSIS Package to Load FactResellerSales.

 Task 1: View FactInternetSales Partitions


 Use SQL Server Management Studio to open and execute the View FactInternetSales Partitions.sql
script in the D:\Labfiles\Lab04\Starter folder.

 Review the results, which include the following information:


o The partition scheme and partition function used to partition the FactInternetSales table.

o The partitions in the table and the filegroups on which they are stored.

o The start and end key values for each partition.


o The number of rows in each partition.

 Make a note of the details for the last partition in the table (which should currently contain no rows).

 Task 2: View Staged Internet Sales Data


 Use SQL Server Management Studio to view the tables in the Staging database, noting that it
contains the following tables (and some others):

o dbo.ExtractLog. A metadata table used to log data extractions.

o lz.InternetSalesOrderDetails. A landing zone table that contains data extracted from the
SalesOrderDetails table in the InternetSales database.

o lz.InternetSalesOrderHeader. A landing zone table that contains data extracted from the
SalesOrderHeader table in the InternetSales database.

o stg.FactInternetSales. A staging table that contains transformed data from the landing zone
tables that is ready to be loaded into the FactInternetSales table.

 View the contents of the stg.FactInternetSales table, noting the number of records it contains. Also
note that it contains alternate (business) keys for the dimension members that have already been
loaded into the dimension tables in the data warehouse.

 Task 3: Review an SSIS Package to Load FactInternetSales


 Use SQL Server Data Tools (Visual Studio) to open the LoadPartition.sln solution in the
D:\Labfiles\Lab04\Starter folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-25

 Open the LoadFactInternetSales.dtsx package and view the Control Flow tab.

Note: The control flow for this package is deliberately simplified to focus on the task required to load
data into a partitioned table. A production package would also include tasks to log audit information,
validate the data, and manage exceptions.

 View the variables in this package. They include the following user variables:

o Filegroup. Used to store the name of the filegroup containing the last (currently empty) partition
of the table (into which the staged data will be loaded).

o LastBoundary. Used to store the starting key value for the last partition currently in the table.

o NextBoundary. Used to store the key value to be used for the start of the new partition that will
be created to maintain an empty partition at the end of the table.

o PartitionNumber. Used to store the partition number of the last (currently empty) partition in
the table (into which the staged data will be loaded).

o SQL_AddConstraintAndIndex. A Transact-SQL statement to create a constraint and an index on


the table into which the data will be loaded before it is switched into the partitioned table. The
statement includes LastBoundary and NextBoundary placeholder text that will be replaced by
expressions at run time.

o SQL_AddPartition. A Transact-SQL script that adds a filegroup and a file to the database,
configures the PS_FactInternetSales partition scheme to use the new filegroup for the next
partition, and then splits the PF_FactInternetSales partition function to create a new partition
that starts at the NextBoundary value.
o SQL_CreateLoadTable. A Transact-SQL script that creates a table with the same structure and
compression as FactInternetSales on the filegroup that contains the last (empty) partition
(indicated by a file_group placeholder value, which will be replaced at run time).
o SQL_SwitchPartition. A Transact-SQL script that switches the loaded table with the empty
partition indicated by a partitionnumber placeholder value, which will be replaced at run time.

 Double-click the Get Partition Info task to view its editor, and note the following details:
o On the General tab, the result set is a single row returned through the MIA-
SQLDW.AWDataWarehouse connection manager from a direct input query that retrieves the
partition number, filegroup, and boundary values for the last partition in FactInternetSales.

o On the Result Set tab, the PartitionNumber, FileGroup, and LastBoundary values returned in
the result set are mapped to the User::PartitionNumber, User::Filegroup, and
User::LastBoundary variables, respectively.

 View the editor for the Get Next Boundary task, and note that it retrieves the LastExtract value for
the InternetSales data source from the dbo.ExtractLog table in the Staging database (converted to
a varchar with style 112 to format it as YYYYMMDD, and assigned the alias NextBoundary). The
NextBoundary value in the result set is then mapped to the User::NextBoundary variable.

 View the editor for the Transact-SQL to Add Filegroup task, and note that it contains an expression
to replace the placeholder text in the User::SQL_AddPartition variable with the contents of the
User::NextBoundary variable. You can click Evaluate Expression to see the resulting Transact-SQL
code when the default variable values are used.

 View the editor for the Add next filegroup task, and note that it uses the MIA-
SQLDW.AWDataWarehouse connection manager to execute the query in the
User::SQL_AddPartition variable (which was set in the previous task).
MCT USE ONLY. STUDENT USE PROHIBITED
4-26 Designing an ETL Solution

 View the editor for the Transact-SQL to Create Load Table task, and note that it contains an
expression to replace the placeholder text in the User::SQL_CreateLoadTable variable with the
contents of the User::Filegroup variable. You can click Evaluate Expression to see the resulting
Transact-SQL code when the default variable values are used.

 View the editor for the Create Load Table task, and note that it uses the MIA-
SQLDW.AWDataWarehouse connection manager to execute the query in the User::SQL_
CreateLoadTable variable (which was set in the previous task).

 Double-click the Load Staged Data task to view its data flow. Note that this data flow extracts data
from the stg.FactInternetSales table in the staging database, and then it uses a series of lookup
tasks to find the surrogate dimension keys for each alternate key in the source data, before loading
the data into the load table created by the Create Load Table task examined previously.

Note: To simplify the lab, the data flow does not redirect rows with non-matching alternate keys. In a
production system, you should design the data flow to include non-match outputs from the lookup tasks
or a conditional split to redirect any rows that have no matching dimension records to a holding table for
reconciliation.

 Click the Data Flow design surface and press F4 to view the Properties pane. Note that the Delay
Validation property for the Load Staged Data task is set to True. This is necessary because the table
referenced in the Load Table destination does not exist when package execution starts―it is created
dynamically at run time.

Note: To create the Load Table destination, the developer used a Transact-SQL script to create a table
named LoadInternetSales in the AWDataWarehouse database and used it as the destination table at
design time. After the data flow implementation was complete and the DelayValidation property set to
True, the table was dropped from the AWDataWarehouse database.
 Return to the Control Flow design surface, and view the editor for the Transact-SQL to Add
Constraint and Index task. Note that it contains an expression to replace the placeholder text in the
User::SQL_AddConstraintAndIndex variable with the contents of the User::LastBoundary and
User::NextBoundary variables. You can click Evaluate Expression to see the resulting Transact-SQL
code when the default variable values are used.

 View the editor for the Add Constraint and Index task, and note that it uses the MIA-
SQLDW.AWDataWarehouse connection manager to execute the query in the User::
SQL_AddConstraintAndIndex variable (which was set in the previous task).

 View the editor for the Transact-SQL to Add Constraint and Index task and note that it contains an
expression to replace the placeholder text in the User::SQL_SwitchPartition variable with the
contents of the User::PartitionNumber variable. You can click Evaluate Expression to see the
resulting Transact-SQL code when the default variable values are used.
 View the editor for the Switch Partitions task, and note that it uses the MIA-
SQLDW.AWDataWarehouse connection manager to execute the query in the User::
User::SQL_SwitchPartition variable (which was set in the previous task).

 View the editor for the Drop Load Table task and note that it executes a DROP TABLE statement to
drop the dbo.LoadInternetSales table in the AWDataWarehouse database.

 Run the package and observe the control flow as it executes. After execution completes, stop
debugging.

 Use SQL Server Management Studio to re-execute the View FactInternetSales Partitions.sql script
you ran earlier, and note that the staged rows have been loaded into what was the last partition, and
that a new empty partition has been added to the end of the table.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 4-27

 Task 4: Create an SSIS Package to Load FactResellerSales


 Use SQL Server Management Studio to open and execute the View FactResellerSales Partitions.sql
script in the D:\Labfiles\Lab04\Starter folder, and note partitions that currently exist in the
FactResellerSales table.

 View the lz.ResellerSalesOrderHeader, lz.ResellerSalesOrderDetails, and stg.FactResellerSales tables in


the Staging database to see the staged data for the FactResellerSales table.

 In the LoadPartition SSIS project, add an SSIS package named Load FactResellerSales.dtsx and
implement the load process for the partitioned FactResellerSales table.

 When developing the package, note the following tips:

o The package should be very similar in design to the Load FactInternetSales.dtsx package.

o The D:\Labfiles\Lab04\Starter\Code Snippets folder contains text files from which you can copy
and paste suitable code for the variables and tasks you will need to create.

o To create a load table to use during development, use SQL Server Management Studio to
generate a CREATE TABLE script from the FactResellerSales table, and then change the table
name to LoadResellerSales before executing it. Remember to drop the table after you complete
the data flow implementation.

o If you test your package and it fails, you can re-run the Setup.cmd batch file in
D:\Labfiles\Lab04\Starter to reset the databases to the starting point.

Results: At the end of this exercise, you will have an SSIS package that loads data into the
FactResellerSales table by using the partition switching technique.

Question: Compare the source-to-target documentation in the D:\Labfiles\Lab04\Solution


folder with your own documentation. What significant differences are there in the suggested
solutions compared to your own, and how would you justify your own solutions?

Question: How might your design of the SSIS package that loads the FactResellerSales
table have differed if the table was partitioned on OrderDateKey instead of ShipDateKey?
MCT USE ONLY. STUDENT USE PROHIBITED
4-28 Designing an ETL Solution

Module Review and Takeaways


Module Review
This module has described considerations and techniques for planning an ETL process. The ETL process is
one of the most challenging aspects of designing a BI solution, and you should use the techniques
described in this module together with the references to further reading as a starting point when
designing your own solutions.

Question: In what scenarios would you consider using Transact-SQL for transformations, and
in what scenarios are SSIS data flow transformations appropriate?
MCT USE ONLY. STUDENT USE PROHIBITED
5-1

Module 5
Designing Analytical Data Models
Contents:
Module Overview 5-1

Lesson 1: Introduction to Analytical Data Models 5-2

Lesson 2: Designing an Analytical Data Model 5-8

Lab A: Designing Analytical Data Models 5-14

Lesson 3: Designing Dimensions 5-22

Lab B: Designing Dimensions and Hierarchies 5-27

Lesson 4: Enhancing Data Models 5-35


Lab C: Enhancing Data Models 5-40

Module Review and Takeaways 5-45

Module Overview
SQL Server Analysis Services (SSAS) enables you to store and analyze data. Microsoft SQL Server 2012
provides two different Analysis Services data models that can be used within an organization:
multidimensional and tabular. Each technology is encapsulated within the single framework of the
Business Intelligence Semantic Model (BISM).

Multidimensional solutions use Online Analytical Processing (OLAP) modeling constructs such as cubes,
measures, and dimensions; and they can also be used to implement data mining solutions. Tabular data
models enable self-service BI solutions by using familiar relational modeling constructs such as tables and
relationships for modeling data. Additionally, the xVelocity in-memory analytics engine is used for storing
and calculating large volumes of data. Understanding the capabilities and differences between each
analytical data model will help you design an appropriate analytical model for your organization.

Objectives
After completing this module, you will be able to:

 Describe the options for analytical data models.

 Design an initial multidimensional or tabular data model.


 Design dimensions in a multidimensional or tabular data model.

 Design measures and cubes in a multidimensional or tabular data model.


MCT USE ONLY. STUDENT USE PROHIBITED
5-2 Designing Analytical Data Models

Lesson 1
Introduction to Analytical Data Models
Analytical data models add value to the underlying data in a data warehouse. There are many similarities
between the two different kinds of analytical data model supported by SQL Server Analysis Services, and
the user experience when using them to analyze data can be almost indistinguishable. However, there are
some fundamental differences in their capabilities and in the way that they are designed and
implemented. Understanding the capabilities of each analytical data model will help you choose which
data model to use, and help you understand the considerations that will be required when implementing
a specific data model.

Lesson Objectives
After completing this lesson, you will be able to:
 Describe how analytical model design fits into a BI project.

 Describe the BI Semantic Model architecture.

 Describe key features of multidimensional and tabular data models.


 Compare data model capabilities.

 Describe the considerations for choosing an analytical data model.

Analytical Model Design in a BI Project


Typically, the analytical model design uses a data
warehouse as a source for the creation of a
multidimensional cube, a tabular data model, or
both. The purpose of the analytical model is to
add value to the data that is provided by the data
warehouse and provide meaningful information
that the business has requested. Additionally, an
analytical data model can offer capabilities that
are not provided by a data warehouse, including
key performance indicators (KPIs), calculated
measures, and user-defined hierarchies.

From a performance point of view, both types of


model can handle large volumes of data, and complex results can be retrieved in a more efficient manner
than if queries were to be executed directly against the data warehouse. To provide high performance,
SSAS, in both multidimensional and tabular data models, provides support for partitioning. Additionally,
both can support very large models that consist of many hundreds of tables or dimensions.

Analytical modeling in a BI project is based on the business requirements for the BI solution. Before you
start building an analytical data model, it is important to ensure that you understand the business
requirements in order to:

 Establish the measures and dimensions required within the data model.

 Identify additional objects, such as KPIs, that are required to add value to the data model.

 Determine which types of model support the analytical capabilities that the business requires.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-3

 Determine whether performance of the data model is likely to be improved by partitioning.

 Establish how many instances of SQL Server Analysis Services to install and in what mode.

By considering these factors, you can determine the appropriate data model to design. Initially, this may
involve creating an analytical data model for both a multidimensional cube and a tabular data model.
From this, it can be established which data model can be used to meet the business requirement as
defined within the BI project scope.

Note: In the early adoption of SQL Server 2012, it has been common for organizations to
adopt both a multidimensional cube and a tabular data model for the same solution. This has
been observed with organizations that originally had a multidimensional cube in an earlier
version of SQL Server and want to perform a parallel migration of the solution to a tabular data
model.

BI Semantic Model Architecture


The BI Semantic Model (BISM) is a single unified
Analysis Services platform that provides the
flexibility to create solutions for any business
analysis requirement, from small-scale desktop BI
to enterprise-scale applications.
The BI Semantic Model consists of two types of
data models:

 Multidimensional data models. These have


been supported in all versions of SQL Server
Analysis Services, up to and including SQL
Server 2012 Analysis Services.
Multidimensional data models expose data
through dimensions and cubes. They use the Multidimensional Expressions (MDX) language to
implement business logic, and can provide access to data through relational online analytical
processing (ROLAP) storage or multidimensional online analytical processing (MOLAP) storage.

 Tabular data models. These are new in SQL Server 2012 Analysis Services. Tabular data models
expose data in a relational format. They use the Data Analysis Expressions (DAX) language to
implement business logic, and can provide access to data by using the in-memory xVelocity engine
(previously known as VertiPaq) or, subject to limitations in the design of the model, the Direct Access
mode, which provides access to data in the underlying data source.
The BI Semantic Model provides a single logical model for all end-user experiences by exposing the same
set of application programming interfaces (APIs) to client tools, regardless of the underlying data model
(tabular or multidimensional). Business users can connect to a tabular or multidimensional BI Semantic
Model by using Power View, Excel, PowerPivot or any other client tool. This approach benefits
organizations in various ways, including:

 It ensures the maximum flexibility for businesses, enabling them to choose client tools without having
to consider compatibility with a specific data model.

 It eliminates the risk for organizations that have already invested in a particular client technology of
becoming locked into that technology because it depends on a particular data model.
MCT USE ONLY. STUDENT USE PROHIBITED
5-4 Designing Analytical Data Models

An Overview of Analytical Data Models


The two analytical data models that can be
implemented within SSAS have many similarities
and some differences.

Multidimensional data models


Typically, multidimensional data models consume
data from a data warehouse and store it in an
OLAP database for one or more cubes. Cubes can
store detailed data; however, the power of
multidimensional data models is to create pre-
aggregated data that is persisted within the cube,
known as measures. This allows analysis to be
performed far more efficiently than could be
performed against the same data in a transactional system. Data can also be stored within dimensions; for
example, dimensions can include contextual information about customers, employees or orders, which is
organized into hierarchies for improved readability of the information. Measures can be intersected by
dimensions to provide contextual information for the aggregated data. MDX can be used to create
calculated measures and queries.

Tabular data models


Tabular data models consist of tables and relationships, much like a relational database. They are flexible
and quick to create, but they lack some of the more advanced features that multidimensional models
have. Tabular models offer four principal advantages to organizations:
 The relational model is very widely understood and relatively intuitive, so the barrier to entry for
relational database developers and information workers who want to develop analysis solutions is
lower for tabular models than for multidimensional models. This helps companies to minimize costs
by leveraging their existing expertise to create data analysis solutions.

 Tabular data models are generally simpler in design that multidimensional models, so companies can
achieve a faster time-to-deployment for BI applications.
 Business users can use Power View to create interactive visualizations from tabular data models.

 Business users can use PowerPivot in Excel to create and share their own tabular data models. If a
user-created PowerPivot workbook becomes heavily used and needs to be brought under centralized
management, it can be easily imported into a tabula Analysis Services database.

Creators of tabular models can use DAX to create measures and calculated columns and to implement
security. DAX is similar to the formulae used in Excel workbooks, so information workers who already use
Excel should find it relatively easy to learn and use. Tabular models are suitable for a wide range of BI
scenarios, from a personal desktop BI application developed in Excel to departmental or even larger
solutions, depending on the complexity of the application.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-5

Data Model Capabilities


Understanding the capabilities of each data model
provides the primary driver for choosing a specific
data model to implement within the BI project.
The following table summarizes feature availability
in each analytical data model.

Feature Multidimensional Tabular

Calculated Yes Yes


measures

Distinct Yes Yes (via


count DAX)

Hierarchies Yes Yes

KPIs Yes Yes (not


trends)

Actions Yes No

Linked Yes No
objects

Many-to- Yes No
many
relationships

Parent-child Yes Yes (via


hierarchies DAX)

Partitions Yes Yes

Perspectives Yes Yes

Semi- Yes Yes


additive
measures

Translations Yes No

User- Yes Yes


defined
hierarchies

Custom Yes No
assemblies

Custom Yes No
rollups

Writeback Yes No

Data mining Yes No


MCT USE ONLY. STUDENT USE PROHIBITED
5-6 Designing Analytical Data Models

Feature Multidimensional Tabular

Power View No Yes


support

The preceding table provides a clear indication of the current capabilities of each analytical data model.
Multidimensional data models are a mature technology available since SQL Server 7.0. If you require data
mining, writeback, or translations within a data model, this can only be facilitated by multidimensional
data models. Conversely, if Power View is the main driver for using a data model, only tabular data
models provide this support.

Additional Reading: For more information about the different features supported by
multidimensional and tabular models, see “Choosing a Tabular or Multidimensional Modeling
Experience in SQL Server 2012 Analysis Services” at http://msdn.microsoft.com/en-
us/library/hh994774.

Considerations for Choosing an Analytical Data Model


There are many considerations that affect the
decision to use a particular analytical data model.
The main consideration should be which
technology will facilitate meeting the overall
business requirements defined within the BI
project. Additional factors to consider include:

 Hardware. The storage of data for a


multidimensional model is disk based,
whereas tabular data models use the xVelocity
engine to store data in memory. Insufficient
memory in a server may favor installing a
multidimensional and data mining instance of
SQL Server Analysis Services. High memory on a server will be more conducive to installing a tabular
instance of SQL Server Analysis Services. This is also determined by the volume of data that will be
consumed by the data model.

 Complexity of calculations. Multidimensional and tabular data models provide expression


languages to implement business logic. Multidimensional data models use MDX and tabular data
models use DAX, which is similar to Excel formulas and provides a combination of functions,
operators, and values. Most standard calculations, including defining aggregate expressions and time
intelligence, can be performed by both languages. However, account-based intelligence is built into
MDX, unlike DAX, which requires manual calculation. Additionally, DAX expressions are supported
only within PowerPivot for Excel, whereas MDX can be used in SQL Server Reporting Services.
 Relationships. Relationships are defined within an analytical data model to determine how the data
between tables relate to each other. Both multidimensional and tabular data models support the
implementation of one-to-many relationships. Many-to-many relationships are supported within a
multi-dimensional model through the creation of a bridging table that creates two one-to-many
relationships between three tables. In a multidimensional mode, the bridging table is defined within
the data source view. In tabular data models, DAX expressions must be created to define many-to-
many relationships.

 Hierarchies. Multidimensional data models provide support for balanced, ragged, and parent-child
hierarchies. Tabular data models simplify the support for balanced hierarchies because attribute
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-7

relationships do not need to be defined within a tabular data model. Ragged hierarchies are not
supported by tabular data models, and parent-child hierarchies can be implemented only by using
DAX expressions.

 Self-service to BI. Multidimensional data models are presented through client applications including
SQL Server Reporting Services and Microsoft Excel, though typically the model itself is created by a
data modeling specialist. Business users can create tabular data models by using PowerPivot Excel.
The table and relationship paradigm used by tabular models makes it easier for business users to
create a data model. If this requires additional help from the IT department, they can import the
tabular data model from PowerPivot into SQL Server Data Tools to further refine the model. This
improves collaboration between the business and IT in delivering BI to the organization.
MCT USE ONLY. STUDENT USE PROHIBITED
5-8 Designing Analytical Data Models

Lesson 2
Designing an Analytical Data Model
The initial steps to create an analytical data model vary, depending on which kind of model you want to
create. However, many of the key design considerations are the same for both models.

Lesson Objectives
After completing this lesson, you will be able to:

 Compare features of data models.

 Enumerate the data sources available for multidimensional and tabular models.

 Describe considerations and guidelines for data sources.

 Describe considerations for attributes.

Creating an Initial Data Model


Although the output for creating a cube is similar
for both a multidimensional and tabular data
model, the way that the cube is created between
the two models has subtle differences.

Multidimensional data models


There are three steps required to create a basic
cube within a multidimensional data model

1. Define data sources.


Data sources provide the connection
information that will be used by SQL Server
Analysis Services to connect to the source of
the information that will be used in either creating a cube or creating a data mining solution. You can
use the Data Source Wizard to create a data source.

2. Create a data source view.

Data source views provide a logical data model of related tables, views, and queries from one or more
data sources. You can define a subset of the source data in a data source view, typically when you
want to expose a data mart from a data warehouse that will be used as a basis for creating a cube.
The data source view contains metadata that is cached so that you can develop SQL Server Analysis
Services solutions without being permanently connected to the source data. Additionally, because the
data source view is a logical model, changes can be made to the data source view without changing
the underlying schema in the data source.
Typically, you use the Data Source View Wizard to create a data source view. In this wizard, you define
the data source on which the data source view is based, and select the tables or views from the data
source that you want to include in the data source view. You can use the Data Source View Designer
to make modifications to the data source view, such as adding and renaming columns, or adding
logical relationships between tables. You can also create multiple diagrams that show different
subsets of the data source view.

3. Create a cube.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-9

You can use the Cube Wizard to create a cube. The cube can be based on an existing data source, a
cube that generates tables in the data source, or an empty cube. The Cube Wizard provides a useful
starting point, although additional work is required in the Cube Designer to refine the cube
properties. When you create a cube from an existing data source view, you specify which tables
contain measures (usually the fact tables in the data warehouse), and which measures should be
included in the cube. The selected measures are collectively grouped together within a measure
group for each fact table, and aggregations for the measures are automatically included in the cube.

Tabular data models


Tabular data models apply a different approach to creating a basic cube. The steps involved include:

1. Define data sources and import tables.

Tabular data models provide support for a much wider range of data sources than multidimensional
data models. You can include data from databases, files, cloud services, and even paste tables from
the clipboard. You use the Table Import Wizard to import the tables you want to include in your
model from one or more data sources, and a worksheet is created for each table in the project. After
you import the tables, you can view a graphical representation of the model and define relationships
between tables if necessary.

2. Create measures.
Unlike in a multidimensional model, the wizard used to import the tables does not identify fact tables
and create measures. Each table worksheet contains a data grid that contains detailed information
from the underlying table and a measures grid that enables you to define DAX calculation to
calculate and summarize data. You must use the measure grid in the table worksheets to create
measures for each column that you want to aggregate in the cube. Typically, you do this by entering
a DAX formula to aggregate the data under the corresponding numeric column. The tabular model
development environment in SQL Server Data Tools includes automatic formulae that you can apply
(such as a SUM calculation).

3. Name the cube.


By default, all cubes in a tabular data model project are named model. To change this to a more
meaningful name, you must edit the Cube name property of the tabular data model project.

Data Sources for Analytical Models


The following data sources are supported in
multidimensional and tabular data models.

Multidimensional Tabular

Access databases Access databases

SQL Server SQL Server relational


relational databases
databases

SQL Server SQL Server Parallel


Parallel Data Data Warehouse
Warehouse (PDW) 3
(PDW) 3

Oracle relational Data feeds including


databases Reporting Services
MCT USE ONLY. STUDENT USE PROHIBITED
5-10 Designing Analytical Data Models

Multidimensional Tabular
reports and Microsoft
Azure Marketplace
DataMarket

Teradata Microsoft Excel files


relational
databases

Informix PowerPivot workbook


relational
databases

IBM DB2 Analysis Services cube


relational
databases

Sybase relational Oracle relational


databases databases

Other OLEDB or Teradata relational


ODBC data databases
sources

Informix relational
databases

IBM DB2 relational


databases

Sybase relational
databases

Text files

The clipboard

Note: Setup does not install the providers that are listed for each data source. Some
providers might already be installed with other applications on your computer; in other cases,
you will need to download and install the provider.

In multidimensional solutions, it is possible to refer to multiple data sources through a data source view.
For example, this can be achieved by retrieving data from a SQL Server data source and an Oracle data
source within a data source view that has a primary data source to SQL Server, in which a named query
uses an OPENROWSET function to retrieve data from the Oracle source. This will have a performance
impact that may prolong cube processing. As a best practice, use an ETL solution to centralize the data
from all of your data sources in a SQL Server data warehouse, and base the data model on the data
warehouse.

When tabular data models process data from a data source, the following data types are supported:

 Whole Number.

 Decimal Number.

 Boolean.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-11

 Text.

 Date.

 Currency.

When importing data or using a value in a formula, even if the original data source contains a different
data type, the data is converted to one of the preceding data types. Consider the source data type before
importing data into the appropriate data model and consider explicit conversions through the use of
views to control the data type that is used within the tabular data source.

Considerations and Guidelines for Data Sources


Consider the following guidelines when planning
data sources for an analytical data model:

 Tabular data models support both relational


and non-relational data sources;
multidimensional data models support
relational data.

 In most BI solutions, it is best practice to


collate data from multiple data sources into a
central data warehouse using technologies
such as SQL Server Integration services.
Querying non-Microsoft data sources or
querying multiple data sources through a
data source view can introduce a performance overhead. This could lead to a slowdown of query
performance during cube processing that will use queries to retrieve the data from the source
systems before processing the data into a cube.

 Consider consuming data through views in the data warehouse instead of directly importing base
tables. This means that you will probably need to manually define relationships in the data model, but
it enables you to use query hints in the underlying views to optimize concurrency and abstract the
underlying table structures.

 Select only the columns required to produce an analytical data model to reduce the time it takes to
process the data.

 When using multidimensional data models, define data source views against a single data source.
Multiple heterogeneous data sources slow down the processing of data.

 When using tabular data models, consider placing filters on the source data to reduce the data used
during development. You can remove the filters before the tabular data model is deployed to a
production server.

 When using multidimensional data models, the default aggregation for measures is a Sum function.
MCT USE ONLY. STUDENT USE PROHIBITED
5-12 Designing Analytical Data Models

Considerations for Attributes


Dimension attributes are used by business users to
aggregate and filter measures. One of the first
tasks you must perform after creating an initial
data model is to define which attributes will be
available for users to use in their analysis.

Multidimensional model attributes


When you create a multidimensional cube, the
columns in each of the dimension tables are
available for you to use as dimension attributes. By
default, only key columns are included as
attributes in each dimension, but you can easily
add the other columns to the list of dimension
attributes in the Dimension Designer.

There are many attribute properties that you can configure in a multidimensional data model, important
properties are shown in the following table.

Category Property

Basic  Basic attribute properties include Name, Description, and ID. Additional
properties include:
 Type. The Type property defines any additional special attribute associated with
the selected dimension type to the dimension. The Type property contains a
drop-down list of preset dimension types. Each dimension type will add
attributes. For example, if you select the Type as Account, it can optionally add
attributes such as AccountName and AccountNumber.
 Usage. When you define dimension attributes, you can set how the attribute will
be used within the dimension. A setting of Key means that the attribute is a key
column to which other attributes relate to. A setting of Parent means that the
attribute is part of a parent-child relationship that exists within the dimension.

Advanced There are properties that are available to configure attributes. Important attributes
include:
 AttributeHierarchyEnabled. When set to true, it allows the attribute to be
displayed within the default attribute hierarchy. This means that each attribute
acts as a flat hierarchy with one level that shows all member values within the
attribute. If you use an attribute within a user-defined hierarchy, consider setting
this to false.
 AttributeHierarchyVisible. Determines if the attribute is visible. If set to True,
the attribute is visible. If set to false, the attribute is not visible, but it can still be
used within MDX statements or user-defined hierarchies.
 IsAggregatable. Specifies whether the values of the attribute members can be
aggregated.
 OrderBy. Controls how attributes are sorted in an attribute hierarchy. If the
NameColumn setting is defined, attributes are ordered by the values specified
in the NameColumn property.

Parent-Child  MembersWithData. In a parent-child dimension, this property controls whether


the parent should display the non-leaf level members. The default setting of
NonLeafDataHidden will hide the data that is not directly associated with the
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-13

Category Property
member. The setting of NonLeafDataVisible will display the data that is both
directly and indirectly associated with the member.
 NamingTemplate. The naming template allows you to define custom names for
each level of the parent-child hierarchy. In an Employee dimension, the first
level may be Executives, the second level Senior Management, and so on.

Tabular model attributes


In a tabular model, each table is potentially a dimension by which measures can be aggregated and
filtered―including fact tables. By default, all columns in all tables are exposed as attributes in the model.
Tabular data models have fewer options for configuring attributes, and you can control their behavior by
setting properties. Basic properties include:

 Column Name.
 Data Format.

 Data Type.

 Description.
 Hidden.

 Sort By Column.

Additionally, you can use reporting properties to define the behavior of the tabular data model within
client reporting tools including:

 Default Image.

 Default Label.
 Image URL.

 Keep Unique Rows.

 Row Identifier.
 Summarize By.

 Table Detail Position.


MCT USE ONLY. STUDENT USE PROHIBITED
5-14 Designing Analytical Data Models

Lab A: Designing Analytical Data Models


Scenario
You have created a data warehouse for Adventure Works Cycles, and now you need to create analytical
data models for business users to consume in client applications and reports. You have decided to create
prototypes for the analytical cube in both multidimensional and tabular models so that you can evaluate
both options.

Objectives
After completing this lab, you will be able to:

 Design a multidimensional data model.

 Design a tabular data model.

Estimated Time: 60 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Creating a Multidimensional Data Model


Scenario
You have decided to start by creating an initial multidimensional model that includes the required
measures and dimension attributes based on views in the data warehouse.
The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Create a Multidimensional Analysis Services Project.


3. Create a Data Source.

4. Create a Data Source View.

5. Create Relationships.

6. Create a Cube.

7. Configure Measures.

8. Configure Attributes.

9. Deploy and Browse the Model.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab05A\Starter folder as Administrator.

 Task 2: Create a Multidimensional Analysis Services Project


 Use SQL Server Data Tools to create an Analysis Services Multidimensional and Data Mining project
named AWSalesMD in the D:\Labfiles\Lab05A\Starter folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-15

 Task 3: Create a Data Source


 Create a data source named AW Data Warehouse that uses the following credentials to connect to
the AWDataWarehouse database on the MIA-SQLDW instance of SQL Server.

o User Name: ADVENTUREWORKS\ServiceAcct

o Password: Pa$$w0rd

 Task 4: Create a Data Source View


 Create a data view source named AW Data Warehouse.

 The data source should include the following views from the AW Data Warehouse data source:

o dw_views.Customer

o dw_views.Date

o dw_views.InternetSales

o dw_views.Product

o dw_views.Reseller
o dw_views.ResellerSales

o dw_views.SalesPerson

o dw_views.SalesTerritory

 Task 5: Create Relationships


 In the data source view, drag foreign key fields from the fact tables to primary key fields in the
dimension tables to create the following relationships.

Fact Table Foreign Key Dimension Table Primary Key

ResellerSales OrderDateKey Date DateKey

ResellerSales ProductKey Product ProductKey

ResellerSales ShipDateKey Date DateKey

ResellerSales ResellerKey Reseller ResellerKey

ResellerSales SalesPersonKey Salesperson EmployeeKey

ResellerSales SalesTerritoryKey SalesTerritory SalesTerritoryKey

InternetSales OrderDateKey Date DateKey

InternetSales ProductKey Product ProductKey

InternetSales ShipDateKey Date DateKey

InternetSales CustomerKey Customer CustomerKey

4. After all the relationships are created, save the data source view.
MCT USE ONLY. STUDENT USE PROHIBITED
5-16 Designing Analytical Data Models

 Task 6: Create a Cube


 Use the Cube Wizard to create a new cube from the existing tables in the AW Data Warehouse data
source view.

 Use the InternetSales and ResellerSales tables as the measure group tables.

 Include the following measures:

o Internet Sales;
 Order Quantity.
 Unit Price.
 Product Unit Cost.
 Total Product Cost.
 Sales Amount.
o Reseller Sales;
 Order Quantity - Reseller Sales.
 Unit Price - Reseller Sales.
 Product Unit Cost - Reseller Sales.
 Total Product Cost - Reseller Sales.
 Sales Amount - Reseller Sales.
 Include all dimension tables.

 Name the cube Sales.

 Task 7: Configure Measures


 In the Sales cube, rename the following measures in the Internet Sales measure group:

o Order Quantity: Internet Quantity.

o Unit Price: Internet Unit Price.


o Product Unit Cost: Internet Unit Cost.

o Total Product Cost: Internet Cost.

o Sales Amount: Internet Revenue.

 Rename the following measures in the Reseller Sales measure group:

o Order Quantity - Reseller Sales: Reseller Quantity.

o Unit Price - Reseller Sales: Reseller Unit Price.

o Product Unit Cost - Reseller Sales: Reseller Unit Cost.

o Total Product Cost - Reseller Sales: Reseller Cost.

o Sales Amount - Reseller Sales: Reseller Revenue.

 Task 8: Configure Attributes


 Edit the Date.dim dimension and set its Type property to Time.

 Create attributes in the Date dimension from the following columns in the Date table (note that
spaces are automatically added to the attribute names to make them more readable):

o DateAltKey
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-17

o MonthName

o CalendarYear

o FiscalQuarter

o FiscalYear

 Rename the Date Alt Key attribute to Date.

 Set the AttributeHierarchyVisible property of the DateKey attribute to False.

 Modify the following properties of the Month Name attribute so that month names are sorted into
month number order:

o Key Columns: Replace MonthName with MonthNumber.

o Name Column: MonthName.

o OrderBy: Key.
 Configure the dimensions described in the following table, setting the AttributeHierarchyVisible
property of the key attribute to False in each dimension.

Dimension Type Column: Attribute

Product.dim Regular  ProductName: Product


 ProductSubcategoryName: Subcategory
 ProductCategoryName: Category

Customer.dim Regular  Name: Customer


 City: City
 StateOrProvince: State Or Province
 Country: Country

Reseller.dim Regular  BusinessType: Business Type


 ResellerName: Reseller
 City: City
 StateOrProvince: State Or Province
 Country: Country

SalesTerritory.dim Regular  SalesTerritoryName: Sales Territory


 Country: Country
 Region: Region

 After you finish, save all dimensions.

 Task 9: Deploy and Browse the Model


 View the project properties and verify that the project will be deployed to the localhost Analysis
Services server to a database named AWSalesMD.

 Deploy the project, replacing the AWSalesMD database if it already exists.


 In the Cube Designer, on the Browser tab, click the Analyze in Excel button (or click Analyze in
Excel on the Cube menu) to open the cube in Excel. Enable data connections if prompted.
MCT USE ONLY. STUDENT USE PROHIBITED
5-18 Designing Analytical Data Models

 Use Excel to view the Internet Revenue and Reseller Revenue measures by the Category attribute
of the Product dimension.

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this exercise, you will have a multidimensional data model named AWSalesMD.

Exercise 2: Creating a Tabular Data Model


Scenario
You have created a multidimensional data model, and now you plan to create a tabular model that
implements the same analytical functionality.

The main tasks for this exercise are as follows:


1. Create a Tabular Analysis Services Project.

2. Import Tables from the Data Warehouse.

3. Create Relationships.
4. Create Measures.

5. Configure Attributes.

6. Deploy and Browse the Model.

 Task 1: Create a Tabular Analysis Services Project


 Use SQL Server Data Tools to create an Analysis Services tabular project named AWSalesTab in the
D:\Labfiles\Lab05A\Starter folder.

 Use the localhost\SQL2 instance of Analysis Services as the workspace server, and set the
compatibility level of the project to SQL Server 2012 SP1 (1103).

 Task 2: Import Tables from the Data Warehouse


 Open the Model.bim model, and import the following tables from the AWDataWarehouse database
on the MIA-SQLDW instance of SQL Server (use the friendly names in parentheses):

o Customer (Customer).

o Date (Date).

o InternetSales (Internet Sales).

o Product (Product).

o Reseller (Reseller).

o ResellerSales (Reseller Sales).


o Salesperson (Salesperson).

o SalesTerritory (Sales Territory).

 Task 3: Create Relationships


 Switch the model view to Diagram View, and drag foreign key fields from the fact tables to primary
key fields in the dimension tables to create the following relationships.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-19

Fact Table Foreign Key Dimension Table Primary Key

ResellerSales OrderDateKey Date DateKey

ResellerSales ProductKey Product ProductKey

ResellerSales ShipDateKey Date DateKey

ResellerSales ResellerKey Reseller ResellerKey

ResellerSales SalesPersonKey Salesperson EmployeeKey

ResellerSales SalesTerritoryKey SalesTerritory SalesTerritoryKey

InternetSales OrderDateKey Date DateKey

InternetSales ProductKey Product ProductKey

InternetSales ShipDateKey Date DateKey

InternetSales CustomerKey Customer CustomerKey

5. After all the relationships are created, save the model.

 Task 4: Create Measures


 Switch the model view back to data view, and view the Internet Sales table.

 In the measures grid, under the appropriate columns, add the following measures.

Column DAX Expression for Measure

OrderQuantity Internet Quantity:=SUM([OrderQuantity])

UnitPrice Internet Unit Price:=SUM([UnitPrice])

ProductUnitCost Internet Unit Cost:=SUM([ProductUnitCost])

TotalProductCost Internet Cost:=SUM([TotalProductCost])

SalesAmount Internet Revenue:=SUM([SalesAmount])

 In the Reseller Sales table, add the following measures.

Column DAX Expression for Measure

OrderQuantity Reseller Quantity:=SUM([OrderQuantity])

UnitPrice Reseller Unit Price:=SUM([UnitPrice])

ProductUnitCost Reseller Unit Cost:=SUM([ProductUnitCost])


MCT USE ONLY. STUDENT USE PROHIBITED
5-20 Designing Analytical Data Models

Column DAX Expression for Measure

TotalProductCost Reseller Cost:=SUM([TotalProductCost])

SalesAmount Reseller Revenue:=SUM([SalesAmount])

 After you create the measures, save the model.

 Task 5: Configure Attributes


 Switch the model view to Diagram View, and hide and rename columns in each of the tables so that
only the columns listed in the following table are visible.

 To hide multiple columns in a table, click the columns you want to hide while holding the Ctrl key,
and then right-click any selected column and click Hide from Client Tools.

 To rename a column, right-click it, and then click Rename.

Table Visible Columns

Customer  Name (renamed to Customer)


 City
 StateOrProvince (renamed to State Or Province)
 Country

Internet Sales  Internet Quantity


 Internet Unit Price
 Internet Unit Cost
 Internet Cost
 Internet Revenue

Date  DateAltKey (renamed to Date)


 MonthName (renamed to Month Name)
 CalendarYear (renamed to Calendar Year)
 FiscalQuarter (renamed to Fiscal Quarter)
 FiscalYear (renamed to Fiscal Year)

Product  ProductName (renamed to Product)


 ProductSubcategoryName (renamed to Subcategory)
 ProductCategoryName (renamed to Category)

Reseller Sales  Reseller Quantity


 Reseller Unit Price
 Reseller Unit Cost
 Reseller Cost
 Inter Reseller net Revenue

Reseller  BusinessType (renamed to Business Type)


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-21

Table Visible Columns


 ResellerName (renamed to Reseller)
 City
 StateOrProvince (renamed to State Or Province)
 Country

Sales Territory  SalesTerritoryName (renamed to Sales Territory)


 Country
 Region

 After you configure the columns, save the model.

 Task 6: Deploy and Browse the Model


 View the project properties and verify that the project will be deployed to the localhost\SQL2
Analysis Services server to a database named AWSalesTab.

 Change the Cube name project property to Sales.


 Deploy the project, replacing the AWSalesTab database if it already exists.

 On the Model menu, click Analyze in Excel and open the default perspective in Excel by using the
credentials of the current Windows user.
 Use Excel to view the Internet Revenue and Reseller Revenue measures by the Category attribute
of the Product dimension.

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this exercise, you will have a tabular data model named AWSalesTab.

Question: Which model did you find easiest to create?


MCT USE ONLY. STUDENT USE PROHIBITED
5-22 Designing Analytical Data Models

Lesson 3
Designing Dimensions
Dimensions provide the analytical business factors by which the measures in the data model can be
aggregated.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for attributes and balance hierarchies.

 Describe considerations for creating role-playing dimensions.

 Describe considerations for creating parent-child hierarchies.

Considerations for Balanced Hierarchies


Hierarchies represent how users will explore data
through dimensions within a cube. These
requirements should be established early in the BI
project when establishing the business
requirements. This information can be captured
through numerous methods, including interviews
and focus groups. Ultimately, business analysts will
capture user stories that reflect the reports that
the users desire from their data warehouse. These
requirements form the reports that are required
and the hierarchy of information as it should be
viewed from a business perspective. By
understanding the hierarchy of information, hierarchies can be created within Analysis Services to fulfill
the user experience of browsing the data as expected.

A balanced hierarchy is the most common type of hierarchy implemented. A hierarchy is balanced when
all branches of the hierarchy descend to the same level, and each member's logical parent is the level
immediately above the member. This is the most common hierarchy that is implemented within SQL
Server Analysis Services and can be implemented in both multidimensional and tabular data models. A
popular example is a calendar hierarchy that contains levels such as Calendar Year, Calendar Quarter,
and Calendar Month.

Creating natural or balanced hierarchies in tabular data models is a simple process of creating a hierarchy,
naming the hierarchy, and then dragging an attribute onto the hierarchy, with the top level dragged first,
followed by each lower level.

Creating a hierarchy in a multidimensional data model


In multidimensional solutions, the Dimension Designer provides a hierarchy pane that enables you to
create a hierarchy and drag each level into the hierarchy, with the top level dragged first, followed by
each lower level. However, an optional but important aspect of creating hierarchies is defining attribute
relationships that are used to help define where the aggregates should be created and stored to improve
the users browsing experience of the cube when drilling down to data through a hierarchy. Although
attribute relationships are not mandatory in multidimensional data models, it is highly recommended that
they are defined.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-23

When a dimension is created, by default a one-to-many relationship is created between the key attribute
and the remaining attributes within the dimension. Although they do not need to be defined, creating
attribute relationships can improve the performance of browsing data by providing information about
additional relationships between attributes that result in the selection of more effective aggregates when
the cube is processed. As a result, the cube processing time can be reduced. Defining relationships
between levels in a hierarchy enables Analysis Services to define more useful aggregations to increase
query performance and can also save memory during processing performance, which can be important
with large or complex cubes.

When you create relationships between attributes, you must also set the KeyColumns and
NameColumns properties of the attributes to ensure that each member in the hierarchy can be uniquely
identified by a combination of columns. For example, a hierarchy might include attributes and levels for
Calendar Month, Calendar Quarter, and Calendar Year. In this case, the Calendar Month attribute
should have a KeyColumns property that includes Calendar Month, Calendar Quarter, and Calendar
Year (because a specific month is unique within a quarter and year). Similarly, the Calendar Quarter
attribute should have a KeyColumns property that includes Calendar Quarter and Calendar Year,
because a quarter is unique within a specific year. If the members within the hierarchies you have created
cannot be uniquely identified, an error will occur when you try to process the dimension.

Creating hierarchies in a tabular data model


Creating hierarchies in a tabular model is much simpler than in a multidimensional model. To create a
hierarchy, view the model in Diagram View, add a hierarchy to the dimension table, and then drag the
columns you want to use as levels into the hierarchy.

Role-Playing Dimensions
Some dimensions can be reused for multiple
relationships with the same measure group or fact
table. For example, an Order table might be
related to a date dimension on both OrderDate
and ShipDate, so users can choose to aggregate
orders by the date on which they were placed, the
date on which they were shipped, or both.
Dimensions that can be used for multiple
relationships in this way are known as role-playing
dimensions.

Another example can be found in the insurance


industry, where a fact table containing the value of
insurance policies might contain a BuyerID column and a SellerID column. These IDs can relate to an
insurance broker’s dimension table on an InsuranceBrokerID column. In this case, an insurance broker
may be either the seller or a buyer of an insurance policy.

Role-playing dimensions in multidimensional data models


In multidimensional data modeling, SQL Server Analysis Services creates a role-playing dimension for each
relationship that is defined in the logical model. In the date example described earlier, two dimensions
would be created:

 Date (Order Date).

 Date (Ship Date).


MCT USE ONLY. STUDENT USE PROHIBITED
5-24 Designing Analytical Data Models

An important point to note is that, although there are two dimensions in the cube, in reality, only one
physical dimension exists, and any changes you make to the underlying dimension (such as defining
hierarchies) will be applied to all of the role-playing dimensions that are based on it.

Role-playing dimensions in tabular data models


In tabular data models, although multiple relationships between a dimension table and a fact table can be
defined, only one relationship can be active at any one time. Other relationships are marked as inactive
and cannot be used to aggregate the measures in the fact table. Additionally, in a client tool such as
Microsoft Excel, only the name of the dimension table shows, so users will find it impossible to determine
which of the relationships is being used to aggregate the data.

To implement multiple role-playing dimensions in a tabular data model, import the dimension table
multiple times and apply a friendly name to each copy that is reflective of the role-playing dimension that
the table will be used to support.

Parent-Child Hierarchies
Parent-child user-defined hierarchies create a
single-level hierarchy that is derived from a single
attribute that is defined as a parent attribute. A
parent attribute describes a self-referencing
relationship, or self-join, within a dimension table.
As a result, there is the appearance that there are
multiple ragged levels within the hierarchy.
However, the multiple levels occur as a result of
the relationship that exists between the parent
attribute and the key attribute.

An example of a parent-child hierarchy is an


Employee dimension that consists of an
EmployeeID attribute as a key attribute and a ManagerID attribute as the parent attribute. A recursive
hierarchy can be created as a result of the relationship between the key and the parent attribute.

Parent-child hierarchies in multidimensional data models


In multidimensional data modeling, if relationships within the underlying data source or logical
relationship within a data source view are defined, SQL Server Analysis Services will detect a parent-child
dimension when the dimension is created. The parent-child hierarchy must then be created within the
Hierarchy pane of the Dimension Structure tab in the Dimension Designer of a particular dimension.

Parent-child hierarchies in tabular data models


There is no native support for a parent-child hierarchy within a tabular data model. DAX expressions must
be used to define a parent-child hierarchy. DAX provides functions that can be used to create a parent-
child hierarchy. Understanding the functions available is important because parent-child hierarchies can
be commonplace within an analytical data model. To create a parent-child hierarchy in a tabular data
model, you must use the PATH, LOOKUPVALUE, and PATHITEM DAX functions.

The PATH function


The PATH function returns a value to denote the entire parent values related to the key value on which
the PATH statement is based. A calculated measure can be created that stores the value of the path item.
In the example of an Employee table, an additional column could be created named EmployeeLevels
that stores this value. The PATH function has the following syntax:
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-25

PATH(<id_columnName>, <parent_columnName>)

 <id_columnName> refers to the key column in a table. For example in an Employees table, the key
column might be EmployeeID.

 <parent_columnName> refers to the parent column for a key column in a table. In the example of an
Employees table, this might be ManagerID.

Note: There are additional DAX functions that are similar to PATHITEM, including
PATHITEMREVERSE, PATHLENGTH, and PATHCONTAINS. For more information, see SQL
Server Books Online.

The LOOKUPVALUE and PATHITEM functions


To return a specific value from the PATH function, the LOOKUPVALUE and PATHITEM function can be
used together to return more meaningful information within a parent-child hierarchy. The
LOOKUPVALUE will specifically return the value to display, and the PATHITEM function defines from
which level the value should be returned. For example, instead of returning an EmployeeID and a
ManagerID in a parent-child hierarchy, both functions could be used together to return the FirstName
and a LastName attributes of an employee.
The LOOKUPVALUE function has the following syntax:

LOOKUPVALUE( <result_columnName>, <search_columnName>, <search_value>[,


<search_columnName>, <search_value>]…)

 <result_columnName> is the name of an existing column that contains the value you want to return.

 <search_columnName> is the name of an existing column on which the lookup is performed.

 <search_value> is a value that can be used to provide a filter for the LOOKUPVALUE function. This
can include a string literal value or another function, such as PATHITEM, to filter the data.
 Optionally, additional <search_columnName> and <search_value> parameters can be defined.

The PATHITEM function has the following syntax:

PATHITEM(<path>, <position>[, <type>])

 <path> refers to a column that contains the results of the PATH function. In the example of an
Employees table, this could be a column named EmployeeLevel.

 <position> is an integer expression referring to the position of the item to be returned.

 <type> is an optional parameter that can be used to determine the data type that the result should
be returned in. A value of 0 is text―which is the default―and a value of 1 is an integer data type.

You can use the LOOKUPVALUE and PATHITEM function together to create a parent-child hierarchy. An
additional calculated measure can be created in a table named Level1 that contains the following code to
populate a value of Firstname for an employee’s immediate manager:

LOOKUPVALUE ([FirstName], [EmployeeId], PATHITEM ([EmployeeLevels], 1))

The calculated measure to create the second level of the employee hierarchy in a column named Level2 is
shown in the following code example:

LOOKUPVALUE ([FirstName], [EmployeeId], PATHITEM ([EmployeeLevels], 2))


MCT USE ONLY. STUDENT USE PROHIBITED
5-26 Designing Analytical Data Models

When the new calculated measures are defined to represent the different levels of an employee hierarchy,
you can use the Create Hierarchy button in SQL Server Data Tools to create a hierarchy, and then click
and drag each level into the new hierarchy.

Design considerations
When you design parent-child hierarchies in analytical data models, consider the following points:

 Ensure that the parent key and the child key are of compatible data types.

 Ensure that a self-join relationship exist between the parent key and the child for best query and
processing performance.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-27

Lab B: Designing Dimensions and Hierarchies


Scenario
You have created multidimensional and tabular models, and provided access to a small number of
business users to provide feedback. These users have suggested some additional ways in which they’d like
to be able to analyze the data, and you must now enhance the design of the dimensions in your models
to meet these requirements. Specifically, users want to be able to:

 View reseller sales performance by reseller geography; drilling down from countries, to states, to
cities, and to individual resellers.

 View reseller sales performance by business type, drilling down to individual resellers.

 View reseller sales performance by sales territory, drilling down from region, to country, and to
individual sales territory.

 View Internet sales by customer geography; drilling down from countries, to states, to cities, and to
individual customers.

 View both Internet and reseller sales by product; drilling down from category, to subcategory, to
individual product.
 View both Internet and reseller sales by calendar date; drilling down from year, to month, to date
based on both order dates and ship dates.

 View both Internet and reseller sales by fiscal date; drilling down from fiscal year, to fiscal quarter, to
month, to date based on both order dates and ship dates.

 View reseller sales performance by salesperson, drilling down through the sales management
structure from senior sales managers to individual sales representatives.

Objectives
After completing this lab, you will be able to:

 Design dimensions and hierarchies in a multidimensional data model.


 Design dimensions and hierarchies in a tabular data model.

Estimated Time: 60 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Creating Dimensions in a Multidimensional Data Model


Scenario
After reviewing feedback from the business users, you are now ready to enhance the multidimensional
model you previously created to meet the new requirements for dimensions and hierarchies. A colleague
has already implemented some of the requested hierarchies.

The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Explore Dimension Hierarchies.

3. Create a Balanced Hierarchy.

4. Implement a Role-Playing Time Dimension.


MCT USE ONLY. STUDENT USE PROHIBITED
5-28 Designing Analytical Data Models

5. Implement a Parent-Child Hierarchy.

6. Deploy and Browse the Model.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab05B\Starter folder as Administrator.

 Task 2: Explore Dimension Hierarchies


 Use SQL Server Data Tools to open the AWSalesMD.sln solution in the D:\Labfiles\Lab05B\Starter
folder.

 Examine the Customer.dim dimension, and note that a hierarchy named Customers By Geography
has been created.

 Examine the attribute relationships defined for this dimension. These relationships are optional, but
they can significantly improve aggregation performance when processing the dimension.

 Examine the properties of the attributes in the dimension, and note the following:

o Each attribute that is included in the Customers By Geography hierarchy is uniquely identified
by a combination of multiple columns in the KeyColumns property. For example, the City
attribute has a KeyColumns property value that includes the City, StateOrProvince, and
Country columns. This ensures that a city in the hierarchy is uniquely identified based on the city
name, the state or province, and the country―for example, two instances of Paris in the Seine
region of France are known to be the same city, while Paris in Texas, USA is a different city.

o Attributes with multiple KeyColumn values have the NameColumn and ValueColumn property
set to reflect the column that should be used for the attribute’s name and value.

o All attributes have the AttributeHierarchyVisible property set to False, so the only way to
browse the dimension is through the Customers By Geography hierarchy.
 Process the dimension, deploying the database if necessary. Then browse the Customers By
Geography hierarchy in the Browser tab of the Dimension Designer.

 Examine the Reseller.dim dimension (which has two hierarchies) and the Sales.dim Territory.dim
dimension, and note that the attributes in these dimensions have been similarly configured.

 Task 3: Create a Balanced Hierarchy


 Edit the Product.dim dimension to create a hierarchy named Products By Category that includes
the following attributes (in order):

o Category

o Subcategory

o Product
 Create the following attribute relationships:

o A flexible relationship between Product and Subcategory.

o A flexible relationship between Subcategory and Category.


 Make the existing relationship between ProductKey and Product rigid.

 Edit the properties of the attributes in the hierarchy so that the following statements are true:
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-29

o None of the attributes is visible other than in the Products By Category hierarchy.

o The product attribute is uniquely identified by the ProductKey and ProductName columns (in
that order).

o The name and value of the Product attribute are based on the ProductName column.

o The Subcategory attribute is uniquely identified by the ProductCategoryName and


ProductSubcategoryName columns (in that order).

o The name and value of the Subcategory attribute are based on the ProductSubcategoryName
column.

o The name and value of the Category attribute are based on the ProductCategoryName
column.

 After you create the hierarchy and configure the attributes, save all files in the project.

 Process the dimension, deploying the database if necessary. Then browse the Products By Category
hierarchy in the Browser tab of the Dimension Designer.

 Task 4: Implement a Role-Playing Time Dimension


 In the Cube Designer, open Sales.cube, and on the Dimension Usage tab, note that the Date
dimension is listed twice―once to represent the Order Date, and once to represent the Ship Date.
Both versions of the dimension are related to the Internet Sales and Reseller Sales measure groups
by the Date Key column.
The Date dimension is a role-playing dimension with two roles (Order Date and Ship Date). These roles
were automatically created by the Cube Wizard when the cube was created based on the two
relationships between the fact tables and the Date table in the data source view.

 Edit the Date.dim dimension, and create the following hierarchies:

o Calendar Date;
 Calendar Year.
 Month Name.
 Date.
o Fiscal Date;
 Fiscal Year.
 Fiscal Quarter.
 Month Name.
 Date.
 Create the following attribute relationships, and change the existing relationship type between
DateKey and Date to Rigid.

Source Attribute Related Attribute Relationship Type

Date Month Name Rigid

Month Name Calendar Year Rigid

Month Name Fiscal Quarter Rigid


MCT USE ONLY. STUDENT USE PROHIBITED
5-30 Designing Analytical Data Models

Source Attribute Related Attribute Relationship Type

Fiscal Quarter Fiscal Year Rigid

 Set the following properties for the attributes in the dimension.

Attribute Property Values

Calendar Year  AttributeHierarchyVisible: False

Date  AttributeHierarchyVisible: False


 KeyColumns: DateKey, and DateAltKey (in that order)
 NameColumn: DateAltKey
 ValueColumn: DateAltKey

Fiscal Quarter  AttributeHierarchyVisible: False


 KeyColumns: FiscalYear, and FiscalQuarter (in that order)
 NameColumn: FiscalQuarter
 ValueColumn: FiscalQuarter

Fiscal Year  AttributeHierarchyVisible: False

Month Name  AttributeHierarchyVisible: False


 KeyColumns: CalendarYear, and MonthNumber (in that order)
 NameColumn: MonthName
 ValueColumn: MonthName

 Set the OrderBy property of the Month Name attribute to Key. Note that because you used the
MonthNumber column in the attribute’s KeyColumns property instead of MonthName, the month
names will be displayed, but they will be sorted by the month number.
 Set the Type property of the dimension to Time by clicking the Date dimension icon above the
attributes in the Attributes pane, and changing the property in the Properties pane. Specifying a
type of Time enables time intelligence for the dimension (so Analysis Services can calculate time
intervals between values).

 Set the Type property of the following attributes:

o Calendar Year: Date > Calendar > Years.

o Date: Date > Calendar > Date.

o Fiscal Quarter: Date > Fiscal > FiscalQuarters.

o Fiscal Year: Date > Fiscal > FiscalYears.

 Save all files in the project.

 Process the dimension, deploying the database if necessary. Then browse the Calendar Year and
Fiscal Year hierarchies in the Browser tab of the Dimension Designer.

 Task 5: Implement a Parent-Child Hierarchy


 Edit the Salesperson.dim dimension, and create a new attribute from the ParentEmployeeKey
column in the Salesperson table.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-31

 Rename the Parent Employee Key attribute to Salesperson, and set its Usage property to Parent.

 Set the AttributeHierarchyVisible property of the Employee Key attribute to False.

 Process the dimension, deploying the database if necessary, and then browse the Salesperson
hierarchy in the Browser tab of the Dimension Designer.

 Task 6: Deploy and Browse the Model


 Deploy the project, replacing the AWSalesMD database if it already exists. If you are prompted for
credentials, use the ADVENTUREWORKS\ServiceAcct user account with the password Pa$$w0rd.

 In the Cube Designer for Sales.cube, on the Browser tab, click the Analyze in Excel button (or click
Analyze in Excel on the Cube menu) to open the cube in Excel. Enable data connections if
prompted.

 Use Excel to view the Reseller Revenue measure by the Products By Category hierarchy, the Order
Date.Calendar Year hierarchy, and the Salesperson hierarchy, verifying that the hierarchies behave
as expected.

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this exercise, you will have a multidimensional model that includes balanced
hierarchies, a role-playing dimension, and a parent-child dimension.

Exercise 2: Creating Dimensions in a Tabular Data Model


Scenario
Having created the required dimensions and hierarchies in a multidimensional model, you want to apply
the same changes to the tabular data mode.

The main tasks for this exercise are as follows:

1. Explore Dimension Hierarchies.


2. Create a Balanced Hierarchy.

3. Implement a Role-Playing Time Dimension.

4. Implement a Parent-Child Hierarchy.


5. Deploy and Browse the Model.

 Task 1: Explore Dimension Hierarchies


 Use SQL Server Data Tools to open the AWSalesTab.sln solution in the D:\Labfiles\Lab05B\Starter
folder
Note: If you reverted the virtual machine after the previous lab, you may be prompted to select a
workspace server. If so, use the localhost\SQL2 instance of Analysis Services as the workspace server, and
set the compatibility level of the project to SQL Server 2012 SP1 (1103).

 Process all of the tables in the model, using the user name ADVENTUREWORKS\ServiceAcct and
the password Pa$$w0rd to connect to the data source.

 In Diagram View, examine the Customers table, and note the following details:

o A hierarchy named Customers By Geography has been created.


MCT USE ONLY. STUDENT USE PROHIBITED
5-32 Designing Analytical Data Models

o The Customers By Geography hierarchy contains the Country, State Or Province, City, and
Customer attributes.

o The Country, State Or Province, City, and Customer attributes in the table (not in the
hierarchy) have been hidden from client tools.

 Note that hierarchies have also been created in the Reseller and Sales Territory tables.

 Task 2: Create a Balanced Hierarchy


 In Diagram View, maximize the Product table by clicking the Maximize button on its title bar.

 Use the Create Hierarchy button in the title bar of the maximized product table to create a hierarchy
named Products By Category.

 Add the Category, Subcategory, and Product attributes to the Products By Category hierarchy.

 Hide the Category, Subcategory, and Product attributes that are not in the hierarchy from client
tools.
 Restore the Product table to its previous size.

 After you finish creating the hierarchy, save the model.

 Task 3: Implement a Role-Playing Time Dimension


 In Diagram View, note that the relationships between ShipDateKey in the Internet Sales and
Reseller Sales tables, and DateKey in the Date table are shown as dotted lines. This is because only
one relationship can be active between the same two tables at any time, and the OrderDateKey
relationships are currently active (which is why they are shown as a solid line).
 Delete the dotted relationships so that the only relationship between the Internet Sales and Reseller
Sales tables and the Date table is based on the OrderDateKey columns.

 Rename the Date table to Order Date.


 Use the existing SqlServer MIA-SQLDW AWDataWarehouse connection in the model to import the
Date view again, this time with the friendly name Ship Date. Note the following:

o You can open existing connections on the Model menu.


o If you are prompted for impersonation credentials, use ADVENTUREWORKS\ServiceAact with
the password Pa$$w0rd.

 Rename the columns in the Ship Date table to match the equivalent columns in the Order Date
table.

 Create relationships between the Reseller Sales and Internet Sales tables and the Date table by
linking the ShipDateKey column in the Internet Sales and Reseller Sales tables to the DateKey
column in the Ship Date table.

 Create the following hierarchies in both the Order Date and Ship Date tables:

o Calendar Date;
 Calendar Year.
 Month Name.
 Date.
o Fiscal Date;
 Fiscal Year.
 Fiscal Quarter.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-33

 Month Name.
 Date.
 Hide all columns in the Order Date and Ship Date tables (other than the hierarchies) from client
tools.

 Mark the Order Date and Ship Date tables as date tables:

o View each table in data view.

o On the Table menu, point to Date, and click Mark as Date Table.

o Use the Date column when marking the tables as date tables.

 Sort the Month Name column in the Order Date and Ship Date tables by the MonthNumber
column:

o In Data View, select the Month Name column header, and then on the Column menu, point to
Sort, and click Sort by Column.

 After you finish, save the model.

 Task 4: Implement a Parent-Child Hierarchy


 In Data View, add a column named Path to the Salesperson table. Use the following DAX formula to
calculate its value:

=PATH([EmployeeKey], [ParentEmployeeKey

 Add a column named Level1 to the Salesperson table. Use the following DAX formula to calculate its
value:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 1, 1))

 Add a column named Level2 to the Salesperson table. Use the following DAX formula to calculate
its value:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 2, 1))

 Add a column named Level3 to the Salesperson table. Use the following DAX formula to calculate its
value:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 3, 1))

 In Diagram View, create a hierarchy named Salesperson in the Salesperson table, and add the
Level1, Level2, and Level3 attributes to the hierarchy.

 Hide all columns in the Salesperson table (other than the hierarchy) from client tools.

 After you finish, save the model.

 Task 5: Deploy and Browse the Model


 Deploy the project, replacing the AWSalesTab database if it already exists.

 On the Model menu, click Analyze in Excel and open the default perspective in Excel by using the
credentials of the current Windows user.

 Use Excel to view the Reseller Revenue measure by the Products By Category hierarchy, the Order
Date.Calendar Year hierarchy, and the Salesperson hierarchy, verifying that the hierarchies behave
as expected.
MCT USE ONLY. STUDENT USE PROHIBITED
5-34 Designing Analytical Data Models

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this exercise, you will have a tabular model that includes balanced hierarchies, a
role-playing dimension, and a parent-child dimension.

Question: How do the two models compare when designing dimensions and hierarchies?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-35

Lesson 4
Enhancing Data Models
After you create a data model and define its dimensions and hierarchies, business users can use it to
perform analytics. However, you can enhance a data model to add more value to the aggregated
measures and improve the business analytical experience.

Lesson Objectives
After completing this lesson, you will be able to:

 Design calculated measures in multidimensional and tabular data models.

 Design key performance indicators in multidimensional and tabular data models.

 Design perspectives in multidimensional and tabular data models.

 Design measure storage in multidimensional and tabular data models.

Custom Calculations
You can extend data models by adding custom
calculations to create measures that are not
available in the data source. For example, a data
model might include measures for sales revenue
and cost, and you could create a calculated
member to calculate gross profit by subtracting
cost from sales revenue. When designing custom
calculations for a data model, there are typically
two kinds of custom calculation that you can
create:
 Calculated columns and measures. You can
use an MDX expression in a multidimensional
model or a DAX expression in a tabular model to create a calculated column in a table. Typically, the
column uses values from other columns in the same table to create a row-level value. For example, if
your data model included a table that contains a Unit Price column that contains the cost of a single
product, and an Order Quantity column that contains the number of units ordered, you could create
a calculated Sales Amount column by multiplying the unit price by the order quantity. You could
then create an aggregated measure based on the calculated column (by adding it to the measure
group for the table in a multidimensional model, or by creating a DAX calculation that uses an
aggregate function in the measure grid in a tabular model).

 Global calculated members. In addition to using calculated columns to create measures, you can
use MDX or DAX expressions to create global calculated members that exist independently of a
measure group or table. A calculated member can perform a calculation that spans multiple fact
tables―for example, by adding a Costs measure in a marketing fact table to a Costs measure in a
manufacturing fact table to generate a total cost across both areas of the business. In a
multidimensional model, you create calculated members in the Cube Designer and define a named
folder in which client tools can browse them. In tabular data models, you can create calculated
measures in the measure grid of any table, but when browsed by a client tool, the calculated
measures will appear as members of the table in which they were created. To provide the custom
folder browsing experience of a multidimensional calculated member in a tabular mode, you can
MCT USE ONLY. STUDENT USE PROHIBITED
5-36 Designing Analytical Data Models

paste an empty table into the model from the clipboard and use that as a folder for global calculated
measures.

Considerations for custom calculations in multidimensional data models


You can create named calculations, which are calculated columns in tables in a data source view. These
values are calculated for each row when the cube is processed and stored in the cube. If a measure group
exists for the table that contains the named calculation, you can aggregate the calculated column as a
measure in the cube. You can also create measure expressions, which are MDX expression for which the
result is stored in the cube.

Additionally, you can use the Calculations tab in the Cube Designer to create calculated members. A
calculated member is a custom MDX expression that defines a member in any hierarchy in the cube. No
data is stored in a calculated member and the MDX calculation is calculated at run time. Calculations are
solved in the order listed in the Script Organizer pane. For best performance with cell calculations,
specify only a single member when possible. The Non-empty Behavior option stores the names of
measures used to resolve NON EMPTY queries in MDX. The script view enables you to add additional
settings to the calculated members that are not available through the form-based calculated member
user interface.

Calculated measures and calculated members can involve simple calculations of existing measures, or the
use of a wide range of MDX functions to perform advanced analytics. The following example uses tuples
to subtract the discount amount from the sales amount:

[Measures].[SalesAmount] - [Measures].[Discount Amount]

The following example uses the ParallelPeriod function to return the order quantity for the parallel
period for

Considerations for custom calculations in tabular data models


Calculated columns enable you to create a DAX formula that defines row-level values for a column. The
DAX expression can be based on measures and other calculated columns. You can concatenate values,
perform addition, extract substrings, or compare the values in other fields. It is typical to create calculated
columns and measures earlier in the development process of tabular data models because they are
required when browsing the data model within Microsoft Excel.

Note: Calculated columns are not supported in Direct Query mode.

The following example of a column named Profit uses a DAX expression to subtract the cost from the
sales amount and will be evaluated against every row in a table:

=[Measures].[SalesAmount] - [Measures].[Cost]

Like any other numeric column, you can create calculated measures from calculated columns. Calculated
measures are aggregated based on a filter or slicer that the user applies in the reporting client such as a
PivotTable. Measures can be based on standard aggregation functions, such as DISTINCT COUNT, COUNT
or SUM, or you can define your own formula by using DAX. The following example calculates the sum of
the Profit column described earlier for a given filter or slicer:

Total Profit:=SUM([Profit])

The formula for a calculated column can be more resource-intensive than the formula used for a measure
because the result for a calculated column is always calculated for each row in a table, whereas a measure
is only calculated for the cells defined by the filter.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-37

You can create a calculated measure that uses a DAX expression that references values outside of the
current table― this enables you to create global calculated measures that do not naturally belong in any
one table. When you need to create this kind of global calculated member, you should consider adding a
global calculations table to the data model to centralize the storage of global calculations and make
browsing more intuitive for business users. An easy way to accomplish this is to copy an Excel worksheet
that contains only a single column header to the clipboard, and paste it into the tabular data model as an
empty table. You can then use DAX expressions to define measures in the measure grid of the new table.
When users browse the data model, the measures will be listed under the table you pasted instead of
under one of the fact tables or dimension tables that contain data rows.

Note: MDX and DAX are rich expression languages that contain a wide range of functions
and expressions. For more information about these expression languages, see SQL Server Books
Online.

Key Performance Indicators


To provide visualization of key business
information, data can be presented from within
SQL Server Analysis Services by using key
performance indicators (KPIs). This feature enables
business users to assess performance over a period
of time using visual indicators and comparative
values.

KPIs in multidimensional data models


The Multidimensional Expression (MDX) language
provides functions specific to KPIs that enable you
to programmatically create them. However, the
Cube Designer provides you with a KPI Designer
that enables you to easily create KPIs in a graphical user interface. The interface will generate MDX
functions in the background on which the KPI will operate. The following is a list of MDX KPI functions
that are used:
 KPIVALUE. This function is an MDX expression that calculates current value and is compared to the
KPIGOAL function.

 KPIGOAL. This function is an MDX expression that calculates the target of the KPI to which the
KPIVALUE function is compared to.

 KPISTATUS. This function is an MDX expression that compares a KPIGOAL expression to a


KPIVALUE expression and is applied to a trend indicator.

 KPITREND. This function is an MDX expression to calculate a trend that is applied to a trend
indicator.

KPIs in tabular data models


In a tabular model, KPIs are defined against a calculated measure that are first created within the
measures grid of a tabular data model and are compared to a target value that is also defined by a
measure or a static value. Within the measure grid, right-click a calculated measure, and then click
Create KPI. A user interface is then displayed that enables you to set a value. A slider is used to define the
status thresholds for a KPI and to select the type of visualization to use in the KPI. Tabular data models do
not support trend analysis.
MCT USE ONLY. STUDENT USE PROHIBITED
5-38 Designing Analytical Data Models

Perspectives
Perspectives enable you to organize data model
objects that make it easier for users to browse a
data model. Perspectives are very useful with very
large cubes that contain many dimensions,
measure groups, KPIs, and other objects that can
make it difficult and confusing for users to find the
objects that they need.

The implementation of perspectives is very similar


in both multidimensional data models and tabular
data models with a user interface that is the same
from a functional point of view. Within
multidimensional models, the perspectives tab
within SQL Server Data Tools can be used to create perspectives by clicking on the New Perspective
button in the Cube Designer. In tabular data models, models, perspectives can be created from the Model
menu of the Data Model Designer.
Within both data models, click the New Perspective button, type in a name for the perspective, and then
click in the check box next to the object that you want to display within the perspective.

After the data model is deployed, when a user creates a connection to the data model from within
Microsoft Excel, a list of perspectives is presented, together with the cubes that are deployed.

Note: Perspectives are not used to apply security to analytical data models.

Considerations for Measures Storage


Measures are the values that are stored within a
cube. They consist of detailed information and can
also include aggregated information across all
dimensions that will make up the axis of the cube.
There are several properties that you can use to
control how the measures are aggregated and
stored, and choosing the right settings can have a
significant effect on cube processing and query
performance.

Designing measure storage for


multidimensional data models
Typically, measures map to numeric columns that
exist within the fact table of a data warehouse, but you can use calculated columns as a basis for defining
a measures value. Measures have properties that enable you to define how they function and appear to
users. This can be configured within the Cube Structure tab of the Cube Designer, and they include
Aggregate function, Data Type, Display Folder, Measure Expression, Visible, Description,
FormatString, ID, Name, and Source.

In multidimensional data models, measures are collectively grouped together in measure groups, which
define the granularity of the data and contain information about how measures relate to dimensions. You
can use the properties of a measure group to control how it manages errors, processing mode options,
and the storage mode for the cube.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-39

The storage modes in the following table can be defined for measure groups in multidimensional data
models. The choice of storage mode can dramatically affect the performance and cube size.

Storage
Description
Mode

MOLAP Setting the storage mode to MOLAP (Multidimensional Online Analytical Processing)
causes measures to be stored in the OLAP database. This speeds up the query
performance for data within the OLAP database, but it also increases the size of the
cube and the time it takes to process the data.

ROLAP Setting the storage mode to ROLAP (Relational Online Analytical Processing) causes
measures to be stored the source system, typically the fact table in the data
warehouse. SQL Server Analysis Services creates additional tables in the data source
for any pre-aggregated data.

HOLAP Setting the storage mode to HOLAP (Hybrid Online Analytical Processing), can be
seen as a compromise between the MOLAP and ROLAP storage mode. With the
HOLAP storage mode setting, pre-aggregated data is stored within the OLAP
database, and detailed data is stored in the relational data source. As a result, queries
for detailed data in the cube are slower than queries for pre-aggregated data.

When using SQL Server 2012 Enterprise edition, you can partition a measure group. Partitioning can
enable users to query the cube efficiently and reduce the time it can take to process a cube, because it
can be processed at partition level instead of processing the entire cube. For example, if you have a cube
that stores 10 years of historical data, you could create two partitions. The first partition could store the
data for the two most recent years, which is the most frequently queried time period, in MOLAP storage
mode. The second partition may store the remaining eight years of data using ROLAP storage mode.

Designing measure storage for tabular data models


Measure configuration for tabular data models is simpler than for multidimensional data models. By
default, tabular data models load all data into memory and do not use disk-based storage structures in
the same way as multidimensional data models do. If memory is limited on the server on which the
tabular instance is installed, you can change the storage model to DirectQueryMode. This mode queries
the data in the data source instead of loading it into memory. When this option is enabled, many DAX
operators and functions are not supported.

Partitioning is also supported in tabular data models, providing the ability to spread the data across
multiple partitions. You can also copy, merge, and delete partitions.
MCT USE ONLY. STUDENT USE PROHIBITED
5-40 Designing Analytical Data Models

Lab C: Enhancing Data Models


Scenario
You have created the hierarchies requested by the business users and asked for feedback. The users are
happy with the data model, but have asked for the following enhancements:

 The CFO would like to be able to view profit for Internet and reseller sales, and also grand totals for
cost, revenue, and profit across both Internet and reseller sales.

 The CEO also wants an easy way to quickly see how profit margin is performing against a target of 40
percent.

 Sales analysts have requested simplified cubes, specifically for Internet sales and reseller sales.

Objectives
After completing this lab, you will be able to:

 Create custom calculations, KPIs, and perspectives in a multidimensional data model.

 Create custom calculations, KPIs, and perspectives in a tabular data model.


Estimated Time: 60 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Enhancing a Multidimensional Data Model


Scenario
You have reviewed the feedback from business users and have decided to enhance the multidimensional
data model with new functionality to improve the user analytical experience.

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Create Named Calculations.

3. Create Calculated Members.

4. Create a KPI.

5. Create Perspectives.

6. Deploy and Browse the Model.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab05C\Starter folder as Administrator.

 Task 2: Create Named Calculations


 Use SQL Server Data Tools to open the AWSalesMD.sln solution in the D:\Labfiles\Lab05B\Starter
folder.

 In the AW Data Warehouse.dsv data source view, add a named calculation to the InternetSales
table:
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-41

o Name the calculation Internet Profit.

o Use the following expression:

[SalesAmount] - [TotalProductCost]

 Add a similar calculated column named Reseller Profit to the ResellerSales table using the same
expression.

 In the Sales cube, add the Internet Profit calculated column to the Internet Sales measure group,
and the Reseller Profit calculated column to the Reseller Sales measure group.

 Task 3: Create Calculated Members


 On the Calculations tab of the Cube Designer, add the following calculated members.

Name Expression Format String Display Folder

Total Revenue [Measures].[Internet "$#,##0.00;($#,##0.00) Total Sales


Revenue] + "
[Measures].[Reseller
Revenue]

Total Cost [Measures].[Internet Cost] + "$#,##0.00;($#,##0.00) Total Sales


[Measures].[Reseller Cost] "

Total Profit [Measures].[Internet Profit] "$#,##0.00;($#,##0.00) Total Sales


+ [Measures].[Reseller "
Profit]

Gross Margin ([Measures].[Internet Profit] "Percent" Total Sales


+ [Measures].[Reseller
Profit])

([Measures].[Internet
Revenue] +
[Measures].[Reseller
Revenue])

 After you add the calculated members, save all files in the project.

 Process the cube, redeploying the AWSalesMD database if necessary. If you are prompted for
credentials, use the ADVENTUREWORKS\ServiceAcct user account with the password Pa$$w0rd.

 Task 4: Create a KPI


 In the Cube Designer, on the KPIs tab, click the Reconnect button.

 Create a new KPI named Gross Margin with the following settings:
o Value Expression:

[Measures].[Gross Margin]

o Goal Expression:

0.4
MCT USE ONLY. STUDENT USE PROHIBITED
5-42 Designing Analytical Data Models

o Status indicator: Shapes.

o Status Expression:

CASE
WHEN([Measures].[Gross Margin]) < 0.3 THEN -1
WHEN ([Measures].[Gross Margin]) > 0.4 THEN 1
ELSE 0
END

 After you create the KPI, save all files in the project.

 Process the cube, redeploying the AWSalesMD database if necessary. If you are prompted for
credentials, use the ADVENTUREWORKS\ServiceAcct user account with the password Pa$$w0rd.

 Task 5: Create Perspectives


 On the Perspectives tab of the Cube Designer, create a perspective named Internet Sales that
includes only the following objects:
o The Internet Quantity, Internet Cost, Internet Revenue, and Internet Profit measures.

o The Order Date, Product, Ship Date, and Customer dimensions.

 Create a perspective named Reseller Sales that includes only the following objects:
o The Reseller Quantity, Reseller Cost, Reseller Revenue, and Reseller Profit measures.

o The Order Date, Product, Ship Date, Reseller, Sales Territory, and Salesperson dimensions.

 After you create the perspectives, save all files in the project.

 Task 6: Deploy and Browse the Model


 Deploy the project, replacing the AWSalesMD database if it already exists. If you are prompted for
credentials, use the ADVENTUREWORKS\ServiceAcct user account with the password Pa$$w0rd.

 In the Cube Designer for Sales.cube, on the Browser tab, click the Analyze in Excel button (or click
Analyze in Excel on the Cube menu), and then select the Sales perspective (which is the complete
cube). Enable data connections if prompted.

 Use Excel to view the Total Revenue, Total Cost, Total Profit, and Gross Margin measures and the
Status of the Gross Margin KIP by the Products By Category hierarchy.

 Close Excel without saving the workbook, and then on the Browser tab, click the Analyze in Excel
button (or click Analyze in Excel on the Cube menu), and then select the Internet Sales perspective.
Enable data connections if prompted, and verify that only four measures from the Internet Sales
measure group, and the Customer, Order Date, Product, and Ship Date dimensions are available.

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this lab, you will have a multidimensional model that contains custom
calculations, a KPI, and perspectives.

Exercise 2: Enhancing a Tabular Data Model


Scenario
You have enhanced the multidimensional data model, and now you want to create the same
enhancements in the tabular data model.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-43

The main tasks for this exercise are as follows:

1. Create Calculated Measures.

2. Add a Table for Global Calculated Measures.

3. Create a KPI.

4. Create Perspectives.

5. Deploy and Browse the Model.

 Task 1: Create Calculated Measures


 Use SQL Server Data Tools to open the AWSalesTab.sln solution in the D:\Labfiles\Lab05C\Starter
folder.

Note: If you reverted the virtual machine after the previous lab, you may be prompted to select a
workspace server. If so, use the localhost\SQL2 instance of Analysis Services as the workspace server, and
set the compatibility level of the project to SQL Server 2012 SP1 (1103).
 Process all of the tables in the model, using the user name ADVENTUREWORKS\ServiceAcct and
the password Pa$$w0rd to connect to the data source.

 In Data View, add a column named Profit to the Internet Sales table. Use the following DAX formula
to calculate its value:

=[SalesAmount] - [TotalProductCost]

 Hide the new Profit column from client tools.

 In the measures grid, under the Profit column, add the following measure:

Internet Profit:=SUM(Profit)

 Perform the same steps in the Reseller Sales table to create a hidden column named Profit, and an
aggregated measure named Reseller Profit.

 After you create the measures, save the model.

 Task 2: Add a Table for Global Calculated Measures


 Use Excel to create a new blank workbook, and in cell A1, type Total Sales. Copy cell A1 to the
clipboard, and then switch back to the AWSalesTab project in Visual Studio.

 Click any table tab, and then on the Edit menu, click Paste to paste the data from the clipboard as a
new table.

 Name the new table Total Sales.

 Add the following measures to the Total Sales table.

DAX Expression Format Property

Total Revenue:=[Internet Revenue] + [Reseller Revenue] Currency

Total Cost:=[Internet Cost] + [Reseller Cost] Currency


MCT USE ONLY. STUDENT USE PROHIBITED
5-44 Designing Analytical Data Models

DAX Expression Format Property

Total Profit:=[Internet Profit] + [Reseller Profit] Currency

Gross Margin:=[Total Profit] + [Total Revenue] Percentage

 After you create the measures, save the model.

 Task 3: Create a KPI


 In the Total Sales table, add a KPI to the Gross Margin measure.

 The KPI should use an absolute value of 0.4 as the target, and have status thresholds of 0.3 and 0.4.

 After you create the KPI, save the model.

 Task 4: Create Perspectives


 In the model, create a perspective named Internet Sales that includes only the following objects:

o The Internet Quantity, Internet Cost, Internet Revenue, and Internet Profit measures.

o The Order Date, Product, Ship Date, and Customer dimensions.


 Create a perspective named Reseller Sales that includes only the following objects:

o The Reseller Quantity, Reseller Cost, Reseller Revenue, and Reseller Profit measures.

o The Order Date, Product, Ship Date, Reseller, Sales Territory, and Salesperson dimensions.
 After you create the perspectives, save the model.

 Task 5: Deploy and Browse the Model


 Deploy the project, replacing the AWSalesTab database if it already exists.

 On the Model menu, click Analyze in Excel and open the default perspective in Excel by using the
credentials of the current Windows user.

 Use Excel to view the Total Revenue, Total Cost, Total Profit, and Gross Margin measures and the
Status of the Gross Margin KIP by the Products By Category hierarchy.
 Close Excel without saving the workbook, and then on the Model menu, click Analyze in Excel, and
then open the Internet Sales perspective in Excel by using the credentials of the current Windows
user to verify that only four measures from the Internet Sales measure group, and the Customer,
Order Date, Product, and Ship Date dimensions are available.

 After you finish browsing the cube, close Excel without saving the workbook, and then close Visual
Studio.

Results: At the end of this exercise, you will have a tabular model that contains calculated measures, a
KPI, and perspectives.

Question: How did the experience of creating cube enhancements vary between the two
models?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 5-45

Module Review and Takeaways


Module Review
This module provided a comparison between multidimensional and tabular data models, with the
intention of enabling you to learn more about the specific capabilities of each and the development tasks
that must be performed to implement both types of model. Familiarity with both models will help you
determine the most appropriate one to use for a particular business scenario.

Question: Now that you are familiar with both models, how would you decide which to use
in a particular business scenario?
MCT USE ONLY. STUDENT USE PROHIBITED
5-46 Designing Analytical Data Models
MCT USE ONLY. STUDENT USE PROHIBITED
6-1

Module 6
Planning a BI Delivery Solution
Contents:
Module Overview 6-1

Lesson 1: Considerations for Delivering BI 6-2

Lesson 2: Common Reporting Scenarios 6-6

Lesson 3: Choosing a Reporting Tool 6-10

Lab: Planning a BI Delivery Solution 6-14

Module Review and Takeaways 6-15

Module Overview
The primary goal of any business intelligence (BI) solution is to deliver meaningful data to business users,
and empower them to make informed business decisions. A well-designed data warehouse and
comprehensive analytical data model are of no use without a way for users to consume the information
they contain. When planning your BI solution, you must consider how it will deliver data to users, and
choose the most appropriate reporting tools for the business requirements.
This module describes common reporting scenarios and the Microsoft tools that can be used to support
them.

Objectives
After completing this module, you will be able to:

 Describe considerations for delivering BI to users.

 Describe the key features of common reporting scenarios.

 Choose an appropriate reporting tool for a particular reporting scenario.


MCT USE ONLY. STUDENT USE PROHIBITED
6-2 Planning a BI Delivery Solution

Lesson 1
Considerations for Delivering BI
The reports and analytical interfaces that business users consume are the visible “face” of the BI solution.
For most users, the reporting tools and the information they make available represent the entire BI
solution, and its success or failure will largely be judged on their effectiveness.

Therefore, in addition to the detailed planning that is required for the server infrastructure, the data
warehouse, the extract, transform, and load (ETL) processes, and the analytical data models on which the
BI solution is based, you must give careful consideration to how the BI data will be delivered to business
users in the form of reports and analytical interfaces.

Lesson Objectives
After completing this lesson, you will be able to:
 Describe how reporting and analysis design fits into the overall BI project.

 Describe types of data source commonly used for reporting and analysis.

 Describe the Microsoft tools that are commonly used for reporting and analysis.
 Describe how Microsoft SharePoint Server can be used as a platform for delivering BI.

Reporting and Analysis Design in Context


Reporting and analysis design is based on the
business requirements for the BI solution. Often,
the business requirements are expressed as
specifications for reports or analytical activities,
and the design of the reporting and analysis
interfaces is really a further iteration of the
requirements gathering process in which report
requirements are used to generate mockups,
which evolve into prototypes, and eventually into
reports.

The design of the reporting and analysis elements


of the solution heavily influence the data
warehouse and ETL design (to help ensure that the data required in the reports is present in the data
warehouse), and the technical architecture and infrastructure design (so that the server and client software
necessary to support the reporting interfaces is present). Therefore, it is critical in a BI project that there is
strong communication between the subteams working in each key area of the overall solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-3

Data Sources for BI


Most corporate reporting and analysis is based on
data in the data warehouse (or a departmental
data mart based on a subset of the data
warehouse) or on analytical data models.
However, although it is desirable to centralize all
data for reporting and analysis, most enterprises
include data that, for practical reasons, cannot be
included in the data warehouse but which is
required for some reports. Typical examples of this
kind of data include financial data in proprietary
accounts systems, unstructured data such as files
and email messages, and data in SharePoint lists.
Additionally, some types of reporting and analytical activity include data from external sources, such as
commercially available data services or Internet feeds.

Before you start planning the detailed design of reports, you should consider the required data sources
and determine how the reporting tools will access the data they provide. In some cases, tools can
consume the data directly from the data source, whereas in others you might need to consider
developing an ETL process that retrieves the required data into an intermediate format for reporting.

Reporting Tools
The Microsoft BI platform includes a wide range of
tools and technologies for reporting and analysis,
each with its own set of capabilities and
restrictions.

SQL Server Reporting Services


SQL Server Reporting Services offers a wide range
of reporting functionality and report authoring
environments, including:

 Report Designer. A professional report


authoring environment in SQL Server Data
Tools.

 Report Builder. A self-service report authoring environment that can be used by business users to
create reports.

 Power View. An interactive data visualization environment in a SharePoint site that can be used to
create graphical data elements from tabular data models.

 Report Server. A platform for publishing and delivering reports.

Microsoft Excel
Excel is a commonly used application in most organizations, and provides a rich, mature environment for
all kinds of data analysis and reporting. Excel is a comprehensive spreadsheet tool that you can use to
create tables of data, including complex calculations and lookups, and apply rich formatting, including
conditional formats that highlight meaningful data values. The key to the value of Excel as a BI reporting
tool is its built-in support for importing data from a wide range of data sources, including relational
databases and analytical data models in SQL Server Analysis Services. Additionally, you can install the
Excel Add-in for the Windows Azure data market, and import data from a wide range of data services. You
MCT USE ONLY. STUDENT USE PROHIBITED
6-4 Planning a BI Delivery Solution

can use the data in your spreadsheet as a source for a wide range of charts and visualizations, including
common charts such as bar, line, and pie charts, as well as inline sparklines, data bars, and indicators.

Excel functionality for BI reporting includes:

 PivotTables and PivotCharts. Excel is commonly used as an interactive interface for exploring
analytical cubes provided by multidimensional and tabular data sources. The built-in PivotTable and
PivotChart capabilities make it easy to aggregate and filter measures by the dimensions in a cube,
and create related tables and charts. Additionally, data analysis tools such as slicers and timelines
make it easy to filter analytical data.

 PowerPivot for Excel. In addition to consuming data models from SQL Server Analysis Services, users
can use PowerPivot for Excel to create their own tabular data models. PowerPivot is an add-in that is
provided with Excel 2013, and is available as a download for earlier versions of Excel. Users can use
PowerPivot to create a tabular model that is saved with the Excel workbook, and which can be
published to SharePoint for use by other business stakeholders.

 Power View. In Excel 2013, users can create Power View visualizations from tabular models in an
Excel workbook, using the same intuitive graphical Power View interface as Reporting Services
provides in SharePoint.

 Data Mining add-ins. With the SQL Server Data Mining add-ins for Excel, users can create and use
data mining models in an Analysis Services instance to analyze tables of data in an Excel spreadsheet.

PerformancePoint Services
PerformancePoint Services is a component of SharePoint Server that provides business analysis
functionality. With PerformancePoint Services, you can use the built-in Dashboard Designer to create:

 Data connections to a wide variety of corporate data sources, including relational databases and data
models in SQL Server Analysis Services.

 Reports that show charts or other business performance visualizations.

 Key performance indicators (KPIs) that compare business performance metrics to targets.

 Scorecards that tabulate KPIs.


 Dashboards that include reports and scorecards.

SharePoint Server as a BI Delivery Platform


Regardless of the tools you decide to use to create
reports and enable user-driven data analysis, you
can use SharePoint Server as a platform to deliver
the reports and enable collaborative data
exploration. You can use SharePoint to deliver
reports and analyses in the following ways:

 SQL Server Reporting Services integration.


You can deploy Reporting Services in native
mode; in which case, it provides its own web-
based interface for managing and viewing
reports. Alternatively, you can install
Reporting Services in SharePoint Integrated
mode; in which case, report management and browsing functionality is integrated into the SharePoint
environment. The integration with SharePoint makes it easier to manage user authentication and
permissions consistently across Reporting Services reports and other resources in SharePoint, and
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-5

enables some features of Reporting Services that are not available in native mode, including Power
View and data alerts.

 Excel Services. SharePoint Server includes Excel Services, which provide an interactive Excel interface
in a web browser. By publishing Excel workbooks and PowerPivot data models in a SharePoint site,
you can make them available to users across the enterprise, even when Excel is not installed on their
computers. When combined with Excel Services in SharePoint Server, Excel provides a comprehensive
BI reporting tool for a variety of scenarios.

 PowerPivot for SharePoint. PowerPivot for SharePoint is built on Excel Services, and enables
business users to publish Excel workbooks that contain PowerPivot tabular data models to a
SharePoint site, where other users can view them and use them as data sources for their own
interactive analysis in Excel and Power View.

 Power View. When SQL Server Reporting Services is installed in SharePoint Integrated mode, users
can use Power View to create interactive visualizations of tabular models in PowerPivot workbooks or
Analysis Services databases in a browser, and save their Power View reports in SharePoint.
 PerformancePoint Services Content. PerformancePoint Services content such as KPIs, scorecards,
and dashboards can be delivered only in a SharePoint site.

 Business Intelligence Center. You can enable individual BI functionality in any SharePoint site, but it
can be more effective to use the built-in SharePoint Server Business Intelligence Center template to
create a BI portal. You can create a site or subsite based on this template, which includes all of the
necessary components for PerformancePoint Services content. Then, you can add a document library
for Reporting Services reports and PowerPivot Gallery for PowerPivot and Power View reports to the
site for a complete BI delivery solution.

Note: Considerations for using SharePoint as a BI delivery platform are discussed in detail
in Module 9: Planning a SharePoint Server BI Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
6-6 Planning a BI Delivery Solution

Lesson 2
Common Reporting Scenarios
Although the business requirements in every organization are different, there are some reporting
scenarios that are commonly required. Not every BI solution requires all of these scenarios, but it is useful
to be able to map business requirements to these scenarios and plan their implementation accordingly.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the key characteristics of formal reports.

 Describe the key characteristics of data exploration and analysis.

 Describe the key characteristics of analytical data mashups.

 Describe the key characteristics of scorecards.


 Describe the key characteristics of dashboards.

Formal Reports
Most corporate environments include a
requirement for formal reporting. Formal reports
are a traditional way for business users to view key
information about the business, and to
communicate business results to external
stakeholders such as shareholders and partners.
Typically, a formal reporting solution has the
following characteristics:

 Reports are scoped at regular intervals; for


example, a monthly sales report.

 Each report is designed to communicate


specific information for a specific audience.

 Reports are based on managed corporate data sources, such as a data warehouse or a financial
accounts system.

 Reports are formally structured with limited interactivity, such as expanding summarized values to
reveal details.

 Reports are often distributed in document formats, such as Microsoft Word, PDF, or as a Microsoft
Excel workbook.

 Reports are often designed to be printed.


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-7

Data Exploration and Analysis


In addition to formal reports, a BI solution can
empower users to perform interactive exploration
and analysis of data. Usually, this analysis takes the
form of “slicing and dicing” data by summarizing,
filtering, and pivoting key measures across
different dimensions. Analysis often explores
business data in more depth than formal reporting
to uncover relationships or patterns in the data.
For example, a monthly sales report might show
the amount of revenue generated by different
sales regions, and a business user might use this as
a starting point to explore different features of
sales orders for the last year to identify trends in sales performance across regions, or to break down the
orders by customer characteristics such as gender or marital status to better understand the profile of the
company’s customers.
Typical characteristics of data exploration and analysis scenarios include:

 The analysis is performed by users who are familiar with the business measures and dimensions, and
can interpret the findings in context.
 The data for the analysis is provided by a data warehouse (or a departmental data mart), often
exposed through an analytical data model.

 The analyst might have a specific goal or hypothesis in mind, but the analytical activity is unstructured
and as unconstrained as possible within the limitations of the available data, the data model, and the
analytical tool used.

 Users often want to be able to present or communicate their findings to colleagues and managers.

Analytical Data Mashups


Analytical data mashups are really an extension of
the data exploration and analysis scenario
discussed in the previous topic. The key difference
is that in addition to using data models as a basis
for analysis of corporate data, a data mashup
incorporates other sources of data to provider a
wider analytical context.

For example, in an organization that is planning to


expand internationally, a business analyst might
start by exploring corporate data showing sales
revenue and customer demographics for the
geographical regions where the company already
has a presence. This analysis can then be extended to include data from an external data service that
provides population statistics and socio-economic indicators for the countries under consideration for the
expansion. Additionally, the geographic data could be visualized on mapping data from a service such as
Bing Maps.
Typical characteristics of an analytical data mashup scenario include:
MCT USE ONLY. STUDENT USE PROHIBITED
6-8 Planning a BI Delivery Solution

 The analysis is based on a combination of managed corporate data and data from other, often
external, sources.

 The users performing the analysis are familiar with advanced data access techniques and can acquire,
cleanse, and restructure the data they need with minimal IT support.

 The conclusions of the analysis might be shared with colleagues and managers, but the activity is not
a regular part of business operations and periodic distribution of the results is not required.

Scorecards
In every business, there are key metrics that are
considered to be indicators of the overall health
and performance of the company. Many of these
indicators are financial measures, such as revenue
and profit, but others might include figures
related to employee turnover, manufacturing
output, the number of customers, or any other
measurable aspect of the company that is
considered important to its success. It is common
practice to track key performance indicators (KPIs)
for these metrics, and measure them against
targets or performance in previous periods.
The KPIs for a particular area of the business, or the business as a whole, can be combined to create a
scorecard that enables business stakeholders to measure overall performance. A scorecard shows an
indicator for each important metric, and enables users to drill-down into the individual areas of the
business that contribute to the overall score. Additionally, the individual KPIs can be weighted to reflect
their relative importance in terms of measuring business performance.

Typically, a scorecard:
 Shows a score for the KPIs that are a considered to reflect the performance of the business, or a
business area.

 Enables users to see overall performance at a glance, and drill-down into a particular KPI to see the
contributory scores for that indicator.

 Is created by a BI specialist developer based on specific business requirements.

Dashboards
A dashboard provides a high-level overview of
performance for a business or a business area by
showing a combination of related indicators,
charts, and other data elements. By bringing
together top-level business performance
information into a single view, a dashboard
provides a way for business managers to get an
overall view of how the business is performing at a
glance, and quickly identify any areas of concern
that require further exploration.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-9

Typically, a dashboard:

 Is hosted in a web portal, and consumed in a browser, or embedded in a business application.

 Includes a mixture of different data visualizations that convey related high-level business information
in a way that can be quickly understood.

 Provide interactive filtering, so that selecting a data value in one element of the dashboard filters the
data shown in the other dashboard elements.

 Is used as a primary means of business performance monitoring that makes it easy to identify areas of
concern that require further exploration.

 Is created by a BI specialist developer based on specific business requirements and priorities provided
by stakeholders.
MCT USE ONLY. STUDENT USE PROHIBITED
6-10 Planning a BI Delivery Solution

Lesson 3
Choosing a Reporting Tool
The common reporting scenarios can all be addressed by multiple reporting tools. However, some tools
are more appropriate for some scenarios than others. This lesson explores the reporting tools available
from Microsoft, and their suitability for the reporting scenarios described in the previous lesson.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe how SQL Server Reporting Services supports common reporting scenarios.

 Describe how Microsoft Excel supports common reporting scenarios.

 Describe how PerformancePoint Services supports common reporting scenarios.

 Determine appropriate tools for specific reporting scenarios.

SQL Server Reporting Services


Reporting Services can be used in each of the
common reporting scenarios described in the
previous lesson, with the following restrictions and
considerations:

 Formal reports. Reporting Services was


originally designed as a platform for creating
and delivering formal reports, and provides
report authoring tools for BI professionals as
well as business users. Its support for a wide
range of data sources, including relational
databases and analytical data models,
combined with its rich formatting, data layout,
and charting capabilities enable you to easily implement complex reports that can be rendered in
multiple formats and printed. Additionally, because Reporting Services includes a platform for
publishing and distributing reports, you can easily deliver formal reports on a scheduled or on-
demand basis.

 Data exploration and analysis. When deployed in SharePoint Integrated mode, Reporting Services
includes Power View, which provides a simple to use, intuitive interface for graphical data exploration.
Users can create comprehensive visualizations of data from tabular data models and save them in a
SharePoint site for others to view, or export them to Microsoft PowerPoint for presentation to
colleagues, managers, or other business stakeholders.

 Analytical data mashups. Report Designer and Report Builder support a wide range of data sources,
but they offer limited interactive analytical capabilities. Power View provides a high level of
interactivity, but it is limited to the data in the underlying tabular data model. To use Power View in a
SharePoint site as a tool for data mashups, business users would need to use PowerPivot for Excel to
create their own tabular data models with all of the data they need.

 Scorecards. Report Designer and Report Builder both enable report authors to visualize comparative
data measures by using indicators. By creating and publishing a report that shows indicators for key
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-11

business metrics, you could use Reporting Services to create a scorecard. However, for complex
scorecards, other KPI visualization tools are generally easier to use than Reporting Services.

 Dashboards. You can create a report that shows summarized data in tables and matrices, charts,
indicators, data bars, sparklines, and gauges. You can then publish this report as a dashboard.
Reporting Services supports drill-down aggregation, and actions such as links to reports or other
online resources, so you could use Reporting Services to create reasonably interactive dashboards.
However, it is difficult to incorporate inter-element filtering in a report, and other tools might provide
a more flexible solution for your particular dashboard requirements.

Note: Considerations for using Reporting Services in a BI solution are discussed in detail in
Module 7: Designing a Reporting Services Solution.

Microsoft Excel
You can use Excel in the following reporting
scenarios:

 Formal reports. Excel provides


comprehensive formatting and charting
capabilities, which you can use to create
formal reports that can be distributed as Excel
workbooks, printed, or saved in PDF format.
You can distribute Excel workbooks by
publishing them to a SharePoint site, where
users can download and view or print them;
or you can send them by email. Excel can
provide a suitable reporting solution for
organizations that require the distribution of only a few reports that can be manually created and
refreshed; however, when multiple reports must be regularly distributed to multiple audiences, a
platform such as Reporting Services provides a more manageable solution.

 Data exploration and analysis. Excel provides a comprehensive solution for user-drive analysis and
data exploration. The ability to connect to a data model and create PivotTables and PivotCharts
makes data exploration easy, and the addition of PowerPivot and Power View means that
sophisticated business users can create their own data models and visualizations in a familiar tool.

 Analytical data mashups. The ability to import data from multiple data sources, including data
services in the Windows Azure data market, makes Excel a powerful tool for creating data mashups.
Even when a connection to a particular data source can’t be created, Excel makes it easy to import
data from text files and XML documents, and in many cases, users can simply paste data from the
clipboard. Users can use the built-in functionality of Excel to filter and cleanse the data before
incorporating it into a PowerPivot tabular model for inclusion in analysis.
 Scorecards. Excel provides built-in support for graphical indicators based on KPIs defined in
multidimensional or tabular data models, and it can be used to create a spreadsheet-based scorecard.
You can then publish the workbook that contains the KPIs to SharePoint, where users can view them
in Excel Services. In some organizations, particularly those where Excel is commonly used for
reporting, this approach can be effective. However, comprehensive scorecards that include
hierarchical rollups and weightings can be complex to create in Excel.

 Dashboards. You can use all of the data elements in Excel, including PivotTables, PivotCharts, slicers,
timelines, and indicators to create a spreadsheet-based dashboard, which you can publish to
MCT USE ONLY. STUDENT USE PROHIBITED
6-12 Planning a BI Delivery Solution

SharePoint for interactive viewing in Excel Services. However, you must be careful to ensure that the
data in the Excel workbook on which the dashboard is based is frequently refreshed. Additionally, the
Excel environment provides a rich data analysis tool, but it is not designed for online viewing at a
glance. You can design the default layout of the spreadsheet to be easily viewed in a browser, but as
users interact with it, the elements can be resized and repositioned.

Note: Considerations for using Excel in a BI solution are discussed in Module 8: Designing
an Excel-Based Reporting Solution.

PerformancePoint Services
You can use PerformancePoint Services in the
following reporting scenarios:

 Formal reports. Although it is possible to


create reports in PerformancePoint, they are
typically not optimized for printing or
distribution. In general, Reporting Services or
Excel provide a better solution for formal
reporting.

 Data exploration and analysis.


PerformancePoint Services reports and
scorecards support some interactivity, but
scope for detailed data exploration and
analysis is limited when compared to Excel.
 Analytical data mashups. Similarly, PerformancePoint Services typically does not offer enough
flexibility to be used as an effective tool for creating data mashups.

 Scorecards. PerformancePoint Services provides comprehensive support for scorecard authoring and
publishing, and makes it easy to define hierarchical KPIs with weighted scorecard values.

 Dashboards. PerformancePoint Services makes it easy to create dashboards that contain related data
visualizations with inter-element filtering, and because PerformancePoint Services is a component of
SharePoint Server, it is easy to create dashboards that are seamlessly embedded into the SharePoint
site.

Note: Considerations for using PerformancePoint Services as a BI solution are discussed in


detail in Module 9: Planning a SharePoint Server BI Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-13

Guidelines for Reporting Tool Choice


The choice of reporting tool depends on various
factors, including the complexity of the reports
and analyses to be created, the ability to deploy
and support the required services, and the tool-
specific skills of the users developing the reports.
All of the reporting tools can be used, with some
limitations, to implement most common reporting
requirements. However, you should consider the
following guidelines for choosing a reporting tool:

 Formal reports. SQL Server Reporting


Services is the preferred tool for
implementing a formal reporting solution.
Business users with limited knowledge of query syntax can use Report Builder to create reports,
particularly when a self-service reporting infrastructure including shared data sources, datasets, and
report parts has been put in place. Alternatively, formal reports can be created by professional BI
specialists using Report Designer. The Reporting Services platform can then be used to publish and
distribute reports.

 Data exploration and analysis. Excel is the most comprehensive and flexible platform for self-service
data analysis and exploration. Most business users are familiar with Excel, and can easily connect to
data sources and create PivotTables and PivotCharts or create Power View visualizations. More
advanced Excel users can use PowerPivot to create their own tabular data models or data mining
tools to apply predictive analytical models to data.

 Analytical data mashups. Similarly, Excel is the most suitable tool for analytical data mashups. Its
support for importing data from multiple corporate and external data and its range of data editing
and filtering capabilities make it a flexible tool for user-driven analysis across data sources.

 Scorecards. Generally, PerformancePoint Services is the best choice for scorecards in environments
where SharePoint Server is used to deliver BI services. You can create complex scorecards that span
business areas and apply custom weightings and threshold values to the KPIs in the scorecards more
easily than in Excel or Reporting Services, and you can integrate PerformancePoint scorecards with
other elements of a PerformancePoint dashboard. In most cases, scorecards are developed by BI
specialists.

 Dashboards. PerformancePoint Services provides the best solution for dashboards in a SharePoint-
based environment. The ability to create interactive dashboards that are closely integrated into the
SharePoint Server user experience helps you embed BI into everyday business operations. Like
scorecards, dashboards are usually created by BI specialists.

Regardless of which tools are used to implement specific reports, all reporting and analysis solutions can
be delivered through SharePoint Server.

Additional Reading: For more information about reporting scenarios and the Microsoft
tools that can be used to support them, see “How to Choose the Right Reporting and Analysis
Tools to Suit Your Style” at http://msdn.microsoft.com/en-us/library/jj129615.aspx.
MCT USE ONLY. STUDENT USE PROHIBITED
6-14 Planning a BI Delivery Solution

Lab: Planning a BI Delivery Solution


Scenario
You are developing a BI solution for Adventure Works Cycles. You have created an initial design for the
data warehouse and analytical data models, and now you need to decide which tools you will use to meet
the reporting and analytical requirements.

Objectives
After completing this lab, you will be able to:

 Assess reporting requirements.

Estimated Time: 45 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Assessing Reporting Requirements


Scenario
You have interviewed business stakeholders to gather reporting requirements. Now you must identify the
different kinds of reporting scenario that the BI solution must support and select appropriate tools to
support them.

The main tasks for this exercise are as follows:


1. Review Reporting Requirements.

2. Plan Reporting Tools.

 Task 1: Review Reporting Requirements


 Read the lab and exercise scenarios.
 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab06\Starter


folder.

 Task 2: Plan Reporting Tools


 Collaborate with two or three other students.

 Discuss the reporting requirements in the interviews and agree on appropriate tools to support them.

 Document your decisions in the Reporting Requirements.docx document in the


D:\Labfiles\Lab06\Starter folder.

Results: At the end of this exercise, you will have a reporting requirements document that lists the
reporting scenarios that the BI solution must support, and the tools that you plan to use.

Question: Review Reporting Requirements in the D:\Labfiles\Lab06\Solution folder. How


do the recommendations in this document compare to the decisions you made in the lab?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 6-15

Module Review and Takeaways


Module Review
This module described common reporting scenarios and the Microsoft reporting tools that you can use to
support them. By recognizing common types of reporting scenario, you can identify the most appropriate
tool to support it for your specific business environment, and plan the infrastructure, data warehouse, and
analytical models accordingly.

Question: How does the inclusion of a requirement for self-service BI influence the choice of
data tools?
MCT USE ONLY. STUDENT USE PROHIBITED
6-16 Planning a BI Delivery Solution
MCT USE ONLY. STUDENT USE PROHIBITED
7-1

Module 7
Designing a Reporting Services Solution
Contents:
Module Overview 7-1

Lesson 1: Planning a Reporting Services Solution 7-2

Lesson 2: Designing Reports 7-9

Lesson 3: Planning Report Consistency 7-15

Lab: Designing a Reporting Services Solution 7-18

Module Review and Takeaways 7-25

Module Overview
Microsoft SQL Server Reporting Services provides a scalable, versatile platform for reporting solutions. If
your Microsoft-based business intelligence (BI) solution requires formal reporting, it is likely to include
Reporting Services. When planning your BI solution, you must consider how Reporting Services will affect
the overall infrastructure requirements and design, and how you will use it to deliver the reports required
to support the business requirements.

Objectives
After completing this module, you will be able to:

 Plan a Reporting Services solution.


 Design reports.

 Plan report consistency.


MCT USE ONLY. STUDENT USE PROHIBITED
7-2 Designing a Reporting Services Solution

Lesson 1
Planning a Reporting Services Solution
Before you start designing reports, you must consider the business requirements and plan the Reporting
Services implementation. This lesson describes some key considerations and guidelines that will help you
plan a successful reporting solution.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe how Reporting Services affects overall BI project planning.

 Identify report requirements.

 Choose a suitable Reporting Services deployment mode.

 Plan report delivery.


 Plan security.

 Plan report navigation.

Reporting Services in a BI project


Reporting Services is a core part of the Microsoft
BI platform, and is commonly used in BI solutions.
When undertaking a BI project, you need to plan
and design two aspects of the Reporting Services
implementation:
 Reporting Services infrastructure.

 Reports and report items.

Reporting Services infrastructure


Designing a Reporting Services infrastructure
involves planning the server topology for the
components of Reporting Services, which include
the report server and the report server databases as well as integration with a SharePoint Server farm.
Additionally, Reporting Services can use network resources such as file shares and email servers, so the
infrastructure design must consider authentication and network connectivity to those resources.

The team members who are responsible for designing Reporting Services infrastructure should include
someone with knowledge of Reporting Services architecture and experience with Reporting Services
component installation and configuration. Additionally, the team should include IT personnel with
knowledge of the network and domain environment into which Reporting Services will be deployed. If
Reporting Services is to be deployed in SharePoint Integrated mode, a SharePoint Server specialist should
also be involved.

Reports and report items


In addition to the infrastructure for Reporting Services, a BI project requires someone with the necessary
skills to determine the reports that are required, the data that is required to support them, and the visual
design and layout for the reports. Additionally, consideration must be given to reusable reporting
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-3

elements, such as datasets and charts, and the rendering formats and delivery channels that will be used
to deliver the reports to business users.

Typically, team members involved in designing reports and report items include a database administrator
or developer who is familiar with the data warehouse and other data sources; a specialist report designer
who can translate user requirements into report designs; and business users who can articulate and clarify
report data and structure requirements, determine report format and interactivity requirements, and
provide feedback on report mockups, prototypes, and drafts.

Identifying Report Requirements


Typically, the business requirements for the BI
solution include some details of the reports that
are required. When planning a Reporting Services
solution, you must start with these business
requirements and determine how they will affect
the design of the reports, report items, and the
Reporting Services infrastructure required to
support them. Key considerations for reports
include:

 What reports are required?

 What data sources are required to support the


reports?

 Who are the audiences for the reports?

 How and when will reports be delivered?


 What render formats are required?

 Is there a requirement for self-service reporting?

What reports are required?


Perhaps the most obvious consideration when planning a reporting solution is the set of reports that are
required. Often, the report requirements are a key driver for the entire BI project, and most business users
have a set of standard reports that they want the BI solution to generate. When designing a reporting
solution, start with the business requirements you have gathered, and list the reports that you can identify
from them. After that, work with business stakeholders to check, consolidate, and prioritize the list to
identify discrete requirements that can be met with the same report and remove reports that are
impractical and provide little business value.

Be aware that in most business environments, requirements can change over time. It is not uncommon to
start a BI project with a list of required reports, and after the data warehouse and data models required to
support them are in place and users have started using them, learn that new reporting requirements
emerge because of the insights gained from the BI solution. Therefore, you need to consider extensibility
when planning data sources, datasets, and reporting folder structures.

What data sources are required to support the reports?


In most enterprise BI solutions, formal reporting is based on either the data warehouse or on analytical
data models. However, there may be a requirement to include data from other sources, and you should
identify these as early as possible.

After identifying the data sources, you can start to consider the specific queries that will be required to
retrieve the reporting data. In some cases, you might find that similar datasets can be used to support
MCT USE ONLY. STUDENT USE PROHIBITED
7-4 Designing a Reporting Services Solution

multiple reports. For example, a monthly sales report might include the same columns as a quarterly sales
report, thereby enabling you to use parameters to filter the rows returned by the dataset, depending on
the report being rendered.

Who are the audiences for the reports?


After you identify the initial set of reports that must be created and the data sources required to support
them, map the reports to the audiences that will consume them. Some reports may be specifically
designed for a single business user, or a small group such as an executive board. Others might be broader
in nature and be designed for the entire workforce. In many cases, the reporting requirements include
reports for external consumers, such as customers, partners, or shareholders. It can be useful to create a
matrix that shows which reports are required by which groups of users.

How and when will reports be delivered?


Reporting Services provides various options for delivering reports. The simplest solution is to use a pull
delivery model in which users view the reports on-demand in a browser, from where they can print or
export them if required. Alternatively, you can use a push delivery model in which reports are sent to users
as an email attachment or deposited in a file share or SharePoint document library on a scheduled basis.
When planning your solution, you should identify the ways in which users would prefer to consume the
reports and extend the matrix of reports and user groups to include details of how the reports should be
delivered. Bear in mind that in some scenarios, users might want to receive regular reports by email and
view them on-demand.

Even in scenarios where reports will be viewed on-demand, you should determine when the reports
should be refreshed with new data. In some cases, users want to see the very latest data from the sources
when they view the report, but in other cases the reports might only need to reflect data for a specific
time period. Understanding this requirement enables you to plan the use of caching or snapshots to
optimize report rendering performance and reduce the workload on data sources.

What render formats are required?


You can use Reporting Services to deliver reports in a variety of formats, including HTML, PDF, image,
Microsoft Word, Microsoft Excel, and other formats. The rendering architecture is extensible, so you can
also use third-party or custom rendering extensions if required. Reporting Services renders reports in each
format as similarly to the source design and layout as possible, but the significant differences between
some formats means that anything other than the most simple reports will render differently in each
format. Therefore, it can be helpful to determine the required render formats when planning the
reporting solution so that the report designs can be optimized for the target formats.

Is there a requirement for self-service reporting?


Until now, you have considered the reports that must be created by BI developers as part of the BI
solution. However, it has become common for organizations to require self-service reporting, in which
business users can take advantage of the BI infrastructure and data to create their own reports. Early in
the planning of a Reporting Services solution, you should determine whether support for self-service
reporting is required, and identify the users who will take advantage of it. Based on the query language
and report design expertise of the self-service reporting users, you can then identify the folder structures,
shared data sources, shared datasets, and report parts that will be required to empower these users to
create useful reports with minimal support from IT.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-5

Reporting Services Modes


A significant factor in designing the infrastructure
for a Reporting Services solution is the choice
between native mode and SharePoint integrated
mode deployment. You can install Reporting
Services in either of these modes depending on
the user and management experience you want to
provide.

Native mode
In native mode, Reporting Services provides a
report server that runs as a service on a Windows
server. This service uses a SQL Server database to
store the report catalog and temporary object
database, and provides user and administrative access through a web-based tool named Report Manager.
Additionally, the report server configuration is managed through a dedicated Reporting Service
Configuration Manager tool.
Reports are organized into folders and secured based on permissions applied to Reporting Services roles.
Administrators can create and manage roles and permissions in Report Manager.

In native mode, users can use a special My Reports folder to manage their own set of reports. They can
also subscribe to reports to have them delivered by email or to a file share. Additionally, administrators
can create data-driven subscriptions that deliver reports to a collection of users based on a predefined set
of delivery and format preferences.

SharePoint Integrated mode


In SharePoint Integrated mode, Reporting Services is installed as an application service in a SharePoint
farm. Users access reports and other reporting objects in SharePoint document libraries, and report item
properties are also managed in the SharePoint Server interface. All configuration of the report server is
performed in the SharePoint Central Administration site.

The reports in document libraries can be organized into folders, and permissions on these items are
applied to SharePoint users just like any other document in a SharePoint library. Like in native mode, users
and administrators can create subscriptions for scheduled delivery of reports, and additionally in
SharePoint Integrated mode, users can subscribe to data alerts, which automatically send notifications if
the source data for a report has changed. However, there is no support for the My Reports folder in
SharePoint Integrated mode.

Considerations for choosing a mode


The most obvious factor in deciding between native and SharePoint Integrated mode is the presence of a
SharePoint farm in the environment where the reporting solution is to be used. If users are familiar with
the SharePoint interface, it makes sense to integrate reporting into that interface. The lack of a My
Reports folder in SharePoint Integrated mode is a consideration if you have an existing Reporting
Services solution in which that feature is used, or if it is critical to the business requirements, but this is
rarely the case.
In environments where SharePoint Server is not used, or in small, departmental reporting scenarios, native
mode provides a highly usable and manageable platform for report delivery. Additionally, even when
reports will be deployed to a production report server in SharePoint Integrated mode, native mode can
still be used effectively by report developers and user acceptance testers.
MCT USE ONLY. STUDENT USE PROHIBITED
7-6 Designing a Reporting Services Solution

Planning Report Delivery


Based on the reporting requirements you
gathered earlier, you must plan the delivery of
reports to the business users who need them. In
many scenarios, users will consume reports by
browsing to them in Report Manager or a
SharePoint document library. However, it is a
common requirement for reports to be delivered
automatically on a scheduled basis through
subscriptions, most commonly by email.

If you intend to use subscriptions for report


delivery, your planning must include the following
considerations:

 Configuration settings for the email server. You must determine the simple mail transfer protocol
(SMTP) server that will be used to relay reports and configure Reporting Services to use it in
Reporting Services Configuration Manager or the SharePoint Central Administration site.
 The types of subscription to be supported. Reporting Services supports standard subscriptions,
which can be defined by individual users; and data-driven subscriptions, which must be defined by an
administrator. Empowering users to create their own subscriptions enables them to manage their own
report delivery to suit their own requirements, but it reduces the level of control the administrator has
over the subscriptions being created. In an environment where a large number of users are creating
subscriptions that run on multiple schedules, the workload on the report server can increase
dramatically. If data-driven subscriptions are to be created, you must determine which reports need
to be delivered to which users, when, in what format, and with what parameter values. Additionally,
data-driven subscriptions are only available in SQL Server 2012 Enterprise and Business Intelligence
editions.

 Data source credentials. Subscriptions require that the credentials used to retrieve report data from
data sources are stored on the report server.

Planning Security
Reports can contain sensitive data, and security is
a primary concern in many reporting
environments. When planning report security, you
need to consider how to secure access to the
reports themselves, and also how to secure access
to the data sources that provide the data for the
reports.

Report security
The specific steps for configuring report security
depend on whether Reporting Services is
deployed in native mode or SharePoint Integrated
mode. However, the planning considerations for
securing reports are the same, regardless of the deployment mode. The basic principle to apply when
planning security for reports is the same as for any resource:

1. For each resource or group of related resources, determine the different levels of access required (for
example, read, view definition, or manage). This enables you to identify the required roles or groups.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-7

2. Identify the users who need to access the resources and the level of access required. This enables you
to determine the membership of the roles or groups.

You should use the matrix of reports and user audiences that you compiled during the requirements
gathering phase as a starting point for determining the required permissions. For each report, group the
users who need access, and then identify reports with common audiences. From this information, you can
determine the roles that are required and the users who need to be members of those roles.

Note: In an enterprise environment, it is normal to assign membership of resource-specific


security roles to Windows groups, not to individual Windows user accounts. This reduces the
need to manage role membership after the initial roles are defined. To achieve this level of
abstraction, you will need to work with a Windows domain administrator to determine what local
Windows groups need to be created, and which domain groups should be added to them.

Data source security


If users will always consume reports by browsing to them in Report Manager or SharePoint Server, you
can use Windows Integrated authentication or user-provided credentials to pass the user’s identity to the
data source. This approach ensures that any auditing in the data source reflects the identity of the user
consuming the report. Depending on the location of the data sources, you may need to configure
Kerberos authentication to enable delegation of user credentials across multiple server connections.

If your reporting solution supports subscriptions, caching, or snapshots, you cannot use Windows
Integrated authentication or an interactive prompt for credentials, and you must store a single set of
credentials on the report server to be used when accessing the data source.

Planning Report Navigation


If users will browse reports in Report Manager or a
SharePoint Server document library, you must give
some consideration to the organization of reports
into folders. The folder structure should be
designed to:
 Provide an intuitive navigation structure that
enables users to find reports easily.

 Simplify management of reports and related


items.

Factors that affect the design of the reporting


folder structure include:

 Security. By arranging reports and items into folders based on security requirements, you can easily
apply the same security settings to a group of related reports by setting permissions at the folder
level. You can take advantage of permission inheritance through a hierarchy of folders.

 Item types. In addition to reports, a reporting solution includes items such as data sources, shared
datasets, and report parts. Often, it makes sense to keep items of the same type together, because
this makes it easier for report developers, administrators, and self-service report authors to find and
manage them.

 Audience. In some BI solutions, the overriding organizational principle for reports is the audience
that will consume them. For example, a BI solution might include reports for multiple groups of users,
MCT USE ONLY. STUDENT USE PROHIBITED
7-8 Designing a Reporting Services Solution

such as executives, sales managers, and production engineers. If each audience requires a discrete set
of reports, it makes sense to organize the reports into folders that reflect the audiences.

 Business areas. Another way to organize reports is by the area of the business that the reports relate
to. For example, a BI solution might include reports that show information about sales, financial
accounts, and production. In this case, it might be logical to organize the reports so that all of the
sales-related reports are stored in one folder, the financial reports in a second folder, and the
production reports in a third folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-9

Lesson 2
Designing Reports
After you plan the reporting solution as a whole, you can focus on the specific reports that must be
created. This lesson describes some key considerations and guidelines for designing reports.

Lesson Objectives
After completing this lesson, you will be able to:

 Design datasets.

 Describe considerations for printed reports.

 Describe considerations for interactive reports.

 Describe considerations for graphical reports.

 Describe considerations for report rendering.


 Describe considerations for using data models as sources for reports.

Designing Data Sets


After creating data sources, the first step in
designing a report is to create the datasets that
contain the data for the report. When designing
datasets for reports, consider the following
guidelines.

Include only the data needed in the


report
Retrieving more data than is required places
unnecessary stress on data sources and network
bandwidth, and can increase memory
consumption on the report server. To minimize
dataset size, you should:

 Include only the columns that are required for display, sorting, or grouping in the report. For
example, avoid using a SELECT * statement, and instead list only the columns that are actually
required.

 Include only the rows that are required for the detail level of the report. For example, if you are
creating a report to show sales performance for each product, grouped by category, there is no need
to include the individual order rows. Instead, use a query with a GROUP BY clause that aggregates the
orders by product and category so that the product-level aggregation in the query is the detail level
of the report.

 Use parameters to filter data in the query. You can apply filters at the report level, or in the query
used to retrieve the data. Generally, it is inefficient to retrieve all of the data from the data source and
then filter in the report. Most business reports are scoped by a specific time period, such as month,
quarter, or year, so most reports should have time-based parameters that restrict the number of rows
retrieved.
MCT USE ONLY. STUDENT USE PROHIBITED
7-10 Designing a Reporting Services Solution

Note: In some scenarios, such as when using snapshots or cached datasets, you might
choose to retrieve the largest volume of data than is needed by a single user of the report or
dataset, and then use report-level filters to generate reports at the required scope. In this
scenario, the performance overhead of initial data retrieval is compensated for by a reusable
dataset or report, which minimizes the need for subsequent requests to the data source.

 Define restrictive default parameter values. Define default parameters to return a minimal number
of rows based on common usage of the report. For example, a sales report might be used to view
sales by month, quarter, or year based on parameters for the start and end of a time period range. By
setting the default values of these parameters to the shortest time period that is commonly used, you
can reduce unnecessary data retrieval. A common technique for applying minimal time period
parameters across multiple reports is to create a shared dataset that retrieves a range of commonly
used date values, and to use that dataset as the source for default and available parameter values for
all time-filtered reports. For example, the following query could be used to return values for:

o The first day of the current year.

o The first day of the previous month.


o The last day of the previous month.

o The first day of the current month.

o The current day.

SELECT DATEFROMPARTS(YEAR(GETDATE()), 1, 1) AS FirstOfThisYear,


DATEADD(d, 1, EOMONTH(DATEADD(m, -2, GETDATE()))) AS FirstOfLastMonth,
EOMONTH(DATEADD(m, -1, GETDATE())) AS LastOfLastMonth,
DATEADD(d, 1, EOMONTH(DATEADD(m, -1, GETDATE()))) AS FirstOfThisMonth,
CAST(GETDATE() AS Date) AS Today;

Sort data in the query


Although you can sort data in a data region in a report, it is generally more efficient to sort the data in the
query by using an ORDER BY clause.

Filter on the date key


As previously stated, most business reports are filtered by a range of dates. In a data warehouse, the fact
table is commonly related to a time dimension table by an integer key, and query performance can be
improved by specifying the integer key value in the WHERE clause of the query instead of using a join to
the dimension table.

Use views instead of base tables


If the data warehouse has been designed to include a layer of views on top of the base fact tables and
dimension tables, use the views in the FROM clause of queries instead of the base tables. Typically, views
in a data warehouse use more user-friendly names for columns and the view itself than the base tables,
and they often omit metadata columns that are not required for reporting. Additionally, view definitions
often include query hints to optimize locking and performance.

Consider encapsulating queries in stored procedures


If you have sufficient access to a SQL Server data source, consider using stored procedures to encapsulate
dataset queries. Stored procedures can be precompiled, and they often offer better performance than
submitting a Transact-SQL query. However, be aware that some additional programming is required to
support datasets with multi-value parameters.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-11

Considerations for Printed Reports


Reports are often printed for physical distribution.
If your BI solution includes reports that are
intended to be consumed in hard-copy format,
consider the following guidelines:

 Determine the required page size and


configure the report properties
appropriately. If the report will be printed in
multiple geographical areas that use different
paper size standards, consider creating
multiple versions of the report. For example,
you might create a US Letter sized report for
printing in the United States and an A4 sized
version of the same report for printing in the United Kingdom.

 Define pagination. Reporting Services provides fine-grained control over pagination behavior for
reports, including the ability to force page breaks before or after data regions and groupings, and to
repeat column headers on each new page. Use these settings to ensure that the report is easy to
consume when spread across multiple physically printed pages.

 Include page numbers in the report header or footer. Multipage reports often contain many
similar-looking pages. If the report is printed on a printer that does not collate the pages, or the
printed report pages are dropped, it can be difficult to sort the pages into the correct order. Including
page numbers makes it easier to do this.
 Include the report execution time. When a printed report is distributed on a regular schedule, it
can be difficult to identify the current version. By including the date and time of the report execution
in the report, users can be sure that they are looking at the most recent data.
 Include parameter values. When viewing a printed report, the user has no way to determine the
parameters that were used to filter the data, unless the parameter values are included in the report.

 Optimize for monochrome documents. Although color printers have become increasingly
prevalent in business environments, it is common for reports to be printed on monochrome printers
or to be photocopied on a black and white photocopier. Therefore, you should avoid relying on color
to indicate specific meaning in the report, and try to use colors that are easily distinguishable when
viewed as greyscale. If the report is intended primarily for consumption as a printed document,
consider using only black and grey fonts on a white background.
MCT USE ONLY. STUDENT USE PROHIBITED
7-12 Designing a Reporting Services Solution

Considerations for Interactive Reports


In contrast to printed reports, some reporting
scenarios require the ability for users to
interactively browse reports, often to explore data
at multiple levels of detail. When designing a
report that is primarily intended for online,
interactive consumption, consider the following
guidelines:

 Maximize the InteractiveSize property of the


report to avoid pagination.

 Use color and font formatting to indicate the


interactive elements and areas of the report.

 Include tooltips to guide user interactions.


 Choose the most appropriate option for parent-child navigation:

o Avoid including subreports in groups where a report includes a large number of parent groups.

o Use the Visibility property to hide child data and provide drill-down functionality where there
are a moderate number of detail rows in the report.

o Use link actions to provide drill-through functionality to parameterized child data reports where
there are a large number of detail rows.
 Avoid complex expressions in headers and footers. If an expression includes anything other than a
simple field value, it is assumed that the expression may include a reference to the TotalPages global
variable. This means that the first page of the report cannot be rendered until after the entire report
is processed and paginated.

Considerations for Graphical Reports


Reports often include charts or other graphics that
make it easy to see high-level information in a
visual format. When designing a report that
includes graphical elements, consider the
following guidelines:

 Use an appropriate graphical


representation for the information being
conveyed. In addition to a wide range of
chart types, including bar charts, line charts,
scatter charts, and pie charts, Reporting
Services provides a range of data visualization
elements that can be used effectively. These
include:

o Gauges. Used to indicate a level of performance within a range of possible values. For example, a
gauge might show the revenue to date against a sales target.

o Data bars and sparklines. Used to provide graphical comparisons of values in multiple rows. For
example, a data bar could show comparative sales volumes across multiple salespeople, or a
sparkline could compare monthly sales levels across regions.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-13

o Indicators. Used to show how a specific data value compares to a target. For example, you could
use indicators to create a scorecard report that shows how monthly sales revenue compares to
target or to the same period in the previous year.

o Maps. Used to show business data that is geographical in nature. For example, you could use a
map to show countries color-coded to indicate the level of sales in each country.

 Consider creating charts and other graphical elements as parameterized subreports or report
parts. Often, the same chart is useful in multiple reports, and you can use subreports or report parts
to create a library of reusable data visualizations that can be incorporated into any report with a
compatible dataset and appropriate parameters. For example, the dashboard shown on the slide
consists of a report that contains the title and execution time, and four subreports that each use
parameter values from the parent report to filter the graphical data.

 Avoid overcomplicating charts. Charts are most useful when they show a clear visualization of a key
data element. You can create charts that include multiple categories and series, and show many data
points; but the more detail you add to a chart, the more difficult it can be to interpret the information
that the chart is designed to convey. Additionally, be careful when selecting color palettes for charts,
because some color combinations can be difficult to distinguish.

Considerations for Report Rendering


When reports are exported or delivered by
subscriptions, consider the render formats that
must be supported. For example, business users
might export reports in Microsoft Word format, or
have them delivered as Microsoft Excel
workbooks. For simple reports with no interactive
elements, the rendered report in each format is
likely to look similar. However, more complex
reports might include visual elements or
functionality that works better in some formats
than in others. Typical causes of problems when
rendering reports in multiple formats include:

 Page size and pagination settings can significantly affect the way a report is rendered.

 Interactive functionality, such as drill-down expansion of hidden groups, is not supported in all
formats.

 Background images are not supported in all renderers, and may be displayed above or below the
data region to which they are applied instead of behind it.
 The CanGrow property of text boxes and the AutoSize property of images can cause problems in
some formats. Some renderers are more efficient if the size of text boxes and images is fixed. You can
use rectangles in reports to fix the position of text boxes and images.

To overcome these problems, you can create a version of the same report for each target format; for
example, an interactive report that includes a hyperlink to a static version for printing. However, this can
entail a lot of additional work to develop and manage multiple versions of each report. An alternative is to
design adaptive reports that modify their behavior, depending on the rendering extension being used
when the report is rendered. To help you accomplish this, Reporting Services supports the following
global variables:

 RenderFormat.IsInteractive. You can use this variable to determine if the render format supports
interactivity, such as drill-down functionality to show hidden groups.
MCT USE ONLY. STUDENT USE PROHIBITED
7-14 Designing a Reporting Services Solution

 RenderFormat.Name. You can use this variable to determine the specific rendering extension being
used and apply format-specific settings.

For example, the following expression could be used to set the Hidden property of a group in a Tablix
data region. The group would then be hidden when rendered to an interactive format, but visible in
formats that do not support interactivity.

=iif(Globals!RenderFormat.IsInteractive, True, False)

Considerations for Data Model Sources


Most of the examples considered in this module
are based on data retrieved from a data
warehouse database. However, in many BI
solutions, reporting is performed using data
retrieved from a multidimensional or tabular data
model in a SQL Server Analysis Services database.
Reporting Services supports the use of Analysis
Services as a data source and the creation of
datasets based on MDX queries.

When using Analysis Services data models as a


source for reports, consider the following
guidelines:

 Edit MDX queries in a suitable query tool. Report Builder and Report Designer include a graphical
query editor that you can use to create simple MDX queries, but in many cases you will want to
modify the MDX that is generated to include additional metadata or optimize the query syntax. You
should use the query designer in Report Builder and Report Designer to create the initial query and
configure parameters, and then copy the code to a more fully featured editing tool to refine it before
importing it back into the dataset.

 Remove the NON EMPTY clause. Often, a report should include empty rows; for example, to
indicate a lack of sales of a particular product or on a particular day. To ensure that empty rows are
included in the dataset, remove the NON EMPTY clause from the MDX query.

 Let the cube perform aggregation. Instead of using a specific function such as Sum or Count in
report field expressions, use Aggregate. This ensures that the specific aggregation defined in the
cube is applied. This is particularly important when the cube includes semi-additive measures that can
be aggregated across some dimensions but not others.
 Let the cube perform sorting. Dimension attributes in a cube can be configured with a sort order
based on their name, key value, or a completely different attribute in the same dimension. Avoid
specifying sort expressions for groups in a report based on a data model and rely on the sort order
defined for the dimension attribute in the data model.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-15

Lesson 3
Planning Report Consistency
Organizations commonly generate many reports, often hundreds or even thousands of discrete reports.
Making sure that the reports generated by a company are consistently formatted and structured can
result in many benefits, including:

 Easier consumption of reports because of a familiar reporting style and structure.

 A consistent, professional image through the use of branding and approved formatting.

 This lesson describes some strategies for enforcing report consistency across an organization or
business area.

Lesson Objectives
After completing this lesson, you will be able to:
 Create and use report templates.

 Use shared data sources and datasets to enforce consistent report data.

 Use linked reports to create consistent reports containing subsets of relevant data.
 Create a self-service reporting environment that encourages report consistency.

Report Templates
To enforce consistency, you can create a report
that includes the elements that you want all
corporate reports to include, and save it as a
template for report developers. For example, you
can create a report template that includes
expressions to display the report title, execution
time, and page numbers in the report header. You
can also apply corporate branding, such as font
formatting and images.

Creating a report template


After you create a report that includes all the
formatting and content you want to include in the
template, save it to the C:\Program Files (x86)\Microsoft Visual Studio
10.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject folder on each computer where
Report Designer will be used to develop reports.

Using a report template


To use the template in Report Designer, report developers can add a new item to the Reports folder in
Solution Explorer, and select the template from the list of available items.
MCT USE ONLY. STUDENT USE PROHIBITED
7-16 Designing a Reporting Services Solution

Data Sources and Datasets


Another way to ensure consistency is to use
shared data sources and datasets. When designing
datasets, consider the following guidelines

 Reuse shared datasets across multiple


reports. By using shared data sources and
datasets for common report queries, you can
ensure that the data is consistent across
reports. This approach simplifies report
development by enabling an expert query
writer to create a dataset that can be used by
other report developers. Additionally, a
shared dataset can be cached independently
of the reports that use it, thereby improving report server performance.

 Standardize parameters. Another use of shared datasets is to provide a library of standard fields for
default and available parameter values. This simplifies the development of parameterized reports, and
helps provide a consistent user experience when browsing reports.

 Create templates. You can save shared data sources and datasets in the same folder as report
templates to make them available to report developers using Report Designer. Report Designer
requires that the Visual Studio project used to create the reports includes local data sources and
datasets that can be used during report development. You can use the project properties to ensure
that when the project is deployed, existing data sources and datasets on the server are not
overwritten.

Linked Reports
Sometimes you need to provide a consistent
report to multiple audiences, but with different
parameterized data in the report. For example,
you might need to create a corporate sales report
for executives showing sales across all regions, and
provide the same report to each regional sales
manager showing only the sales for the relevant
region.

An easy way to accomplish this is to create a


linked report, which you can publish in a different
folder to the source report. You can then override
the default parameter values and visibility to filter
the linked report appropriately. Linked reports use the layout and format definition of the source report
with the overridden parameters and other properties, such as caching behavior, to provide multiple user
audiences with an identically formatted report that contains only the data that is relevant to each
audience.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-17

Self-Service Reporting Consistency


When the reporting solution must support self-
service reporting, you can simplify report
authoring for business users while enforcing
consistency by applying the following guidelines:

 Publish report templates. Report Builder


does not support creating new items from a
locally installed template, but you can achieve
the same result by opening an existing report
and saving it with a different name. You could
distribute template reports to each business
user’s computer, but a more scalable solution
is to publish the template as a report in an
intuitive folder location on the report server, and apply permissions that enable business users to
open the report but not overwrite it. Business users can then be instructed to open the report
template, add the datasets and content they want to include, and save the resulting report in an
appropriate folder for self-published reports.

 Make shared data sources and datasets easy to find. Report Builder does not require local design-
time data sources and datasets, so business users should be able to easily find and select published
data sources and datasets to use in their reports. This not only simplifies report development for
business users, but it also helps ensure consistency and manageability of the queries being used to
populate reports. If some business users are sufficiently skilled in writing their own queries, they can
be empowered to create new shared datasets where required, using existing shared data sources and
datasets for default and available parameter values.

 Publish parameterized report parts. Create charts and other reusable data elements, such as
scorecards, as parameterized report parts and publish them as a library that business users can
incorporate into their reports. The parameters enable users to apply the report parts to their own
subsets of reporting data while helping ensure a consistent style for data visualizations across all
corporate reports.
MCT USE ONLY. STUDENT USE PROHIBITED
7-18 Designing a Reporting Services Solution

Lab: Designing a Reporting Services Solution


Scenario
You have implemented a data warehouse and data model for Adventure Works Cycles, and now you must
design a solution for formal and self-service reporting. Initially, you will design a prototype reporting
solution for user acceptance testing in a Reporting Services installation in native mode, which may be
deployed to a SharePoint Integrated mode report server at a later date.

Objectives
After completing this lab, you will be able to:

 Design a reporting environment.

 Create reusable report items.

 Create consistent reports.

Estimated Time: 60 Minutes


Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Designing a Reporting Environment


Scenario
You have gathered reporting requirements from a business stakeholder, and now you must design a
report server folder structure that makes is easy for business users to find the reports they need while
optimizing the manageability of reports and related items.

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Review Reporting Requirements.

3. Design a Report Folder Structure.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab07\Starter folder as Administrator.

 Task 2: Review Reporting Requirements


 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab07\Starter
folder.

 Task 3: Design a Report Folder Structure


 Collaborate with two or three other students.

 Discuss the reporting requirements in the interviews and agree on an appropriate folder structure to
support them.

 Create the folders you think are needed in Report Manager at http://mia-sqlbi/reports_sql2.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-19

Results: At the end of this exercise, you should have created folders in the report server at http://mia-
sqlbi/reports_sql2.

Exercise 2: Creating Reusable Report Items


Scenario
Now that you have created folders for the reports and related items, you plan to enforce report
consistency by creating reusable report items.

The main tasks for this exercise are as follows:

1. Create Shared Data Sources.

2. Create a Report Template.

3. Create Shared Datasets.

4. Deploy Template Items.


5. Create a Chart Report Part.

 Task 1: Create Shared Data Sources


 Still working as a small group, determine the required data sources and the credentials and settings
that they should use.
 Use SQL Server Data Tools to create a Report Server project named AWReports in the
D:\Labfiles\Lab07\Starter folder.

 In the project, create shared data sources for each data source the reporting solution must support.
o Use Windows authentication for the data source credentials―you will use this form of
authentication when working in the project.

 Configure the project properties so that the data sources, datasets, and report parts in the project will
be deployed to the appropriate folders you created earlier on the http://mia-sqlbi/reportserver_sql2
report server. Reports in this project should be deployed an appropriate folder for report templates.

 Deploy the project.


 In Internet Explorer, use Report Manager to modify the credentials of the deployed data sources for
the production environment.

 Task 2: Create a Report Template


 In the AWReports project, add a blank report named AWReport.rdl.

 Modify the report so that it matches the design specification provided by the CEO in the interviews
document:

o Add page headers and footers as required.

o The Adventure Works Cycles corporate logo is provided in the D:\Labfiles\Lab07\Starter folder.

o Assign appropriate names to the objects you add to the report.

 Task 3: Create Shared Datasets

Note: Transact-SQL scripts for the datasets you need to create in this task are provided in the
MCT USE ONLY. STUDENT USE PROHIBITED
7-20 Designing a Reporting Services Solution

D:\Labfiles\Lab07\Starter folder.

 In the AWReports project, add the following shared datasets, which should all use the data source
for the data warehouse:

o Internet Sales By Country. This dataset should return the following fields filtered by starting and
ending ship dates that are provided as Date/Time parameters:
 Fiscal Year.
 Month Number.
 Month Name.
 Country.
 State or Province.
 City.
 Total Sales Amount.
 Total Product Cost.
The results should be grouped and ordered by the following fields:
 Fiscal Year.
 Month Number.
 Month Name.
 Country.
 State or Province.
 City.
o Reseller Sales By Region. This dataset should return the following fields filtered by starting and
ending ship dates that are provided as Date/Time parameters, and by region, which is provided
as a Text parameter that supports multiple values and has default value that includes the Europe,
North America, and Pacific regions;
 Fiscal Year.
 Month Number.
 Month Name.
 Sales Region.
 Sales Country.
 Sales Territory.
 Total Sales Amount.
 Total Product Cost.
The results should be grouped and ordered by the following fields:
 Fiscal Year.
 Month Number.
 Month Name.
 Sales Region.
 Sales Country.
 Sales Territory.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-21

o Last Month. This dataset should retrieve the date of the first and last days in the month prior to
the current month. It will be used to provide default values for parameters in monthly sales
reports.

 Task 4: Deploy Template Items


 View the properties of the AWReports project to verify that all items will be deployed to the correct
location, and that datasets and data sources will not be overwritten.

 Deploy the project.

 Copy the datasets and report template that you have created to the C:\Program Files\Microsoft Visual
Studio 10.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject folder.

 Task 5: Create a Chart Report Part


 Use Report Manager in Internet Explorer to view the folder you created for report parts.

 Launch Report Builder, downloading it if necessary, and use it to create a new chart report based on
the Internet Sales By Country dataset you published to the report server in the previous task.
 Create a chart that shows sales revenue by country, and format it as you like.

 Assign a meaningful name to the chart.

 If you want to preview the chart, enter any two dates within the last year as parameters.
 When you are happy with the chart, use Report Builder to publish it as a report part to the folder you
created in Report Manager. Do not publish any other objects in the project (such as parameters or
datasets).
 After the chart is published as a report part, close Report Builder, discarding the report.

Results: At the end of this exercise, you will have published shared data sources, a report template, shared
datasets, and a report part.

Exercise 3: Creating Reports


Scenario
You have created a reporting environment and published reusable report items. Now you must test the
reporting solution by creating and publishing reports.

The main tasks for this exercise are as follows:


1. Create a Report From a Template.

2. Create Linked Reports.

3. Create a Report with Report Builder.

 Task 1: Create a Report From a Template


 Use SQL Server Data Tools to create a Report Server project named Executive Reports in the
D:\Labfiles\Lab07\Starter folder.

Tip: You have previously deployed shared data sources and datasets to the report server. However, you
need to create local substitutes for these with the same names in the project to use during development.
When the project is deployed, the local versions do not overwrite the existing server versions, and the
reports in the project transparently use the versions that already exist on the server.
MCT USE ONLY. STUDENT USE PROHIBITED
7-22 Designing a Reporting Services Solution

 In the project, create a shared data source with the same name as the one you created earlier for the
data warehouse SQL Server database:

o Use Windows authentication for the data source credentials―you will use this form of
authentication when working in the project. When you deploy the project, the data source on the
server will be used instead of the one in the project.

 Add a new item to the Shared Datasets folder in the project, based on the Reseller Sales By Region
dataset template you created earlier, and named Reseller Sales By Region.rsd:

o When the dataset is added, it is opened so you can see its source XML definition. After you close
it, re-opening it will display its properties.

 Add a new item to the Shared Datasets folder in the project, based on the Last Month dataset
template you created earlier, and named Last Month.rsd.

Tip: Now that you have created local copies of the data source and datasets on the server, you can create
a report and add references to the local data source and datasets in the project. These references will be
switched to the existing server versions when the report is deployed.
 Add a new item to the Reports folder in the project, based on the AWReport template you created
earlier, and named Reseller Sales.rdl.

 View the report data for the report, and add a reference to the shared data source you created earlier.
 Add a reference to the Reseller Sales By Region shared dataset to the report, and name it
ResellerSalesByRegion.

 Add a reference to the Last Month shared dataset to the report, and name it LastMonth.
Tip: The ResellerSalesByRegion dataset includes parameters for the start and end dates by which the
data in the report will be filtered. You will use the fields returned by the LastMonth dataset to set the
default values for these parameters to the first and last dates of the previous month, respectively.
The ResellerSalesByRegion dataset also includes a multi-valued parameter for the regions that should be
included in the report, and a default value that includes all of the regions has already been defined for
this.
 Configure the StartDate and EndDate parameters to use the FirstDate and LastDate fields from the
LastMonth dataset as default values.

 Add a table to the report, and use fields from the ResellerSalesByRegion dataset to create a report
that shows sales revenue for each territory, grouped by country and region:

o To create groups in the report, drag fields to the Row Groups pane under the report.

 Preview the report and apply formatting until you are satisfied with it.

 Configure the project properties so that:

o The project items will be deployed to the http://mia-sqlbi/reportserver_sql2 report server.

o Data sources and datasets in the project will not overwrite existing objects with the same name
on the server.

o Reports in this project will be deployed to an appropriate folder for executive reports based on
the folder structure you defined at the start of this lab.
 Deploy the project.

 Use Internet Explorer to verify that the Reseller Sales report has been deployed, and that it shows
reseller sales by region with the expected default parameters.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-23

 Task 2: Create Linked Reports


 In Internet Explorer, create a linked report based on the Reseller Sales report you created in the
previous task:

o Name the linked report Reseller Sales - Europe and save it in an appropriate folder for access by
the regional sales manager for Europe.

o After you save the linked report, edit its properties and override the default Regions parameter
value so that the report shows only sales for Europe, and hide the Regions parameter so that
users cannot change it.

 Create a second linked report for the regional sales manager of North America, and the third report
for the regional sales manager of the Pacific region:

o Name the linked reports appropriately and save them in appropriate folders.

o Override the Regions parameter so that each report shows only sales for the appropriate region
and cannot be changed.

 Task 3: Create a Report with Report Builder

In this task, you will use Report Builder to create a report based on the template you saved earlier. In a
self-service reporting scenario, you can use this technique to ensure report consistency. However, Report
Builder does not include functionality for template-based authoring, so self-service authors must open an
existing report that serves as the template, and then save the modified report under a different name.

 Start Report Builder and open the AWReport template from the folder in the report server where you
deployed it in the previous exercise.

 After you open the report, save it as Internet Sales in an appropriate folder for the financial reports
that will be created by self-service reporting authors. Be careful not to overwrite the AWReport
template.

Tip: In a production environment, you could use permissions to ensure that self-service reporting authors
have read-only access to the report template, removing the risk of accidental modification.

 Insert the InternetSalesChart report part you published previously into the report:

o Search the report part gallery for “InternetSales” to find the report part, and then double-click it
to insert it.

o You may need to resize the report and chart appropriately after you have added it.

 View the report data and note that adding the report part also added the dataset on which the chart
is based, and the parameters defined in that dataset.

Tip: Unlike when using Report Designer in SQL Server Data Tools, when using Report Builder, you can add
a reference to data sources and datasets on the server directly, without having to create a local copy in a
project.

 Add a dataset named LastMonth that is a reference to the Last Month shared dataset you deployed
to the report server in the previous exercise.
 Configure the StartDate and EndDate parameters to use the FirstDate and LastDate fields from the
LastMonth dataset as default values.

 Insert a table into the report, under the chart, and use it to show revenue, cost, and profit for each
city, grouped by state or province and country.

o To calculate profit, use the following expression:


MCT USE ONLY. STUDENT USE PROHIBITED
7-24 Designing a Reporting Services Solution

=Fields!Revenue.Value - Fields!Cost.Value

 Run the report to preview it, and then make any formatting changes you want.

 When you are satisfied with the report, save it and close Report Builder.

 Use Internet Explorer to view your report in the folder where you deployed it.

Results: At the end of this exercise, you should have created a report from a template, created a linked
report, and used Report Builder to create a report that includes a previously published report part.

Question: What were the key organizational principles you applied when designing the
report server folder structure, and what revisions did you consider when you started to create
and publish report items?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 7-25

Module Review and Takeaways


Module Review
This module described considerations for planning a Reporting Services solution in a BI project. Reporting
Services is a versatile platform that supports many different kinds of reports and report delivery solutions,
but by applying the basic principles in this module, you should be able to plan a successful reporting
solution.

Question: What are likely to be the key challenges in providing a self-service reporting
solution that includes an Analysis Services data model as a data source, and how might you
overcome them?
MCT USE ONLY. STUDENT USE PROHIBITED
7-26 Designing a Reporting Services Solution
MCT USE ONLY. STUDENT USE PROHIBITED
8-1

Module 8
Designing a Microsoft Excel-Based Reporting Solution
Contents:
Module Overview 8-1

Lesson 1: Using Excel for Data Analysis and Reporting 8-2

Lesson 2: PowerPivot for Excel 8-6

Lesson 3: Power View for Excel 8-9

Lab: Designing a Microsoft Excel-Based Reporting Solution 8-11

Module Review and Takeaways 8-18

Module Overview
Microsoft Excel is used in many organizations around the world, and is familiar to most business users. By
leveraging the data analysis capabilities of Excel, organizations can empower users to explore data models
and other data sources, identify patterns in business data, and improve business decision making. As a
business intelligence (BI) specialist, you must be familiar with the data analysis functionality in Excel, and
be able to determine suitable Excel features for specific business requirements.

Objectives
After completing this module, you will be able to:

 Describe Excel features that enable users to analyze data.


 Configure and use PowerPivot for Excel.

 Configure and use Power View in an Excel workbook.


MCT USE ONLY. STUDENT USE PROHIBITED
8-2 Designing a Microsoft Excel-Based Reporting Solution

Lesson 1
Using Excel for Data Analysis and Reporting
Excel includes comprehensive functionality for displaying and analyzing data. This lesson describes some
of the ways in which users can connect to data sources from Excel and perform data analysis.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the role of Excel in a BI project.

 Describe considerations for connecting to data sources in Excel.

 Describe ways in which users can analyze data in Excel.

Microsoft Excel in a BI Project


When planning a BI solution for an environment
where users will perform data analysis, some
consideration should be given to the tools that
they will use, including Microsoft Excel. Some key
considerations for Excel in a BI environment
include:
 The version of Excel to be used. Microsoft
Excel 2013 includes a substantial number of
new and enhanced functionality for data
analysis and visualization, but many common
data analysis activities can also be performed
in previous versions.
 The corporate data sources to which users will require access. Generally, Excel users who need to
undertake data analysis are granted access to data models in Analysis Services. However, you may
also grant more experienced users, who need to create their own data models, direct access to the
data warehouse database.

 Access to external data sources and cloud services. Excel includes support for importing data from
Web pages and cloud services, such as the Windows Azure data market and Bing Maps. For these
resources, users will require access to the Internet and, in some cases, a registered account with the
data service provider.

 How users will share Excel workbooks. In some scenarios, specialist data analysis will use Excel in
isolation. However, increasingly, organizations rely on users sharing information. In an enterprise
organization, users can share Excel workbooks by publishing them on a Microsoft SharePoint Server
site and use Excel Services to view and interact with the workbooks in a web browser. Additionally,
Microsoft Office 365 enables organizations to publish Excel workbooks in the cloud and interact with
them by using Office web applications.

Note: This module focuses on the data analysis functionality of Excel. Considerations for
sharing workbooks in a SharePoint Server site are discussed in Module 9: Planning a SharePoint
Server BI Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-3

Data Sources for Excel


Excel supports a wide range data sources from
which users can import data for analysis. The
ability to obtain data from such a wide range of
data providers makes it possible for users to create
analytical data mashups that combine data from
multiple sources. Commonly used data sources in
an enterprise BI solution include:

 Analysis Services data models. Users can


connect to data models in Analysis Services
databases and use them as a source for
PivotTables and PivotCharts. Excel uses
Multidimensional Expressions (MDX) queries
to access Analysis Services data, and makes no distinction between multidimensional and tabular data
models. If you plan to enable users to access data models, you must create the required roles in
Analysis Services and apply appropriate permissions on the cube, dimensions, and cells. You should
also consider creating perspectives to simplify access to relevant cube data for specific groups of
users.

 SQL Server databases. Users can also retrieve data from SQL Server relational databases. Most
commonly in a BI solution, this type of access is performed against the data warehouse or a
departmental data mart. You must give consideration to the credentials that users will use to connect
to SQL Server and the permissions they will need. Typically, users access the data warehouse by using
Windows Integrated authentication and require only read access to the data. A common approach is
to grant read access to views instead of to base tables, which enables you to remove metadata
columns that are not required for analysis and include query hints to optimize performance and
concurrency.

Other data sources that can be used to access corporate data include:

 OLE DB and ODBC databases. If the organization stores data in databases for which OLE DB or
ODBC drivers are available, users can connect to these data sources from Excel.

 Microsoft Access databases. If the organization stores data in a Microsoft Access database, it can be
imported into an Excel worksheet for analysis.

 Text files. You can import data from text file, such as comma-separated values (CSV) files. This can be
useful when a data provider is not available for corporate applications that can export data as text.

 XML files. You can import data from XML files, which is a common export format for many
applications.

Increasingly, businesses can extend the value of their data analysis by incorporating data from sources
outside of the organization. With Excel, you can import data from the following external sources:

 Windows Azure SQL Database. The growth in popularity of cloud services has led many
organizations to use cloud-based database services such as Windows Azure SQL Database. From
Excel, Windows Azure SQL Database is accessed like any other SQL Server data source, except that
only SQL Server authentication is supported.

 Windows Azure DataMarket. Users can subscribe to third-party datasets in Windows Azure
DataMarket and use them to obtain useful data that augments corporate data models. For example, a
DataMarket feed containing historical weather statistics for specific geographical areas could be used
to analyze the effect of weather on sales in those regions.
MCT USE ONLY. STUDENT USE PROHIBITED
8-4 Designing a Microsoft Excel-Based Reporting Solution

 OData Feeds. OData has become a popular format for data feeds from applications and services
across the web. In Excel, you can connect to an OData feed and use it as a source of data for a table
or PivotTable.

Viewing Data in Excel


After you import data into an Excel workbook, you
can format it as a table and prepare it for use as a
report. Excel provides a range of data
manipulation and formatting capabilities that you
can use in your reports.

Sorting and filtering


When data in Excel is formatted as a table, you
can sort and filter the table based on the values in
each column. This makes it easy to scope the data
in a report and display it in order.

Calculated values
One of the key strengths of Excel is its ability to calculate cell values based on formulae. You can use Excel
formulae to:

 Calculate column values based on data in other columns. For example, subtract cost from revenue
to calculate profit.
 Calculate aggregate totals and subtotals. For example, use the SUM function to calculate total
revenue.

 Look up related data values. For example, use the VLOOKUP function to find a value in another
table based on a key value in the current table.

Formatting
You can apply formatting to a worksheet to improve its visual appeal, format numbers and dates
appropriately, and emphasize important values. Examples of formatting you can use in an Excel-based
report include:

 Font formatting. You can apply font formatting to alter the size, color, and font of the text in an
Excel worksheet.

 Number formatting. You can apply number formatting to display values in specific date, currency,
percentage, and other number formats.
 Cell and border formatting. You can apply colors to lines and backgrounds and make it easier to
see distinct areas of the report; for example, column headers and subtotal rows.

 Conditional formatting. You can use conditional formatting to change the appearance of cells
based on the values they contain. A wide range of conditional formatting options are provided in
Excel, including the ability to color-code cells based on value ranges, display data bars that are
relative to cell values, and add indicator icons to cells based on the values they contain.

Charts and data visualizations


Excel supports many charts and visualizations, including:

 Bar charts.

 Column charts.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-5

 Pie charts.

 Scatter charts.

 Sparklines.

You can easily create a chart by selecting the data you want to show graphically and inserting the desired
chart type. Additionally, Excel 2013 can suggest appropriate charts based on an analysis of the selected
data.

Interactive Data Analysis in Excel


In addition to a wide range of data filtering and
formatting options, Excel provides tools for
interactive data analysis. When combined with a
data model in Analysis Services or a PowerPivot
data model in the Excel workbook, these tools
provide a powerful data analysis platform.

PivotTables
A PivotTable shows measures aggregated by
dimension attributes and hierarchies. With a
PivotTable, you can interactively:

 Aggregate data values across rows and


columns of dimension attributes.
 Drill up and down dimension hierarchies to see totals and subtotals at each level.

 Filter data based on dimension attribute values.

PivotCharts
PivotTables show dimension attribute member names and measure values in a tabular matrix format,
which can be an effective way to explore the data in a data model. However, many users prefer to
consume data visually, and often a graphical summary can make it easier to see key trends or insights at a
glance. You can use PivotCharts to show aggregated data graphically, and interact with the chart to
explore the data. You can use PivotCharts in isolation, or you can link them to PivotTables so any
interactive exploration of the data in the PivotTable is automatically reflected in all related PivotCharts.

Slicers
Although you can apply filters to PivotTable and PivotCharts, when an attribute supports a manageable
range of possible values, it can be more intuitive to create a slicer for this attribute and filter the data
interactively by selecting the values you want to include in the slicer. Slicers can be linked to one or more
PivotTables or PivotCharts, enabling you to easily view alternative data scenarios in all elements on the
worksheet by selecting or de-selecting a slicer value.

Timelines
Most data analysis involves a time dimension, and often you want to view data for a specific period of
time. Timelines make it easy to select a time range based on a date field in the fact table.

Note: To support timelines, the table containing the measures to be aggregated (typically,
a fact table in a data warehouse) must contain a datetime value.
MCT USE ONLY. STUDENT USE PROHIBITED
8-6 Designing a Microsoft Excel-Based Reporting Solution

Lesson 2
PowerPivot for Excel
PowerPivot for Excel enables Excel users to create a tabular data model in the Excel workbook. In
organizations where business users have sufficient experience and skills to build tabular data models
based on data from multiple sources, this capability can significantly enhance the ability of these users to
analyze business data.

Lesson Objectives
After completing this lesson, you will be able to:

 Deploy PowerPivot for Excel.

 Compare PowerPivot workbooks to tabular data models in Analysis Services.

 Describe considerations for including PowerPivot in a BI solution.

Deploying PowerPivot for Excel


PowerPivot functionality for Excel is implemented
as an Excel add-in. If your BI solution includes
PowerPivot, you must ensure that the add-in is
deployed to the Excel users who will need to use
it.

PowerPivot for Excel 2013


In Excel 2013, the PowerPivot for Excel add-in is
installed with Excel, and it can be enabled or
disabled in the COM Add-Ins dialog box. To
access the COM Add-Ins dialog box, use the
following procedure:
1. Open an Excel workbook.

2. On the ribbon, on the FILE tab, click Options.

3. In the Excel Options dialog box, click the Add-Ins tab.

4. In the Manage drop-down menu, select COM Add-ins, and then click Go.

PowerPivot for Excel 2010


If the business users in your organization are using Excel 2010, the PowerPivot add-in must be
downloaded and installed. You can download a 32-bit or 64-bit version of the add-in from the Microsoft
Download Center at http://www.microsoft.com/en-us/download/details.aspx?id=29074.

Note: Be careful to download the SQL Server 2012 PowerPivot Add-In for Excel 2010, not
the earlier SQL Server 2008 R2 version.

Upgrading PowerPivot workbooks


If you have used the Excel 2010 add-in to create a PowerPivot workbook, opening the workbook’s data
model in Excel 2013 will prompt you to upgrade the data model to the PowerPivot for Excel 2013 format.
After a data model is upgraded to the PowerPivot for Excel 2013 format, it cannot be opened in Excel
2010.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-7

PowerPivot Workbooks vs. Tabular Analysis Services Databases


Creating a PowerPivot data model in an Excel
workbook is a similar authoring experience to
creating a tabular data model for Analysis Services
in SQL Server Data Tools. Both processes start with
the importing of tables from a data source,
followed by the refinement of the model using a
development environment that supports data view
and diagram view interfaces. However, there are
some fundamental differences between the two
technologies, which you should be aware of when
planning a BI solution. The following summarizes
these differences:

 Authoring environment. PowerPivot data models can only be created in Excel. Typically, tabular
data models for Analysis Services are created in SQL Server Data Tools. Additionally, PowerPivot
workbooks can be imported into a tabular data model project in SQL Server Data Tools.
 Implicit measures. PowerPivot data models automatically generate implicit measures for numerical
values that can be aggregated. In a tabular data model for Analysis Services, all measures must be
explicitly defined as Data Analysis Expressions (DAX) expressions.
 Linked tables. PowerPivot data models can use worksheets in the Excel workbook as linked tables.
This makes it easy to supplement source tables in a data model with data in the Excel workbook itself.
Analysis Services tabular data models do not support linked tables, because they are not hosted
within an Excel workbook.

 Storage format. PowerPivot workbooks use the xVelocity in-memory storage format for data models
exclusively. A tabular data model in Analysis Services can use xVelocity storage or DirectQuery storage
(in which the data is retrieved from the source table).

 Model size. The maximum size for a PowerPivot data model in a 64-bit workbook is 4 GB, with a
much lower limit for 32-bit workbooks. The size of an Analysis Services tabular model is limited only
by the physical resources in the server.

 Partitions. The tables in a PowerPivot model cannot be partitioned. Analysis Services tabular data
models support partitioning to help optimize processing of extremely large tables.

 Security. To secure a PowerPivot data model, you must restrict access to the workbook in which it is
contained. Analysis Services supports role-based security with granular permissions.

 Shared access. A PowerPivot workbook in Excel is designed for personal BI by the Excel user.
Conversely, an Analysis Services data model is designed for multiple concurrent users.

Both models can be shared through Microsoft SharePoint Server. To share a PowerPivot data model, the
user must publish the workbook to a SharePoint Server site, where PowerPivot for SharePoint makes the
workbook available to other users through Excel Services. Users can use the workbook interactively in a
browser through Excel Services, create a Power View report from the model defined in the workbook, or
use it as an Analysis Services data source for an Excel PivotTable by specifying the URL for the workbook
in the Data Source Wizard in Excel. You can also create BI Sematic Model (BISM) connections in a
SharePoint document library that reference either a PowerPivot workbook on the SharePoint site or a
tabular database in Analysis Services. BISM connections for either model type can be used as a source for
Power View reports or Excel PivotTables.
MCT USE ONLY. STUDENT USE PROHIBITED
8-8 Designing a Microsoft Excel-Based Reporting Solution

Planning Considerations for PowerPivot


If you intend to include PowerPivot in a BI
solution, you must consider the following:

 Data source access.

 User training.

 Sharing PowerPivot workbooks.

Data source access


Ensure that any users who need to create
PowerPivot workbooks have been granted
appropriate access to the data sources they will
use for the tables in their data models. Consider
creating a layer of views over the base tables and restricting self-service data modeling access to those
views.

User training
PowerPivot is a relatively new technology, and even users with many years of experience in using Excel
may require some training to help them learn how to create and use data models. User training should
include:

 Product training on how to use PowerPivot in Excel.

 Organization-specific training on how to access data sources and policies for consuming and
distributing data.

Sharing PowerPivot workbooks


When users want to share their workbooks, you must implement a suitable infrastructure and define
appropriate processes for them to do so. In most cases, the best way for users to share PowerPivot
workbooks is through SharePoint Server.

Note: Sharing PowerPivot workbooks in SharePoint Server is discussed in Module 9:


Planning a SharePoint Server BI Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-9

Lesson 3
Power View for Excel
Power View provides an intuitive data visualization environment that enables users to graphically explore
data. Power View is available as a component of Reporting Services in SharePoint Integrated mode, or
within an Excel 2013 workbook.

Lesson Objectives
After completing this lesson, you will be able to:

 Deploy Power View for Excel.

 Use Power View in Excel.

 Describe considerations for using Power View for Excel in a BI solution.

Deploying Power View for Excel


Power View for Excel is implemented as an Excel
add-in. You can enable or disable the add-in in
the COM Add-Ins dialog box in Excel, in the same
way as the PowerPivot add-in.

Note: The Power View add-in is installed


with Excel 2013. It is not available in previous
versions of Excel.

In addition to enabling the Power View add-in, to


create Power View reports in an Excel workbook,
the Microsoft Silverlight plug-in for Internet Explorer must be installed on the computer where Excel will
be used. Power View uses Silverlight to create the data visualizations and the Power View reporting
interface.

Using Power View in Excel


The Power View for Excel interface is built into
Excel. Each Power View report created in a
workbook is added as a Power View worksheet. To
create a Power View report, select the data you
want to include in the report, and then click
Power View on the Insert tab.

Power View data sources


You can create a Power View report from any
range of cells in a worksheet. This enables you to
quickly start visualizing data in the workbook.
However, Power View is optimized for visual
exploration of tabular data models, so Power View
works best after you create a PowerPivot data model in the workbook. You can select any cell in a
MCT USE ONLY. STUDENT USE PROHIBITED
8-10 Designing a Microsoft Excel-Based Reporting Solution

PivotTable that is based on the PowerPivot data model and insert a Power View report that automatically
includes the tables defined in the data model.

The Power View interface


The Power View for Excel interface is similar to the Power View interface in SharePoint Server, and
includes a report area and a Power View Fields pane. Users can drag fields to the report or to
visualization-specific areas in the Power View Fields pane. Additionally, a Filters area can be displayed, in
which the user can apply filters to the data displayed in the report.

To specify how the data should be visualized, the user can select a chart type on the Design tab of the
ribbon. Options include bar charts, column charts, scatter charts, line charts, pie charts, and maps.
Additionally, users can select options on the Powerview and Layout tabs of the ribbon to add images
and text boxes, set background images, and specify legend and title options for the charts in the Power
View report. Charts can be displayed in tiles, each containing the same chart for each value in a selected
field, or displayed on vertical and horizontal multiples to enable side-by-side comparison of the same
data for different categories.
To interact with a Power View report, users can click series areas in charts or legends to highlight selected
values. Additionally, charts with related data in the same report are linked, so highlighting a value in one
chart will also highlight it in all other charts in the report. Another type of interaction can be used to
observe changes in data values over time. Users can create a chart that includes a play axis based on a
date or time field, and watch the data change as the play axis progresses through the time values.

Considerations for Power View in Excel


If business users will use Power View in your BI
solution, you should consider the following:
 User training.

 Sharing Power View reports.

User training
Although Power View is an intuitive tool, some
users may require training on how to select
appropriate data in Excel and create a Power View
report from it.

Sharing Power View reports


Power View reports cannot be exported from Excel, so the only way to share them is to distribute the
workbook in which they were created. If a goal of your BI solution is to share data visualizations created in
Power View, consider using the Power View designer provided by Reporting Services in SharePoint
Integrated mode instead of Power View for Excel. When users create Power View reports in SharePoint
Server, they can be viewed by other users and exported to PowerPoint.

Note: Considerations for using Power View in SharePoint Server are discussed in Module 9:
Planning a SharePoint Server BI Solution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-11

Lab: Designing a Microsoft Excel-Based Reporting


Solution
Scenario
You have created a SQL Server data warehouse and an Analysis Services data model for sales and
marketing analysts in Adventure Works Cycles to use when analyzing sales results. Employees at
Adventure Works use Microsoft Excel 2013, and you must now plan how they will use it to analyze the
sales data.

Objectives
After completing this lab, you will be able to:

 Determine the Excel features required to support data analysis requirements.

 Use Excel to analyze data in an Analysis Services data model.

 Use PowerPivot to create a tabular data model in an Excel workbook.


 Use Power View to create data visualizations in an Excel workbook.

Estimated Time: 60 Minutes


Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Planning Excel Features


Scenario
You have gathered analytical requirements from users at Adventure Works Cycles. Now you must examine
those requirements and determine the Excel features required to support them.

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Review Reporting Requirements.

3. Determine Required Excel Features.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab08\Starter folder as Administrator.

 Task 2: Review Reporting Requirements


 Review the Adventure Works employee interviews in Interviews.docx in the D:\Labfiles\Lab08\Starter
folder.

 Task 3: Determine Required Excel Features


 Collaborate with two or three other students.

 Discuss the reporting requirements in the interviews and agree on the Excel features required to
support them.
MCT USE ONLY. STUDENT USE PROHIBITED
8-12 Designing a Microsoft Excel-Based Reporting Solution

 Document your decisions in the Excel Features.docx document in the D:\Labfiles\Lab08\Starter


folder.

Results: At the end of this exercise, you should have a document that contains a list of the required Excel
features.

Exercise 2: Analyzing a Cube with Excel


Scenario
You have created an analytical data model in an Analysis Services database to support sales analysis. You
must now verify that it can be used from Excel.

The main tasks for this exercise are as follows:

1. Connect to Analysis Services Data from Excel.


2. Add Data to a PivotTable.

3. Add Slicers.

4. Add a PivotChart.

 Task 1: Connect to Analysis Services Data from Excel


 Create a new blank Excel workbook named Sales Analysis.xslx in the D:\Labfiles\Lab08\Starter
folder.

 Create a data connection to Analysis Services on MIA-SQLBI, and then import the Sales cube from
the AWSalesMD database.

 Create a PivotTable on the existing worksheet, leaving blank rows above the table.

 Task 2: Add Data to a PivotTable


 Add the following fields to the PivotTable:
o Reseller Revenue.

o Sales Territory By Geography.

o ShipDate.Fiscal Date.

 Task 3: Add Slicers


 Insert slicers so that the data in the PivotTable can be filtered by product category and by reseller
business type.

 Modify the settings of the slicers to hide items with no data.


 Rearrange the worksheet so that the slicers are above the PivotTable.

 Test the slicers by selecting categories and business types and verifying that the data in the
PivotTable is filtered accordingly. Then clear all filters.

 Task 4: Add a PivotChart


 Insert a clustered column PivotChart based on the PivotTable, and place it above the PivotTable, next
to the slicers.

 Verify that expanding hierarchies in the PivotTable updates the data in the PivotChart.

 Verify that clicking slicer items filters the data in the PivotChart.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-13

Results: At the end of this exercise, you will have an Excel workbook that contains a PivotTable and a
PivotChart based on an Analysis Services cube.

Exercise 3: Creating a PowerPivot Workbook


Scenario
The marketing analysts need to perform more complex analysis than the cube you have created supports.
You plan to enable these users to create their own PowerPivot data models from data in the data
warehouse, and use them for analysis.

The main tasks for this exercise are as follows:

1. Verify PowerPivot Add-In Installation.

2. Create a PowerPivot Data Model.


3. Create Relationships.

4. Configure Attributes.

5. Create Hierarchies.
6. Test the PowerPivot Data Model.

 Task 1: Verify PowerPivot Add-In Installation


 Create a new blank Excel workbook named Marketing Analysis.xslx in the D:\Labfiles\Lab08\Starter
folder.
 View the COM Add-ins for Microsoft Excel and ensure that the Microsoft Office PowerPivot for
Excel 2013 add-in is enabled.

 Task 2: Create a PowerPivot Data Model


 Create a PowerPivot data model in the Marketing Analysis.xlsx workbook.
 The data model should include the following tables from the AWDataWarehouse database on the
MIA-SQLDW instance of SQL Server:

o Customer
o Date

o InternetSales

o Product

o Promotion

5. Change the friendly name of InternetSales to Internet Sales.

 Task 3: Create Relationships


 Switch to diagram view, and drag foreign key fields from the fact tables to primary key fields in the
dimension tables to create the following relationships.

Fact Table Foreign Key Dimension Table Primary Key


MCT USE ONLY. STUDENT USE PROHIBITED
8-14 Designing a Microsoft Excel-Based Reporting Solution

Fact Table Foreign Key Dimension Table Primary Key

InternetSales OrderDateKey Date DateKey

InternetSales ProductKey Product ProductKey

InternetSales CustomerKey Customer CustomerKey

InternetSales PromotionKey Promotion PromotionKey

 Task 4: Configure Attributes


 Switch the data view and add a column named Birth Year to the Customer table using the following
DAX expression:

=YEAR([BirthDate])

 Mark the Date table as a date table, and configure the MonthName column to be sorted by
MonthNumber.

o The option to mark a table as a date table is on the Design tab of the ribbon.
o The option to specify a sorting column is on the Home tab of the ribbon.

 Switch back to diagram view, and then hide and rename columns in each of the tables so that only
the columns listed below are visible.
o To hide multiple columns in a table in diagram view, maximize the table, click the columns you
want to hide while holding the Ctrl key, and then right-click any selected column and click Hide
from Client Tools.
o To rename a column, right-click it and click Rename.

Table Visible Columns

Customer  MaritalStatus (renamed to Marital Status)


 Gender
 NumberChildren (renamed to Children)
 NumberCars (renamed to Cars)
 CommuteDistance renamed to (Commute Distance)
 City
 StateOrProvince (renamed to State Or Province)
 Country
 PostalCode (renamed to Postal Code)
 Birth Year

Internet Sales  TotalProductCost (renamed to Cost)


 SalesAmount (renamed to Revenue)
 OrderDate (renamed to Order Date)
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-15

Table Visible Columns

Date  DayNumberOfMonth (renamed to Day)


 MonthName (renamed to Month)
 CalendarYear (renamed to Year)

Product  ProductName (renamed to Product)


 ProductSubcategoryName (renamed to Subcategory)
 ProductCategoryName (renamed to Category)

Promotion  PromotionName (renamed to Promotion)


 PromotionType (renamed to Promotion Type)

 Task 5: Create Hierarchies


 In diagram view, create the following hierarchies in the tables in the model.

 Products: Products By Category


o Category

o Subcategory

o Product
 Date: Calendar Date

o Year

o Month
o Day

 Promotion: Sales Promotion

o Promotion Type
o Promotion

 Customer: Customers By Geography

o Country

o State Or Province

o City

o Postal Code

 Task 6: Test the PowerPivot Data Model


 Save the workbook.

Note: If an error message is displayed when saving the workbook in this pre-release version of Excel 2013,
click OK to close it.

 In the PowerPivot for Excel – Marketing Analysis.xlsx window, in the ribbon, on the Home tab, in
the PivotTable drop-down list, click PivotTable. Insert a PivotTable into the existing worksheet,
leaving blank rows above the table.

 Use the PivotTable to view the Revenue measure in the Internet Sales table by the Products By
Category hierarchy and the Sales Promotion hierarchy.
MCT USE ONLY. STUDENT USE PROHIBITED
8-16 Designing a Microsoft Excel-Based Reporting Solution

 Format the Revenue measure using an accounting format that shows the values as currency with two
decimal places.

 Add the Cars, Children, and Birth Year customer attributes to the filters of the PivotTable.

 Insert slicers for the Marital Status and Gender customer attributes, and then hide slicer items with
no data.

 Insert a timeline based on the Order Date field.

 Filter the data to show revenue in the past two years from single female customers born after 1970
with no cars or children.

 After you finish, save the workbook, ignoring any errors. Close the PowerPivot window, but keep the
workbook open.

Results: At the end of this exercise, you will have an Excel workbook that contains a PowerPivot data
model based on data from the data warehouse.

Exercise 4: Using Power View in Excel


Scenario
The marketing analysts plan to share their PowerPivot workbooks with the sales VP, so he can use them to
visually explore the sales data. You intend to use Power View in Excel to test this functionality.
The main tasks for this exercise are as follows:

1. Create a Power View Report.

 Task 1: Create a Power View Report


 In the Marketing Analysis.xlsx workbook you created in the previous exercise, insert a Power View
report.

Note: If a Power View report does not open on the POWER VIEW tab of the ribbon, view Excel options
and remove the Power View COM add-in, and then add it again.
 Set the report title to Sales promotion Analysis, and then hide the filters area to maximize your
working area.

 Add the following Power View fields to the report:

o Revenue

o Promotion Type

o Commute Distance

 Display the fields as a clustered bar chart that fills the left half of the report. Tile the chart by Year so
that you can click the year headers above the chart to view revenue by promotion type broken down
by commute distance for each year.
 In the blank area to the right of the bar chart, add the Revenue and Country fields, and then display
them as a pie chart that fills the top of the right half of the report.

 In the blank area under the pie chart, add the Revenue and Cars fields. Display them as a clustered
column chart that fills the bottom of the right half of the report.

 Click the Commute Distance legend values to shade all of the charts in the report based on the
selected commute distance.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 8-17

 After you finish exploring the data, save the workbook and close Excel.

Results: At the end of this exercise, you will have an Excel workbook that contains a Power View report
based on a PowerPivot data model.

Question: How might you support the Sales VP’s requirement to visually analyze the
marketing data models in Excel 2010?
MCT USE ONLY. STUDENT USE PROHIBITED
8-18 Designing a Microsoft Excel-Based Reporting Solution

Module Review and Takeaways


Module Review
This module has described some features of Microsoft Excel that can be used to analyze data. Excel is a
commonly used application in business environments and provides a familiar environment for data
analysis.

Question: What challenges do think organizations will face when empowering users to
analyze data in Excel?
MCT USE ONLY. STUDENT USE PROHIBITED
9-1

Module 9
Planning a SharePoint Server BI Solution
Contents:
Module Overview 9-1

Lesson 1: Introduction to SharePoint Server as a BI Platform 9-2

Lesson 2: Planning Security for a SharePoint Server BI Solution 9-8

Lesson 3: Planning Reporting Services Configuration 9-13

Lesson 4: Planning PowerPivot Configuration 9-16

Lesson 5: Planning for PerformancePoint Services 9-20

Lab: Planning a SharePoint Server BI Solution 9-25


Module Review and Takeaways 9-31

Module Overview
SharePoint Server is an increasingly important part of the end-to-end solution for the centralized delivery
of business intelligence (BI) solutions. SharePoint Server provides a platform that makes it easier for
business users to share and collaborate on a wide range of information. Understanding how to enable
SharePoint capabilities within a BI project is being seen by organizations as a useful skill and asset for a
SQL Server professional to possess.

Objectives
After completing this module, you will be able to:

 Describe SharePoint Server as a BI platform.

 Plan security for a SharePoint Server BI solution.

 Plan Reporting Services configuration.

 Plan PowerPivot configuration.

 Plan for PerformancePoint Services.


MCT USE ONLY. STUDENT USE PROHIBITED
9-2 Planning a SharePoint Server BI Solution

Lesson 1
Introduction to SharePoint Server as a BI Platform
SharePoint Server can play an integral part of a SQL Server 2012 business intelligence solution. After you
establish that the business reporting requirements requires a centralized platform, there are
considerations that should be addressed before setting up SharePoint Server.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe how SharePoint Server fits into a BI project.

 Describe SharePoint Server.

 Describe the required SharePoint services to support BI.

 Describe the SharePoint farm topology options.


 Describe sites and subsites for BI.

SharePoint Server Design in a BI Project


Many business intelligence professionals express
concern that business reports stored on local
desktops cannot be protected by centralized
backup strategies. Additionally, as these business
reports become business-critical, business
operations can be adversely affected by lost
reports.

SharePoint Server can play a fundamental role in


centralizing BI solutions in two ways:
 It can provide a platform for the centralized
storage and protection of BI reports.

 It is possible to host business intelligence applications within SharePoint Server.

As a result, SharePoint Server can address previous concerns expressed by BI professionals by providing a
single location for the storage of reports. The added benefit is that SharePoint Server provides versioning
capabilities that enable business users to browse older versions of the same documents. This capability
enables easier and greater collaboration between business users in a consistent environment while
providing a platform that can be centrally managed by IT professionals.

In the context of a BI project plan, SharePoint Server has to be considered in two areas:

 Reporting and analysis design.

 Technical architecture and infrastructure design.

Reporting and analysis design


Typically, business requirements are expressed in the form of the reports desired by the business.
Additionally, users may express their preferred application for creating the reports and performing data
analysis. The business requirements will determine the type of applications that will be used―this impacts
the SharePoint services that will be required to support the applications needed by the business.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-3

Note: For more information about reporting and data analysis requirements, see Module 6:
Planning a BI Delivery Solution.

Technical architecture and infrastructure design


By understanding the applications that the business requires to perform reporting and data analysis, you
are able to determine the SharePoint services that will be needed to support the applications.
Additionally, you must consider the importance that the business places on performance, scalability, and
availability of the SharePoint solution. This determines the type of SharePoint Server farm that is needed.

What Is SharePoint Server?


SharePoint Server makes it easier for users to
collaborate and work together. Previous releases
of SharePoint Server provided a web-based portal
that provides file sharing and document
management. However, with each release, these
capabilities have been expanded to include the
creation of social networks, search, and business
intelligence capabilities. SharePoint Server uses a
Microsoft Office user interface that makes it
intuitive for business users to use, and provides a
centralized platform that IT departments can
manage.

SharePoint Server is available in the following form factors:

 Appliance. The Microsoft Business Decision Appliance contains a pre-built SharePoint 2010 farm with
support for Reporting Services. This only requires you to plug in the appliance and start it. Note that
at the time of writing, PowerPivot is not available on the Business Decision appliance.
 Self-build. You can manually install and configure SharePoint Server on a dedicated Windows
system. This enables you to customize the solution for the business. This module will focus on
SharePoint from a self-build perspective. The web front end layer and the application layer can also
be virtualized using Windows Hyper-V.

 Cloud. Office 365 provides SharePoint capabilities and is offered as a service to which your
organization can subscribe to. This solution is useful to organizations that do not have the expertise
to implement a full SharePoint environment.

When planning a SharePoint Server BI solution, there are three tiers of the architecture to consider:

 Web font-end tier. One or more servers that are used to accept requests for a SharePoint
application/service and direct the request to the appropriate application server.

 Application tier. One or more servers that host the service applications in the SharePoint Server
infrastructure.
 Data tier. A SQL Server instance, which can be clustered, that hosts SharePoint databases.

Each tier can contain multiple servers to meet the business requirements for performance, scalability,
and/or availability. The key point is that all servers must belong to the same SharePoint farm; a logical
grouping of servers that provides the infrastructure for a SharePoint Server solution.
MCT USE ONLY. STUDENT USE PROHIBITED
9-4 Planning a SharePoint Server BI Solution

SharePoint terminology
Before you start to plan a BI solution that uses SharePoint Server, it is important to understand the core
components and terminology of a SharePoint Server solution. The following list defines some important
SharePoint Server concepts:

 SharePoint Server farm. A farm is a collection of servers that work together to provide SharePoint
services. Each server in the farm hosts one or more SharePoint Server components, and the entire
farm constitutes a logical container for all of the SharePoint services provided by those servers and
the core unit of administration.

 SharePoint databases. SharePoint Server is primarily a platform for publishing and collaborating on
content. The content in a SharePoint site, together with farm configuration data and application
settings, is stored in one or more SQL Server databases.

 Service applications. SharePoint Server provides an extensible platform that can deliver a broad
range of services. Each service is encapsulated in an application, which can be hosted on one or more
application servers in the SharePoint Server farm.
 Web applications. SharePoint web applications are Internet Information Services (IIS) applications
where users can consume SharePoint Server services. The services available in a specific web
application are determined by associating application services in the farm with the web application.
 Site collection. A site collection, as its name suggests, is a collection of SharePoint sites hosted in a
web application. You can use a site collection as a central unit of management and configuration for
multiple sites. SharePoint Server supports site collection features, which can be enabled or disabled at
the site collection level.

 Site. A site is a container for related content, and provides a specific endpoint to which users can
browse. Sites inherit the features of their parent site collection, and each site has site features that can
be enabled or disabled on a site by site basis.

 Apps. The content on a site is delivered through visual elements, which are known as apps.
SharePoint Server includes apps, such as document libraries and lists, which you can use to create the
user interface for the site. Additionally, service applications and third-party software developers can
provide additional apps.

 Subsites. In many cases, you can deliver all of the content you need to in a site. However, you can
also group related content into subsites under a parent site. Subsites inherit the features of their
parent site.

SharePoint Farm Topology Options


As part of the technical architecture and
infrastructure design phase, a SharePoint farm
topology should be designed in line with the
business requirement. This section describes the
topologies that can be configured for a single
SharePoint farm.

Single Server
In this farm topology, all of the SharePoint Server
architecture layers are hosted on a single Windows
Server. The main benefit of this model is that the
licensing cost for the solution is minimized.
Typically, this type of configuration is found in
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-5

development environments or training environments, and provides the easiest setup of a SharePoint farm.
However, because all three layers run on the same Windows server and share the same hardware, there
can be increased contention of resources. If this affects the performance of business reports, consider
implementing a scale out solution.

Scale Out
In this farm topology, each of the SharePoint Server architecture layers is separated onto different
Windows servers. Scaling out a SharePoint farm distributes the workload across multiple servers―this
reduces contention on a single server, which improves throughput. This does come with an additional
licensing cost. Furthermore, there is more infrastructure preparation required to manage security
seamlessly across the SharePoint farm. There is also no resilience if one of the servers shuts down. Scale
out should be considered as a valid topology if performance is important without the need for resilience.

High Availability
In this farm topology, the SharePoint architecture layers are separated across Windows servers, and then
each layer is duplicated onto another server. This is the most expensive topology to implement, but it
provides load balancing and high availability across the entire SharePoint farm. The infrastructure
preparation is similar to that of creating a scale out architecture. However, a Network Load Balancer is
also required to distribute incoming requests to the first available web front-end server.

Required SharePoint Services for BI


After SharePoint Server is installed on each server
in the SharePoint farm, use SharePoint Central
Administration to configure the services and
components that are mandatory to use the full
range of BI features within SharePoint Server.

Site collection and site


The first step after installing SharePoint server is to
create a site collection and at least one site. This
will provide the web portal through which users
will access the business intelligence applications
and documents. For business intelligence
purposes, ensure that you define a site name that
is understood by business users and use the Business Intelligence Center template to ease the
configuration of the BI components within the SharePoint site.

Excel Services
After the site is defined, configure Excel Services. This should be done for two reasons. Firstly, it enables
your business users to share and collaborate with Excel files. It is also a prerequisite service for PowerPivot
for SharePoint. Excel Services is configured on the application layer servers within a SharePoint farm.

Claims to Windows Token Service


This service is required when a service such as Excel Services has to communicate with a remote data
source that is hosted outside of the SharePoint farm. Within a SharePoint farm, communication takes
place between the farm servers using claims-based authentication. Typically, a user connects to a
SharePoint farm using Windows authentication and must connect to the remote data source with the
same credential.

The Claims to Windows Token Service is responsible for converting a Windows authentication token to a
claims-based token for incoming connections into the SharePoint farm, and then convert the outgoing
MCT USE ONLY. STUDENT USE PROHIBITED
9-6 Planning a SharePoint Server BI Solution

traffic from the SharePoint farm from a claims-based token to a Windows authentication token. Because
this service deals with the sensitive task of handling authentication tickets, the account that runs this
service should be added to the Local Administrators group on the server on which it is hosted.
Additionally, Local Security Policy configuration is required in Windows to enable the following rights:

 Act as part of the operating system.

 Impersonate a client after authentication.

 Log on as a service.

For a BI implementation of a SharePoint farm, the Claims to Windows Token Service must be configured
on the same server on which Excel Services is installed.

PowerPivot for SharePoint


PowerPivot for SharePoint provides the functionality of PowerPivot for Excel on a centralized SharePoint
Server. This service requires running SQL Server Setup on each application server on the SharePoint farm,
which will create a PowerPivot instance of Analysis Services on which the PowerPivot workspaces can be
created when using the application. After it is installed, SharePoint Central Administration can be used to
complete the integration with SharePoint Server.

Reporting Services Integrated mode


In SQL Server 2012, the architecture of Reporting Services in SharePoint Integrated mode has changed to
become a shared service within SharePoint. This means that it can now be managed within SharePoint
Server alone. Like PowerPivot for SharePoint, Reporting Service in SharePoint Integrated mode required
SQL Server Setup to install Reporting Services in SharePoint Integrated mode on the application servers; it
also requires installing the Reporting Services add-in on the web front-end servers. When complete,
SharePoint Central Administration can be used to complete the integration with SharePoint Server.

PerformancePoint Services
PerformancePoint Services is installed as part of SharePoint Setup on the application servers on the
SharePoint farm. In the context of a BI solution, selecting the Business Intelligence Center template when
creating a site means that only minimal configuration is required within Central Administration to start
this service.

Additionally, the following services are recommended in that they can add value to the BI services that are
mandatory:

Search service
As more documents are added to the SharePoint farm, it can become cumbersome to manually search for
documents. Enabling the SharePoint Search service will catalog the content of a SharePoint site so that
the Search feature can be used to quickly retrieve documents.

Secure store service


The secure store service provides an alternative method of authentication. It is similar to storing
credentials in a ReportServer database in Reporting Services native mode. However, the secure store
service is available to a wider range of applications. The secure store service greatly simplifies the
configuration of authentication for a many services, including PerformancePoint Services and PowerPivot.

Microsoft Office Web Apps


Office Web Apps provide web based Microsoft Office products including Word, Excel and PowerPoint.
They provides basic functionality of each product that will allow changes to an office document to be
made through a web browser.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-7

SharePoint logging
Logging is an extremely useful component of SharePoint to enable, particularly during the installation and
configuration of the SharePoint farm. Any errors within the configuration are reported to the error log
files known as a Unified Logging Service (ULS) log files. This is located in C:\Program Files\Common
Files\Microsoft Shared\Web Server Extensions\15\LOGS folder. If there is a problem in the SharePoint
farm, open the latest file and perform a search for the word “error”.

Sites and Subsites for BI


SharePoint Server provides a web portal through
which business users can access the data that is
stored on a SharePoint site. The flexible nature of
SharePoint means that a site collection can be
created with one site that can be used by the
entire organization. However, for large enterprises,
you also have the ability to create a subsite to
provide further separation of the business data
from an organizational and security perspective.
Ultimately, the business requirements will
determine which type of users can access specific
data. If security is a concern when gathering the
requirements, subsites will be required.

After a site or a subsite is created, depending on the template that is selected during the site creation, a
site structure will be created with specific apps and features enabled. When you use the Business
Intelligence Center template, default folders are created for PerformancePoint data sources and
dashboards, but there is no default folder structure for PowerPivot files or reports. However, you can add
document libraries and PowerPivot Gallery pages to support them.
When creating a site or subsite, use the following guidelines:

 Let business requirements dictate the number of sites to create.

 If security of data is a concern, create separate subsites for security separation.


 For BI sites, use a Business Intelligence Center site template to create a site that contains BI
functionality.

 Keep the site structure simple and easy-to-use for business users.
 Before you can create a subsite in a SharePoint Server site, you must activate:

o The SharePoint Server Publishing Infrastructure site collection feature.

o The SharePoint Server Publishing site feature.


MCT USE ONLY. STUDENT USE PROHIBITED
9-8 Planning a SharePoint Server BI Solution

Lesson 2
Planning Security for a SharePoint Server BI Solution
An important aspect of planning a SharePoint farm is security. This requires choosing an appropriate
authentication and user identity model that can ensure that the site is secure, and if necessary that a user
can be audited. This can involve additional considerations when implementing a Scale Out or High
Availability topology to ensure that credentials are correctly identified when connecting to back end data
sources through a SharePoint farm.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe SharePoint authentication models.

 Describe SharePoint Server authentication configuration.


 Describe Kerberos.

 Describe how to Plan Kerberos configuration.

SharePoint Authentication Modes


The SharePoint Server authentication mode
determines how a user will authenticate with
resources on a SharePoint farm. SharePoint Server
uses one of two authentication modes: Claims
Based Authentication or Classic Mode
Authentication.

Claims Based Authentication


In this authentication mode, authentication
providers are supported to validate the identity of
a user. These providers include:

 Windows authentication.

 Forms-based authentication.

 Security Assertion Markup Language (SAML).

Claims Based Authentication provides support for Windows and third-party authentication protocols and
directory providers in multi-vendor environments. Claims Based Authentication in Windows is built on the
Windows Identity Foundation (WIF).

Classic Mode Authentication


Classic Mode Authentication supports only Windows authentication. Windows authentication uses the
following protocols:

 NTLM

 Kerberos

 Anonymous

 Basic
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-9

 Digest

NTLM, Kerberos, and Anonymous are configured through Active Directory; Basic and Digest are
configured using Internet Information Services (IIS). Kerberos is a common protocol that is used within
organizations because it has the ability to delegate client credentials to access back-end data sources. This
is an important requirement if it is required to audit individual user access to a back-end data source.

Configuring SharePoint Server Authentication


When installing SharePoint Server 2013, the
default authentication mode for a SharePoint site
is Claims Based Authentication. This is a change
from SharePoint Server 2010, where the
authentication mode could be selected when
defining the site for the SharePoint Server. You
must use PowerShell to switch the authentication
mode from Claims Based Authentication to Classic
Mode Authentication.

SharePoint Server provides a PowerShell console


named SharePoint 2013 Management Shell that
enables you to run PowerShell commands that will
impact the SharePoint farm. To run a SharePoint site in Classic Mode Authentication, you must use the
following PowerShell syntax.

New-SPWebApplication -Name <Name> -ApplicationPool <ApplicationPool> -


AuthenticationMethod <WindowsAuthType> -ApplicationPoolAccount <ApplicationPoolAccount>
-Port <Port> -URL <URL>

 <Name> is the name of the new web application.

 <ApplicationPool> is the name of the application pool.

 < WindowsAuthType > is either “NTLM” or “Kerberos”. Kerberos is recommended.


 <ApplicationPoolAccount> is the user account that this application pool will run as.

 <Port> is the port on which the web application will be created in IIS.

 <URL> is the public URL for the web application.


The following example configures a SharePoint 2013 site named Adventure Works BI Site running under
the application pool AWAppPool to use Classic Mode Authentication with Kerberos. The Adventure Works
BI Site can be accessed at http://www.AWBI on port 80, and the application pool AWAppPool executes
under the account ADVENTUREWORKS\SVC_AWBISite.

New-SPWebApplication -Name "Adventure Works BI Site" -ApplicationPool "AWAppPool" -


AuthenticationMethod "Kerberos" -ApplicationPoolAccount (Get-SPManagedAccount
"ADVENTUREWORKS\SVC_AWBISite") -Port 80 -URL "http://www.AWBI"
MCT USE ONLY. STUDENT USE PROHIBITED
9-10 Planning a SharePoint Server BI Solution

Introduction to Kerberos
Kerberos is an authentication protocol that is
designed to provide a single sign-on environment.
A client session authenticates against a domain
controller in a domain, which issues a session
ticket if the correct user name and password are
supplied. The session presents this ticket to
network resources, such as SQL Server or a file
server, to get access.

Kerberos provides convenience in that you have to


authenticate only once; the session ticket that you
are presented with contains information that will
be presented to the resource when you access it
and is valid for eight hours by default. Kerberos provides additional capabilities that are very important
when setting up a SharePoint environment to support access to SQL Server technologies.

Delegation
Kerberos delegation is the process of giving an Active Directory account permissions to perform a task. An
example could be the ability to pass a set of credentials to another user account.

Impersonation
Kerberos impersonation is the process of one account impersonating the credential of another account.
This permission must be delegated for impersonation to work.

If Kerberos is not configured and a user connects to an application such as PowerPivot, Report Builder, or
Power View that accesses data in a backend database, by default a connection will be made using the
service account of the application. If there is a need to audit access against the back-end database, it will
audit the fact that the service account accessed the back-end database, and not the user who made the
request, for the report.

If your business requirements include the auditing of individual users’ access to data, Kerberos delegation
and impersonation is required to retain the identity of the user who originally made the request for the
report. The process of maintaining the user’s credentials over two or more connections is referred to as a
“double hop,” and creates a requirement to delegate the rights to authenticate as the original user’s
identity. Kerberos supports this scenario, enabling a user to authenticate using his or her password only
once when logging on to the domain. After that, it is the session ticket that is used to authenticate. As a
result, Kerberos has the ability to delegate control of a user’s session ticket, or even a workstation’s session
ticket.

To retain the identity of the user who originally made the request for the report, the following tasks need
to be performed:

1. You must first must represent SQL Server 2012 Business Intelligence applications in SharePoint Server
as objects within Active Directory so that they can be secured.

2. You must use delegation to enable the service accounts used by the BI applications to impersonate a
user against a back-end server.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-11

Planning Kerberos Configuration


The key to successful Kerberos configuration is in
the planning. The first step is to represent SQL
Server 2012 Business Intelligence applications held
within SharePoint as objects within Active
Directory. This is performed by registering a
Service Principle Name (SPN) within Active
Directory. An SPN is the name by which a client
uniquely identifies an instance of a service. Some
services are automatically registered when you
join a computer to a domain. However, SharePoint
applications do not automatically register SPNs, so
they have to be manually registered using the
SetSPN command from a command prompt.

Before running SetSPN, use the following guidelines to determine the required information.

Know your environment


List the computer name and fully qualified domain name with ports of all servers involved in the entire
Business Intelligence solution, including: web front-end layer servers, application layer servers, SharePoint
database layer servers, and any server that will host source data such as a SQL Server or Analysis Services
that hosts source data. Listing the computer name and fully qualified domain name will enable you to
deal with both NetBIOS and hostname communication on the network.

Identify the service accounts for the applications


For each computer you have listed, note all of the services specific to the BI solution that is running on
each computer and the associated service account for the application. This will include Reporting Services,
Analysis Services, as well as the credentials for the application pool for the SharePoint site, Excel Services,
Claims to Windows Token Service, PowerPivot, and PerformancePoint services. The SQL Server Browser
Service should also be listed if named instances are being used.

Determine the SPNs that the application needs


To correctly register SPNs, some services have a specific name for the service that it provides. The
following table defines the mandatory service names to be used within an SPN.

Service SPN service name

SQL Server Database Engine MSSQLSvc

SQL Server Analysis Services MSOLAPSvc.3

SQL Server Browser Service MSOLAPDisco.3

Other services can have a custom name defined of your choice. The following tables contains some
suggested names to use to make it easier to identify the service in Active Directory.

Service Suggested SPN service name

Excel Services SP/Excel

SharePoint Site SP/HTTP

Reporting Services SP/SSRS


MCT USE ONLY. STUDENT USE PROHIBITED
9-12 Planning a SharePoint Server BI Solution

Service Suggested SPN service name

PowerPivot SP/PPivot

PerformancePoint SP/PPS

Claims to Windows Token Service SP/C2WTS

Determine the types of SPNs required


After the previous issues have been considered, SetSPN can be used to register the names within Active
Directory. The following is the syntax for SetSPN.

Setspn –S <service class>/<host>:<port> <service account name>

 <service class> denotes the name of the service or application. If it is SQL Server, the <service class>
is MSSQLSvc. Analysis Services is MSOLAPSvc.3, and Reporting Services would be HTTP.

 <host> is the fully qualified domain name or NetBIOS name in which the application is running. The
recommended practice is that each application should have two entries: one for fully qualified
domain name and one for NetBIOS.

 <port> is optional and is used to define the port on which the service is running. This should be used
when multiple instances of an application is running.

 <service account name> is the service account that is defined for the application.

For example, if there is a default SQL Server instance running on a computer named AWSQL.AW.Local
under the service account AW\SQLService, two SPNs would be registered as follows:

 For the fully qualified domain name:

Setspn –S MSSQLSvc/AWSQL.AW.Local:1433 AW\SQLService

 For the NetBIOS name:

Setspn –S MSSQLSvc/AWSQL:1433 AW\SQLService

If an Excel Services service is running on the same computer using the service account AW\ExcelService, its
SPN could be set with the following code.

Setspn –S SP/Excel AW\ExcelService

These code samples register service principal names in Active Directory that can then be delegated. Now
that you’ve registered the SPNs, you should create a list of fully qualified domain names and associated
service accounts for any applications that will be subjected to the double hop issue and plan the
delegation configuration required to pass a user’s credential from one service to another.

Identify required delegation


Consider creating a Visio diagram of the BI environment, and define a path of authentication from the
client machine to the back-end data source.

Consider Active Directory and local policy settings


Service account settings may be controlled by domain administrators using a Group Policy such as
“Account is sensitive and cannot be delegated”. Make sure that there are no settings like this that will
prevent Kerberos configuration.

Lesson 3
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-13

Planning Reporting Services Configuration


After the SharePoint site is created, and if necessary, the Kerberos infrastructure defined to support
delegation for Reporting Services you can enable Reporting Services in a SharePoint site. The location of
Reporting Services in the farm is determined by the farm topology you have selected based on the
business requirements that have been gathered.

You will also have to enable the Reporting Services content type for the SharePoint document libraries
where you want to publish reports The content type required will be based on the reporting requirements
that have been gathered in the business requirements phase.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the requirements for Reporting Services in SharePoint Integrated mode.

 Configure Report Services within a SharePoint site.


 Describe the document libraries that support Reporting Services.

SharePoint Integrated Mode Requirements


To install Reporting Services in SharePoint
Integrated mode, you must perform the following
steps:
1. Install SharePoint Server.

2. Install Reporting Services.

3. Configure Reporting Services

Installing SharePoint Server


Regardless of the topology selected for the
SharePoint farm, the following steps should be
performed on each SharePoint Server before
installing Reporting Services in Integrated mode:

1. Run the SharePoint Prerequisite installer on every server in the farm.

2. Run the SharePoint setup.exe on every server in the farm. Note that on the first server you will define
the location of the back-end database server and create the farm with a dedicated user account and
passphrase. This passphrase is then used to join remaining servers to the same farm.

3. Install any required service packs.

Installing Reporting Services


After the SharePoint Server farm is installed, use the SQL Server 2012 installation media to install
Reporting Services in SharePoint Integrated mode on the application tier servers in the SharePoint farm.

Install the Reporting Services add-in for SharePoint Server on web front-end servers
After installation is complete, on the web front-end servers, use the SQL Server 2012 installation media to
install the Reporting Services add-in for SharePoint Server. In a single server deployment, the Reporting
Services add-in for SharePoint Server is installed with Reporting Services. In a Scale Out or High
Availability farm topology, you must install the Reporting Services add-in for SharePoint Server separately.
MCT USE ONLY. STUDENT USE PROHIBITED
9-14 Planning a SharePoint Server BI Solution

Configuring Reporting Services


When installed in SharePoint Integrated mode,
Reporting Services is implemented as a service
application in the SharePoint Server farm, and can
be managed like any other SharePoint Server
service. To complete the configuration of
Reporting Services in SharePoint Integrated mode,
use Central Administration in SharePoint to
perform the following steps:

1. Register a managed Reporting Services


account. Every application that runs on a
SharePoint farm should run under a separate
service account. The first step in the
configuration of any application is to register a managed service account within SharePoint Server.

2. Start the Reporting Services service. Reporting Services must then be started in the SharePoint
farm to complete the configuration.
3. Create the Reporting Services application pool. Ideally, each application should run within its own
application pool with its own managed account to ease security and maintainability.

4. Configure service settings. Reporting Services has configuration settings that you can use to control
its behavior, such as the SMTP server to be used for sending subscriptions by email.

5. Enable Reporting Services and Power View in SharePoint sites. The final step is to enable
Reporting Services as a feature within sites where you want to be able to publish and view reports.

Planning Document Libraries for Reports


A document library is where you can create,
collect, and update files with other users.
SharePoint Server provides templates that contain
preset settings that make the process of setting
the libraries easier. Additional, a library can be
highly customized to contain content that the
business users require.

Use only content types within a library that is


required by the business. For example, after a
service such as Reporting Services is enabled
within a SharePoint site, additional steps can be
performed to ensure that the content associated
for the application is made available to the users. Reporting Services provides the following content types
within SharePoint libraries that enable users to work with Reporting Services:

 Report Builder Report.

 Report Data Source.

 Report Builder Model.

Additionally, you can add the BISM Connection content type if you want to enable the creation of
connections to tabular data models for Power View and Excel.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-15

Enable only the content types that the business requires. For example, many organizations will not enable
the Report Builder Model content type because users will connect directly to data sources using the
Report Data source content type. It is important that you are led by the reporting requirements that were
determined during the business analysis phase of the project.

Before content types can be added to a site, you have to enable the ability to add content types to give
the ability to customize a library. This is to be done by a SharePoint Server Farm administrator, after which
content types can be added to a SharePoint Server site.
MCT USE ONLY. STUDENT USE PROHIBITED
9-16 Planning a SharePoint Server BI Solution

Lesson 4
Planning PowerPivot Configuration
The installation and configuration of PowerPivot is similar to that of Reporting Services with the location
of the PowerPivot for SharePoint installation determined by the farm topology that is selected. However,
after it is installed, additional consideration is required for managing data refresh of PowerPivot files and
monitoring PowerPivot activity.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe the planning required for PowerPivot deployments.

 Describe how to plan and manage data refresh.

 Use Business Intelligence Semantic Model (BISM) connection files.


 Monitor PowerPivot.

Planning PowerPivot for SharePoint Deployment


To install PowerPivot for SharePoint, you must
perform the following tasks:

1. Install SharePoint Server.

2. Install PowerPivot for SharePoint.

3. Configure PowerPivot for SharePoint.

Installing SharePoint Server


Regardless of which topology for the SharePoint
Server farm is selected, the following steps should
be performed on each SharePoint Server before
installing PowerPivot for SharePoint:

1. Run the SharePoint Server Prerequisite installer on every server in the farm.

2. Run the SharePoint Server setup.exe on every server in the farm. Note that on the first server you will
define the location of the SharePoint Server farm database server and create the farm with a
dedicated user account and passphrase. This passphrase is then used to join remaining servers to the
same farm.

3. Install any required service packs.

Installing PowerPivot for SharePoint


After the SharePoint farm is installed, use the SQL Server 2012 installation media to install PowerPivot for
SharePoint on the application servers in the SharePoint Server farm. This creates a PowerPivot mode
instance of Analysis Services on the application server, which provides a workspace for working with
PowerPivot files. After the install is complete, run spPowerPivot.msi from the SQL Server 2012 SP1
feature pack on each server in the SharePoint Server farm to install data providers, the PowerPivot
configuration tool, PowerPivot Gallery, and components required for data refresh.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-17

Configuring PowerPivot for SharePoint


To complete the configuration of PowerPivot for SharePoint, run the Configure or Repair PowerPivot
for SharePoint tool on the application servers to perform the follow steps:

1. Register a managed account for the PowerPivot for SharePoint application.

2. Create the PowerPivot for SharePoint application.

3. Start the PowerPivot for SharePoint application.

4. Enable PowerPivot integration within a SharePoint site.

Planning and Managing Data Refresh


PowerPivot workbooks contain data models that
are based on data from one or more data sources,
often including a data warehouse. When browsing
a PowerPivot workbook, users can manually
refresh the data in the model from the original
data sources to ensure that they are viewing the
latest data. However, the requirement to manually
refresh the data can be confusing for users, and
difficult to manage for database administrators.
PowerPivot for SharePoint supports automatic
data refresh based on a scheduled job, which gives
PowerPivot workbook owners and administrators
more control over data refresh activity. This enables you to schedule requests for new data to coincide
with periods of low activity in data sources, and synchronize data refreshes with data warehouse load
cycles. Before implementing data refresh, you should plan your strategy for data source authentication
and data refresh schedules.

Planning authentication for data refresh


When a user configures data refresh for the PowerPivot workbook, they must specify the credentials to be
used when connecting to the data source. Options for authentication include:

 Allow users to specify Windows credentials. This approach enables users to specify a Windows
user name and password when configuring data refresh options for a workbook. These credentials are
then used for scheduled data refreshes. This approach has the advantage of requiring minimal
administrative configuration, but it can result in a difficult-to-manage environment where credentials
for data access are defined in multiple places, and may need to be changed if a user leaves the
organization or their data access privileges change. You can disable this option in the settings for the
PowerPivot service application in SharePoint Central Administration.
 Use the unattended PowerPivot data refresh account. When configuring PowerPivot, you can
specify credentials to be used by an unattended PowerPivot data refresh account. This enables you to
create a single account that can be used for data refresh connections for all workbooks. The benefit
of this approach is that a single service account for data access authentication can be centrally
managed and granted the required permissions in all data sources. However, this approach cannot be
used for data sources that do not support Windows authentication.

 Use custom credentials that are saved in the secure storage service. The secure storage service in
SharePoint Server 2013 provides a mechanism for credentials to be stored securely and associated
with an application name that is used to look up the credentials when required. This enables users to
use the credentials without having to know the user name or password. The benefit of using this
MCT USE ONLY. STUDENT USE PROHIBITED
9-18 Planning a SharePoint Server BI Solution

approach is that it can be used to store both Windows and non-Windows credentials. Additionally, it
enables you to create multiple credentials for data refresh to facilitate finer-grained auditing than a
single service account can accommodate.

Planning for data refresh schedules


In most enterprise environments, data sources are heavily used during business hours. Therefore, you
should plan to limit data refresh activity to non-business hours. You can configure start and end times for
the business day in the settings for the PowerPivot service application in SharePoint Central
Administration, so you should identify the earliest and latest times at which you want to allow users to
schedule data refresh processes, and configure the service application accordingly.

BISM Connections
The Business Intelligence Semantic Model (BISM)
has its own content type within SharePoint Server.
This enables you to define connection information
to Analysis Services directly from SharePoint
Server. After the connection information is
defined, there is the capability to create
PowerPivot and Power View reports directly from
a BISM connection that is defined within a
SharePoint library.
First, the BISM connection should be enabled as a
content type within the SharePoint Library. After
the content type is enabled, perform the following
steps to define a BISM connection:

1. Browse to a library where the BISM content type is enabled.

2. Click Documents on the SharePoint ribbon.


3. Click the down arrow on New Document, and then select BI Semantic Model Connection File to
open the New BI Semantic Model Connection page.

4. In the File text box, type in a name for the BISM connection file, and optionally, add a description
under Description.

5. In the Workbook URL or Server Name text box, type in the URL of the Excel file or the name or IP
address of a tabular instance of Analysis Services.

6. In the Database text box, type in a tabular database that is currently available on the server.

7. Click OK.

After the BISM connection file is defined, click the down arrow on the BISM file, and then select Create
Power View Report to create a Power View report or Launch Excel to create a PowerPivot report.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-19

Monitoring PowerPivot
SharePoint Server provides the capability to
monitor the PowerPivot activity that is occurring
on a SharePoint farm. Using the PowerPivot
Management Dashboard, you can establish the
following metrics about PowerPivot instance and
its workbooks:

 PowerPivot Instance health. Provides metrics


on the CPU and memory utilization on a
PowerPivot instance.

 Report Activity. Provides lists and charts on


the frequency with which PowerPivot
workbooks are accessed.
 Data refresh. Contains information pertaining to both successful and failed data refresh attempts.

 Quality of Service. Provides metrics for query response time when retrieving PowerPivot reports.

The PowerPivot Management Dashboard provides you with information that will enable you to take
appropriate steps to deal with:

 Failed data refreshes.

 Manage popular PowerPivot reports so they are optimized for query response times.
 Taking appropriate action on hardware should the CPU or memory be under pressure.

 Understand reporting patterns that enable the conversion of self-service reports to standard reports.

The PowerPivot Management Dashboard can be found in Central Administration, in General


Application Settings in the PowerPivot section, click Management Dashboard. There is also the ability
to create customized reports and dashboards for more targeted reporting.
MCT USE ONLY. STUDENT USE PROHIBITED
9-20 Planning a SharePoint Server BI Solution

Lesson 5
Planning for PerformancePoint Services
PerformancePoint Services is a service that is available within SharePoint Server Enterprise edition, which
enables the creation of highly visual reports without the need to install SQL Server. However, adding this
capability side-by-side with SQL Server BI components will provide a complete business intelligence
solution within the organization.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe PerformancePoint Server.

 Describe how to enable PerformancePoint Server in SharePoint Server.

 Describe the Dashboard Designer.


 Describe PerformancePoint data sources.

 Describe KPIs, reports, scorecards, and dashboards.

What is PerformancePoint Services?


PerformancePoint Services is a feature within
SharePoint Server that enables the creation of
KPIs, reports, scorecards, and dashboards without
the need to install SQL Server. Although many of
these features can be created by using the
business intelligence features of SQL Server 2012,
the tighter integration of PerformancePoint
Services within SharePoint Server 2013 provides
convenience for creating business intelligence
items. The Dashboard Designer provides the
starting point for users to create highly visual
reports. This is fully integrated into SharePoint
Server 2013.

PerformancePoint Services is available in SharePoint Enterprise edition only, and is automatically enabled
when the Business Intelligence template is selected when creating a site. It provides a graphical
environment designed to make it easy for users to create dashboards and reduces the development time
required to create a report compared to tools such as Reporting Services. However, Reporting Services
provides more flexibility in defining the layout of a report or dashboard. PerformancePoint Services
dashboards have fixed layout, and although a choice of layouts is presented in the Dashboard Designer, it
is not as flexible as Reporting Services.

Combining PerformancePoint Services with technologies such as Reporting Services and PowerPivot gives
the business more versatility in creating business intelligence reports. This increases the capability that the
business has to create a wide range of reports for multiple audiences.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-21

Planning for PerformancePoint Services


When the SharePoint server farm is created, steps
must be to be performed to ensure that
PerformancePoint Services are available.

Ensure that the correct version of the


ADOMD.Net client library is installed
If you are installing PerformancePoint Services on
a server that does not have Excel Services or
PowerPivot configured, you must install the
ADOMD.Net component on the server that will
host PerformancePoint services. This can be found
in the SQL Server 2012 SP1 feature pack.

Create a managed service account and a PerformancePoint application pool


You must first start the PerformancePoint service on the application server where you want to run
PerformancePoint. After that is done, all services within SharePoint should run under its own separate
service account. Ensure that a domain user account has been created in Active Directory for the purpose
of running the PerformancePoint service account. After it is created, register it as a managed account
within SharePoint and when creating the SharePoint application pool, ensure that PerformancePoint
creates its own application pool using the registered managed account. Furthermore, the
PerformancePoint Services service application proxy must be associated with the default web application
that hosts the default SharePoint site, and you must ensure that the managed service account is granted
access to the content databases.

Configure an unattended service account if required


For PerformancePoint to access data sources, an unattended account can be defined to connect up on
behalf of users. PerformancePoint Services uses Secure Store Service to store the unattended service
account password. Before using the unattended service account, make sure that Secure Store has been
configured. If your requirement is that the user’s identity must be identified when connecting to a back-
end data source, Kerberos delegation and impersonation can also be configured to work with
PerformancePoint.

Dashboard Designer
The Dashboard Designer is the primary tool used
by end users to create a PerformancePoint report,
scorecard, KPI, or dashboard. For the first time in
SharePoint 2013, it is available within the
SharePoint site on the SharePoint ribbon, and is
found on the PerformancePoint tab.

The Dashboard Designer contains a workspace


browser that enables the creation of data sources
and PerformancePoint content. The data sources
support SharePoint Lists; Excel Services; SQL Server
tables and Excel workbooks; and multi-
dimensional (Analysis Services) data sources. The
PerformancePoint content provides support for reporting templates, including KPIs, reports, scorecards,
and dashboards.
MCT USE ONLY. STUDENT USE PROHIBITED
9-22 Planning a SharePoint Server BI Solution

Consider the following when using the Dashboard Designer:

 Ensure that the account used to connect to the data source has permission to read the data.

 If Kerberos delegation and impersonation is used, ensure that a user’s credential is successfully
presented to the data source.

 Use the business requirements to determine the report type of PerformancePoint reports.

 Ensure that users are educated on how to use Dashboard Designer to reduce the number of support
calls.

PerformancePoint Data Sources


Configuring PerformancePoint data sources for
Analysis Services is more complex than in other
reporting environments, such as Reporting
Services. All technologies require the user to
define a server name, an authentication
mechanism, and a database name. However,
PerformancePoint data sources require additional
setting for time mappings that defines how time
intelligence is structured within the data source.
Therefore, it is important to understand how the
time-based data is stored in a data source.
When creating a data source for PerformancePoint
Services, three tabs are displayed:

 Editor. This tab is where the server name, an authentication mechanism, and a database name is
defined. You can use one of three authentication mechanisms:

 Unattended service account. A predefined account that connects using a dedicated connection.

 Stored account. A target application defined in the Secure Store Service that has a dedicated
connection configured.

 Per-user identity. The credential of the user using a data source is used.

 Properties. This tab allows you to provide a name and a description for the data source. A
responsible person can be defined by defining an email address of an individual or a group. Finally, a
display folder enables the ability to determine a folder name for storing the data source.

 Time. This tab enables you to define time mappings. Time mapping involves selecting a table in a
data source that contains a time-based hierarchy, from which an attribute in a dimension is mapped
to a time property in the PerformancePoint data source. For example, the Year property could be
mapped to a column within the data source named CalendarYear, and Month could be mapped to
MonthName. By providing this information, PerformancePoint is informed how the time-based data
is defined, and can aggregate data over time periods.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-23

KPIs, Reports, Scorecards, and Dashboards

KPIs
KPIs display visual objects that measures numeric
metrics against a target. In PerformancePoint
Services, you can use the Dashboard Designer to
create KPIs that compare an actual value against a
target value. Both the actual and target values are
associated with data sources and formulae, and
can be formatted with an appropriate number
format. You can also define threshold percentages
that determine the icons to use when comparing
the actual value to the target value.

For example, a business requirement might be to track sales revenue performance with a year-on-year
growth target of 10 percent. To support this requirement, you might create a KPI with the following
characteristics:
 The actual value uses a data source that applies the YearToDate time-intelligence function to the
sales revenue measure. This results in a figure that shows the sales revenue for the current year so far.

 The target value uses the same data source to create a variable named LastYearSales that is based on
the formula YearToDate-1, which returns the sales revenue figure for the year-to-date period in the
previous year. The target value is then defined as LastYearSales * 1.1 (in other words, last year’s
revenue to date plus 10 percent).
 Both values are formatted as currency.

 Thresholds are configured to show:

o A red indicator if the actual value is less than 75 percent of the target value.
o A yellow indicator if the actual value is between 75 and 95 percent of the target value.

o A green indicator if the actual value is above 90 percent of the target value.

Reports
Reports provide an interactive, graphical representation of data that can be displayed on a dashboard
page. You can create many kinds of report with PerformancePoint Services, including:

 Analytic charts and grids.

 Strategy Map.

 KPI Details reports.

 Decomposition Tree.
Additionally, you can show SQL Server Reporting Services reports and Excel Services reports in a
PowerPivot dashboard.

Dashboard Designer provides a graphical report design environment in which you can create reports by
dragging measures and dimension hierarchies from an Analysis Services data source. One of the key
benefits of using PerformancePoint Services to create reports is that reports automatically provide drill-
down interactivity. For example, you might create a report containing a pie chart that shows sales revenue
by product category based on a dimension hierarchy that includes category, subcategory, and product
levels. When the report is displayed on a SharePoint Server site, users can click the pie segment for a
MCT USE ONLY. STUDENT USE PROHIBITED
9-24 Planning a SharePoint Server BI Solution

particular category, and the chart will be redrawn to show sales for subcategories in that category.
Clicking a subcategory segment redraws the chart to show sales for products in that subcategory.

Scorecards
A scorecard is a collection of KPIs that enables users to drill down into hierarchies to identify specific areas
of the business that are over-performing or under-performing against the target. For example, a
scorecard could show the sales revenue KPI discussed earlier in this topic aggregated by sales region. At
the top level, the scorecard shows sales revenue performance against the target for the company as a
whole, but users can expand the scorecard to view performance for individual sales regions.

A scorecard can contain multiple KPIs that measure different aspects of business performance to provide
an overall view of how the organization is meeting its targets. For example, a scorecard might include KPIs
for sales revenue, profitability, and productivity levels based on hours of continuous operation for plant
machinery. The KPIs in the scorecard can each be weighted to reflect their relative importance to the
overall goals of the business, and a total score can then be calculated. This approach is often referred to as
a balanced scorecard, because it balances multiple factors to provide a high-level view of how the business
is performing.

Dashboards
Dashboards are a PerformancePoint component that enables the user to bring together multiple
PerformancePoint objects in one place. A dashboard page provides preset layouts that allow you to
choose which PerformancePoint object should populate a particular area. There are different page
layouts, including the following:
 1 Zone.

 2 Columns.

 2 Rows.

 3 Columns.

 3 Rows.

 Column, Split Column.


 Header, 2 Columns.

A key benefit of creating a dashboard is that, in addition to a single view of high-level business
performance data, the various data elements on the dashboard can be linked. For example, clicking a
column for product category sales revenue in a column chart might filter a different chart to show
profitability and productivity data for the selected product category.

Saving and deploying PerformancePoint content


You can save a local copy of the PerformancePoint objects you create, enabling you to return to work on
the PerformancePoint content at a later date. After you finish creating the PerformancePoint dashboard
pages you want to include in the site, you can deploy them to a SharePoint site.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-25

Lab: Planning a SharePoint Server BI Solution


Scenario
You have created a data warehouse and analytical data model for Analysis Services Cycles, and planned
formal reporting with Reporting Services and business analysis with Microsoft Excel. IT managers and
database administrators at Adventure Works are concerned about the proliferation of spreadsheets that
contain business data across file shares and personal document stores, and they have requested that you
design a SharePoint Server site on which the delivery and management of BI services can be centralized.

Objectives
After completing this lab, you will be able to:

 Create a SharePoint Server site for BI.

 Configure Reporting Services in SharePoint Server.

 Manage PowerPivot for SharePoint.


 Create a PerformancePoint Dashboard.

Lab Setup
Estimated Time: 90 Minutes
Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as
ADVENTUREWORKS\Student with the password Pa$$w0rd.

If you are unfamiliar with SharePoint Server, it is highly recommended that you perform the lab using the
lab answer key instead of the high-level steps.

Exercise 1: Creating a SharePoint Server Site for BI


Scenario
You have decided to create a SharePoint subsite for business intelligence reports.

The main tasks for this exercise are as follows:


1. Prepare the Lab Environment.

2. Enable SharePoint Publishing.

3. Create a Subsite.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.
 Run Setup.cmd in the D:\Labfiles\Lab09\Starter folder as Administrator.

 Task 2: Enable SharePoint Publishing


 In the Adventure Works Portal site at http://mia-sqlbi/sites/adventureworks, edit the site settings to:

o Activate the SharePoint Server Publishing Infrastructure site collection feature.


o Activate the SharePoint Server Publishing site feature.
MCT USE ONLY. STUDENT USE PROHIBITED
9-26 Planning a SharePoint Server BI Solution

Tip: To view the site settings, use the Settings menu, which can be accessed from the Settings icon at the
upper-right of the home page next to the name of the currently logged on user.

 Task 3: Create a Subsite


 Add a new subsite to the contents of the Adventure Works Portal site at http://mia-sqlbi.

 The subsite should be based on the Business Intelligence Center enterprise template and be
accessible at the following URL:

http://mia-sqlbi/sites/adventureworks/bi

 On the Adventure Works Portal home page, in the Quick Launch area on the left, add a link to the
new subsite.

Results: At the end of this exercise, you should have created a subsite based on the Business Intelligence
Center template at http://mia-sqlbi/sites.adventureworks/bi.

Exercise 2: Configuring Reporting Services in SharePoint Server


Scenario
You have created a SharePoint site for BI reports. Now you must verify that Reporting Services
functionality is supported on the SharePoint site, publish the organization’s formal reports, and ensure
that users can perform self-service reporting on the SharePoint site.

The main tasks for this exercise are as follows:


1. Review Reporting Services Configuration.

2. Create a Document Library for Reports.

3. Deploy Report Items to SharePoint Server.


4. Test Self-Service Reporting.

 Task 1: Review Reporting Services Configuration


 Verify that the Report Server Integration site collection feature is activated in the Adventure Works
Portal site.

 This feature is inherited by all subsites of the top-level site where it is activated.

 Task 2: Create a Document Library for Reports


 In the Adventure Works BI Portal subsite at http://mia-sqlbi/sites/adventureworks/bi, add a library
named AWReports based on the Document Library app.

 Modify the advanced settings for the AWReports document library to enable management of
content types, and then add the existing Report Builder Report and Report Data Source content
types are from the SQL Server Reporting Services Content Types group.

 Change the new button order and content type so that the Document content type is no longer
visible, and the default content type is Report Builder Report.

 Modify the list name, description, and navigation settings of the document library to display a link to
it in the Quick Launch area.

 Task 3: Deploy Report Items to SharePoint Server


 Use SQL Server Data Tools to open the AWReports.sln project in the D:\Labfiles\Lab09\Starter folder.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-27

 Modify the properties of the project as follows:

o TargetServerURL: http://mia-sqlbi/sites/adventureworks/bi

o TargetDatasetFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Datasets

o TargetDataSourceFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Data Sources

o TargetReportFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Templates

o TargetReportPartFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Report Parts

 Deploy the project.

 Verify that the Data Sources, Datasets, and Templates folders are created in the AWReports
document library in the Adventure Works BI Portal site you created earlier.

 Edit the data sources that were deployed and configure them to use the following stored Windows
credentials:

o User Name: ADVENTUREWORKS\ServiceAcct

o Password: Pa$$w0rd

 Task 4: Test Self-Service Reporting


 Create a new folder named Self-Service Reports in the AWReports document library.
 Start Report Builder and open the AWReport template from the Templates folder in the report
server where you deployed it in the previous exercise.

 After you open the report, save it as Internet Sales in the Self-Service Reports folder. Be careful not
to overwrite the AWReport template.

 Add a dataset named InternetSalesByCountry that is a reference to the Internet Sales By


Country.rsd shared dataset you deployed to the report server in the previous exercise.
 Add a second dataset named LastMonth that is a reference to the Last Month shared dataset you
deployed to the report server in the previous exercise.

 Configure the StartDate and EndDate report parameters to use the FirstDate and LastDate fields
from the LastMonth dataset as default values.

 Insert a table into the report, below the chart, and use it to show revenue, cost, and profit for each
city, grouped by state or province and country.
o To calculate profit, use the following expression:

=Fields!Revenue.Value - Fields!Cost.Value

 Run the report to preview it, and then make any formatting changes you want.

 After you are satisfied with the report, save it and close Report Builder.
 Use Internet Explorer to view your report in the Self-Service Reports folder.

Results: At the end of this exercise, you will have published Reporting Services reports to the BI subsite
and verified that self-service reporting is supported.
MCT USE ONLY. STUDENT USE PROHIBITED
9-28 Planning a SharePoint Server BI Solution

Exercise 3: Configuring PowerPivot for SharePoint


Scenario
Users have created PowerPivot workbooks and are distributing them to one another as email attachments.
You have decided to centralize the publishing of these workbooks in a SharePoint site.

The main tasks for this exercise are as follows:

1. Create a PowerPivot Gallery.

2. Publish a PowerPivot Workbook.

3. Configure Data Refresh.

4. View the PowerPivot Management Dashboard.

 Task 1: Create a PowerPivot Gallery


 In the Adventure Works BI Portal subsite at http://mia-sqlbi/sites/adventureworks/bi, add a library
named AWPowerPivot based on the PowerPivot Gallery app.

 Modify the list name, description, and navigation settings of the document library to display a link to
it in the Quick Launch area.

 Task 2: Publish a PowerPivot Workbook


 Upload the Marketing Analysis.xlsx PowerPivot workbook from the D:\Labfiles\lab09\Starter folder to
the AWPowerPivot PowerPivot gallery that you created in the previous task.

 Verify that you can view the PowerPivot workbook in Excel Services within the SharePoint site.

 Task 3: Configure Data Refresh


 Use the SharePoint Central Administration tool to manage the Secure Store Service service
application and note that:

o Default business hours are from 4:00 am to 8:00 pm.


o The unattended data refresh account has been configured to use the target application ID you
saw in the secure store service earlier.

o Users are allowed to enter custom Windows credentials.

 Edit the service application associations for the SharePoint-80 web application to add the Secure
Store Service service.

o This enables the default Web site and its subsites to use credentials in the secure store service.

o If you do not associate the secure store service with a site where PowerPivot workbooks are
hosted, users will not be able to configure data refresh for the workbooks.

 View the configuration settings for the PowerPivot service application, and note the default data
refresh settings.

 In the BI portal site at http://mia-sqlbi/sites/adventureworks/bi, in the AWPowerPivot folder, click


the Manage Data Refresh button for the Marketing Analysis workbook and enable data refresh.

o Review the default settings for data refresh and note that the unattended data refresh credentials
in the secure store service are used.

 Task 4: View the PowerPivot Management Dashboard


 Use the SharePoint Central Administration tool to view the PowerPivot Management Dashboard.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-29

o The dashboard may not contain any data because the job to process the dashboard data may not
have been run.

 Review the timer job descriptions, and then run the job named PowerPivot Management
Dashboard Processing Timer Job.

 After the job runs successfully, view the PowerPivot Management Dashboard again and explore
the charts it contains.

Results: At the end of this exercise, you will have a PowerPivot Gallery that contains a published
PowerPivot workbook.

Exercise 4: Create a PerformancePoint Dashboard


Scenario
You have been asked to provide a proof of concept that shows off the capabilities of PerformancePoint
Services in SharePoint Server 2013. To that end, you will configure an unattended account under which
PerformancePoint reports will run and then define a data source against the AWSalesMD cube within
Analysis Services. Using AWSalesMD cube as a source, you will then create a KPI, a report, a scorecard, and
a dashboard to show key business metrics using PerformancePoint Services.

The main tasks for this exercise are as follows:


1. Configure the PerformancePoint Unattended Account.

2. Create a PerformancePoint Data Source.

3. Create a KPI.

4. Create a Report.

5. Create a Scorecard.

6. Create a Dashboard.

 Task 1: Configure the PerformancePoint Unattended Account


 Use the SharePoint Central Administration tool to configure the settings of the PerformancePoint
Services service application so that it uses the following unattended service account:

o User Name: ADVENTUREWORKS\ServiceAcct

o Password: Pa$$w0rd

 Task 2: Create a PerformancePoint Data Source


 In the Data Connections Library on the Adventure Works BI Portal site at http://mia-
sqlbi/sites/Adventureworks/BI, launch the PerformancePoint Dashboard Designer.

 Create a data source named AWSalesMD that connects to the Sales cube in the AWSalesMD
Analysis Services database on MIA-SQLBI.

 Configure the AWSalesMD data source to use a time dimension based on the Fiscal Date hierarchy
in the Ship Date dimension.

 Specify July 1 of the most recent fiscal year in the cube as the start date for the time dimension,
specify that this reference member is at the Day level of the hierarchy.
 Map the reference member above to the same date in the data source.

 Map the following dimension attributes to the time hierarchy levels in the data sources:
MCT USE ONLY. STUDENT USE PROHIBITED
9-30 Planning a SharePoint Server BI Solution

o Fiscal Year: Year.

o Fiscal Quarter: Quarter.

o Month Name: Month.

o Date: Day.

 Task 3: Create a KPI


 Create a KPI named Reseller Revenue that compares reseller revenue for the year-to-date with the
same period in the previous year, and is based on a target of 25 percent revenue growth.

 The KPI should consist of :

o An actual value named YTD that is based on the Reseller Revenue measure filtered by the
YearToDate time intelligence function.

o A target value named Target that is based on a calculated metric in which the Reseller Revenue
measure filtered by the YearToDate-1 function is multiplied by 1.25.
 Both the target and actual values should be formatted as currency, and the following threshold values
should be used to determine the KPI value:

o Best: 120%
o Threshold 2: 90%

o Threshold 1: 50%

o Worst: 0%

 Task 4: Create a Report


 Create an analytic chart named Reseller Profit that shows the Reseller Profit measure as a stacked
bar chart with the default hierarchy in the Sales Territory dimension as the series.

 Task 5: Create a Scorecard


 Create a scorecard named Reseller Revenue Scores that shows the Reseller Revenue KPI for each
member in the Sales Territory dimension.

 Task 6: Create a Dashboard


 Create a dashboard named Sales Dashboard that shows the Reseller Profit chart and the Reseller
Revenue Scores scorecard on a page named Reseller Performance.

 Deploy the dashboard to the Dashboards folder in the Adventure Works BI Portal site.

 Make the dashboard the home page for the Adventure Works BI Portal site.

 Explore the dashboard and verify that the chart and scorecard provide interactive functionality.

Results: At the end of this exercise, you will have created four PerformancePoint reports on the
SharePoint site.

Question: What is the benefit of creating a subsite in SharePoint Server for storing business
intelligence content? Is it a mandatory process?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 9-31

Module Review and Takeaways


Module Review
This module explored how SharePoint Server can be used to add value to an overall BI project by
providing a centralized platform for storing BI applications and content. It explored the various farm
topology options that can be implemented to meet the organization’s performance, scalability, and
availability requirements, while outlining which BI component should be installed at each layer of the farm
topology when it is implemented. Additionally, an exploration of Kerberos provided information about
the required Kerberos setup if auditability plays an important part of the business requirements.

The module then concluded with an exploration of the capabilities of Reporting Services, PowerPivot, and
PerformancePoint Services in a SharePoint farm and how this could be centralized in a single subsite to
provide a one-stop shop for an organization’s BI platform.

Question: Now that you are familiar with the capabilities that SharePoint Server brings to a
BI project, what considerations would there be for implementing SharePoint as part of a BI
project in your organization?
MCT USE ONLY. STUDENT USE PROHIBITED
9-32 Planning a SharePoint Server BI Solution
MCT USE ONLY. STUDENT USE PROHIBITED
10-1

Module 10
Monitoring and Optimizing a BI Solution
Contents:
Module Overview 10-1

Lesson 1: Overview of BI Monitoring 10-2

Lesson 2: Monitoring and Optimizing the Data Warehouse 10-5

Lesson 3: Monitoring and Optimizing Analysis Services 10-10

Lesson 4: Monitoring and Optimizing Reporting Services 10-16

Lab: Monitoring and Optimizing a BI Solution 10-20

Module Review and Takeaways 10-29

Module Overview
After an organization implements a business intelligence (BI) solution, the key components of the solution
need to be monitored to ensure the ongoing health of the solution and to troubleshoot and resolve
performance problems. This module discusses monitoring tools and techniques for the main SQL Server
services in a BI solution, and provides guidance on how to troubleshoot problems and optimize
performance.

Objectives
After completing this module, you will be able to:
 Describe key considerations for monitoring a BI solution.

 Monitor and optimize a data warehouse.

 Monitor and optimize SQL Server Analysis Services.

 Monitor and optimize SQL Server Reporting Services.


MCT USE ONLY. STUDENT USE PROHIBITED
10-2 Monitoring and Optimizing a BI Solution

Lesson 1
Overview of BI Monitoring
Performance monitoring and optimization is a critical consideration in a BI solution, and should be
considered from the start of the project. This lesson describes considerations for health and performance
monitoring, and emphasizes the importance of creating a performance baseline against which future
measurements can be compared.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for monitoring and performance optimization.

 Describe key elements of the BI solution to monitor.

 Create a performance baseline.

Introduction to Monitoring and Optimization


User and organizational expectations for
performance are likely to have been expressed in
the business requirements, and optimal
performance should have been a core
consideration in the design of the technical
architecture, data warehouse, and reporting and
analysis elements of the solution. However, a BI
project generally involves a discrete set of
considerations for monitoring the performance
and health of the solution as a whole, and for
troubleshooting performance issues as they arise.

Considerations for monitoring and optimization


span the entire BI solution, and must take into account each subsystem, including:
 The underlying network and server infrastructure.

 Extract, transform, and load (ETL) processes and infrastructure.

 The data warehouse and departmental data marts.

 SQL Server Analysis Services servers and data models.

 SQL Server Reporting Services servers and reports.

 Microsoft SharePoint Server farms, servers, and services.

Note: The remainder of this module focuses on performance monitoring and optimization
of the data warehouse, Analysis Services, and Reporting Services. Information about monitoring
SQL Server Integration Services package execution is provided in Module 11: Planning BI
Operations. Some details of health monitoring for BI services in a SharePoint Server farm were
discussed in Module 9: Planning a SharePoint Server BI Solution. For courses that provide broader
coverage of performance monitoring and optimization for SharePoint Server in general, refer to
the Microsoft Learning catalog at www.microsoft.com/learning.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-3

What to Monitor
Like any complex IT system, a BI solution requires
monitoring and maintenance to ensure it
continues to perform efficiently and effectively.
Broadly, there are two kinds of monitoring
required for a BI solution: health monitoring and
performance troubleshooting.

Health monitoring
Health monitoring is the ongoing review of
hardware utilization for the key components of
the BI solution. Usage of CPU, memory, disk, and
network resources during typical workloads can be
gathered and monitored for any changes that
might indicate a problem. This is similar to a physical check from a doctor, where blood pressure, heart
rate, and other vital signs are periodically measured to evaluate the health of an individual and detect any
potential problems early.

Performance troubleshooting
Performance troubleshooting is the diagnosis of a perceived problem with the performance of the BI
solution. Typically, performance troubleshooting occurs in response to a symptom that has been detected
by a user or through health monitoring. A performance problem usually relates to a degradation in
response time (amount of time the system takes to perform a specific task) or throughput (the number of
concurrent activities the system can support).

Creating a Performance Baseline


Before you can apply an effective health
monitoring process, you must have a datum set of
resource utilization metrics with which to compare
future measurements. Baseline resource utilization
metrics should be collected for each server or
service in the BI solution over a sufficient period of
time to reflect normal workload cycles. These
workload cycles should include:

 Data warehouse load operations.

 Data model processing operations.

 Normal everyday query activity.

 Peak-time query activity.

The goal of baseline monitoring is to gain an understanding of the average utilization during each
workload period of the following hardware resources:

 CPU

 Memory

 Disk

 Network
MCT USE ONLY. STUDENT USE PROHIBITED
10-4 Monitoring and Optimizing a BI Solution

A common technique for establishing baseline resource utilization metrics is to use Performance Monitor
in the Computer Management console to create a data collector set that includes counters for the
resource utilization you want to include in the baseline. Each SQL Server service provides counters that
provide detailed information about how the resources are used by the service. In addition to these
service-specific counters, most baseline data collector sets include general system metrics such as the
following counters:

 Processor: % Processor Time. This counter indicates the percentage of time that the CPU is utilized.

 Memory: Available Mbytes. This counter indicates the amount of memory in megabytes available in
the system.

 Paging File: % Usage. This counter indicates the percentage of time that memory values are
accessed from the paging file on disk.

 System: Processor Queue Length. This counter indicates the number of requests waiting for an
available execution thread on the processor.

After defining the data collector sets, you must run them for sufficient periods of time to gather
meaningful data. You can then view the counter values that you have recorded as a graphical or text-
based report in Performance Monitor, or you can export them to a comma-separated values (CSV) file for
further viewing and analysis in Microsoft Excel. These reports should then be retained as a baseline set of
metrics with which future measurements can be compared.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-5

Lesson 2
Monitoring and Optimizing the Data Warehouse
The data warehouse is at the heart of the BI solution, and provides the basis for analytical and reporting
activity. Understanding how to monitor, troubleshoot, and optimize performance in a data warehouse is
an important part of planning a BI solution, and this lesson discusses key considerations and techniques
that you should employ.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for monitoring a data warehouse.

 Describe commonly used monitoring tools for data warehouse activity.

 Use Resource Governor to prioritize data warehouse workloads.


 Troubleshoot data warehouse performance problems.

Considerations for Data Warehouse Performance


Considerations for data warehouse performance
include the various workloads that are performed
in the data warehouse server, and the
configuration settings for the SQL Server instance
on which the data warehouse is hosted.

Data warehouse workloads


Most data warehouse are required to support the
following workloads:

 ETL data loads. These are usually regularly


scheduled events that load new or updated
data into the tables in the data warehouse.

 Data model processing. This usually occurs after each ETL data load to refresh multidimensional and
tabular data models that are based on the data warehouse tables.

 Report queries. These are performed when data is retrieved from data warehouse tables to create a
report or a user-defined data model in Excel.

 Operational activities. These are scheduled operations such as index maintenance or database
backups.

SQL Server settings


SQL Server provides multiple configuration settings that affect the way the service uses hardware
resources. In particular, you can use the Maximum Server Memory configuration setting to specify the
maximum amount of system memory that SQL Server can use, and you can use the ALTER SERVER
CONFIGURATION statement to control how SQL Server uses CPU threads and simulate non-uniform
memory access (NUMA).
MCT USE ONLY. STUDENT USE PROHIBITED
10-6 Monitoring and Optimizing a BI Solution

Note: Generally, SQL Server configuration settings are best left at their default values unless
you have specific, validated reasons to change them. For information about configuring SQL
Server settings, see “How to determine proper SQL Server configuration settings” at
http://support.microsoft.com/kb/319942.

Monitoring Tools for the Data Warehouse


You can use the following tools to monitor and
troubleshoot SQL Server performance:

 Performance Monitor. Performance Monitor


provides a range of objects and counters that
you can use to monitor SQL Server
performance. Use Performance Monitor to
record metrics for baseline measurements and
to establish resource utilization during specific
periods of activity for troubleshooting
purposes.
 SQL Server Profiler. SQL Server Profiler
enables you to record events such as
Transact-SQL queries or stored procedure executions. You can create a SQL Server Profiler trace that
records a filtered set of events and data values, and use it to view details of activity that was traced, or
replay the events on test systems.

 Database Engine Tuning Advisor. The database engine tuning advisor can use a SQL Server Profiler
trace or query plan cache to evaluate the physical data structures in a database against a specific
query workload, and recommend schema modifications for table partitioning, indexes, and statistics.

 Data Collector. The data collector provides a performance management framework for SQL Server
instances. A central management data warehouse is created, and data collection is enabled on each
instance of SQL Server to be monitored. SQL Server Agent jobs on each server then record
performance-related data at regular intervals and upload it to the management data warehouse.
Database administrators can then use a centralized set of reports to view database server
performance and health data across the data center.

 Dynamic Management Views. SQL Server provides dynamic management views (DMVs) that you
can query to obtain system performance data. By using DMVs, you can create a custom monitoring
solution that gathers the statistics that are most relevant to your specific data warehouse workloads
and performance priorities.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-7

Using Resource Governor to Balance Resource Utilization


A common performance-related issue in a data
warehouse is the requirement for predictable
performance of key, high-priority workloads that
are not affected by resource contention with other
activities in the database. For example, an ETL data
load process might need to load data into tables
and initiate data model processing within a
defined time period in order to meet business
reporting requirements. If the database is also
available for user queries during the load, the
contention for server resources might slow down
the data load and result in a scheduled report
being generated before all of the new data has been loaded―or worse, there may be insufficient server
resources to complete the data load at all.

Resource Governor provides a way to define resource pools with scoped constraints that define the
system resources available to them. You can then define workload groups in each resource pool with
specific priorities in terms of how they can use the resources in the pool. Finally, you can define a classifier
function that is used to determine the workload group in which a specific session should be executed.
For example, you could use the following Transact-SQL code to create resource groups for low priority
and high priority activities.

CREATE RESOURCE POOL [Low Priority]


WITH(min_cpu_percent=0,
max_cpu_percent=50,
min_memory_percent=0,
max_memory_percent=50);
CREATE RESOURCE POOL [High Priority]
WITH(min_cpu_percent=20,
max_cpu_percent=90,
min_memory_percent=50,
max_memory_percent=90);

Next, you can create workload groups for ETL operations and user queries, and assign them to the
appropriate resource pools with specific restrictions on resource pool resources they can use.

CREATE WORKLOAD GROUP [User Queries]


WITH(group_max_requests=10,
importance=Low,
request_max_cpu_time_sec=50,
request_max_memory_grant_percent=50,
request_memory_grant_timeout_sec=20,
max_dop=1)
USING [Low Priority];
CREATE WORKLOAD GROUP [ETL]
WITH(group_max_requests=100,
importance=High,
request_max_cpu_time_sec=80,
request_max_memory_grant_percent=80,
request_memory_grant_timeout_sec=30,
max_dop=4)
USING [High Priority];

Then you could create a classifier function that determines which workload group each session belongs in.
In this example, the ETL process is identified as being run from an application named “SQL Server” (which
is the name used by SQL Server Integration Services) using the user account
ADVENTUREWORKS\ServiceAcct between 1:00 A.M. and 2:00 A.M. on the first day of the month.
MCT USE ONLY. STUDENT USE PROHIBITED
10-8 Monitoring and Optimizing a BI Solution

CREATE FUNCTION dbo.fn_classify_apps() RETURNS sysname


WITH SCHEMABINDING
AS
BEGIN
DECLARE @retval sysname
IF (APP_NAME() LIKE '%SQL Server%' AND USER_NAME() = 'ADVENTUREWORKS\ServiceAcct'
AND DAY(GETDATE()) = 1 AND DATEPART(HOUR, GETDATE()) BETWEEN 1 AND 2)
SET @retval = ETL;
ELSE
SET @retval = 'User Queries';
RETURN @retval;
END
GO

Finally, you could assign the classifier function to Resource Governor.

ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.fn_classify_apps);


ALTER RESOURCE GOVERNOR RECONFIGURE;

Troubleshooting Data Warehouse Performance


When monitoring SQL Server database engine
activity, use the following Performance Monitor
counters to gain on overview of server resource
utilization:
 Memory: Available Mbytes.

 Paging File: % Usage.

 Physical Disk:Avg. Disk sec/Read.


 Physical Disk: Avg. Disk sec/Write.

 Physical Disk: Disk Reads/sec.

 Physical Disk: Disk Writes/sec.


 Processor: % Processor Time.

 SQLServer:Buffer Manager: Buffer cache hit ratio.

 SQLServer:Buffer Manager: Page life expectancy.

 SQLServer:General Statistics: User connections.

 SQLServer:Memory Manager: Memory grants pending.

 SQLServer:SQL Statistics: Batch requests/sec.

 SQLServer:SQL Statistics: Compilations/sec.

 SQLServer:SQL Statistics: Recompilations/sec.

 System: Processor queue length.


You can also use SQL Server Profiler to trace the specific queries that are being executed in the database,
and use dynamic management views to gather further information. After you gather the statistics, you can
review them in the following order to identify any performance bottlenecks:
1. Review the System: Processor Queue Length and Processor: % Processor Time counters to
determine whether the CPU is experiencing consistently high levels of utilization and threads are
waiting for CPU availability. If it is, review Task Manager to determine if the processor is being used
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-9

by SQL Server or another application that could be moved to a different server. If the CPU is being
heavily used by SQL Server, consider adding processors to the server.

2. Review the Paging File: % Usage and Memory: Available Mbytes counters to determine whether
the system is running short of physical memory and having to page memory values to disk. If a
reasonable amount of memory is available, but paging is still occurring, review counters in the
SQLServer:Buffer Manager and SQLServer:Memory Manager objects to check for configuration
issues with SQL Server’s cache settings.

3. If no obvious memory issues are detected, review the counters in the Physical Disk object. A high
amount of I/O might indicate caching problems or high page file activity, or low numbers of reads
per second might indicate a requirement for faster storage or more effective placement of data files.

Additional Reading: For additional troubleshooting tips for data warehouse workloads,
see “Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting
Applications” at http://sqlcat.com/sqlcat/b/top10lists/archive/2007/11/21/top-10-sql-server-
2005-performance-issues-for-data-warehouse-and-reporting-applications.aspx.
MCT USE ONLY. STUDENT USE PROHIBITED
10-10 Monitoring and Optimizing a BI Solution

Lesson 3
Monitoring and Optimizing Analysis Services
Analysis Services provides a platform for centrally hosted and managed data models. In a BI solution
where business users and applications make frequent use of these data models for analysis, performance
of Analysis Services can be a significant factor in the overall success of the BI solution.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for Analysis Services performance.

 Describe monitoring tools for Analysis Services.

 Monitor Analysis Services.

 Troubleshoot Analysis Services queries.

Considerations for Analysis Services Performance


To effectively monitor, optimize, and troubleshoot
Analysis Services performance, you must first
understand the core workloads that the service
supports, and the configuration options that you
can use to manage hardware resource utilization.

Analysis Services workloads


An Analysis Services server must support the
following two workloads.

 Processing. When the source data for a data


model has been modified, for example after a
data warehouse load cycle, the data in the
data model must be refreshed. To accomplish this, Analysis Services processes all or part of the data
model to create new aggregations and store new dimension members and hierarchies. Depending on
the size of the data model, the amount of data that has changed, and the type of processing (full or
partial) being performed, processing can incur significant demand for hardware resources.

 Queries. Queries are submitted by user applications to retrieve data from Analysis Services data
models. The specific details of query execution vary between multidimensional data models and
tabular data models, but in both cases, queries can be broken down into two subprocesses:

o Data retrieval. The data necessary to satisfy the query is extracted from the data model by the
storage engine.

o Calculations. The data is aggregated, sorted, and otherwise manipulated to satisfy the query
requirements by the formula engine.

Note: In addition to the core workloads in the preceding list, Analysis Services also supports
operational tasks, such as backup operations. However, the effect of these operations on
performance is generally not as significant as processing or query execution.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-11

Analysis Services settings


Analysis Services provides configuration settings that you can use to tune server performance. These
settings are stored in the msmdsrv.ini file in the \OLAP\Config subfolder where the instance is installed,
but you can also access them in the Analysis Server Properties dialog box in SQL Server Management
Studio. Like the SQL Server database engine configuration settings, you should generally only change
Analysis Services configuration settings if you have a validated reason to do so or if you have been
advised to change specific values by Microsoft Product Support.

Memory management settings


Generally, Analysis Services is memory-intensive, and the following settings that can significantly affect
performance:

 Memory\LowMemoryLimit. This is the minimum amount of memory that Analysis Services requires.
Analysis Services will not release memory below this limit.

 Memory\TotalMemoryLimit. If Analysis Services exceeds this value, it will automatically start


releasing memory by reducing the working set and removing data from cache where possible. Note
that Analysis Services can exceed this limit, but it provides a threshold where Analysis Services starts
to proactively deallocate memory resources.

 Memory\HardMemoryLimit. This setting is used to cause Analysis Services to release memory


resources more aggressively than the Memory\TotalMemoryLimit settings. If Analysis Services
exceeds the Memory\HardMemoryLimit setting, user sessions are selectively closed to reduce
memory consumption.
The preceding settings are applicable to servers in both multidimensional and tabular modes. Because of
the in-memory nature of tabular data models, you can also use the following settings to further control
memory usage in a tabular mode server:

 Memory\VertiPaqPagingPolicy. With the default value 1, data in the data model can be paged to
disk if the system runs low on physical memory resources. When this value is set to 0, all data in a
tabular model must remain in memory.
 Memory\VertiPaqMemoryLimit. This setting specifies the maximum amount of physical memory
that can be used to store an in-memory data model. When Memory\VertiPaqPagingPolicy is set to
0, this setting specifies the maximum size of the data model. When Memory\VertiPaqPagingPolicy
is set to 1, this setting determines the maximum amount of memory for the data model beyond
which it will be paged to disk.

Flight Recorder
By default, Analysis Services uses a feature named Flight Recorder to log server activity into a short-term
log for troubleshooting purposes. This logging can incur a performance overhead, and should be disabled
in production servers unless you are using it to troubleshoot a specific problem. You can disable Flight
Recorder by setting the Log\FlightRecorder\Enabled server property to False.

Additional Reading: For more information about optimizing Analysis Services, see
“Microsoft SQL Server Analysis Services Multidimensional Performance and Operations Guide” at
http://social.technet.microsoft.com/wiki/contents/articles/11608.e-book-gallery-for-microsoft-
technologies.aspx#MicrosoftSQLServerAnalysisServicesMultidimensionalPerformanceandOperatio
nsGuide.
MCT USE ONLY. STUDENT USE PROHIBITED
10-12 Monitoring and Optimizing a BI Solution

Monitoring Tools for Analysis Services


You can use the following tools to monitor and
troubleshoot Analysis Services:

 Performance Monitor. Performance Monitor


provides a range of objects and counters that
you can use to monitor Analysis Services
performance. Use Performance Monitor to
record metrics for baseline measurements and
to establish resource utilization during specific
periods of activity for troubleshooting
purposes.

 SQL Server Profiler. SQL Server Profiler


enables you to record details of events that
occur during Analysis Services workloads, such as the execution of MDX or DAX queries.

 Dynamic Management Views. SQL Server provides dynamic management views (DMVs) that you
can query to obtain information about activity in the server. Analysis Services includes the following
DMVs:

o $system.discover_commands. This view lists all currently running commands on the server.

o $system.discover_connections. This view lists current open connections.


o $system.discover_sessions. This view lists all sessions on the server. You can use it to determine
which commands were executed in which connections.

o $system.discover_memoryusage. This view lists all memory allocations in the server.

o $system.discover_locks. This view lists currently held and requested locks.

Monitoring Analysis Services


Performance Monitor is the primary tool for
creating a performance baseline and monitoring
Analysis Services.

Performance counters for Analysis


Services
Performance counters for Analysis Services are
grouped into objects for each instance of Analysis
Services; for example, counters relating to memory
for the default instance of Analysis Services are in
the MSAS11:Memory object, and the same
counters for a named instance with the name
BISRV2 can be found in the
MSOLAP$BISRV2:Memory object. Use the following counters to gain an overall view of resource
utilization.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-13

Object Counter Description

MSAS11:Memory Memory Usage KB The amount of memory used by Analysis Services.

MSAS11:Connection Current The number of open connections to Analysis


Connections Services.

MSAS11:Proc Temp file bytes The amount of temporary data written during
Aggregations written processing operations. Ideally this should be near to
zero.

MSAS11:Proc Current Partitions The speed and concurrency of process index


Indexes Rows/Sec

MSAS11:Processing Rows read/sec The speed of processing activity.


Rows written/sec

Memory Available Kbytes The amount of memory available in the server and
Page Faults/sec the number of times memory is paged in from disk.

System File Read bytes/sec File system cache activity.


File Read
Operations/sec

Network Interface Bytes Received/sec Data sent and received through network
Bytes Sent/sec connections.

TCPv4 and TCPv6 Segments/sec Information about the stability of network


Segments connections.
Retransmitted/sec

Additionally, you can include the Memory Limit Hard KB, Memory Limit High KB, and Memory Limit
Low KB counters from the MSAS11:Memory object to compare memory usage to the limits set in the
server configuration. For an Analysis Services instance in tabular mode, the Memory Limit VertiPaq KB
counter shows the maximum data model memory allocation. To troubleshoot performance problems, you
can also include counters from the MSAS11:Thread object; and the MSAS11:MDX and MSAS11:Storage
Engine Query objects include counters that can be useful when troubleshooting query performance. If
you are troubleshooting I/O problems, you can also include additional counters from the Memory object
to estimate the size of the file system cache as well as the Physical Disk object.

Note: An important tip to follow when viewing graphical data in Performance Monitor is to
ensure that the counters being monitored are viewed at the same scale. You can use the built-in
Scale Selected Counters functionality in the list of counters below the chart, or you can set the
scale individually in the properties of each counter.
MCT USE ONLY. STUDENT USE PROHIBITED
10-14 Monitoring and Optimizing a BI Solution

Troubleshooting Analysis Server Queries


Analysis Services queries can be extremely
complex to troubleshoot. Performance problems
can be caused by a number of factors, so it is
important to understand the architecture of
Analysis Services subsystems involved in
processing a query and the events that occur
during query execution.

Analysis Services query architecture


Analysis Services uses the following subsystems to
process queries:

 Session Manager. This subsystem manages


user sessions, and integrates with Security Manager to authenticate users.
 Query Processor. This subsystem is responsible for generating query execution plans and applying
the necessary formulas and calculations to create the requested result set. Query execution plans are
cached for faster execution in subsequent queries.
 Storage Engine. This subsystem retrieves the data required to satisfy the query. Whenever possible,
data is cached for faster retrieval in subsequent queries.

Query execution events


During query execution, a sequence of events occurs as the process progresses through the subsystems.
The specific events that occur can vary between queries, but the following general pattern is typical of an
Analysis Services query execution:

1. A new session is initialized in the session manager.

2. A query is started.

3. The query starts the process of extracting data from a cube.


4. The query is broken down into subcube queries, which are used to retrieve data from the storage
engine.

5. If the data model includes aggregations that satisfy the subcube query, the aggregations are
retrieved. Otherwise, more granular data must be retrieved and aggregated.

6. If the required data is cached, the subcube queries get the data from the cache. Otherwise, the data is
retrieved from the stored dimension or measure group.

7. At the end of each subcube query, the results are passed to the query processor. The query processor
then begins to serialize the results, applying any additional aggregations, sort operations, or other
calculations as required.

8. When the results are serialized, the cube query ends.

9. When the query ends, the results are passed to the user session.

Interpreting query events


You can use SQL Server Profiler to trace the events that occur during query execution and determine
which steps in the overall query process take the most time to complete. You can then use this
information to identify steps in the process that will most benefit from further investigation and
optimization. Each data model and query has its own particular characteristics, but you can use the
following guidelines to start troubleshooting query performance issues:
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-15

 If the process spends more time in the query processor than in the storage engine, consider
optimizing the MDX or DAX query to reduce the number of calculations being performed.

 If the query process spends more time in the storage engine than the query processor, consider
creating partitions in the data model, and defining attribute relationships in multidimensional
hierarchies.

 If the query process spends more time in the storage engine than the query processor, and data is
seldom retrieved from aggregations, consider optimizing the aggregations in the cube based on
usage.

 If the query process spends more time in the storage engine than the query processor, but data is
rarely retrieved from cache, investigate the memory resources, utilization, and configuration.

Additional Reading: For more information about diagnosing query performance


problems, see “Analysis Services Performance Guide” at http://www.microsoft.com/en-
us/download/confirmation.aspx?id=17303.
MCT USE ONLY. STUDENT USE PROHIBITED
10-16 Monitoring and Optimizing a BI Solution

Lesson 4
Monitoring and Optimizing Reporting Services
Reporting Services provides a platform for delivering reports based on data from data models and the
data warehouse. Business users can use Reporting Services interactively, or receive reports automatically
through subscriptions. This lesson discusses considerations for monitoring and optimizing Reporting
Services performance.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for Reporting Services performance.

 Describe monitoring tools for Reporting Services.

 Use caching and snapshots to optimize Reporting Services performance.

Considerations for Reporting Services Performance


To optimize Reporting Services, you must
understand the workloads that it supports and the
way that it uses server resources such as memory.

Reporting Services workloads


To generate reports, Reporting Services must
support the following workloads:

 Data retrieval and processing. When a


report is requested (interactively or by a
scheduled subscription, cache refresh, or
snapshot), Reporting Services loads the
definition of the report and uses its datasets
to retrieve the required data. The report is then generated in an intermediate format for rendering.

 Report rendering. When a user views or exports a report, or the report must be formatted for
delivery as a subscription, Reporting Services uses the appropriate rendering extension to render the
report into the required format.

Reporting Services settings


Report execution and rendering are memory-intensive tasks, and the best way to ensure high
performance is to host Reporting Services in a distributed topology on servers with a large amount of
memory. You can use the following memory management settings in the RSReportServer.config file to
specify thresholds for memory:

 WorkingSetMinimum. You can add this setting to the RSReportServer.config file to set the minimum
amount of system memory that Reporting Services must have allocated before it will start to release
memory resources. By default, this setting is 60 percent of the memory available on the server. If
Reporting Services is within this value, the level of memory pressure is considered low.

 MemorySafetyMargin. You can use this value to specify a percentage of WorkingSetMaximum. If


Reporting Services exceeds this amount of memory, the level of memory pressure is considered
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-17

medium, and Reporting Services begins to refuse some requests for memory reduce memory
allocations.

 MemoryThreshold. You can use this value to specify a percentage of WorkingSetMaximum that is
higher than MemorySafetyMargin. If Reporting Services exceeds this amount of memory, the level
of memory pressure is considered high, and Reporting Services begins to manage requests for
memory aggressively.

 WorkingSetMaximum. You can add this setting to the RSReportServer.config file to set the
maximum amount of system memory that Reporting Services can use. By default, this setting is not
included in the RSReportServer.config file, and Reporting Services can access all memory available on
the server.

The following table describes how Reporting Services adapts the priority of reporting tasks based on the
memory thresholds defined by the settings in RSReportServer.config.

Memory Setting for low Setting for high


Server behavior
pressure boundary boundary

Low WorkingSetMinimum MemorySafetyMargin Background processing is given a


lower priority than interactive
requests.

Medium MemorySafetyMargin MemoryThreshold Current requests continue to


process, but new requests might
be rejected. Background
processing is given a lower
priority than interactive requests,
but all components may have
memory allocations reduced.

High MemoryThreshold WorkingSetMaximum Memory allocation is further


reduced and requests for more
memory are denied. Current
requests can take longer to
complete. New requests are
rejected. The report server pages
in-memory data to disk.

Monitoring Tools for Reporting Services


Although you can use data in Task Manager and
Windows Event Viewer to view Reporting Service
activity, Performance Monitor is the primary tool
for monitoring Reporting Services.

Monitoring the Reporting Services web


service
The Reporting Services web service manages
interactive reporting activity. For Reporting
Services instances in native mode, you can use the
counters in the MSRS 2011 Web Service object
to monitor the number of requests, cache usage,
and processing activity for interactive reports
MCT USE ONLY. STUDENT USE PROHIBITED
10-18 Monitoring and Optimizing a BI Solution

viewed in Report Manager. When Reporting Services is deployed in SharePoint Integrated mode, the
MSRS 2011 SharePoint Mode Web Service object provides the same counters.

Monitoring the Reporting Services Windows service


The Reporting Services Windows service manages scheduled reporting activity, such as cache refreshes,
snapshot creation, data alerts, and subscription processing. For Reporting Services instances in native
mode, you can use the counters in the MSRS 2011 Windows Service object to monitor this activity.
When Reporting Services is deployed in SharePoint Integrated mode, the MSRS 2011 SharePoint Mode
Windows Service object provides the same counters.

Monitoring HTTP and memory activity


To monitor the number of HTTP requests and the amount of data transferred, you can use the counters in
the ReportServer:Service object, or the ReportServerSharePoint:Service object if the server is
configured in SharePoint Integrated mode. This object also includes the Memory Pressure State counter,
which indicates the memory pressure for Reporting Services based on the thresholds discussed in the
previous topic. The Memory Pressure State counter can have one of the following values:
 1: None

 2: Low

 3: Medium
 4: High

 5: Maximum exceeded

Using Caching and Snapshots to Optimize Performance


You can optimize Reporting Services performance
by reducing the amount of data retrieval and
report processing that occurs during report
execution. To accomplish this, you can use caching
or snapshots.

Caching reports and shared datasets


You can cache reports and shared datasets to
reduce the time required for report execution.
Cached copies of reports and datasets are stored
in the report server temporary database (by
default, named ReportServerTempDB) and
retrieved on-demand. When a report or dataset
that is configured for caching is requested, Report Server first tries to retrieve it from the cache. If it is not
found, report processing occurs as for a non-cached object and the resulting dataset or report is stored in
cache for the next request.

When planning to use caching as a performance optimization technique, consider the following features
of cached objects in Reporting Service:

 Cached reports are stored in an intermediate format that includes data and layout information, but
must still be rendered to the requested format.

 Cached datasets and reports are based on specific parameter value combinations. A cached copy of
the report or dataset is created for each combination of parameter values requested.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-19

 You can configure a cached object to expire after a specified interval (in minutes) or at a time
specified in a schedule. Schedules can be specific to an individual cached object, or shared across
multiple objects. When a cached object expires, it is removed from the cache and the next request
results in a new execution with live data.

 You can preload a cached object by creating a cache refresh schedule, or by scheduling a subscription
for a cached report with a NULL delivery extension.

 Cached objects are recreated in the event of a server restart.

Creating snapshots
Cached reports can improve performance, but the cache is volatile and can change when reports, data
sources, or datasets are modified. If you require a more predictable way to generate reports that are
based on the data at a specific point in time, consider using snapshots.

A snapshot is a copy of a report in intermediate format that is created for a specific parameter
combination at a specific time. Snapshots are stored in the Report Server database (by default, named
ReportServer) and can be created on a regularly scheduled basis. By default, each snapshot is replaced
with the new version when the snapshot is created, but you can also create a report history that includes a
set of previous snapshots.
MCT USE ONLY. STUDENT USE PROHIBITED
10-20 Monitoring and Optimizing a BI Solution

Lab: Monitoring and Optimizing a BI Solution


Scenario
You have implemented a BI solution for Adventure Works Cycles. Now you must monitor the performance
of the solution, and optimize it where necessary. The solution consists of a data warehouse that provides a
source of data for a multidimensional Analysis Services data model and several Reporting Services reports.

Objectives
After completing this lab, you will be able to:

 Monitor and optimize a data warehouse.

 Monitor and optimize Analysis Services.

 Monitor and optimize Reporting Services.

Estimated Time: 75 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log on to 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Monitoring and Optimizing a Data Warehouse


Scenario
The data warehouse for the Adventure Works Cycles BI solution is a SQL Server database. The data
warehouse is updated by an ETL process each month, after which the multidimensional data model in
Analysis Services is processed to reflect the new data. Throughout the month, the data warehouse is
queried by Reporting Services reports and other business user–reporting tools.

You plan to monitor the data warehouse and:


 Determine baseline performance metrics for the data warehouse for typical workloads, including ETL
data loads and report query execution.

 Ensure that the physical data structures in the database, such as indexes and statistics, are optimal for
the queries being executed.

 Manage workload priorities and prevent user queries from adversely affecting data load resource
availability.

 Generate server health statistics to assist in ongoing server management.

The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Configure Data Collection.

3. Monitor Data Warehouse Load Activity.

4. Monitor Data Warehouse Query Activity.

5. Manage Workload Priorities.

6. View Server Activity.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-21

 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab10\Starter folder as Administrator.

 Task 2: Configure Data Collection


 Use SQL Server Management Studio to configure data collection for the MIA-SQLDW instance of SQL
Server.

o Create a management data warehouse on MIA-SQLDW.

o Make ADVENTUREWORKS\ServiceAcct a member of the mdw_admin role in the management


data warehouse.

o Set up data collection in MIA-SQLDW to log performance data to the management data
warehouse you created.

 Task 3: Monitor Data Warehouse Load Activity


 Use Performance Monitor to create a data collector set that includes the following counters:
o Processor: % Processor Time.

o Memory: Available Mbytes.

o Paging File: % Usage.


o PhysicalDisk: Disk Reads/sec.

o PhysicalDisk: Disk Writes/sec.

o SQLServer:Memory Manager: Memory Grants Pending.

o System: Processor Queue Length.

 You can access Performance Monitor in the Computer Management tool. To open this tool in
Windows Server 2012, move the mouse pointer to the lower-left of the taskbar until the Start screen
image appears. Right-click the Start screen image, and then click Computer Management.

 For counters that provide multiple instances, select the _Total instance.

 Start the data collection set, and then run the LoadDW.dtsx package in the LoadPartition.sln SQL
Server Integration Services project to load the data warehouse.

 After the load completes, stop the data collection set and view its latest report.

 Task 4: Monitor Data Warehouse Query Activity


 Use SQL Server Profiler to create a trace that saves details of the following events in the MIA-SQLDW
instance of SQL Server to a file:

o SQL:BatchCompleted

o SQL:Stmnt:Completed

 Configure the trace to include the following columns:

o ApplicationName

o DatabaseName

o Duration

o EndTime

o LoginName
MCT USE ONLY. STUDENT USE PROHIBITED
10-22 Monitoring and Optimizing a BI Solution

o Reads

o RowCounts

o SPID

o StartTime

o TextData

 Apply a column filter so that events are recorded only when the DatabaseName column value is like
%AWDataWarehouse%.

 Run the trace, and while it is running, run the RunDWQueries.cmd command file in the
D:\Labfiles\Lab10\Starter folder. This executes a script that runs queries in the data warehouse for
more than a minute.

 After the command file finishes, stop the SQL Server Profiler trace and view the events it recorded.

 Use the Database Engine Tuning Advisor tool to analyze the trace file you recorded in the
AWDataWarehouse database, and generate recommendations for indexes and views on the
assumption that aligned partitioning should be used and existing aligned partitioning structures
should be retained.

 Task 5: Manage Workload Priorities


 In SQL Server Management Studio, enable Resource Governor and create the following resource
pools.

Minimum Memory Maximum Memory


Name Minimum CPU % Maximum CPU %
% %

Low 0 50 0 50
Priority

High 20 90 20 90
Priority

 Create the following workload group for the Low Priority resource pool.

CPU Memory
Maximum Memory
Name Importance Time Grant Time- Degree of Parallelism
Requests Grant %
(sec) out (sec)

User Low 10 50 50 20 1
Queries

 Create the following workload group for the High Priority resource pool.

Memory
Maximum CPU Time Memory
Name Importance Grant Time- Degree of Parallelism
Requests (sec) Grant %
out (sec)
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-23

Memory
Maximum CPU Time Memory
Name Importance Grant Time- Degree of Parallelism
Requests (sec) Grant %
out (sec)

ETL High 100 80 80 30 4

 Use the following Transact-SQL code to create a classifier function named dbo.fn_classify_apps that
returns the string “User Queries” if the application name in the current session is “SQLCMD”; or “ETL”
if the current application is named “SQL Server”. Alternatively, you can execute the Classifier
Function.sql script file in the D:\Labfiles\Lab10\Starter folder.

USE master;
GO
CREATE FUNCTION dbo.fn_classify_apps() RETURNS sysname
WITH SCHEMABINDING
AS
BEGIN
DECLARE @retval sysname
IF (APP_NAME() LIKE '%SQLCMD%')
SET @retval = 'User Queries';
IF (APP_NAME() LIKE '%SQL Server%')
SET @retval = 'ETL';
RETURN @retval;
END
GO

 Configure Resource Governor to use the dbo.fn_classify_apps classifier function.


 Use Performance Monitor to view the following performance counters.

Object Counter Instances

SQLServer: Resource Pool Stats CPU control effect  High Priority


 Low Priority

SQLServer:Workload Group Stats CPU usage %  ETL


 User Queries

SQLServer:Workload Group Stats Reduced memory  ETL


grants/sec  User Queries

 Run the RunDWQueries.cmd command file in the D:\Labfiles\Lab10\Starter folder and observe the
counters in performance monitor.

 With the RunDWQueries.cmd command file still running, run the RunETL.cmd command file to
simulate ETL activity, and observe the counters in performance monitor.

 Note that the CPU control effect % for both workloads increases as Resource Governor prioritizes
CPU resources for the ETL workload.

 Task 6: View Server Activity


 In SQL Server Management Studio, run the following SQL Server Agent jobs to upload data collector
statistics to the management data warehouse you created at the beginning of this exercise.

o collection_set_1_noncached_collect_and_upload
o collection_set_2_upload

o collection_set_3_upload
MCT USE ONLY. STUDENT USE PROHIBITED
10-24 Monitoring and Optimizing a BI Solution

 View the Server Activity History data collection report, which is generated from the management data
warehouse.

 Use the interactive zoom-in icon to filter the report for shorter time periods, and note the activity that
has been recorded.

 After you finish reviewing the server activity, disable data collection for the MIA-SQLDW server
instance.

Results: At the end of this exercise, you will have a Performance Monitor report showing activity during
an ETL data load and recommendations from the Database Tuning Advisor based on a SQL Server
Profiler trace. You will also have created resource pools and workload groups for Resource Governor,
and generated server health data with the Data Collector.

Exercise 2: Monitoring and Optimizing Analysis Services


Scenario
Users have complained that a report used to show resellers in descending order of revenue volume takes
a long time to run. The query uses an MDX statement to retrieve the required data from a
multidimensional cube in Analysis Services, and you want to determine whether the problem was caused
by sub-optimal cube aggregation and storage design, or by inefficient MDX code.
The main tasks for this exercise are as follows:

1. Restart Analysis Services.

2. Create a SQL Server Profiler Trace for Analysis Services.

3. Add Analysis Services Counters to Performance Monitor.

4. Monitor an MDX Query.

5. Monitor a Revised Query.

 Task 1: Restart Analysis Services


 Use SQL Server Management Studio to connect to the MIA-SQLBI instance of Analysis Services, and
then restart the service.

 Task 2: Create a SQL Server Profiler Trace for Analysis Services


 Use SQL Server Profiler to create a trace that records the following events in the MIA-SQLBI instance
of Analysis Services:

o Progress Report Begin.

o Progress Report End.

o Query Begin.

o Query End.

o Query Cube Begin.

o Query Cube End.

o Query Subcube.

o Serialize Results Begin.

o Serialize Results End.


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-25

 Configure the trace to include the following columns:

o EventSubclass

o TextData

o ApplicationName

o Duration

o DatabaseName

o ObjectName

o SPID

o CPUTime

 Apply a column filter so that events are recorded only when the DatabaseName column value is like
AWSalesMD.

 Task 3: Add Analysis Services Counters to Performance Monitor


 Use Performance Monitor to view the Total cells calculated counter in the MSAS11: MDX object.

 Change the view in Performance Monitor to show the counter values as a report. The initial value for
the Total cells calculated counter should be zero.

 Task 4: Monitor an MDX Query


 Ensure that the SQL Server Profiler trace is running and that Performance Monitor is showing live
values for the Total cells calculated counter.

 Use SQL Server Management Studio to open and execute the MDX Query.mdx script in the
D:\Labfiles\Lab10\Starter folder.

 Stop the SQL Server Profiler trace and freeze the display in Performance Monitor.

 View the events in the SQL Server Profiler trace, and compare the amount of time spent in the
storage engine (querying subcubes) with the amount of time spent serializing the results.

 View the Total cells calculated counter value in Performance Monitor.

Note: The results indicate that the query spent significantly more time manipulating the data than
retrieving it from the storage engine, and a very large number of cells were calculated during the
execution of the query. The most appropriate way to improve the query performance is to optimize the
MDX and reduce the number of calculations being performed.

 Task 5: Monitor a Revised Query


 Restart Analysis Services to clear the cache and reset all counters.
 Restart the SQL Server Profiler trace and unfreeze the display in Performance Monitor.

 Use SQL Server Management Studio to open and execute the Revised MDX Query.mdx script in the
D:\Labfiles\Lab10\Starter folder.

 Stop the SQL Server Profiler trace and freeze the display in Performance Monitor.

 View the events in the SQL Server Profiler trace, and compare the amount of time spent in the
storage engine (querying subcubes) with the amount of time spent in the formula engine (serializing
the results).
MCT USE ONLY. STUDENT USE PROHIBITED
10-26 Monitoring and Optimizing a BI Solution

 View the Total cells calculated counter value in Performance monitor.

Note: The revised version of the query uses a WITH SET statement to sort the resellers by revenue
before applying the RANK function. This enables the query processor to use a linear hash scan to find
each reseller’s position in the ordered list, dramatically reducing the number of calculations required
to produce the results.

Results: At the end of this exercise, you will have created a SQL Server Profiler trace and used
Performance Monitor to view Analysis Services performance data while executing an MDX query.

Exercise 3: Monitoring and Optimizing Reporting Services


Scenario
You have created Reporting Services reports that retrieve data from the data warehouse. You want to
observe the query workloads that these queries place on the data warehouse, and use caching to
minimize the number of queries executed in the data warehouse.

The main tasks for this exercise are as follows:


1. Deploy Reports.

2. Add Analysis Services Counters to Performance Monitor.

3. Create a SQL Server Profiler Trace for Report Queries.


4. Monitor Report Execution.

5. Configure Caching for a Shared Dataset.

6. Configure Caching for a Report.

 Task 1: Deploy Reports


 Use SQL Server Data Tools to deploy the Reporting Services objects in the Executive Reports.sln
solution, which is in the D:\Labfiles\Lab10\Starter folder.

 Verify that the items have been deployed to the Reports document library in the SharePoint Server
site at http://mia-sqlbi/sites/adventureworks.

 Modify the deployed AWDataWarehouse data source so that it uses the following stored Windows
credentials:

o User Name: ADVENTUREWORKS\ServiceAcct

o Password: Pa$$w0rd

 View the Reseller Sales report and verify that it displays sales data for the previous month.

 Task 2: Add Analysis Services Counters to Performance Monitor


 Use Performance Monitor to view the following MSRS 2011 Web Service SharePoint Mode object
counters:

o Total Cache Hits.

o Total Cache Misses.

o Total Reports Executed.

 View the counter values as a report, and note their initial values.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-27

 Task 3: Create a SQL Server Profiler Trace for Report Queries


 Use SQL Server Profiler to create a trace that records the following events in the MIA-SQLDW instance
of the SQL Server database engine:

o RPC:Completed

o SQL:BatchCompleted

 Configure the trace to include the following columns:

o TextData

o ApplicationName

o CPU

o Duration

o SPID

o StartTime
o BinaryData

o DatabaseName

 Apply a column filter so that events are only recorded when the DatabaseName column value is like
%AWDataWarehouse%.

 Task 4: Monitor Report Execution


 In the Reports folder in the SharePoint Server site at http://mia-sqlbi/sites/adventureworks, view the
Reseller Sales report.

 In Performance Monitor, verify that the total number of report executions has increased by one.

 In SQL Server Profiler, note the Transact-SQL queries that were executed in the data warehouse. These
should include:
o A query to retrieve the default StartDate and EndDate parameter values.

o A query to retrieve the data for the report.

 Change the StartDate and EndDate parameters and view the report again.
 View the Performance Monitor counter values and SQL Server Profiler events, and then stop the SQL
Server Profiler trace, but keep it open so you can restart it later.

 Task 5: Configure Caching for a Shared Dataset


 Modify the caching options of the Last Month dataset:
o Cache the dataset.

o Refresh the cache on a custom schedule as 12:00 AM on the first day of every month.

 Restart the SQL Server profiler trace.

 View the Reseller Sales report with the default parameter values and note the counter values in
Performance Monitor and the events in the SQL Server Profiler trace. These should be the same as
before.

 Change the StartDate and EndDate parameters and view the report again.

 View the Performance Monitor counter values and SQL Server Profiler events. This time, the number
of cache hits should have increased, and the query to retrieve the default parameter values was not
executed.
MCT USE ONLY. STUDENT USE PROHIBITED
10-28 Monitoring and Optimizing a BI Solution

 Stop the SQL Server Profiler trace, but keep it open so you can restart it later.

 Task 6: Configure Caching for a Report


 Modify the processing options of the Reseller Sales report and configure it to use cached data.

o Refresh the cached report data after 30 minutes.

 Restart the SQL Server profiler trace.

 View the Reseller Sales report with the default parameter values and note the counter values in
Performance Monitor and the events in the SQL Server Profiler trace.

o There should be a cache miss (because the report is configured to be cached, but this is the first
time it has been executed since caching was configured) and the query to retrieve the report
data should have been executed in the data warehouse.

 Change the StartDate and EndDate parameters and view the report again.

 View the Performance Monitor counter values and SQL Server Profiler events.
o There is another cache miss and the query is executed in the data warehouse, because the report
that was cached during the previous execution used different parameter values.

 View the Reseller Sales report with the default parameter values and note the counter values in
Performance Monitor and the events in the SQL Server Profiler trace.

o This time, the report had previously been cached with the requested parameter values, so the
cached copy was rendered and no query was executed in the data warehouse.
 Change the StartDate and EndDate parameters to the same values you used previously and view the
report again.

 View the Performance Monitor counter values and SQL Server Profiler events.
o The cached copy of the report that was generated the last time you used these parameter values
is rendered, so no query was executed in the data warehouse.

 Stop the SQL Server Profiler trace and close all windows when you have finished.

Results: At the end of this exercise, you will have deployed Reporting Services items to a SharePoint
Server document library, and configured caching for a dataset and a report.

Question: How might the classifier function you would create to prioritize ETL workloads in
a real solution differ from the one used in the lab?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 10-29

Module Review and Takeaways


Module Review
This module described considerations and guidelines for monitoring and optimizing the components of a
BI solution. By applying the information in this module, you can create a performance baseline for your BI
solution and use it to identify performance degradation. You can then troubleshoot performance
problems and determine possible causes and resolutions.

Question: In this module, you have considered the components of SQL Server that must be
monitored and optimized in a BI solution. What other elements of the solution should you
monitor and troubleshoot in the event of performance problems?
MCT USE ONLY. STUDENT USE PROHIBITED
10-30 Monitoring and Optimizing a BI Solution
MCT USE ONLY. STUDENT USE PROHIBITED
11-1

Module 11
Operating a BI Solution
Contents:
Module Overview 11-1

Lesson 1: Overview of BI Operations 11-2

Lesson 2: ETL Operations 11-6

Lesson 3: Data Warehouse Operations 11-11

Lesson 4: Analysis Services Operations 11-16

Lesson 5: Reporting Services Operations 11-20

Lab: Operating a BI Solution 11-23


Module Review and Takeaways 11-27

Course Evaluation 11-28

Module Overview
Much of the emphasis when designing a business intelligence (BI) solution is on meeting the functional
and performance-related business requirements. However, you must also consider the ongoing
operational requirements for the solution, and plan suitable strategies to maintain the various elements of
the BI infrastructure. This module describes some of the main considerations for operating a BI solution.

Objectives
After completing this module, you will be able to:

 Describe core considerations for BI operations.


 Describe considerations and strategies for managing ETL operations.

 Describe considerations and strategies for managing data warehouse operations.

 Describe considerations and strategies for managing Analysis Services operations.

 Describe considerations and strategies for managing Reporting Services operations.


MCT USE ONLY. STUDENT USE PROHIBITED
11-2 Operating a BI Solution

Lesson 1
Overview of BI Operations
BI operations should be considered at the very beginning of a BI project so that you design a solution that
can be maintained within the constraints of the business and IT environment in which it will be deployed.
This lesson discusses some core considerations for planning BI operations.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe planning considerations for BI operations.

 Describe considerations for automating operational tasks.

 Describe key drivers for scheduling BI operations.

Planning for BI Operations


Like the rest of the BI solution, planning for
operations and maintenance is driven by the
business requirements. As well as gathering
functional requirements and expectations about
performance, you should gather operational
requirements from the IT personnel who will be
required to support the system in production and
use them to plan operational tasks and schedules.

Often operational tasks and procedures are


documented in a run book or operations manual,
and you should consider creating such a
document for the BI solution. The operations
manual should minimally include the following:

 Descriptions and procedures for maintenance tasks that need to be performed in each subsystem of
the BI solution.
 Scheduling and dependency information about the order and frequency with which tasks must be
performed.

 Details of security group and permissions required to perform maintenance tasks.

 Descriptions of operators and notifications used to track the success or failure of automated tasks.

 Locations of log files and other troubleshooting information.

 Detailed procedures for disaster recover scenarios.

You should compile the operations manual as the BI solution is designed and implemented, and ensure
that it is kept up to date when the solution is deployed into production.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-3

Automating Operational Tasks


Although all of the tasks required to maintain and
operate a BI solution can be performed
interactively, in most cases you should plan to
automate regular maintenance tasks. You should
use the operations manual to document the
automated tasks and schedules as well as any
interactive steps required for any unscheduled
interventions to handle exceptions to normal
processing.

Options for automating maintenance tasks


include:

 Microsoft System Center. Microsoft System


Center 2012 is a platform for provisioning and managing IT infrastructure and application services in
a public or private cloud datacenter. With System Center technologies, you can automate standard
server operations such as applying software updates, enforcing configuration policies, and running
maintenance tasks.

 SQL Server Agent jobs. The SQL Server Agent is a task and notification automation engine for
managing one or more SQL Server instances in a datacenter. With the SQL Server Agent, you can
schedule database maintenance jobs that consist of multiple steps, and alert operators of their
success or failure.

 SQL Server Integration Services (SSIS) packages. Although SSIS packages are generally regarded as
data flow solutions for extract, transform, and load (ETL) processes; they can also include a wide range
of control flow tasks to perform maintenance and configuration operations.

 SharePoint Server timer jobs. If your BI solution includes SharePoint Server, operational tasks
required to maintain the SharePoint Server environment and refresh PowerPivot data can be
implemented as timer jobs that are defined in SharePoint Central Administration.

 The Windows task scheduler. If your BI solution includes a custom task such as a PowerShell script
or command line call to the bulk copy program (BCP), you can use the Windows task scheduler to
automate this task if none of the previously listed solutions is suitable.

You should note that some tasks can be performed by using a more than one of the listed options. For
example, you can use a SQL Server Agent job to process an Analysis Services cube by including an
Analysis Services Command step that runs an XMLA command. Alternatively, you could run the XMLA
command by scheduling a PowerShell script, or by creating an SSIS package that includes the Analysis
Services Processing task in a control flow. It is also common to combine the technologies, for example
by creating a SQL Server Agent job that runs an SSIS package on a scheduled basis.

Additional Reading: For more information about automating database maintenance tasks
with the SQL Server Agent, attend Course 10775A: Administering Microsoft SQL Server 2012
Databases.
MCT USE ONLY. STUDENT USE PROHIBITED
11-4 Operating a BI Solution

Considerations for Scheduling Operational Tasks


As well as deciding on the specific tool you will
use to automate tasks, you must consider the
schedule for these tasks and determine how
frequently, in what order, and at what time the
tasks should run. In most BI solutions, the primary
driver for operational tasks is the flow of data
from source systems to the data warehouse. This,
more than any other factor, affects the need to
perform operational tasks. Key considerations for
planning operational tasks that are driven by this
data flow include:

 Data latency tolerance. One of the most


important business requirements to determine at the beginning of a BI project is the tolerance for
data latency. In other words, how up to date does the data in the reports delivered by the BI solution
need to be? This requirement determines the frequency for ETL operations to transfer source data to
the data warehouse, which in turn triggers other operations such as updating indexes and statistics,
processing data models, and backing up databases that contain new data.

 Data flow dependencies. As the previous bullet point indicates, a data flow-driven BI solution
creates operational dependencies between the steps in the data flow. This operational flow
determines the order in which tasks should be performed. For example, an ETL process might load a
large volume of data into a data warehouse, and in doing so invalidate the data distribution statistics
for tables and cause fragmentation of indexes; and so create a requirement for index and statistics
maintenance tasks. Additionally, after a load the data warehouse contains a significant volume of new
data that is not reflected in data models and cached reports, so data models need to be processed
and cached reports refreshed. Finally, the new data in the data warehouse should be backed up to
ensure recoverability in the event of a disaster. There would be little point in scheduling a data model
process or to occur before the data warehouse load operation; or in refreshing a cached report that
uses data from a data model before the model is processed.

 Operational windows. Another constraint that you must consider when planning operational tasks is
the period of time in which the operations must be completed. For example, a business requirement
might be that all reporting data is updated each month, with the previous month’s data being
reflected in reports by 9.00 am on the first of the month. The ETL process to transfer the previous
month’s data might start at midnight on the first of each month and take several hours, and there
may not be sufficient time to update all indexes and statistics, perform a full process of all data
models, refresh all cached reports, and backup the data warehouse before the 9.00 am deadline. If
the volume of data in your ETL solution results in a requirement for a larger operational window than
the business can support, you must consider alternative designs for the operational schedule. For
example, you could consider the following options:

o Transfer new data into the data warehouse incrementally each day instead of in a single load on
the first of the month, but do not reprocess data models and refresh cached reports until the
start of the new month.

o Partition the data model and perform an incremental process of modified dimensions and
measure group partitions instead of processing the entire data model.

o Delay index and statistics maintenance until a later date in the month, and accept that query
performance might be degraded until it has been performed.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-5

o Delay backing up the database until later in the month, and retain data that as staged by the ETL
process until the backup has been performed to support recovery in the event of a database
failure.
MCT USE ONLY. STUDENT USE PROHIBITED
11-6 Operating a BI Solution

Lesson 2
ETL Operations
The ETL process drives the ongoing operations of the entire BI solution, and includes some specific tasks
that must be performed during the data flow from source systems to the data warehouse. The ETL process
is primarily concerned with the data flow from source systems to the data warehouse. The tasks to
perform the data flow are typically implemented as SSIS packages, which can be deployed individually to
the file system of to the msdb database on a SQL Server instance, or as a project to an SSIS catalog.

Common operational tasks related to the ETL process include:

 Deploying and configuring SSIS packages.

 Running SSIS packages to execute ETL processes.

 Reviewing SSIS package execution reports and other ETL logs.

 Examining redirected data that failed to load.

 Backing up the SSIS catalog database.


 This lesson discusses considerations for planning these tasks.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for deploying SSIS packages.

 Plan the use of environments and variables in an SSIS project.


 Schedule package execution.

 Describe the information available in package execution reports.

 Back up the SSIS Catalog database.

Deploying and Configuring Packages


You can deploy SSIS packages to the file system or
to the msdb database in a SQL Server instance.
This method of deployment is known as the
package deployment model, and until the release
of SQL Server 2012 was the only deployment
optional available. SQL Server 2012 introduced a
project deployment model in which a project
containing multiple related SSIS packages can be
deployed and managed as a unit in an SSIS
catalog. The package deployment model is
supported for backward compatibility, but new
SSIS solutions should generally be based on the
project deployment model.

The following table lists the key differences between the package deployment model and the project
deployment model.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-7

Feature Package Deployment Model Project Deployment Model

Unit of deployment Individual packages Project

Storage Packages and all associated files A project is deployed to the SSIS
can be copied to the file system of catalog, or a folder within the catalog,
a local or remote computer, or of an instance of SQL Server.
packages can be deployed to the
MSDB database of an instance of
SQL Server.

Dynamic Property values are stored in Environment variables in the SSIS


configuration individual package configurations catalog are mapped to project-level
and assigned at run time. parameters and connection managers.

Compiled format Packages and associated resources The entire project is compiled as a
are each stored as single files in the single file (with an .ispac extension).
file system. The entire project
might comprise of many files

Troubleshooting To log events you must add a log Events are automatically logged and
provider to the package and saved to the catalog. These events can
configure logging for each package then be displayed with views such as
individually. catalog.executions and
catalog.event_messages.

Environments and Variables


When you deploy a project to an SSIS catalog, you
can create one or more environments and
associate them with the project. Each environment
represents an execution context for the packages
in the project, and can include multiple variables
that can be mapped to project-level parameters or
connection manager properties. In this way, you
can set specific configuration values to be used in
different execution contexts.

For example, an SSIS project might be used to


transfer data from text files to a database. You can
deploy an SSIS project and create an environment
named Test and second environment named Production. In each environment, you could define
variables named SourceFolder and DBServer. In the Test environment, you might assign the values
“\\TestSrv1\TestData” and “TestDBSrv” to the SourceFolder and DBServer variables respectively; and in
the Production environment, you might assign the values “\\AccountsSrv1\Data” and “DWSrv”. You can
then associate the Test and Production environment with the project; and map the SourceFolder
variable to a parameter that is used to determine the location of the data files to be loaded, and the
DBServer variable to the ServerName property of the connection manager used to connect to the
database.

When you execute the project packages and specify the Test environment, the data is loaded from
\\TestSrv1\TestData into a database on the TestDBSrv server. Executing the same packages in the
Production environment results in the data being loaded from \\AccountsSrv1\Data to a database in the
DWSrv server.
MCT USE ONLY. STUDENT USE PROHIBITED
11-8 Operating a BI Solution

Considerations for Package Execution


You can use the Windows task scheduler to run
the DTExec command line tool and execute an
SSIS package. More commonly, you can execute a
package in a step of a SQL Server Agent job.
When you schedule execution of a package, you
must consider the following:

 When the package should be executed. As


discussed in the previous lesson, the package
execution should be scheduled in accordance
with the data refresh frequency, operational
task order, and operational window
constraints that are driven by the business
requirements.

 The user account context in which the package should run. When a package will run on a
scheduled basis, it must use an appropriate user context that has the necessary permissions to
perform the tasks in the package and any child packages that it executes. When scheduling SSIS
package execution as a SQL Server Agent job step, you can use the SQL Server Agent service account
or a proxy account that has the required permissions. When planning package execution context, you
should apply the principle of “least privilege” and use an account that has the required permissions
but no more.

 The environment or configuration that should be applied. As described in the previous topic, you
can use environments to apply dynamic configuration values to the packages in a deployed project. If
a package is deployed in the package deployment model, you can use configurations to achieve a
similar abstraction of values for settings. Whichever approach is used, you must know which
environment or configuration to apply when executing the package automatically.

 The operators who should be notified of the outcome of the execution. When a package is run
on a scheduled basis, it is typically executed in an unattended environment with nobody to observe
its success or failure. You should therefore plan to notify operators by email of the package execution
outcome. In some cases, the package itself might include control flow logic that uses the Send Mail
task to notify operators of specific task outcomes, and these can be supplemented with operator
notifications from the SQL Server Agent.

Package Execution Reports


Most SSIS packages include logging functionality
to record key events that occur during execution,
and your operations and maintenance plan should
include reviewing these logs to ensure that all
tasks have been performed successfully and
monitor package performance. Additionally, when
deployed in the project deployment model,
packages record execution events and
performance metrics in the SSIS catalog, and you
can use the reports provided by the SSIS catalog
to view a broad range of information about
package execution, including the events that
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-9

occurred, the duration of each event, the parameter values that were used, and a performance
comparison with previous executions of the same package.

The SSIS catalog provides the following reports:

 Integration Services Dashboard. This report provides a central summary that shows details of
package executions. For each package execution listed in this report, you can drill into three
subreports: Overview, All Messages, and Execution Performance.

 All Executions. This report provides details of all package executions on the server, and can be
filtered to show executions within a specified date range.

 All Connections. This report shows details of all connections that have been used in package
executions, including connection strings and whether the connection failed or succeeded.

 All Operations. This report shows details of all operations that have been performed on the server,
including package deployments, executions, and other administrative operations.

 All Validations. This report shows details of all validations that SSIS has performed for packages.

Note: In addition to the standard reports listed here, you can create custom reports that
retrieve information from dynamic management views, and publish the .rdl file for your report to
the SSISDB node under the Integration Services Catalog node in SQL Server Management
Studio. For information about creating and publishing custom SQL Server Management Studio
reports, see SQL Server Books Online.

Backing Up the SSIS Catalog


The SSIS catalog is a SQL Server database named
SSISDB that contains deployed packages,
configuration settings, and logged data for
reports. To ensure recoverability in the event of a
database failure, you should include the SSIS
catalog database in your backup strategy. The
database can be backed up like any other SQL
Server database, however there are some
additional considerations for ensuring
recoverability of the SSIS catalog.

The SSIS catalog contains sensitive data, such as


credentials for data sources. Consequently, the
database is encrypted using a master key. As a part of the backup strategy for the SSISDB database, you
should back up the master key and specify a password that must be provided when the key is restored.
You can accomplish this with the BACKUP MASTER KEY statement as shown in the following code sample.

USE SSISDB
GO
BACKUP MASTER KEY TO FILE = 'C:\Keys\SSISDBKey’
ENCRYPTION BY PASSWORD = ‘Pa$$w0rd’

As well as backing up the master key, you should generate Transact-SQL scripts to recreate the following
objects that are used by the SSIS catalog in case you need to restore the catalog to a new server.

 The ##MS_SSISServerCleanupJobLogin## login.

 The dbo.sp_ssis_startup stored procedure.


MCT USE ONLY. STUDENT USE PROHIBITED
11-10 Operating a BI Solution

 The SSIS Server Maintenance Job SQL Server Agent job.

Additional Reading: For step by step instructions to restore the SSIS catalog on a new
server, see “Backup, Restore, and Move the SSIS Catalog” in SQL Server Books Online.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-11

Lesson 3
Data Warehouse Operations
The data warehouse in a BI solution is a database, and requires the same operational maintenance as any
database. Unlike most business application databases, the data warehouse is generally used for query
operations with no user-driven data modifications. However, the ETL process periodically inserts and
updates large numbers of rows.

Common data warehouse operations include:

 Reorganizing or rebuilding indexes to reduce fragmentations caused by ETL data loads.

 Updating data distribution statistics to ensure that the query optimizer can choose the best query
plans.

 Managing partitioned tables as new data is inserted or to consolidate and archive old data.

 Backing up the database as part of a disaster recovery plan.

 Additionally, you must monitor database server health and disk usage, extending file groups as
necessary to avoid running out of disk space.

Additional Reading: For more information about monitoring database server health,
attend Course 10775A: Administering Microsoft SQL Server 2012 Databases

This lesson described considerations for common data warehouse operations.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for managing indexes.


 Describe considerations for managing statistics.

 Describe considerations for managing partitions.

 Describe considerations for designing a data warehouse backup strategy.

Managing Indexes
As data is added to the data warehouse, the
indexes you have created on tables can become
fragmented. This fragmentation can be
detrimental to query performance, and must be
reduced through periodic reorganizing or
rebuilding of indexes. Depending on the
technique used to load the tables in the data
warehouse, some indexes may be dropped and
recreated by the ETL process, thereby reducing the
likelihood of fragmentation. For example, a
columnstore index on a partitioned table that is
loaded using the SWITCH statement is created for
each new partition. Other indexes however can suffer from fragmentation and should be maintained
periodically.
MCT USE ONLY. STUDENT USE PROHIBITED
11-12 Operating a BI Solution

Checking for fragmentation


For data warehouse with a small number volume of data and a large operational window, you can simply
rebuild every index after each load. However, in most solutions this approach is both unnecessary and
impractical. Before deciding whether or not to rebuild an index, you can examine it to detect the level of
fragmentation by querying the sys.dm_db_index_physical_stats system function. The results returned by
this function include the following columns:

 avg_fragmentation_in_percent. The percent of pages in the index that are out of sequential order.

 fragment_count. The number of fragments containing physically consecutive leaf pages in the index.

 avg_fragment_size_in_pages. The average number of pages in a fragment.

Managing fragmentation
Depending on the level of fragmentation detected by the sys.dm_db_index_physical_stats system
function, you can reorganize or rebuild the index. Reorganizing an index physically reorders the leaf-level
pages of clustered and nonclustered indexes, and is usually faster and less resource-intensive than
performing a full rebuild of the index. An index is always kept online while being reorganized. If the index
is highly fragmented, you can rebuild it. Depending on the specific index, you may be able to keep it
online during the rebuild operation, but in some cases indexes must be taken offline to be rebuilt.

You must assess the resource and availability impact of reorganizing or rebuilding indexes against the
performance degradation caused by fragmentation and decide what action to take. The specific
thresholds for action will vary between solutions, but the following guidelines provide a good starting
point for evaluating your own requirements.
 If the avg_fragmentation_in_percent value is between 5 percent and 30 percent, use the ALTER
INDEX REORGANIZE statement to reorganize the index.

 If the avg_fragmentation_in_percent value is greater than 30 percent, use the ALTER INDEX
REBUILD statement. Where possible, use the WITH (ONLINE = ON) clause to perform the rebuild
without taking the index offline.

Maintaining Statistics
The SQL Server query optimizer uses statistics
about the distribution of data in a table, view, or
index to select appropriate query execution plans.
In most cases, you should rely on the
AUTO_CREATE_STATISTICS and
AUO_UPDATE_STATISTICS settings to ensure that
SQL Server automatically creates and maintains
statistics. However, in some cases, particularly after
a large data load, you might improve query
performance by proactively updating statistics on
specific objects or across the entire data
warehouse.

Updating statistics with the UPDATE STATISTICS statement


To update statistics on an individual object, you can use the UPDATE STATISTICS statement. To create the
statistics, SQL Server can scan all of the rows in the object, or scan a sample of the table and extrapolate
the statistics from that. A full scan provides a more accurate result, but for large objects can require a
significant amount of time and resources. The sample size can be specified as an explicit number of rows
or a percentage of the total rows in the object. For example, the following Transact-SQL statement
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-13

updates the statistics on all columns and indexes in the dbo.DimProduct table by sampling 25 percent of
the rows in the table.

UPDATE STATISTICS dbo.DimProduct WITH SAMPLE 25 PERCENT;

If no SAMPLE option is specified, SQL Server calculates an appropriate sample size based on the size of
the object. You can specify a RESAMPLE option to update the statistics with the same sample settings that
were used during the previous update. Alternatively, to force a full scan of the object, you can specify the
FULLSCAN option.

Updating statistics with the sp_updatestats stored procedure


As an alternative to the UPDATE STATISTICS statement, you can use the sp_updatestats system stored
procedure. This stored procedure updates statistics for all objects in the database where necessary.
Objects that have not experienced any data modifications since the last time the statistics were updated
will not be included to prevent unnecessary overhead. You can optionally specify a @resample parameter
value of ‘resample’, which has the same effect as the RESAMPLE option in the UPDATE STATISTICS
statement.

The following code sample updates the statistics for all objects as necessary in the AWDataWarehouse
database.

USE AWDataWarehouse;
GO
EXEC sp_updatestats;

Managing Partitions in the Data Warehouse


In large data warehouses, it is common to
partition fact tables based on a date key.
Partitioning tables can significantly improve the
performance of both queries and data load
operations. When the data warehouse includes
partitioned tables, you must plan for the following
partition management operations.
 Creating new partitions for data loads.
Depending on the frequency of data loads
and the partitioning interval chosen for the
table, a new partition may be required for
each ETL load cycle. For example, a data
warehouse to which data is loaded monthly might include a fact table with a partition for each month
of data. Even if the partitioning interval does not match the load frequency, there will be occasions
when a new partition must be created to support incoming data. For example, a fact table may be
loaded each week and include a partition for each month. When a data load includes data for a new
month, a new partition must be created. The creation of new partitions to support data load
operations is usually automated in an SSIS package or SQL Server Agent job that splits an empty
partition at the end of the table, and switches a staging table of new data into the new partition
created by the split.

 Consolidating partitions. As well as creating new partitions for data loads, a data warehouse might
use a different partitioning interval for older data than for current data. For example, you might
design a partitioned table to include a partition for each week of the current month, and a single
MCT USE ONLY. STUDENT USE PROHIBITED
11-14 Operating a BI Solution

partition for each previous month. At the start of each new month, you must merge the weekly
partitions for the previous month into a single monthly partition.

 Using a sliding window to archive old data. Over time, a data warehouse can grow extremely large
with a lot of historical data. In cases where extremely old data is of little reporting or analytical value,
you may decide to archive or simply delete it. When the data is stored in a partitioned table, you can
use a sliding window archive process that uses switch, merge, and split operations to move partitions
containing old data to an archive table.

Additional Reading: For information about how to automate a sliding window archival
process, see “How to Implement an Automatic Sliding Window in a Partitioned Table on SQL
Server 2005” at http://msdn.microsoft.com/en-us/library/aa964122(SQL.90).aspx.

Backing Up the Data Warehouse


Like any database, you should plan a suitable
disaster recovery strategy for a data warehouse
that includes regular backups and a tested restore
process. Most data warehouses are non-
transactional and have a recovery model of
simple or bulk_logged. This eliminates the
possibility of performing transaction log backups
and restricts your options to using a combination
of full and differential backups. Because most data
modifications to a data warehouse occur on a
regularly scheduled basis through ETL load
operations, you can schedule backups to occur
after each data load.

Planning a full database backup strategy


If your data warehouse contains only a small volume of data, and the operational window for data loads
and backup operations is long enough, you can implement a simple backup strategy that performs a full
database backup after each data load. This approach minimizes operational complexity and makes it
simple to schedule backup operations. To recover the database in the event of a failure, you only need to
restore the latest backup.

Planning a differential backup strategy


If the data warehouse is refreshed with new data frequently (for example, daily), or the operational
window is too short to support a full database backup after each load, you can perform a full database
backup at a fixed interval with a differential backup after each load. To recover the database in the event
of a failure, you must restore the last full backup followed by the most recent differential backup.

Planning a partial backup strategy


When a data warehouse is extremely large, you can use filegroups to store older, inactive data separately
from current data and configure the filegroups containing inactive data as read-only. This approach
enables you to implement as partial backup strategy in which each read-only filegroup is backed up once,
and only read/write filegroups are included in subsequent backups – significantly reducing the time taken
to perform a full or differential backup.

One of the advantages of including read-only filegroups in partial backup strategy is that it enables you
to perform a piecemeal restore. In a piecemeal restore, you can recover read/write filegroups and make
them available to users for querying before the recovery of read-only filegroups is complete.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-15

Demonstration: Using a Partial Backup Strategy


In this demonstration, you will see how to implement a partial backup strategy and a piecemeal restore.

Demonstration Steps
Implement a partial backup strategy

1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. In the D:\Demofiles\Mod11 folder, right-click Setup.cmd and click Run as administrator. When
prompted, to allow the program to make changes, click Yes.

3. Start SQL Server Management Studio and connect to the localhost instance of the database engine
by using Windows authentication.

4. Open Partial Backup.sql from the D:\Demofiles\Mod11 folder.

5. Select the code under the comment View filegroups and partitions, and then click Execute. This
retrieves information about the filegroups and partitions in the DemoDW database.
6. Select the code under the comment Make inactive filegroups read-only, and then click Execute.
This marks the filegroups containing partitions with inactive data as read-only.

7. Select the code under the comment Backup inactive filegroups, and then click Execute. This backs
up the read-only filegroups.

8. Select the code under the comment Backup read/write filegroups, and then click Execute. This
backs up the read/write filegroups.
9. Select the code under the comment Perform a data load (from line 45 to line 74) and then click
Execute. This performs ETL tasks to load a new row into a dimension table and a new partition in a
fact table.
10. Select the code under the comment Make loaded filegroup read-only and back it up and then
click Execute. This marks the newly loaded filegroup as read-only and backs it up.

11. Select the code under the comment Perform a differential backup of read/write filegroups and
then click Execute. This creates a differential backup of the read/write filegroups.

Perform a piecemeal restore

1. Select the code under the comment Simulate a disaster and then click Execute. This drops the
DemoDW database.

2. Select the code under the comment Restore the initial full backup with the partial option and
then click Execute. This restores the original full backup and specifies that the database will be
recovered from partial backups.

3. Select the code under the comment Restore the read/write filegroups and recover and then click
Execute. This restores the differential backup of the read/write filegroups.
4. Select the code under the comment Access read/write data and then click Execute. This queries the
dimension table, which is on the PRIMARY filegroup and has been recovered.

5. Select the code under the comment Restore the read-only filegroups (from line 118 to line 141)
and then click Execute. This restores the read-only filegroups, which were backed up individually.

6. Select the code under the comment Access read-only data and then click Execute. This queries the
fact table to verify that it has been restored.
MCT USE ONLY. STUDENT USE PROHIBITED
11-16 Operating a BI Solution

Lesson 4
Analysis Services Operations
Analysis Services provides the data models on which business analysis and reporting can be based.

Analysis Services can be installed in multidimensional or tabular mode depending on the kind of data
model required. However, regardless of the installation mode, there are some Analysis Services
maintenance and management tasks that must be included in your operations plan. Common Analysis
Services operations include:

 Creating and managing partitions.

 Processing data models.

 Backing up Analysis Services databases.

 This lesson describes the key operational tasks that you must perform to maintain an Analysis Services
server in a BI solution.

Lesson Objectives
After completing this lesson, you will be able to:

 Describe considerations for managing partitions in a data model.


 Describe considerations for processing data models.

 Describe considerations for backing up an Analysis Services database.

Managing Partitions in a Data Model


When a data model includes a large volume of
fact data (in a measure group of the
multidimensional model or a table in a tabular
model), you can use partitions to reduce
processing overhead. For example, in a data
model that contains sales data, you could create a
partition for each month of the current year, and a
partition for each previous year. Sales data for
previous years is unlikely to change, so partitions
for previous years can be excluded from
processing after data loads. Common operational
tasks that you must plan for data model partition
management include:

 Creating new partitions as new data is loaded from the data warehouse. For example, if a data
model contains a partition for each month and the data warehouse on which it is based is loaded
monthly, a new partition must be created each month before processing the data model to add the
new data.

 Merging partitions as data becomes less active. For example, if a data model includes a partition
for each month of the current year and a single partition for each previous year, at the beginning of
each new year, the monthly partitions for the previous year must be merged.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-17

Automating partition management


You can automate partition creation and management by creating an XMLA script to perform the
required action and executing it by using one of the following methods:

 Create an Analysis Services Command job step in a SQL Server Agent job.

 Include an Analysis Services Execute DDL Task component in an SSIS control flow.

 Use the Invoke-ASCmd cmdlet to execute the script from a PowerShell script (you must import the
sqlps and sqlascmdlets modules to use this cmdlet).

To generate the XMLA script, use SQL Server Management Studio to configure the partition and script the
action to a file. You can then modify the script as required before executing it, for example to change the
partitioning criteria when creating a new partition.

Processing Data Models


As new data is loaded into the data warehouse,
any data models based on the data warehouse
tables must be processed to include the new data.
Multidimensional models and tabular models load
and store data from source tables in different ways
to one another; and although the options for
managing processing are the similar for both
models, there are some differences that you
should consider.

Processing options in a
multidimensional data model
To refresh the data in a multidimensional model,
you can process the entire database, a cube, a dimension, a measure group, or a partition. When
processing objects in a multidimensional model, you can specify the following processing modes.
 Process Default. This option detects the current processing state of the objects, and processes them
only if required.

 Process Full. This option drops the data in all objects and processes them by loading all data from
the source tables.

 Process Clear. This option removes all data from objects and leaves them empty.

 Process Data. You can use this option to process a dimensions, cube, measure group, or partition by
refreshing the data in the object without creating aggregations or indexes.

 Process Add. You can use this option to process dimensions, measure groups, and partitions. For
dimensions, this option adds new members and updates dimension attribute captions and
descriptions. For measure groups and partitions, this option adds new fact data and process only the
relevant partitions.

 Process Update. You can use this option to process a dimension by forcing a re-read of data and an
update of dimension attributes.

 Process Index. You can use this option to process cubes, dimensions, measure groups, and partitions
by rebuilding indexes and aggregations for all processed partitions. For previously unprocessed
objects, this option generates an error.
MCT USE ONLY. STUDENT USE PROHIBITED
11-18 Operating a BI Solution

Note: If you have created structures for data mining, you can also process them. For more
information about data mining, attend Course 10778: Implementing Data Models and Reports
With SQL Server 2012.

Processing options in a tabular data model


If you need to refresh the data in a tabular model, you can process the entire database, a single table, or a
partition. When processing objects in a tabular instance of Analysis Services, you can specify one of the
following processing modes:

 Process Default. This option detects the current processing state of the objects, and processes them
only if required.

 Process Full. This option drops the data in all objects and processes them by loading all data from
the source tables.

 Process Clear. This option removes all data from objects and leaves them empty.

 Process Recalc. This option can only be used to process a database and updates and recalculates
hierarchies, relationships, and calculated columns.
 Process Data. This option can be used when processing a table or a partition, and loads data into the
object without rebuilding hierarchies or relationships or recalculating calculated columns and
measures.
 Process Defrag. This option can be used to process a table and defragment its indexes.

 Process Add. This option can be used to process a partition by incrementally loading new data into
it.

Choosing a processing mode


If your operational window supports it, the simplest approach is to perform a full process of the entire
multidimensional cube or tabular database after each data warehouse load operation. This ensures that all
data is loaded into the correct data model structures and aggregations and calculated members are
recalculated. However, for large data warehouses and data models, this approach can be extremely time-
consuming and resource-intensive. You can reduce the overhead and time required for data model
processing by:

 Processing only the dimensions and measure groups in a multidimensional model, or tables in a
tabular model for which new data has been loaded.

 Performing an incremental process that only adds new data to the model.

When evaluating either of these options, you must determine whether a partial or incremental processing
mode will maintain the full integrity of all data and aggregations in the data model.

Additional Reading: For more information about choosing a processing mode for
multidimensional data models, see “Analysis Services Processing Best Practices” at
http://msdn.microsoft.com/en-US/library/cc966525.

Automating processing
You can automate processing by creating an XMLA script to process the appropriate object and executing
it by using one of the following methods:

 Create an Analysis Services Command job step in a SQL Server Agent job.

 Include an Analysis Services Execute DDL Task component in an SSIS control flow.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-19

 Use the Invoke-ASCmd cmdlet to execute the script from a PowerShell script (you must import the
sqlps and sqlascmdlets modules to use this cmdlet).

Alternatively, you can use the Analysis Services Processing Task component in an SSIS control flow.
However, this component is designed for multidimensional data models, and may not support all
processing options for tabular data models.

Backing Up an Analysis Services Database


To ensure recoverability in the event of an Analysis
Services server or database failure, you can back
up an Analysis Services database by using the
backup and restore functionality provided by the
product.

Considerations for Analysis Services


backup
When planning an Analysis Services backup
strategy, consider the following:

 You should back up the Analysis services


database after any significant change, such as
processing a data model to incorporate new data in the source database.

 Backing up an Analysis Services database does not back up the source data on which it is based
(usually a data warehouse). You should implement a separate backup process for source data.

 As well as performing a backup of the Analysis Services database, you should maintain the source
projects for your data models in a source control system so that should a restore from a backup fail,
you can recover the data model by redeploying the project and processing it from the source data.

Performing a backup
The Analysis Services backup process can be invoked interactively in SQL Server Management Studio or
automated by executing an XMLA script that includes the Backup element. When you back up an
Analysis Services database, you can optionally choose to compress the backup file and encrypt it with a
password that must be supplied in order to restore the backup.
MCT USE ONLY. STUDENT USE PROHIBITED
11-20 Operating a BI Solution

Lesson 5
Reporting Services Operations
Reporting Services provides a platform for publishing and delivering reports. This lesson describes
considerations for planning the operational tasks required to support Reporting Services in a BI solutions.

Lesson Objectives
After completing this lesson, you will be able to:

 Plan management of schedules for caching and snapshots.

 Plan a backup strategy for Reporting Services databases.

Managing Scheduled Tasks


A common performance optimization technique
in a Reporting Services solution is to cache
frequently used datasets and reports, or to create
report snapshots on a periodic basis. Additionally,
you can process subscriptions to send reports to
users by email, or save them to a file share or
SharePoint document library. To perform these
tasks on a regular basis, you must assign a
schedule to the task. You can create individual
schedules for each task, or you can define shared
schedules that you can use to manage the
execution of multiple tasks.

Shared schedules make it easier to manage multiple tasks that should be performed at the same regular
interval. For example, in a reporting solution that is based on a data warehouse where new data is loaded
each month, you could use a shared schedule to perform the following tasks at the beginning of the
month:

 Refresh specific cached datasets and reports to include the new data.
 Force expiration of other cached datasets and reports so that future requests will retrieve new data.

 Create new report snapshots for month-end reporting.

 Process monthly subscriptions to send reports by email.

Considerations for schedules


When you plan to use scheduled tasks, consider the following:

 Schedules depend on the SQL Server Agent. If you stop the SQL Server Agent, no scheduled tasks will
be performed.

 Using shared schedules enables you to centrally pause, resume, and modify multiple scheduled tasks
in a single location. Using object-specific schedules entails managing each schedule individually.

 A report server uses the time zone of the computer on which it is installed, regardless of the time
zone configuration of client computers that access the report server. All schedules use the local time
of the server on which they are defined.

 If you change the time zone of a server on which Reporting Services is installed in native mode, you
must restart the Report Server service for the time zone change to take effect. When you change the
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-21

time zone of a report server, existing schedules retain the same times in the new time zone. For
example, a task that was scheduled to run at 2:00 in the old time zone will be scheduled to run at
2:00 in the new time zone.

 Time zone settings for a report server installed in SharePoint integrated mode are determined by the
SharePoint Server regional settings.

Managing Encryption keys


Reporting Services uses a symmetric encryption
key to securely store sensitive information, such as
credentials for data sources. You should backup
this key and store the backup in a secure location.
You will need to restore the key to perform any of
the following tasks:

 Change the Report Server service account or


reset its password.

 Rename the server on which the report server


is hosted.
 Migrate a report server installation or change
the report server database.

 Restore a report server after a hardware failure.

Backing up an encryption key in native mode


When a report server is deployed in native mode, you can back up the encryption key by using the
Reporting Services Configuration tool, or the rskeymgmt.exe command line utility. When you back up the
key, you must specify a password, which must be provided when the key is restored from the backup.

Backing up an encryption key in SharePoint integrated mode


When a report server is deployed in SharePoint integrated mode, you can back up the encryption key in
the Key Management section of the Manage Reporting Services Application page in SharePoint
Central Administration. Alternatively, you can create a PowerShell script to back up the key.

Backing Up Reporting Services Databases


A Reporting Services report server uses two
databases to store reports, service configuration
settings, and other data. By default these database
are named ReportServer and
ReportServerTempDB. The ReportServer
database is the primary application database for
the report server, and the ReportServerTempDB
database contains temporary data such as cached
reports and datasets. Your operations plan for the
BI solution should include a strategy for ensuring
the recoverability of these databases in the event
of a server or hardware failure.
MCT USE ONLY. STUDENT USE PROHIBITED
11-22 Operating a BI Solution

Note: The database names for a specific instance may be different from the default names,
but the names will always match the same naming convention – the database for temporary
objects will always be the name of the primary database with “TempDB” appended.

Planning a recovery strategy for the ReportServer database


The ReportServer database is configured to use the full recovery model by default, and therefore
supports full, differential, and transaction log backups. You can implement a backup schedule for this
database in the same way as you would for any application database by performing a full backup at
periodic intervals interspersed with differential and transaction log backups if required to meet your
recoverability target.

Planning a recovery strategy for the ReportServerTempDB database


The ReportServerTempDB database contains only temporary objects, so a database backup is not
required. In the event of failure, you can create a new database with the appropriate name and run the
CatalogTempDB.sql script in the \Program Files\Microsoft SQL Server\MSRS11.instancename\Reporting
Services\ReportServer folder.

Note: In the event of a hardware failure, after restoring the ReportServer database and
recreating the ReportServerTempDB database, you must restore the encryption key on all
report servers that use the database.
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-23

Lab: Operating a BI Solution


Scenario
You have created a BI solution for Adventure Works Cycles. Now you must deploy the solution and
automate operational tasks.

Objectives
After completing this lab, you will be able to:

 Configure SQL Server Integration Services.

 Automate data warehouse load tasks.

 Test automated tasks.

Estimated Time: 60 Minutes

Start 20467A-MIA-DC and 20467A-MIA-SQLBI, and then log onto 20467A-MIA-SQLBI as


ADVENTUREWORKS\Student with the password Pa$$w0rd.

Exercise 1: Configuring SQL Server Integration Services


Scenario
You have created an SSIS project to perform the ETL tasks in the BI solution. You plan to deploy this
project to an SSIS catalog and test it in a test environment before moving it into production.

The main tasks for this exercise are as follows:

1. Prepare the Lab Environment.

2. Create an SSIS Catalog.

3. Configure SSIS Environments.

 Task 1: Prepare the Lab Environment


 Read the lab and exercise scenarios.
 Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

 Run Setup.cmd in the D:\Labfiles\Lab11\Starter folder as Administrator.

 Task 2: Create an SSIS Catalog


 Use SQL Server Management Studio to create an SSIS catalog with a password of Pa$$w0rd in the
MIA-SQLDW instance of SQL Server.

 In the catalog, create a folder named DW ETL with the description Folder for the Adventure Works
ETL SSIS Project.

 Deploy the LoadPartition.ispac project in D:\Labfiles\Lab11\Starter to the DW ETL\Projects folder.

 Task 3: Configure SSIS Environments


 In SQL Server Management Studio, in the DW ETL folder you created earlier, create an environment
named Test.

 In the Test environment create variables with the following properties:

Name Type Description Value Sensitive


MCT USE ONLY. STUDENT USE PROHIBITED
11-24 Operating a BI Solution

Name Type Description Value Sensitive

StagingServer String SQL Server for staging DB localhost Unselected

DWServer String SQL Server for data warehouse localhost Unselected

 Add a second environment named Production with the following variables:

Name Type Description Value Sensitive

StagingServer String SQL Server for staging DB MIA-SQLDW Unselected

DWServer String SQL Server for data warehouse MIA-SQLDW Unselected

 Configure the LoadPartition project to:

o Add references to the Test and Production environments.

o Set the ServerName parameter for the AWDataWarehouse connection manager to the
DWServer environment variable.
o Set the ServerName parameter for the Staging connection manager to the StagingServer
environment variable.

Results: At the end of this exercise, you will have an SSIS catalog that contains environments named
Test and Production, and you will have deployed the LoadPartition SSIS project to the SSIS catalog.

Exercise 2: Automate Data Warehouse Load Tasks


Scenario
After each data load, you want to automate data warehouse maintenance tasks to rebuild indexes and
update statistics, and process an Analysis Services cube that is based on the data warehouse. You plan to
automate the ETL process and the data warehouse maintenance tasks by using a SQL Server Agent job.

The main tasks for this exercise are as follows:

1. Create a Script to Maintain Indexes and Statistics.

2. Create an XMLA Script to Process a Cube.

3. Create a SQL Server Agent Job.

 Task 1: Create a Script to Maintain Indexes and Statistics


 Create a Transact-SQL script that rebuilds all of the indexes for the dbo.DimSalesTerritory table.

o The rebuild operations should be performed online.

o Tip: You can generate a script to perform offline rebuilds of a table’s indexes by right-clicking the
Indexes folder in Object Explorer, clicking Rebuild, and scripting the action to a new query
window. You can then edit this script to perform the rebuilds online.

 Add a statement to the end of the script that executes the sp_updatestats system stored procedure
to update all statistics.

 Save the script in the D:\Labfiles\Lab11\Starter folder.


MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-25

 Task 2: Create an XMLA Script to Process a Cube


 Create an XMLA script to process the Sales cube in the AWSalesMD database on the MIA-SQLBI
instance of Analysis Services.

 You can generate an XMLA script to process a cube from the Process Cube dialog box, which you
open by connecting to Analysis Services in Object Explorer, right-clicking the cube, and clicking
Process.

 Save the script in the D:\Labfiles\Lab11\Starter folder.

 Task 3: Create a SQL Server Agent Job


 Create a new SQL Server Agent job with an appropriate name to perform the data load operations for
the data warehouse.

 Add the following steps to the job:

o A step that runs the LoadDW.dtsx Integration Services package, which is in the LoadPartition
SSIS project you deployed to the SSIS catalog on MIA-SQLDW previously. The package should be
executed in the Test environment.

o A step that executes the Transact-SQL script you created previously to rebuild indexes and
update statistics.
o A step that runs the Analysis Services command XMLA script to process the Sales cube you
created earlier in the MIA-SQLBI Analysis Services server.

 Schedule the job to run at 12:00 on the first day of every month.

Results: At the end of this exercise, you will have a SQL Server Agent job named Data Warehouse
Load.

Exercise 3: Test Automated Operations


Scenario
You have created a SQL Server Agent job to automate the ETL process and perform maintenance tasks
after a data load. Now you plan to test the job and review execution information.

The main tasks for this exercise are as follows:

1. Run the SQL Server Agent Job.

2. View Job History.

3. View SSIS Package Execution History.

 Task 1: Run the SQL Server Agent Job


 Run the SQL Server Agent job you created in the previous exercise.

 Task 2: View Job History


 View the job history of the SQL Server Agent job you ran in the previous task and verify that all three
steps were completed successfully.

 Task 3: View SSIS Package Execution History


 View the Integration Services Dashboard report for the SSISDB Integration Services catalog on
MIA-SQLDW.
MCT USE ONLY. STUDENT USE PROHIBITED
11-26 Operating a BI Solution

o The Integration Services Dashboard report is listed among the standard reports for the SSISDB
catalog.

 Verify that the most recent execution succeeded, and view the overview report for the execution.

 In the overview report for the most recent execution, verify that the parameters used for the
AWDataWarehouse.ServerName and Staging.ServerName parameters were the values you
specified in the Test environment.

 View the messages that were logged during package execution.

 View the performance statistics for the package execution and note its duration.

Results: At the end of this exercise, you will have executed a job, reviewed job history, and reviewed SSIS
catalog reports.

Question: How might the operations solution you created in the lab have differed if the
measure groups in the cube were partitioned on the same basis as the fact tables in the
relational database?

Question: If the volume of data to be loaded and processed was significantly larger, or the
time period available for performing the ETL load was shorter, how might you change the
solution you created in the lab?
MCT USE ONLY. STUDENT USE PROHIBITED
Designing Business Intelligence Solutions with Microsoft SQL Server 2012 11-27

Module Review and Takeaways


Module Review
This module has described considerations for maintaining and operating a BI solution. The specific tasks
you must plan depend on the business requirements for your BI solution and the constraints of the IT
environment in which it will be deployed, but the guidance in this module should provide a starting point
for planning BI operations..

Question: As a BI specialist, your involvement in a BI solution may end when the solution is
deployed into production. How can you ensure that the IT personnel who will support the
solution are able to manage and troubleshoot the necessary operational tasks?
MCT USE ONLY. STUDENT USE PROHIBITED
11-28 Operating a BI Solution

Course Evaluation
Your evaluation of this course will help Microsoft understand the quality of your learning experience.

Please work with your training provider to access the course evaluation form.

Microsoft will keep your answers to this survey private and confidential and will use your responses to
improve your future learning experience. Your open and honest feedback is valuable and appreciated.
MCT USE ONLY. STUDENT USE PROHIBITED
L1-1

Module 1: Planning a BI Solution


Lab: Planning a BI Solution
Exercise 1: Scoping a BI Solution
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab01\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Review Interview Transcripts


1. In the D:\Labfiles\Lab01\Starter folder, double-click Interveiws.docx to open it in Microsoft Word.
2. Read the interviews in the document.

X Task 3: Prioritize Business Requirements


1. Form a small group with two or three other students.

2. Discuss the interviews and identify as many business requirements as you can.
3. Open Requirements Matrix.docx in the D:\labfiles\Lab01\Starter folder.

4. Based on the available information, assess the business value and feasibility of the requirements.
Record these in Requirements Matrix.docx.

Results: At the end of this exercise, you should have created a matrix that shows the relative value and
feasibility of the business requirements for the BI solution.
MCT USE ONLY. STUDENT USE PROHIBITED
L1-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Determining Software Requirements


X Task 1: Suggest Software for the BI Solution
1. Still working as a small group, discuss the business requirements you have identified and suggest
Microsoft software components that could be used to meet them.

2. Open Software Requirements.docx in the D:\labfiles\Lab01\Starter folder.

3. Document your software suggestions and the rationale for your choices.

Results: At the end of this exercise, you should have a list of suggested software components for the BI
solution.
MCT USE ONLY. STUDENT USE PROHIBITED
L2-1

Module 2: Planning SQL Server Business Intelligence


Infrastructure
Lab: Planning SQL Server Business
Intelligence Infrastructure
Exercise 1: Planning Server Topology
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab02\Starter folder.


4. Right-click Setup.cmd, and then click Run as administrator.

5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Review the Interview Transcripts


1. In the D:\Labfiles\Lab02\Starter folder, double-click Interveiws.docx to open it in Microsoft Word.

2. Read the interviews in the document.

X Task 3: Design a Server Infrastructure for the BI Solution


1. Form a small group with two or three other students.
2. Discuss the interviews and agree on a specification for the servers that will be required and the
services that will be hosted on them. For the purposes of this exercise, you can ignore any
requirements for:

o SharePoint Server

o Data Quality Services

o Master Data Services

3. On the taskbar, click the Visio 2013 icon to open Microsoft Visio.

4. Use Visio to document your server infrastructure design. Save the file as BI Topology.vsdx in the
D:\Labfiles\Lab02\Starter folder.
5. Close Visio.

Results: At the end of this exercise, you should have a Visio diagram that documents your server
infrastructure design.
MCT USE ONLY. STUDENT USE PROHIBITED
L2-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Planning Data Warehouse Hardware


X Task 1: Measure Maximum Consumption Rate
1. On the taskbar, click the SQL Server Management Studio icon.

2. When prompted, connect to the database engine on the localhost instance by using Windows
authentication.

3. On the File menu, point to Open, and then click File. Browse to the D:\Labfiles\Lab02\Starter folder,
select Create Benchmark DB.sql, and then click Open.

4. Click Execute, and wait for query execution to complete.


5. On the File menu, point to Open, and then click File. In the D:\Labfiles\Lab02\Starter folder, select
Measure MCR.sql, and then click Open.

6. Click Execute, and wait for query execution to complete.

7. In the results pane, click the Messages tab.

8. Add the logical reads value for the two queries together, and then divide the result by two to find
the average.
9. Add the CPU time value for the two queries together, and then divide the result by two to find the
average. Divide the result by 100 to convert it to seconds.

10. Calculate MCR by using the following formula:


(average logical reads / average CPU time) * 8 / 1024

11. Calculate the number of cores required to support a workload with an average query size of 500 MB,
10 concurrent users, and a target response time of 20 seconds:
((500 / MCR) * 10) / 20

12. Close SQL Server Management Studio without saving any files.

X Task 2: Estimate Server Hardware Requirements


1. In the D:\Labfiles\Lab02\Starter folder, double-click DW Hardware Spec.xlsx to open it in Microsoft
Excel.

2. In any blank cell, use the following formula to calculate the number of cores required for the given
workload figures:

=((B6/C3)*B7)/B8

3. Based on the results of the preceding formula, recommend the number and type of processors to
include in the data warehouse server.

4. Calculate the volume of fact data in gigabytes (estimated fact rows x bytes per row, divided by
100,000), and add 50 GB for indexes and dimensions. Then divide the result by 3 to allow for a 3:1
compression ratio. The resulting figure is the required data storage.
5. Add 50 GB each for log space, TempDB storage, and staging data to calculate the total data volume.

6. Assuming an annual data growth of 150 GB, calculate the required storage capacity in three years.

7. Based on the data volume and CPU requirements, suggest a suitable amount of memory for the
server.

8. In the D:\Labfiles\Lab02\Starter folder, double-click Storage Options.docx and review the available
options for storage hardware. Then, based on the storage requirements you have calculated, select a
suitable storage option for the data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
L2-3

9. Record your recommendations in DW Hardware Spec.xlsx, and then close Excel and Word.

Results: After this exercise, you should have a completed worksheet that specifies the required hardware
for your data warehouse server.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L3-1

Module 3: Designing a Data Warehouse


Lab A: Designing a Data Warehouse Logical
Schema
Exercise 1: Identifying Business Processes and Dimensions
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click File Explorer.

3. In File Explorer, browse to the D:\Labfiles\Lab03A\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Review the Interview Transcripts


1. In the D:\Labfiles\Lab03A\Starter folder, double-click Interveiws.docx to open it in Microsoft Word.
2. Read the interviews in the document.

X Task 3: Identify and Document Business Processes


1. Form a small group with two or three other students.

2. Discuss the interviews, and identify the business processes in Adventure Works that generate the data
required to meet the analytical and reporting requirements.

3. Prioritize the business processes by their importance to the business in terms of analytical and
reporting requirements.
4. On the taskbar, click the Excel 2013 icon to start Excel, and then open Matrix.xlsx in the
D:\Labfiles\Lab03A\Starter folder.

5. In Excel, under the heading Business Processes, enter the business processes you have identified in
descending order of priority.

X Task 4: Examine Source Data


1. On the taskbar, click the SQL Server Management Studio icon, and when prompted, connect to the
localhost instance of the database engine by using Windows authentication.

2. On Object Explorer, expand Databases, expand ResellerSales, and expand Tables. This folder
contains the tables that are defined in the ResellerSales database.

3. Right-click Database Diagrams, and then click New Database Diagram. If you are prompted to
create the required support objects, click Yes.

4. Click the first table in the list, hold Shift and click the last table in the list, and then click Add.

5. Click Close, and then view the diagram to familiarize yourself with the database schema.
6. If you want to view the data in a table, right-click the table in Object Explorer, and then click Select
Top 1000 Rows. You can then modify the Transact-SQL code (for example, by deleting the TOP 1000
clause) and re-executing the query to view specific data values.
MCT USE ONLY. STUDENT USE PROHIBITED
L3-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

7. Repeat steps 2 to 6 to explore the following databases:

o ProductsMDS.

o InternetSales.

o Marketing.

8. Close SQL Server Management Studio, without saving any changes.

9. Use File Explorer to view the contents of the D:\Accounts folder, and then double-click each file in this
folder to open them in Microsoft Excel.

10. After you finish examining the data, close Excel.

X Task 5: Identify and Document Dimensions


1. Open the Matrix.xlsx Excel workbook you used previously.

2. Under the heading Dimensions, next to the existing Time dimension, enter the dimensions you
believe can be supported by the data and meet the analytical and reporting requirements.

3. Indicate which dimensions relate to which business processes by entering “x” in the intersecting cell.

Results: At the end of this exercise, you will have created a matrix of business processes and dimensions.
MCT USE ONLY. STUDENT USE PROHIBITED
L3-3

Exercise 2: Designing Dimension Models and Data Warehouse Tables


X Task 1: Create a Sun Diagram
1. On the taskbar, click the Visio 2013 icon.

2. Create a new Visio document based on the Basic Diagram template.

3. Add a circle shape and place it at the center of the drawing. Double-click the shape and type an
appropriate fact table name for the measures generated by the highest priority business process you
documented in Matrix.xlsx. Then, hold the Ctrl key and press Enter to create a new line.

4. Type the names of each measure to be included in the fact table on a new line. After you finish, press
Enter.

5. Add a rectangle shape for each dimension that is related to the business process based on the
information you entered in Matrix.xlsx; and arrange the rectangles around the circle.

6. Double-click each rectangle and enter the name of the dimension.

7. Use the Connector tool to draw a line between each dimension rectangle and the fact table circle.

8. Use the Text tool to list the attributes and hierarchies supported by each dimension.

9. For each remaining business process in matrix.xlsx, click the Insert Page icon at the bottom of the
drawing area, and repeat steps 2 to 8 to create a dimensional model of the business process.

10. Save the Visio document as Initial Sun Diagram.vsdx in the D:\Labfiles\Lab03A\Starter folder.

X Task 2: Design Data Warehouse tables


1. Create a new Visio document based on the DataBase Crow’s Foot Notation template in the
Software and Database category.

2. Add an Entity shape for each table you want to use to implement the highest priority dimension
model.

3. In each table, add an Attribute for each column you want to define in the table.

4. Add a Relationship connector for each relationship you want to define between the tables.
5. Repeat steps 2 to 4 to add the tables you want to define for the remaining business processes.

6. Save the Visio document as DW Schema.vsdx in the D:\Labfiles\Lab03A\Starter folder.

7. Close Visio.

Results: At the end of this exercise, you will have a sun diagram showing the facts, measures,
dimensions, attributes, and hierarchies you have identified, and a database schema diagram showing
your design for dimension and fact tables.
MCT USE ONLY. STUDENT USE PROHIBITED
L3-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Lab B: Designing a Data Warehouse Physical


Implementation
Exercise 1: Designing File Storage
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab03B\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.

5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Plan Data Warehouse Storage


1. Form a small group with two or three other students.

2. Use Windows Explorer to view the available logical drives (drives E, F, G, H, I, J, K, L, M, and N on MIA-
SQLBI).

3. Review your database schema design from the previous lab. If you did not complete the previous lab,
use Visio to review DW Schema.vsdx in the D:\Labfiles\Lab03B\Starter folder.
4. In the D:\Labfiles\Lab03B\Starter folder, double-click AWDataWarehouse.docx to open it in Microsoft
Word.

5. In the table under the heading Storage, document your planned usage for each logical drive. Your
plan should include:

o Data warehouse filegroups for system tables, dimension tables, and fact tables.

o Staging tables.
o Log files.

o TempDB.

o Backup files.
For more information, see the “Considerations for Database Files” topic in the “Designing a Data
Warehouse Physical Implementation” lesson.

Results: At the end of this exercise, you should have a document that contains a table describing your
planned usage for each logical volume of the data warehouse server.
MCT USE ONLY. STUDENT USE PROHIBITED
L3-5

Exercise 2: Designing Warehouse Data Structures


X Task 1: Plan Partitioning
1. Still working as a small group, discuss how you might use partitioning in the data warehouse schema
you reviewed in the previous exercise.

2. If you want, you can create a test database in the localhost instance of SQL Server and use it to
experiment with partitioned table designs.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about partitioned tables and indexes.

3. In the AWDataWarehouse.docx document you edited in the previous exercise, under the heading
Partitioning, type a description of your proposed use of partitioning in the data warehouse. You
should include:

o The tables (if any) that will be partitioned.

o The partitioning columns to be used.

o The range of data to be placed on each partition.

X Task 2: Plan Indexes


1. Still working as a small group, discuss how you might use indexing in the data warehouse schema you
reviewed in the previous exercise.

2. If you want, you can create a test database in the localhost instance of SQL Server and use it to
experiment with indexes.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about indexes.
3. In the AWDataWarehouse.docx document you edited in the previous exercise, under the heading
Indexes, type a description of your proposed use of indexing in the data warehouse. You should
include:
o The tables (if any) that will be indexed.

o The columns to be indexed.

o The types of index to be created.

X Task 3: Plan Compression


1. Still working as a small group, discuss how you might use compression in the data warehouse schema
you reviewed in the previous exercise.

2. If you want, you can create a test database in the localhost instance of SQL Server and use it to
experiment with compression.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about compression.

3. In the AWDataWarehouse.docx document you edited in the previous exercise, under the heading
Compression, type a description of your proposed use of compression in the data warehouse. You
should include:

o The tables and indexes (if any) that will be compressed.

o The type of compression to be used.


MCT USE ONLY. STUDENT USE PROHIBITED
L3-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 4: Plan Views


1. Still working as a small group, discuss how you might use views in the data warehouse schema you
reviewed in the previous exercise.

2. If you want, you can create a test database in the localhost instance of SQL Server and use it to
experiment with views.

Note: SQL Server Books Online is installed locally on the virtual machine. You can use this to review
documentation about views.

3. In the AWDataWarehouse.docx document you edited in the previous exercise, under the heading
Views, type a description of your proposed use of compression in the data warehouse. You should
include:

o The views (if any) that will be created.

o The settings used for the view definitions.

Results: At the end of this exercise, you will have a document that contains information about your
plans for partitions, indexes, compression, and views in the data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-1

Module 4: Designing an ETL Solution


Lab: Designing an ETL Solution
Exercise 1: Preparing for ETL Design
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab04\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Examine the Data Warehouse Schema


1. Form a small group with two or three other students.
2. In the D:\Labfiles\Lab04\Starter folder, double-click DW Schema.vsdx to open it in Microsoft Visio.

3. Click the Reseller Sales page and view the tables in the diagram. Note that the diagram indicates the
columns in the dimension and fact tables, and the slowly changing dimension (SCD) type for historical
dimension attributes.

4. Click the Internet Sales page and view the tables it contains.

5. Close Microsoft Visio without saving any changes.

6. Start SQL Server Management Studio, and when prompted, connect to the MIA-SQLDW instance of
the database engine by using Windows authentication.

7. In Object Explorer, expand Databases, expand AWDataWarehouse, and then expand Tables.
8. Expand dbo.DimCustomer, and then expand Columns. Note the columns in this table, and their
data types.

9. Repeat the previous step for the following tables:

o dbo.DimDate

o dbo.DimProduct

o dbo.DimPromotion

o dbo.DimReseller

o dbo.DimSalesperson

o dbo.DimSalesTerritory
o dbo.FactInternetSales

o dbo.FactResellerSales

10. Keep SQL Server Management Studio open for the next task.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 3: Explore Data Sources


1. In SQL Server Management Studio, in Object Explorer, expand the InternetSales database, expand
Tables, and then expand the following tables and their Columns folders:

o dbo.SalesOrderHeader

o dbo.SalesOrderDetail

o dbo.Customer
o dbo.StateOrProvince

o dbo.Country

These tables provide the source data for the following tables in the data warehouse:

o dbo.FactInternetSales

o dbo.DimCustomer

Note: Total product cost for a sales order is calculated by multiplying the unit cost for each order line
item by the ordered quantity. Similarly, a sales amount is calculated by multiplying the unit price by the
quantity.

2. In Object Explorer, expand the ResellerSales database, expand Tables, and then expand the
following tables and their Columns folders:

o dbo.SalesOrderHeader

o dbo.SalesOrderDetail

o dbo.Reseller

o dbo.BusinessType

o dbo.SalesEmployee
o dbo.SalesTerritory

o dbo.SalesRegion

o dbo.StateOrProvince
o dbo.Country

These tables provide the source data for the following tables in the data warehouse:

o dbo.FactResellerSales
o dbo.DimReseller

o dbo.DimSalesperson

o dbo.DimSalesTerritory

Note: Total cost and sales amount for reseller orders are calculated the same way as for Internet orders.
The sales territory for a sales order is determined by the sales territory where the reseller placing the order
is located, not by the sales territory assigned to the salesperson. Sales territories are often reassigned
between salespeople, but resellers stay within a single sales territory.

3. In Object Explorer, expand the Marketing database, expand Tables, and then expand the
dbo.Promotions table and its Columns folder. This table provides the source data for the
DimPromotion table in the data warehouse.

Note: The MarketingPromotion column in the SalesOrderHeader table in the InternetSales database
contains the PromotionID value from this table when an order is placed in response to a promotion.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-3

When no promotion is associated with the order, the MarketingPromotion column contains a NULL
value.

4. In Object Explorer, expand the ProductsMDS database, expand Views, and then expand the
following views and their Columns folders.

o mdm.Product

o mdm.ProductSubcategory

o mdm.ProductCategory

These views provide the source data for the DimProduct table in the data warehouse.
Note: This database represents a master data hub for the product data. This data is replicated to the
InternetSales and ProductSales databases, but the ProductsMDS database contains the master version
of the data.

5. Minimize SQL Server Management Studio.

Results: At the end of this exercise, you will have examined the data sources for the ETL process.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Creating Source to Target Documentation


X Task 1: Review High-Level Data Flow Diagrams
1. Continue working with two or three other students.

2. In the D:\Labfiles\Lab04\Starter folder, double-click Source to Target Diagrams.vsdx to open it in


Microsoft Visio.

3. Click the DimCustomer page, and examine the data flow for the DimCustomer table, noting the
following details:

o The data flow is shown from the Customer table in the InternetSales database to the
DimCustomer table (which is in the AWDataWarehouse database).

o The steps that need to be performed during the data flow are documented next to the data flow.

o Data from the StateOrProvince and Country tables is added to the data flow during lookup
steps.

o The details of the SCD columns are shown next to the relevant steps.

4. Click the DimProduct, DimPromotion, and FactInternetSales pages, and then review the diagrams
they contain.

5. Keep Visio open.

X Task 2: Create High-Level Data Flow Diagrams


1. In the Source to Target Diagrams.vsdx Visio document, on the ribbon, on the INSERT tab, click New
Page.

2. Right-click the page tab at the bottom of the drawing area, and then click Rename. Rename the page
to DimReseller.

3. On the ribbon, on the HOME tab, use the Rectangle, Text, and Connector tools to create a high-
level data flow diagram for the DimReseller table. The diagram should include:

o Source tables.
o Steps that need to be performed during the data flow.

o SCD attributes.

4. Repeat the previous steps to create a new page named FactResellerSales that contains a diagram for
the FactResellerSales data flow.

5. Close Visio.

Note: Completed diagrams are provided in Source to Target Diagrams.vsdx in the


D:\Labfiles\Lab04\Solution folder.

X Task 3: Review Detailed Source to Target Maps


1. In the D:\Labfiles\Lab04\Starter folder, double-click Source To Target Mappings.xlsx to open it in
Microsoft Excel.

2. On the DimCustomer worksheet, scroll to the right to view the Data Warehouse section of the map,
and note that it contains the columns in the DimCustomer table. Each row documents a data flow
from a source column to a column in the DimCustomer table.

3. Scroll back to the left, and note that the Source section of the worksheet contains details of the
source fields that are extracted from tables in the InternetSales database.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-5

4. Examine the Landing Zone section of the worksheet, and note that it contains details of the tables
that the source data is initially extracted to, together with any validation rules or transformations that
are applied during the extraction.

5. Examine the Staging section of the worksheet, and note that it contains details of the staging tables
that are created from the extracted data in the landing zone, together with any validation rules or
transformations that must be applied to the data.

6. Click the FactInternetSales worksheet, and note that it documents the data flow for each column in
the FactInternetSales table.
7. Keep Excel open.

X Task 4: Create a Detailed Source to Target Map


1. In the Source To Target Mappings.xlsx workbook, click the FactResellerSales worksheet and note
that the columns in the Data Warehouse section have already been entered.

2. Complete the source to target map for the FactResellerSales table. You should design a data flow in
which the source data is initially extracted into appropriately named landing zone tables and then
transformed and loaded into staging tables before being loaded into the data warehouse.
Note: A completed map is provided in Source to Target Mappings.xlsx in the D:\Labfiles\Lab04\Solution
folder.

Results: At the end of this exercise, you will have a Visio document that contains high-level data flow
diagrams and an Excel workbook that contains detailed source-to-target documentation.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 3: Using SSIS To Load a Partitioned Fact Table


X Task 1: View FactInternetSales Partitions
1. Maximize SQL Server Management Studio and open the View FactInternetSales Partitions.sql script
in the D:\Labfiles\Lab04\Starter folder.

2. Click Execute.

3. Review the results, which include the following information:

o The partition scheme and partition function used to partition the FactInternetSales table.

o The partitions in the table and the filegroups on which they are stored.
o The start and end key values for each partition.

o The number of rows in each partition.

4. Make a note of the details for the last partition in the table (which should currently contain no rows).
5. Keep SQL Server Management Studio open.

X Task 2: View Staged Internet Sales Data


1. In SQL Server Management Studio, in Object Explorer, expand the Staging database, and then
expand Tables.

2. Note that the Staging database includes the following tables (and some others):

o dbo.ExtractLog. A metadata table used to log data extractions.

o lz.InternetSalesOrderDetails. A landing zone table that contains data extracted from the
SalesOrderDetails table in the InternetSales database.

o lz.InternetSalesOrderHeader. A landing zone table that contains data extracted from the
SalesOrderHeader table in the InternetSales database.
o stg.FactInternetSales. A staging table that contains transformed data from the landing zone
tables that is ready to be loaded into the FactInternetSales table.

3. Right-click stg.FactInternetSales, and then click Select Top 1000 Rows. In the script that is
generated, delete the TOP 1000 clause, click Execute, and then note the number of rows returned by
the query (displayed at the bottom right).

4. Minimize SQL Server Management Studio.

X Task 3: Review an SSIS Package to Load FactInternetSales


1. In the D:\Labfiles\Lab04\Starter folder, double-click LoadPartition.sln to open it in Microsoft Visual
Studio.

2. In Solution Explorer, double-click the LoadFactInternetSales.dtsx package, and then click the
Control Flow tab if it is not already selected.

3. If the Variables pane is not visible, right-click the control flow design surface, and then click
Variables. Note that the package contains the following variables:

o Filegroup

o LastBoundary

o NextBoundary

o PartitionNumber

o SQL_AddConstraintAndIndex
MCT USE ONLY. STUDENT USE PROHIBITED
L4-7

o SQL_AddPartition

o SQL_CreateLoadTable

o SQL_SwitchPartition

4. Double-click Get Partition Info to view its editor, review the settings for the task, and then click
Cancel.
5. Repeat the previous step for the following tasks:

o Get Next Boundary.

o Transact-SQL to Add Filegroup.

o Add next filegroup.

o Transact-SQL to Create Load Table.

o Create Load Table.


6. Double-click Load Staged Data and view its data flow. Click an empty area of the data flow design
surface, press F4, and in the properties pane note that the DelayValidation property for the data
flow is set to True.

7. Click the Control Flow tab.

8. Double-click Transact-SQL to Add Constraint and Index to view its editor, review the settings for
the task, and then click Cancel.
9. Repeat the previous step for the following tasks:

o Add Constraint and Index.

o Transact-SQL to Switch Partition.


o Switch Partition.

o Drop Load Table.

10. On the Debug menu, click Start Debugging. Then, after execution completes, on the Debug menu,
click Stop Debugging and minimize Visual Studio.

11. Maximize SQL Server Management Studio and click the query editor for the View FactInternetSales
Partitions.sql script you ran earlier. Click Execute and review the results, noting that the staged rows
have been loaded into what was the last partition, and that a new empty partition has been added to
the end of the table.

12. Keep SQL Server Management Studio open.

X Task 4: Create an SSIS Package to Load FactResellerSales

Note: In this task, you will create an SSIS package to load some staged data. The process is
complex and contains many steps, and the package you create does not include error handling.
If you test your package and it fails, you can re-run the Setup.cmd batch file in
D:\Labfiles\Lab04\Starter to reset the databases to the starting point before trying to resolve
the problem.

1. In SQL Server Management Studio, open View FactResellerSales Partitions.sql script from the
D:\Labfiles\Lab04\Starter folder. Click Execute, and then review the information returned about the
partitions in the FactResellerSales table.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

2. In Object Explorer, right-click the stg.FactResellerSales table in the Staging database, and then click
Select Top 1000 Rows. In the script that is generated, delete the TOP 1000 clause, click Execute,
and then note the number of rows returned by the query (displayed at the bottom right).

3. In Object Explorer, expand the AWDataWarehouse database and its Tables folder. Right-click the
dbo.FactResellerSales table, point to Script Table as, point to CREATE To, and then click New
Query Editor Window.

4. In the resulting Transact-SQL code, change the table name in the first line of the CREATE TABLE
statement to [dbo].[LoadResellerSales], and then click Execute. After execution completes, right-
click the Tables folder for the AWDataWarehouse database, and then click Refresh to verify that
the dbo.LoadResellerSales table has been created.

5. Minimize SQL Server Management Studio, and maximize Visual Studio. If the SSIS Toolbox is not
visible, on the SSIS menu, click SSIS Toolbox.

6. In Solution Explorer, right-click SSIS Packages, and then click New SSIS Package. Right-click
Package1.dtsx, click Rename, and then type Load FactResellerSales.dtsx.
7. Right-click the control flow design surface, click Variables, and then click the Add Variable button in
the Variables pane to add the following variables.

Data
Name Value
Type

Filegroup String RSFG_NNNNNNNN

LastBoundary String NNNNNNNN

NextBoundary String NNNNNNNN

PartitionNumber String 0

SQL_AddConstraintAndIndex String Copy and paste the text in


SQL_AddConstraintAndIndex.txt in the
D:\Labfiles\Lab04\Starter\Code Snippets folder.

SQL_AddPartition String Copy and paste the text in SQL_AddPartition.txt in


the D:\Labfiles\Lab04\Starter\Code Snippets folder.

SQL_CreateLoadTable String Copy and paste the text in SQL_CreateLoadTable.txt


in the D:\Labfiles\Lab04\Starter\Code Snippets
folder.

SQL_SwitchPartition String Copy and paste the text in SQL_SwitchPartition.txt


in the D:\Labfiles\Lab04\Starter\Code Snippets
folder.

8. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface. Double-click the new task,
and in the Execute SQL Task Editor dialog box, set the following properties and click OK.

Tab Property Value


MCT USE ONLY. STUDENT USE PROHIBITED
L4-9

Tab Property Value

Gener Name Get Partition Info


al

Gener ResultSet Single row


al

Gener Connection MIA-SQLDW.AWDataWarehouse


al

Gener SQLSourceTy Direct input


al pe

Gener SQLStatemen Click Browse and browse to the D:\Labfiles\Lab04\Starter\Code


al t Snippets folder. Change the file type to Text Files (*.txt), select
GetPartitionInfo.txt, and then click Open.

Result Click Add to • Result Name: PartitionNumber, Variable Name:


Set add three User::PartitionNumber
results with • Result Name: Filegroup, Variable Name: User::Filegroup
the settings • Result Name: LastBoundary, Variable Name: User::LastBoundary
shown in the
Value
column.

9. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Get Partition Info, and
then connect the green precedence constraint from Get Partition Info to the new task. Double-click
the new task, and in the Execute SQL Task Editor dialog box, set the following properties, and then
click OK.

Tab Property Value

General Name Get Next Boundary

General ResultSet Single row

General Connection MIA-SQLDW.Staging

General SQLSourceType Direct input

General SQLStatement Click Browse and browse to the D:\Labfiles\Lab04\Starter\Code


Snippets folder. Change the file type to Text Files (*.txt), select
Get Next Boundary.txt, and then click Open.

Result Click Add to Result Name: NextBoundary, Variable Name: User::NextBoundary


Set add a result
with the
settings shown
MCT USE ONLY. STUDENT USE PROHIBITED
L4-10 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Tab Property Value

in the Value
column.

10. In the SSIS Toolbox, drag Expression Task to the control flow surface, click Get Next Boundary, and
then connect the green precedence constraint from Get Next Boundary to the new task.

11. Right-click the new task, click Rename, and then rename it to Transact-SQL to Add Filegroup.
Double-click the new task, and in the Expression Builder dialog box, enter the following expression,
and then click OK.

@[User::SQL_AddPartition]= REPLACE(@[User::SQL_AddPartition], "nextboundary",


@[User::NextBoundary])

12. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Transact-SQL to Add
Filegroup, and then connect the green precedence constraint from Transact-SQL to Add Filegroup
to the new task. Double-click the new task, and in the Execute SQL Task Editor dialog box, set the
following properties, and then click OK.

Tab Property Value

General Name Add next filegroup

General ResultSet None

General Connection MIA-SQLDW.AWDataWarehouse

General SQLSourceType Variable

General SourceVariable User::SQL_AddPartition

13. In the SSIS Toolbox, drag Expression Task to the control flow surface, click Add next filegroup, and
then connect the green precedence constraint from Add next filegroup to the new task.

14. Right-click the new task, click Rename, and then rename it to Transact-SQL to Create Load Table.
Double-click the new task, and in the Expression Builder dialog box, enter the following expression,
then and click OK.

@[User::SQL_CreateLoadTable]= REPLACE(@[User::SQL_CreateLoadTable], "file_group",


@[User::Filegroup])

15. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Transact-SQL to Create
Load Table, and then connect the green precedence constraint from Transact-SQL to Create Load
Table to the new task. Double-click the new task, and in the Execute SQL Task Editor dialog box, set
the following properties, and then click OK.

Tab Property Value

General Name Create Load Table

General ResultSet None


MCT USE ONLY. STUDENT USE PROHIBITED
L4-11

Tab Property Value

General Connection MIA-SQLDW.AWDataWarehouse

General SQLSourceType Variable

General SourceVariable User::SQL_CreateLoadTable

16. In the SSIS Toolbox, drag Data Flow Task to the control flow surface, click Create Load Table, and
then connect the green precedence constraint from Create Load Table to the new task.

17. Right-click the new task, click Rename, and then rename it to Load Staged Data. Double-click the
new task to view its data flow.

18. In the SSIS Toolbox, drag Source Assistant to the data flow surface, and in the Source Assistant -
Add New Source dialog box, select the SQL Server source type and the MIA-SQLDW.Staging
connection manager, and then click OK.

19. Right-click the new task, click Rename, and then rename it to Staged Reseller Sales. Double-click
the new task, and in the OLE DB Source Editor dialog box:

a. Ensure that the MIA-SQLDW.Staging connection manager is selected.


b. Change the data access mode to SQL command.

c. Click Browse, browse to the D:\Labfiles\Lab04\Starter\Code Snippets folder, change the file type
to All Files (*.*), select Staged Reseller Sales.txt, and then click Open.
d. Click OK.

20. In the SSIS Toolbox, drag Lookup to the data flow surface, click Staged Reseller Sales, and then
connect the blue data flow output from Staged Reseller Sales to the new lookup transformation.
21. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
OrderDateKey. Double-click the lookup transformation, and in the Lookup Transformation Editor
dialog box, set the following properties, and then click OK.

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
manager.

Connection Use results of an SELECT DateKey, DateAltkey FROM DimDate


SQL query.

Columns Configure the 1. Drag the OrderDateAltKey input column to the


lookup columns DateAltKey lookup column.
as described in
2. Select the check box for the DateKey lookup column.
the value
MCT USE ONLY. STUDENT USE PROHIBITED
L4-12 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Tab Property Value

column. 3. Change the Output Alias for the DateKey column to


OrderDateKey.

22. In the SSIS Toolbox, drag Lookup to the data flow surface, click Lookup OrderDateKey, and then
connect the blue data flow output from Lookup OrderDateKey to the new lookup transformation. In
the Input Output Selection dialog box, in the Output drop-down list, select Lookup Match
Output, and then click OK.

23. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
ProductKey. Double-click the lookup transformation, and in the Lookup Transformation Editor
dialog box, set the following properties, and then click OK.

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
manager.

Connection Use results of an SELECT ProductKey, ProductAltKey FROM DimProduct


SQL query.

Columns Configure the 1. Drag the ProductAltKey input column to the


lookup columns ProductAltKey lookup column.
as described in
2. Select the check box for the ProductKey lookup
the value column.
column.

24. In the SSIS Toolbox, drag Lookup to the data flow surface, click Lookup ProductKey, and then
connect the blue data flow output from Lookup ProductKey to the new lookup transformation. In
the Input Output Selection dialog box, in the Output drop-down list, select Lookup Match
Output, and then click OK.

25. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
ShipDateKey. Double-click the lookup transformation, and in the Lookup Transformation Editor
dialog box, set the following properties, and then click OK.

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
MCT USE ONLY. STUDENT USE PROHIBITED
L4-13

Tab Property Value

manager.

Connection Use results of an SELECT DateKey, DateAltkey FROM DimDate


SQL query.

Columns Configure the 1. Drag the ShipDateAltKey input column to the


lookup columns DateAltKey lookup column.
as described in
2. Select the check box for the DateKey lookup column.
the value column.
3. Change the Output Alias for the DateKey column to
ShipDateKey.

26. In the SSIS Toolbox, drag Lookup to the data flow surface, click Lookup ShipDateKey, and then
connect the blue data flow output from Lookup ShipDateKey to the new lookup transformation. In
the Input Output Selection dialog box, in the Output drop-down list, select Lookup Match
Output, and then click OK.
27. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
ResellerKey. Double-click the lookup transformation, and in the Lookup Transformation Editor
dialog box, set the following properties, and then click OK.

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
manager.

Connection Use results of an SELECT ResellerKey, ResellerAltKey FROM DimReseller


SQL query.

Columns Configure the 1. Drag the ResellerAltKey input column to the


lookup columns ResellerAltKey lookup column.
as described in
2. Select the check box for the ResellerKey lookup column.
the value column.

28. In the SSIS Toolbox, drag Lookup to the data flow surface, click Lookup ResellerKey, and then
connect the blue data flow output from Lookup ResellerKey to the new lookup transformation. In
the Input Output Selection dialog box, in the Output drop-down list, select Lookup Match
Output, and then click OK.

29. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
SalespersonKey. Double-click the lookup transformation, and in the Lookup Transformation Editor
dialog box, set the following properties, and then click OK.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-14 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
manager.

Connection Use results of an SELECT EmployeeKey, EmployeeAltKey FROM


SQL query. DimSalesperson

Columns Configure the 1. Drag the SalesPersonAltKey input column to the


lookup columns EmployeeAltKey lookup column.
as described in
2. Select the check box for the EmployeeKey lookup
the value
column.
column.
3. Change the Output Alias for the EmployeeKey
column to SalespersonKey.

30. In the SSIS Toolbox, drag Lookup to the data flow surface, click Lookup SalespersonKey, and then
connect the blue data flow output from Lookup SalespersonKey to the new lookup transformation.
In the Input Output Selection dialog box, in the Output drop-down list, select Lookup Match
Output, and then click OK.

31. Right-click the new lookup transformation, click Rename, and then rename it to Lookup
SalesTerritoryKey. Double-click the lookup transformation, and in the Lookup Transformation
Editor dialog box, set the following properties, and then click OK.

Tab Property Value

General Specify how to Ignore failure


handle rows with
no matching
entries.

Connection OLE DB MIA-SQLDW.AWDataWarehouse


connection
manager.

Connection Use results of an SELECT SalesTerritoryKey, SalesTerritoryAltKey FROM


SQL query. DimSalesTerritory

Columns Configure the 1. Drag the SalesTerritoryAltKey input column to the


lookup columns SalesTerritoryAltKey lookup column.
as described in
2. Select the check box for the SalesTerritoryKey lookup
the value
column.
column.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-15

32. In the SSIS Toolbox, drag Destination Assistant to the data flow surface, and in the Destination
Assistant - Add New Destination dialog box, select the SQL Server source type and the MIA-
SQLDW.AWDataWarehouse connection manager, and then click OK.

33. Click Lookup SalesTerritoryKey, and connect the blue data flow output from Lookup
SalesTerritoryKey to the new OLE DB destination. In the Input Output Selection dialog box, in the
Output drop-down list, select Lookup Match Output, and then click OK.

34. Right-click the new OLE DB destination, click Rename, and then rename it to Load Table. Double-
click Load Table, and in the OLE DB Destination Editor dialog box:
a. Ensure that the MIA-SQLDW.AWDataWarehouse connection manager is selected.

b. Ensure that the data access mode is Table or view – fast load.

c. In the Name of the table or the view drop-down list, select [dbo].[LoadResellerSales].
d. Select Keep nulls.

e. Click the Mappings tab, and ensure that all destination columns are mapped to identically-
named input columns.
f. Click OK.

35. Click the Control Flow tab, and then click the Load Staged Data task to select it. Press F4, and then
set the DelayValidation property of the Load Staged Data task to True.

36. Maximize SQL Server Management Studio, and in Object Explorer, right-click the
dbo.LoadResellerSales table you created earlier, and then click Delete. In the Delete Object dialog
box, ensure that only the LoadResellerSales table is listed, and then click OK. Return to Visual Studio.

37. In the SSIS Toolbox, drag Expression Task to the control flow surface, click Load Staged Data, and
then connect the green precedence constraint from Load Staged Data to the new task.

38. Right-click the new task, click Rename, and then rename it to Transact-SQL to Add Constraint and
Index. Double-click the new task, and in the Expression Builder dialog box, enter the following
expression, and then click OK.

@[User::SQL_AddConstraintAndIndex]=
REPLACE(REPLACE(@[User::SQL_AddConstraintAndIndex], "LastBoundary",
@[User::LastBoundary]), "NextBoundary", @[User::NextBoundary])

39. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Transact-SQL to Add
Constraint and Index, and then connect the green precedence constraint from Transact-SQL to
Add Constraint and Index to the new task. Double-click the new task, and in the Execute SQL Task
Editor dialog box, set the following properties, and then click OK.

Tab Property Value

General Name Add Constraint and Index

General ResultSet None

General Connection MIA-SQLDW.AWDataWarehouse

General SQLSourceType Variable

General SourceVariable User::SQL_AddConstraintAndIndex


MCT USE ONLY. STUDENT USE PROHIBITED
L4-16 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

40. In the SSIS Toolbox, drag Expression Task to the control flow surface, click Add Constraint and
Index, and then connect the green precedence constraint from Add Constraint and Index to the
new task.

41. Right-click the new task, click Rename, and then rename it to Transact-SQL to Switch Partition.
Double-click the new task, and in the Expression Builder dialog box, enter the following expression,
and then click OK.

@[User::SQL_SwitchPartition]= REPLACE(@[User::SQL_SwitchPartition],
"partitionnumber", @[User::PartitionNumber])

42. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Transact-SQL to
Switch Partition, and then connect the green precedence constraint from Transact-SQL to Switch
Partition to the new task. Double-click the new task, and in the Execute SQL Task Editor dialog box,
set the following properties, and then click OK.

Tab Property Value

General Name Switch Partition

General ResultSet None

General Connection MIA-SQLDW.AWDataWarehouse

General SQLSourceType Variable

General SourceVariable User::SQL_SwitchPartition

43. In the SSIS Toolbox, drag Execute SQL Task to the control flow surface, click Switch Partition, and
then connect the green precedence constraint from Switch Partition to the new task. Double-click
the new task, and in the Execute SQL Task Editor dialog box, set the following properties, and then
click OK.

Tab Property Value

General Name Drop Load Table

General ResultSet None

General Connection MIA-SQLDW.AWDataWarehouse

General SQLSourceType Direct input

General SQLStatement Click Browse and browse to the D:\Labfiles\Lab04\Starter\Code


Snippets folder. Change the file type to Text Files (*.txt), select
Drop Load Table.txt, and then click Open.

44. On the Debug menu, click Start Debugging. Then, after execution completes, on the Debug menu,
click Stop Debugging and close Visual Studio. Save your work if prompted.

45. Maximize SQL Server Management Studio, and then click the query editor for the View
FactResellerSales Partitions.sql script you ran earlier. Click Execute, and then review the results,
MCT USE ONLY. STUDENT USE PROHIBITED
L4-17

noting that the staged rows have been loaded into what was the last partition, and that a new empty
partition has been added to the end of the table.

46. Close SQL Server Management Studio without saving any changes.

Results: At the end of this exercise, you will have an SSIS package that loads data into the
FactResellerSales table by using the partition switching technique.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L5-1

Module 5: Designing Analytical Data Models


Lab A: Designing Analytical Data Models
Exercise 1: Creating a Multidimensional Data Model
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer icon.

3. View the contents of the D:\Labfiles\Lab05A\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Create a Multidimensional Analysis Services Project


1. On the taskbar, click SQL Server Data Tools.
2. On the File menu, point to New, and click Project.

3. In the New Project dialog box, click Analysis Services Multidimensional and Data Mining Project,
in the Name text box, type AWSalesMD, in the Location box browse to D:\Labfiles\Lab05A\Starter,
and then click OK.

X Task 3: Create a Data Source


1. In Solution Explorer, right-click the Data Sources folder, and then click New Data Source.

2. On the Welcome to the Data Source Wizard page, click Next.


3. On the Select how to define the connection page, click New.

4. In the Connection Manager dialog box, in the Server name drop-down list, type MIA-SQLDW; in
the Log on to the server area, ensure that Use Windows Authentication is selected; in the Select
or enter a database name list, click AWDataWarehouse, and then click OK.

5. On the Select how to define the connection page, ensure that the MIA-
SQLDW.AWDataWarehouse data connection is selected, and then click Next.
6. On the Impersonation information page, select Use a specific Windows user name and
password, in the User name text box, type ADVENTUREWORKS\ServiceAcct, in the password text
box type Pa$$w0rd, and then click Next.
7. On the Completing the Wizard page, set the data source name to AW Data Warehouse, and then
click Finish.

X Task 4: Create a Data Source View


1. In Solution Explorer, right-click the Data Source Views folder, and then click New Data Source
View.

2. On the Welcome to the Data Source View Wizard page, click Next.

3. On the Select a Data Source page, verify that the AW Data Warehouse data source is selected, and
then click Next.

4. On the Name Matching page, click Next.


MCT USE ONLY. STUDENT USE PROHIBITED
L5-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

5. On the Select Tables and Views page, in the Available objects list, click Customer (dw_views), and
then hold down the Ctrl key and click the following objects:

o Date (dw_views).

o InternetSales (dw_views).

o Product (dw_views).
o Reseller (dw_views).

o ResellerSales (dw_views).

o Salesperson (dw_views).

o SalesTerritory (dw_views).

6. On the Select Tables and Views page, click the > button to add the selected tables to the Included
objects list, and then click Next.
7. On the Completing the Wizard page, set the name of the data source view to AW Data
Warehouse, and then click Finish.

X Task 5: Create Relationships


1. In the AW Data Warehouse.dsv Data Source View Designer, arrange the tables so that you can see
both the ResellerSales and Date tables. Then drag the OrderDateKey field from the ResellerSales
fact table to the DateKey field in the Date dimension table to create a relationship in which
OrderDateKey is the foreign key and DateKey is the primary key.
2. Repeat the previous step to create the following relationships.

Dimension
Fact Table Foreign Key Primary Key
Table

ResellerSales ProductKey Product ProductKey

ResellerSales ShipDateKey Date DateKey

ResellerSales ResellerKey Reseller ResellerKey

ResellerSales SalesPersonKey Salesperson EmployeeKey

ResellerSales SalesTerritoryKey SalesTerritory SalesTerritoryKey

InternetSales OrderDateKey Date DateKey

InternetSales ProductKey Product ProductKey

InternetSales ShipDateKey Date DateKey

InternetSales CustomerKey Customer CustomerKey

3. On the File menu, click Save Selected Items.

X Task 6: Create a Cube


1. In Solution Explorer, right-click the Cubes folder, and then click New Cube.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-3

2. On the Welcome to the Cube Wizard page, click Next.

3. On the Select Creation Method page, verify that Use existing tables is selected, and then click
Next.

4. On the Select Measure Group Tables page, verify that the AW Data Warehouse data source view is
selected, click the check boxes next to InternetSales and ResellerSales, and then click Next.

5. On the Select Measures page, clear all of the check boxes other than the ones for the following
measures, and then click Next:

o Internet Sales:
• Order Quantity.
• Unit Price.
• Product Unit Cost.
• Total Product Cost.
• Sales Amount.
o Reseller Sales:
• Order Quantity - Reseller Sales.
• Unit Price - Reseller Sales.
• Product Unit Cost - Reseller Sales.
• Total Product Cost - Reseller Sales.
• Sales Amount - Reseller Sales.
6. On the Select New Dimensions page, ensure that all check boxes are selected, and then click Next.

7. On the Completing the Wizard page, in the Cube name box, type Sales, and then click Finish.

X Task 7: Configure Measures


1. In the Sales.cube Cube Designer, in the Measures pane, expand Internet Sales.

2. Right-click Order Quantity, click Rename, and then rename the measure to Internet Quantity.

3. Repeat the previous step to rename the following measures in the Internet Sales measure group:

o Unit Price: Internet Unit Price.

o Product Unit Cost: Internet Unit Cost.

o Total Product Cost: Internet Cost.

o Sales Amount: Internet Revenue.

4. Expand the Reseller Sales measure group and rename the following measures:

o Order Quantity - Reseller Sales: Reseller Quantity.

o Unit Price - Reseller Sales: Reseller Unit Price.

o Product Unit Cost - Reseller Sales: Reseller Unit Cost.

o Total Product Cost - Reseller Sales: Reseller Cost.

o Sales Amount - Reseller Sales: Reseller Revenue.

5. On the File menu, click Save Selected Items.


MCT USE ONLY. STUDENT USE PROHIBITED
L5-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 8: Configure Attributes


1. In Solution Explorer, under Dimensions double-click Date.dim

2. In the Data Source View pane, in the Date table, right-click the DateAltKey column and click New
Attribute from Column.

3. Repeat the previous step for the following columns (note that spaces are automatically added to the
attribute names to make them more readable):

o MonthName

o CalendarYear

o FiscalQuarter

o FiscalYear

4. In the Attributes pane, right-click Date Alt Key and click Rename. Then rename the attribute to
Date.

5. In the Attributes pane, click DateKey and press F4. Then in the Properties pane, set the
AttributeHierarchyVisible property to False.

6. On the File menu, click Save Selected Items.

7. Repeat steps 1 to 5 to create attributes in the following dimensions and set the
AttributeHierarchyVisible property of the key attribute in each dimension to False:

Dimension Type Column: Attribute

Product.dim Regular • ProductName: Product


• ProductSubcategoryName: Subcategory
• ProductCategoryName: Category

Customer.dim Regular • Name: Customer


• City: City
• StateOrProvince: State Or Province
• Country: Country

Reseller.dim Regular • BusinessType: Business Type


• ResellerName: Reseller
• City: City
• StateOrProvince: State Or Province
• Country: Country

SalesTerritory.dim Regular • SalesTerritoryName: Sales Territory


• Country: Country
• Region: Region

8. On the File menu, click Save All.

X Task 9: Deploy and Browse the Model


1. In Solution Explorer, right-click AWSalesMD, and then click Properties.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-5

2. Click Deployment, and verify that the Target properties have the following values, and then click
OK:

o Server: localhost

o Database AWSalesMD

3. On the Build menu, click Deploy AWSalesMD.


If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.
If you are prompted to replace an existing database with the same name, click Yes.
4. In Solution Explorer, double-click Sales.cube, and then in the Cube Designer, click the Browser tab.

5. On the Cube menu, click Analyze in Excel. If a Microsoft Excel Security Notice dialog box is
displayed, click Enable.
6. In Excel, in the PivotTable Fields pane, under Internet Sales, select Internet Revenue, and under
Reseller Sales, select Reseller Revenue.

7. In the PivotTable Fields pane, under Product, select Category.

8. Verify that the PivotTable in Excel shows the Internet and reseller sales revenue for four product
categories, and then close Excel without saving the workbook.

9. Close Visual Studio, saving your work if prompted.

Results: At the end of this exercise, you will have a multidimensional data model named AWSalesMD.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Creating a Tabular Data Model


X Task 1: Create a Tabular Analysis Services Project
1. On the taskbar, click SQL Server Data Tools.

2. On the File menu, point to New, and then click Project.

3. In the New Project dialog box, click Analysis Services Tabular Project, in the Name text box, type
AWSalesTab, in the Location box browse to D:\Labfiles\Lab05A\Starter, and then click OK.

4. If the Tabular model designer dialog box is displayed, in the Workspace server list, select
localhost\SQL2, and in the Compatibility level box, select SQL Server 2012 SP1 (1103), and then
click OK.

X Task 2: Import Tables from the Data Warehouse


1. In Solution Explorer, double-click Model.bim to open the model.

2. On the Model menu, click Import from Data Source.


3. In the Table Import Wizard, on the Connect to a Data Source page, select Microsoft SQL Server, and
then click Next.

4. On the Connect to a Microsoft SQL Server Database page, in the Server name box, type MIA-SQLDW,
ensure that Use Windows Authentication is selected, and in Database name list, select
AWDataWarehouse, and then click Next.

5. On the Impersonation Information page, in the User Name box, type


ADVENTUREWORKS\ServiceAcct, in the Password box, type Pa$$w0rd, and click Next.

6. On the Choose How to Import the Data page, ensure that Select from a list of tables and views to
choose the data to import is selected, and then click Next.
7. On the Select Tables and Views page, select the following source tables, specifying the friendly
name indicated in parentheses. Make sure you select the views in the dw_views schema, and not the
similarly named dimension and fact tables in the dbo schema.

o Customer (Customer).

o Date (Date).

o InternetSales (Internet Sales).

o Product (Product).

o Reseller (Reseller).

o ResellerSales (Reseller Sales).

o Salesperson (Salesperson).

o SalesTerritory (Sales Territory).

After you select all of the views and correct the friendly names where necessary, click Finish.
8. After the data is imported, click Close.

X Task 3: Create Relationships


1. On the Model menu, point to Model View, and click Diagram View, and arrange the tables so that
you can see both the ResellerSales and Date tables. Then drag the OrderDateKey field from the
ResellerSales fact table to the DateKey field in the Date dimension table to create a relationship in
which OrderDateKey is the foreign key, and DateKey is the primary key.

2. Repeat the previous step to create the following relationships.


MCT USE ONLY. STUDENT USE PROHIBITED
L5-7

Dimension
Fact Table Foreign Key Primary Key
Table

ResellerSales ProductKey Product ProductKey

ResellerSales ShipDateKey Date DateKey

ResellerSales ResellerKey Reseller ResellerKey

ResellerSales SalesPersonKey Salesperson EmployeeKey

ResellerSales SalesTerritoryKey SalesTerritory SalesTerritoryKey

InternetSales OrderDateKey Date DateKey

InternetSales ProductKey Product ProductKey

InternetSales ShipDateKey Date DateKey

InternetSales CustomerKey Customer CustomerKey

3. On the File menu, click Save Model.bim.

X Task 4: Create Measures


1. On the Model menu, point to Model View, click Data View, and then click the Internet Sales tab to
view the worksheet for the Internet Sales table.

2. If the measure grid is not visible under the table data, on the Table menu, click Show Measure Grid.

3. In the measure grid, click the cell under the OrderQuantity column. On the Column menu, point to
AutoSum, and then click Sum.
4. In the formula bar, edit the DAX formula to change the measure name to Internet Quantity, as
shown in the following code.

Internet Quantity:=SUM([OrderQuantity])

5. Repeat the previous two steps to create Sum aggregations for the following measures.

Column Measure Name

UnitPrice Internet Unit Price

ProductUnitCost Internet Unit Cost

TotalProductCost Internet Cost

SalesAmount Internet Revenue

6. Click the Reseller Sales tab, and create the following measures using the Sum aggregation.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Column Measure Name

OrderQuantity Reseller Quantity

UnitPrice Reseller Unit Price

ProductUnitCost Reseller Unit Cost

TotalProductCost Reseller Cost

SalesAmount Reseller Revenue

7. On the File menu, click Save Model.bim.

X Task 5: Configure Attributes


1. On the Model menu, point to Model View, and click Diagram View.

2. While holding the Ctrl key, click all columns in the Customer table except for the following ones:
o Name.

o City.

o StateOrProvince.
o Country.

3. Right-click any of the selected columns, and then click Hide from Client Tools. The preceding
columns, which you did not select, will remain visible as dimension attributes.

4. Right-click the Name column, click Rename, and then rename the column to Customer.

5. Right-click the StateOrProvince column, click Rename, and then rename the column to State Or
Province.
6. Repeat steps 2 to 5 as necessary to configure the columns in the following table.

Table Visible Columns

Internet • Internet Quantity


Sales • Internet Unit Price
• Internet Unit Cost
• Internet Cost
• Internet Revenue

Date • DateAltKey (renamed to Date)


• MonthName (renamed to Month Name)
• CalendarYear (renamed to Calendar Year)
• FiscalQuarter (renamed to Fiscal Quarter)
• FiscalYear (renamed to Fiscal Year)

Product • ProductName (renamed to Product)


MCT USE ONLY. STUDENT USE PROHIBITED
L5-9

Table Visible Columns

• ProductSubcategoryName (renamed to Subcategory)


• ProductCategoryName (renamed to Category)

Reseller • Reseller Quantity


Sales • Reseller Unit Price
• Reseller Unit Cost
• Reseller Cost
• Reseller Revenue

Reseller • BusinessType (renamed to Business Type)


• ResellerName (renamed to Reseller)
• City
• StateOrProvince (renamed to State Or Province)
• Country

Sales • SalesTerritoryName (renamed to Sales Territory)


Territory • Country
• Region

7. On the File menu, click Save Model.bim.

X Task 6: Deploy and Browse the Model


1. On the Project menu, click AWSalesTab Properties, and ensure that the Server property is set to
localhost\SQL2, and the Database property is set to AWSalesTab. Set the Cube name property to
Sales, and then click OK.

2. On the Build menu, click Deploy AWSalesTab. After deployment completes, click Close.

3. On the Model menu, click Analyze in Excel. When prompted, ensure that Current Windows User
and the (Default) perspective are selected, and then click OK.

4. In Excel, in the PivotTable Fields pane, under Internet Sales, select Internet Revenue, and under
Reseller Sales, select Reseller Revenue.

5. In the PivotTable Fields pane, under Product, select Category.

6. Verify that the PivotTable in Excel shows the Internet and reseller sales revenue for four product
categories, and then close Excel without saving the workbook.

7. Close Visual Studio, saving your work if prompted.

Results: At the end of this exercise, you will have a tabular data model named AWSalesTab.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-10 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Lab B: Designing Dimensions and


Hierarchies
Exercise 1: Creating Dimensions in a Multidimensional Data Model
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab05B\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.

5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Explore Dimension Hierarchies


1. In the D:\Labfiles\Lab05B\Starter folder, double-click AWSalesMD.sln to open the solution in SQL
Server Data Tools.

2. In Solution Explorer, double-click Customer.dim, and note that a hierarchy named Customers By
Geography has been created in this dimension.

3. Click the Attribute Relationships tab, and note that relationships have been defined between the
attributes in the hierarchy.

4. Return to the Dimension Structure tab, and then click the City attribute. Press F4 to view the
following properties of the selected attribute:

o AttributeHierarchyVisible. This has been set to False, so the City attribute can be browsed only
through the Customers By Geography hierarchy.

o KeyColumns. The City member of the hierarchy is uniquely defined by a combination of the
City, StateOrProvince, and Country columns.

o NameColumn. Because the attribute has multiple key columns, the NameColumn property must
be set to specify which column gives the attribute its name.

o ValueColumn. Similar to the preceding attribute, the ValueColumn property specifies which
column contains the attribute value.

5. View the properties of the other attributes in the Customer dimension. Note that each attribute that
is included in the Customers By Geography hierarchy is uniquely identified by a combination of
multiple key columns, and that none of the attributes are visible―the only way to browse the
dimension is through the Customers By Geography hierarchy.

6. On the Dimension menu, click Process.

If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd. If you are prompted to replace an existing database with the same name, click
Yes.

7. In the Process Dimension – Customer dialog box, click Run, and in the Process Progress dialog
box, after processing completes, click Close. Then, in the Process Dimension – Customer dialog box,
click Close
MCT USE ONLY. STUDENT USE PROHIBITED
L5-11

8. In the Dimension Designer, click the Browser tab and if necessary, click the Reconnect button. In the
Hierarchy list, ensure that Customers By Geography is selected, and expand the All member to
browse the hierarchy.

9. Examine the Reseller.dim dimension (which has two hierarchies) and the Sales Territory.dim
dimension, and note that the attributes in these dimensions have been similarly configured.

X Task 3: Create a Balanced Hierarchy


1. In Solution Explorer, double-click Product.dim.

2. Drag the Category attribute to the Hierarchies pane.

3. Drag the Subcategory attribute and drop it under the Category attribute in the hierarchy that has
been created.

4. Drag the Product attribute, and drop it under the Subcategory attribute in the hierarchy.

5. Right-click Hierarchy, click Rename, and then rename the hierarchy to Products By Category.
6. Click the Attribute Relationships tab, and in the pane at the top, right-click Product, and then click
New Attribute Relationship.

7. In the Create Attribute Relationship dialog box, under Source Attribute, ensure that Product is
selected, under Related Attribute, select Subcategory, and in the Relationship type list, select Flexible
(may change over time), and then click OK.

8. Right-click Subcategory, and then click New Attribute Relationship.


9. In the Create Attribute Relationship dialog box, under Source Attribute, ensure that Subcategory is
selected, under Related Attribute, select Category, and in the Relationship type list, select Flexible
(may change over time), and then click OK.
10. Double-click the relationship arrow between Product Key and Product, and in the Edit Attribute
Relationship dialog box, change the Relationship type to Rigid (will not change over time), and
then click OK.

11. On the Dimension Structure tab, click the Product attribute, and then press F4.

12. In the Properties pane, set the AttributeHierarchyVisible property to False.

13. In the Properties pane, click the ellipsis button for the KeyColumns property, and in the list of
Available Columns, select ProductKey, click the > button, use the up-arrow button to reorder the
columns so that ProductKey is listed at the top, and then click OK.

14. In the Properties pane, click the ellipsis button for the NameColumn property, select ProductName,
and then click OK.

15. In the Properties pane, click the ellipsis button for the ValueColumn property, select ProductName,
and then click OK.
16. Repeat the previous five steps to set the properties of the following attributes.

Attribute Property Values

Subcategory • AttributeHierarchyVisible: False


• KeyColumns: ProductCategoryName, and ProductSubcategoryName (in that
order)
• NameColumn: ProductSubcategoryName
• ValueColumn: ProductSubcategoryName
MCT USE ONLY. STUDENT USE PROHIBITED
L5-12 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Attribute Property Values

Category • AttributeHierarchyVisible: False

17. On the File menu, click Save All.

18. On the Dimension menu, click Process.

If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.
If you are prompted to replace an existing database with the same name, click Yes.

19. In the Process Dimension – Product dialog box, click Run, and in the Process Progress dialog box,
after processing completes, click Close. Then, in the Process Dimension – Product dialog box, click
Close.

20. In the Dimension Designer, click the Browser tab and if necessary, click the Reconnect button. Then
in the Hierarchy list, ensure that Products By Category is selected, and then expand the All member
to browse the hierarchy.

X Task 4: Implement a Role-Playing Time Dimension


1. In Solution Explorer, double-click Sales.cube.

2. On the Dimension Usage tab, note that the Date dimension is listed twice – once to represent the
Order Date, and once to represent the Ship Date. Both versions of the dimension are related to the
Internet Sales and Reseller Sales measure groups by the Date Key column.

3. In Solution Explorer, double-click Date.dim.

4. Drag the Calendar Year attribute to the Hierarchies pane.


5. Drag the Month Name attribute and drop it under the Calendar Year attribute in the hierarchy that
was created.

6. Drag the Date attribute, and drop it under the Month Name attribute in the hierarchy.

7. Right-click Hierarchy, click Rename, and then rename the hierarchy to Calendar Date.

8. Drag the Fiscal Year attribute to the Hierarchies pane.

9. Drag the Fiscal Quarter attribute and drop it under the Fiscal Year attribute in the hierarchy that
was created.

10. Drag the Month Name attribute and drop it under the Fiscal Quarter attribute in the hierarchy that
was created.

11. Drag the Date attribute, and drop it under the Month Name attribute in the hierarchy.

12. Right-click Hierarchy, click Rename, and then rename the hierarchy to Fiscal Date.

13. On the Attribute Relationships tab, use the same technique you used in the previous exercise to
create the following attribute relationships.

Source Attribute Related Attribute Relationship Type

Date Month Name Rigid

Month Name Calendar Year Rigid


MCT USE ONLY. STUDENT USE PROHIBITED
L5-13

Source Attribute Related Attribute Relationship Type

Month Name Fiscal Quarter Rigid

Fiscal Quarter Fiscal Year Rigid

14. Change the relationship type between DateKey and Date to Rigid.

15. On the Dimension Structure tab, set the following attribute properties.

Attribute Property Values

Calendar Year • AttributeHierarchyVisible: False

Date • AttributeHierarchyVisible: False


• KeyColumns: DateKey, and DateAltKey (in that order)
• NameColumn: DateAltKey
• ValueColumn: DateAltKey

Fiscal Quarter • AttributeHierarchyVisible: False


• KeyColumns: FiscalYear, and FiscalQuarter (in that order)
• NameColumn: FiscalQuarter
• ValueColumn: FiscalQuarter

Fiscal Year • AttributeHierarchyVisible: False

Month Name • AttributeHierarchyVisible: False


• KeyColumns: CalendarYear, and MonthNumber (in that order)
• NameColumn: MonthName
• ValueColumn: MonthName

16. Set the OrderBy property of the Month Name attribute to Key.

17. In the Attributes pane, click the Date dimension (at the root of the attributes). In the Properties
pane, set the Type property to Time.

18. Set the Type property of the following attributes:

o Calendar Year: Date > Calendar > Years

o Date: Date > Calendar > Date

o Fiscal Quarter: Date > Fiscal > FiscalQuarters

o Fiscal Year: Date > Fiscal > FiscalYears

19. On the File menu, click Save All.

20. On the Dimension menu, click Process.

Note If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account
with the password Pa$$w0rd. If you are prompted to replace an existing database with the same
name, click Yes.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-14 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

21. In the Process Dimension – Date dialog box, click Run, and in the Process Progress dialog box,
after processing completes, click Close. Then, in the Process Dimension – Date dialog box, click
Close.

22. In the Dimension Designer, click the Browser tab and if necessary, click the Reconnect button. In the
Hierarchy list, ensure that Calendar Date is selected, and expand the All member to browse the
hierarchy.

23. In the Hierarchy list, ensure that Fiscal Date is selected, and then expand the All member to browse
the hierarchy.

X Task 5: Implement a Parent-Child Hierarchy


1. In Solution Explorer, double-click Salesperson.dim.

2. In the Data Source View pane, right-click the ParentEmployeeKey column, and then click New
Attribute from Column.

3. In the Attributes pane, click the Parent Employee Key attribute. In the Properties pane, set the
following properties:

o Name: Salesperson.
o Usage: Parent.

4. In the Attributes pane, click the Employee Key attribute. In the Properties pane, set the
AttributeHierarchyVisible property to False.

5. On the Dimension menu, click Process.

If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.
If you are prompted to replace an existing database with the same name, click Yes.

6. In the Process Dimension – Salesperson dialog box, click Run, and in the Process Progress dialog
box, after processing completes, click Close. In the Process Dimension – Salesperson dialog box,
click Close.

7. In the Dimension Designer, click the Browser tab and if necessary, click the Reconnect button. In the
Hierarchy list, ensure that Salesperson is selected, and then expand the All member to browse the
hierarchy.

X Task 6: Deploy and Browse the Model


1. On the Build menu, click Deploy AWSalesMD.

If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd. If you are prompted to replace an existing database with the same name, click
Yes.

2. In Solution Explorer, double-click Sales.cube and then in the Cube Designer, click the Browser tab.

3. On the Cube menu, click Analyze in Excel. If a Microsoft Excel Security Notice dialog box is
displayed, click Enable.

4. In Excel, in the PivotTable Fields pane, under Reseller Sales, select Reseller Revenue.

5. In the PivotTable Fields pane, under Product, select Products By Category.

6. In the PivotTable Fields pane, note that both Order Date and Ship Date dimensions are listed, and
then under Order Date, select Order Date.Calendar Date.

7. In the PivotTable Fields pane, under Salesperson, select Salesperson.


MCT USE ONLY. STUDENT USE PROHIBITED
L5-15

8. Browse the PivotTable in Excel and verify that the hierarchies behave as expected. Close Excel without
saving the workbook.

9. Close Visual Studio, saving your work if prompted.

Results: At the end of this exercise, you will have a multidimensional model that includes balanced
hierarchies, a role-playing dimension, and a parent-child dimension.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-16 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Creating Dimensions in a Tabular Data Model


X Task 1: Explore Dimension Hierarchies
1. In the D:\Labfiles\Lab05B\Starter folder, double-click AWSalesTab.sln to open the solution in SQL
Server Data Tools. If the Tabular model designer dialog box is displayed, in the Workspace server
list, select localhost\SQL2, in the Compatibility level box, select SQL Server 2012 SP1 (1103), and
then click OK.

2. In Solution Explorer, double-click Model.bim.

3. On the Model menu, point to Process and click Process All. If the Impersonate Credentials dialog
box is displayed, enter the user name ADVENTUREWORKS\ServiceAcct and the password
Pa$$w0rd, and click OK. Then, in the Data Processing dialog box, when processing is complete, click
Close.
4. On the Model menu, point to Model View, and then click Diagram View.

5. In the Customer table, note that a hierarchy named Customers By Geography has been created.

6. Note that the Customers By Geography hierarchy contains the Country, State Or Province, City,
and Customer attributes.

7. Note that the Country, State Or Province, City, and Customer attributes in the table (not in the
hierarchy) have been hidden from client tools.
8. Note that hierarchies have also been created in the Reseller and Sales Territory tables.

X Task 2: Create a Balanced Hierarchy


1. In Diagram View, click the Product table, and then click the Maximize button in its title bar.

2. Click the Create Hierarchy button in the title bar of the maximized Product table.
3. After the new hierarchy is created, change its name to Products By Category.

4. Drag the Category, Subcategory, and Product attributes (in that order) to the Products By
Category hierarchy.
5. Click the Product attribute that is not in the hierarchy, and hold the Shift key and click the Category
attribute. Right-click the selected attributes, and then click Hide from Client Tools.

6. Click the Restore button in the title bar of the maximized Product table.
7. On the File menu, click Save All.

X Task 3: Implement a Role-Playing Time Dimension


1. In Diagram View, click the dotted line between the Reseller Sales and Date tables, and then press
Delete. When you are prompted to confirm, click Delete from Model.

2. Repeat the previous step to delete the dotted line between the Internet Sales and Date tables.

3. Right-click the title bar of the Date table, click Rename, and then rename the table to Order Date.

4. On the Model menu, click Existing Connections.

5. In the Existing Connections dialog box, ensure that SqlServer MIA-SQLDW AWDataWarehouse is
selected, and then click Open. If you are prompted for impersonation credentials, enter the password
Pa$$w0rd, and then click OK.

6. In the Table Import Wizard dialog box, on the Choose how to Import the Data page, ensure that
Select from a list of tables and views to choose the data to import is selected, and then click Next.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-17

7. On the Select Tables and Views page, select the Date source table, which is in the dw_views
schema, and change the friendly name to Ship Date. Click Finish. After the data from the table is
imported, click Close.

8. In the new Ship Date table that has been added to the model, right-click each of the following
attributes, click Rename, and rename them as specified in the following list:

o DateAltKey: Date.

o MonthName: Month Name.

o CalendarYear: Calendar Year.


o FiscalQuarter: Fiscal Quarter.

o FiscalYear: Fiscal Year.

9. Drag the ShipDateKey attribute from the Reseller Sales table and drop it on the DateKey attribute
in the Ship Date table. Drag the ShipDateKey attribute from the Internet Sales table and drop it on
the DateKey attribute in the Ship Date table.

10. Click the Order Date table, and then click the Maximize button in its title bar.
11. Click the Create Hierarchy button in the title bar of the maximized Order Date table. When the new
hierarchy is created, change its name to Calendar Date.

12. Drag the Calendar Year, Month Name, and Date attributes (in that order) to the Calendar Date
hierarchy.

13. Repeat the previous two steps to create a second hierarchy named Fiscal Date that contains the
Fiscal Year, Fiscal Quarter, Month Name, and date attributes.

14. Right-click each column in the table (not the attributes in the hierarchy) that is not currently hidden
from client tools, and then click Hide from Client Tools.

15. Click the Restore button in the title bar of the maximized Order Date table.
16. Click the Maximize button in the title bar of the Ship Date table, and then repeat the previous 5
steps to create Calendar Date and Fiscal Date hierarchies in the Ship Date table.

17. On the Model menu, point to Model View, click Data View, and then click the Order Date tab.
18. On the Table menu, point to Date, and click Mark as Date Table. In the Mark as Date Table dialog
box, ensure that the Date column is selected in the list, and then click OK.

19. Click the Month Name column header, and on the Column menu, point to Sort, and then click Sort
by Column.

20. In the Sort by Column dialog box, in the Sort section, in the Column list, ensure that Month Name
is selected. Then, in the By section, in the Column list, select MonthNumber, and then click OK.
21. Repeat the previous 3 steps on the Ship Date tab to mark Ship Date as a date table and sort the
Month Name column by MonthNumber.

22. On the File menu, click Save All.

X Task 4: Implement a Parent-Child Hierarchy


1. In Data View, click the Salesperson tab.

2. Scroll to the right, double-click Add Column, type Path, and press Enter.

3. With the new Path column selected, in the formula bar, enter the following DAX expression:

=PATH([EmployeeKey], [ParentEmployeeKey])
MCT USE ONLY. STUDENT USE PROHIBITED
L5-18 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

4. Scroll to the right, double-click Add Column, type Level1, and then press Enter.

5. With the new Level1 column selected, in the formula bar, enter the following DAX expression:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 1, 1))

6. Scroll to the right, double-click Add Column, type Level2, and then press Enter.

7. With the new Level2 column selected, in the formula bar, enter the following DAX expression:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 2, 1))

8. Scroll to the right, double-click Add Column, type Level3, and then press Enter.

9. With the new Level3 column selected, in the formula bar, enter the following DAX expression:

=LOOKUPVALUE ([Name], [EmployeeKey], PATHITEM ([Path], 3, 1))

10. On the Model menu, point to Model View, and then click Diagram View. Click the Salesperson
table, and then click the Maximize button in its title bar.

11. Click the Create Hierarchy button in the title bar of the maximized Salesperson table. After the new
hierarchy is created, change its name to Salesperson.

12. Drag the Level1, Level2, and Level3 attributes (in that order) to the Salesperson hierarchy.

13. Click the EmployeeKey attribute, hold the Shift key and click the Level3 attribute immediately above
the Salesperson hierarchy. Then right-click the selected attributes and click Hide from Client Tools.

14. Click the Restore button in the title bar of the maximized Product table.
15. On the File menu, click Save All.

X Task 5: Deploy and Browse the Model


1. On the Build menu, click Deploy AWSalesTab. After deployment completes, click Close.

2. On the Model menu, click Analyze in Excel. When prompted, ensure that Current Windows User
and the (Default) perspective are selected, and click OK.

3. In Excel, in the PivotTable Fields pane, under Reseller Sales, select Reseller Revenue.

4. In the PivotTable Fields pane, under Product, select Products By Category.

5. In the PivotTable Fields pane, note that both Order Date and Ship Date dimensions are listed, and
then under Order Date, select Calendar Date.

6. In the PivotTable Fields pane, under Salesperson, select Salesperson.


7. Browse the PivotTable in Excel and verify that the hierarchies behave as expected. Close Excel without
saving the workbook.

8. Close Visual Studio, saving your work if prompted.

Results: At the end of this exercise, you will have a tabular model that includes balanced hierarchies, a
role-playing dimension, and a parent-child dimension.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-19

Lab C: Enhancing Data Models


Exercise 1: Enhancing a Multidimensional Data Model
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab05C\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.

5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Create Named Calculations


1. In the D:\Labfiles\Lab05C\Starter folder, double-click AWSalesMD.sln to open the solution in SQL
Server Data Tools.

2. In Solution Explorer, in the Data Source Views folder, double-click AW Data Warehouse.dsv.

3. Right-click the ResellerSales table, and then click New Named Calculation.

4. In the Create Named Calculation dialog box, in the Column name box, type Reseller Profit. In the
Expression box, type the following expression, and then click OK:

[SalesAmount] - [TotalProductCost]

5. Repeat the previous two steps to add a named calculation to the InternetSales table. Name the
calculation Internet Profit and use the same expression as for the Reseller Profit calculation.

6. In Solution Explorer, double-click Sales.cube.

7. In the Measures pane, right-click the Internet Sales measure group, and then click New Measure.

8. In the New Measure dialog box, ensure that the Usage is set to Sum, the Source table is set to
InternetSales, and select the Internet Profit source column. Click OK.

9. Repeat the previous two steps to add a measure for the Reseller Profit column in the ResellerSales
table to the Reseller Sales measure group.

10. On the File menu, click Save All.

X Task 3: Create Calculated Members


1. In the Cube Designer, on the Calculations tab, on the Cube menu, click New Calculated Member.

2. In the Name box, type [Total Revenue].


3. In the Expression box, type the following code:

[Measures].[Internet Revenue] + [Measures].[Reseller Revenue]

4. In the Format string drop-down list, click "$#,##0.00;($#,##0.00)".

5. In the Display folder box, type Total Sales.

6. On the Cube menu, click New Calculated Member.

7. In the Name box, type [Total Cost].


MCT USE ONLY. STUDENT USE PROHIBITED
L5-20 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

8. In the Expression box, type the following code:

[Measures].[Internet Cost] + [Measures].[Reseller Cost]

9. In the Format string drop-down list, click "$#,##0.00;($#,##0.00)".

10. In the Display folder box, type Total Sales.

11. On the Cube menu, click New Calculated Member.

12. In the Name box, type [Total Profit].

13. In the Expression box, type the following code:

[Measures].[Internet Profit] + [Measures].[Reseller Profit]

14. In the Format string drop-down list, click "$#,##0.00;($#,##0.00)".

15. In the Display folder box, type Total Sales.


16. On the Cube menu, click New Calculated Member.

17. In the Name box, type [Gross Margin].

18. In the Expression box, type the following code:

([Measures].[Internet Profit] + [Measures].[Reseller Profit])


/
([Measures].[Internet Revenue] + [Measures].[Reseller Revenue])

19. In the Format string drop-down list, click “Percent".


20. In the Display folder box, type Total Sales.

21. On the File menu, click Save All.

22. On the Build menu, click Process:


• If you are prompted to redeploy the database, click Yes.

• If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.

• If you are prompted to replace an existing database with the same name, click Yes.

23. In the Process Cube – Sales dialog box, click Run, and in the Process Progress dialog box, after
processing completes, click Close. Then, in the Process Cube – Sales dialog box, click Close.

X Task 4: Create a KPI


1. In the Cube Designer, click the KPIs tab, and then click the Reconnect button.

2. On the Cube menu, click New KPI.

3. In the Name box, type Gross Margin.


4. On the Metadata tab of the Calculation Tools pane, expand Measures, expand Total Sales, and
then drag Gross Margin to the Value Expression box.

5. In the Goal Expression box, type 0.4.


6. In the Status indicator list, select Shapes.

7. In the Status expression box, type the following expression:

CASE
WHEN([Measures].[Gross Margin]) < 0.3 THEN -1
MCT USE ONLY. STUDENT USE PROHIBITED
L5-21

WHEN ([Measures].[Gross Margin]) > 0.4 THEN 1


ELSE 0
END

8. On the File menu, click Save All.

9. On the Build menu, click Process:

• If you are prompted to redeploy the database, click Yes.


• If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.

• If you are prompted to replace an existing database with the same name, click Yes.
10. In the Process Cube – Sales dialog box, click Run, and in the Process Progress dialog box, when
processing has succeeded, click Close. Then, in the Process Cube – Sales dialog box click Close.

X Task 5: Create Perspectives


1. In the Cube Designer, click the Perspectives tab.
2. On the Cube menu, click New Perspective.

3. In the Perspective Name column, replace Perspective with Internet Sales.

4. If the Internet Sales measure group is not already expanded, expand it and clear the following check
boxes:

o Internet Unit Price.

o Internet Unit Cost.

5. Clear the check box for the Reseller Sales measure group (this clears the check boxes for all of the
measures in this measure group).

6. Clear the check boxes for the following dimensions:

o Reseller.

o Sales Territory.

o Salesperson.
7. Clear the check box for the Gross Margin KPI.

8. Clear the check boxes for the following calculated members:

o Total Revenue.
o Total Cost.

o Total Profit.

o Gross Margin.

9. Create another new perspective named Reseller Sales, and clear the check boxes for all objects other
than those in the following list:

o The Reseller Quantity, Reseller Cost, Reseller Revenue, and Reseller Profit measures.

o The Order Date, Product, Ship Date, Reseller, Sales Territory, and Salesperson dimensions.

10. On the File menu, click Save All.

X Task 6: Deploy and Browse the Model


1. On the Build menu, click Deploy AWSalesMD.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-22 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

• If you are prompted for credentials, use the ADVENTUREWORKS\ServiceAcct user account with the
password Pa$$w0rd.

• If you are prompted to replace an existing database with the same name, click Yes.

2. In the Cube Designer, click the Browser tab, and then click Analyze in Excel on the Cube menu.

3. In the Analyze in Excel dialog box, ensure that Sales is selected, and then click OK. If a Microsoft
Excel Security Notice dialog box is displayed, click Enable.

4. In Excel, in the PivotTable Fields pane, under Values, expand Total Sales, and then select Total
Revenue, Total Cost, Total Profit, and Gross Margin.
5. In the PivotTable Fields pane, under Product, select Products By Category.

6. In the PivotTable Fields pane, under KPIs, expand Gross Margin, and then select Status.

7. In the PivotTable, expand Clothing and view the Gross Margin Status indicator for each product
subcategory. Close Excel without saving the workbook.

8. In the Cube Designer, on the Browser tab, on the Cube menu, click Analyze in Excel.

9. In the Analyze in Excel dialog box, select Internet Sales, and then click OK. If a Microsoft Excel
Security Notice dialog box is displayed, click Enable.

10. In Excel, in the PivotTable Fields pane, verify that only four measures from the Internet Sales
measure group, and the Customer, Order Date, Product, and Ship Date dimensions are available.

11. Close Excel without saving the workbook.

12. Close Visual Studio, saving your work if prompted.

Results: At the end of this lab, you will have a multidimensional model that contains custom
calculations, a KPI, and perspectives.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-23

Exercise 2: Enhancing a Tabular Data Model


X Task 1: Create Calculated Measures
1. In the D:\Labfiles\Lab05C\Starter folder, double-click AWSalesTab.sln to open the solution in SQL
Server Data Tools. If the Tabular model designer dialog box is displayed, in the Workspace server
list, select localhost\SQL2, in the Compatibility level box, select SQL Server 2012 SP1 (1103), and
then click OK.

2. In Solution Explorer, double-click Model.bim.

3. On the Model menu, point to Process and click Process All. If the Impersonate Credentials dialog
box is displayed, enter the user name ADVENTUREWORKS\ServiceAcct and the password
Pa$$w0rd, and click OK. Then, in the Data Processing dialog box, when processing is complete, click
Close.
4. In the model designer, click the Internet Sales tab.

5. Scroll to the right, double-click Add Column, type Profit, and then press Enter.

6. With the new Profit column selected, in the formula bar, enter the following DAX expression:

=[SalesAmount] - [TotalProductCost]

7. Right-click the Profit column header, and then click Hide from Client Tools.

8. In the measure grid, click the cell directly under the new Profit column. On the Column menu, point
to AutoSum, and then click Sum.
9. In the formula bar, edit the DAX formula to change the measure name to Internet Profit, as shown
in the following code:

Internet Profit:=SUM([Profit])

10. Click the Reseller Sales tab, and repeat steps 4 to 9 to create a hidden calculated Profit column and
an aggregated measure named Reseller Profit.

11. On the File menu, click Save All.

X Task 2: Add a Table for Global Calculated Measures


1. Start Microsoft Excel and create a new blank workbook.

2. In cell A1, enter Total Sales.

3. Right-click cell A1, and then click Copy.

4. In the Visual Studio instance for the AWSalesTab project, click any table tab, and then click Paste on
the Edit menu.

5. In the Paste Preview dialog box, type Total Sales in the Table Name box, and then click OK.

6. Close Excel without saving the workbook.

7. In the Visual Studio instance for the AWSalesTab project, in the Total Sales table, in the first empty
cell in the measure grid, enter the following DAX expression:

Total Revenue:=[Internet Revenue] + [Reseller Revenue]

8. With the Total Revenue measure selected, press F4. In the Properties pane, set the Format property
to Currency.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-24 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

9. In the empty cell under the one in which you just created the Total Revenue measure, enter the
following DAX expression:

Total Cost:=[Internet Cost] + [Reseller Cost]

10. With the Total Cost measure selected, press F4. In the Properties pane, set the Format property to
Currency.

11. In the empty cell under the one in which you just created the Total Cost measure, enter the following
DAX expression:

Total Profit:=[Internet Profit] + [Reseller Profit]

12. With the Total Profit measure selected, press F4. In the Properties pane, set the Format property to
Currency.

13. In the empty cell under the one in which you just created the Total Profit measure, enter the
following DAX expression:

Gross Margin:=[Total Profit] + [Total Revenue]

14. With the Gross Margin measure selected, press F4. In the Properties pane, set the Format property
to Percentage.

15. On the File menu, click Save All.

X Task 3: Create a KPI


1. In the Total Sales table, in the measure grid, right-click the cell that contains the Gross Margin
measure, and then click Create KPI.

2. In the Key Performance Indicator (KPI) dialog box, select Absolute value and type in 0.4. Then,
drag the first slider to the value 0.3 and drag the second slider to the value 0.4.

3. Click OK.

4. On the File menu, click Save All.

X Task 4: Create Perspectives


1. On the Model menu, point to Perspectives, and then click Create and Manage.

2. In the Perspectives dialog box, click New Perspective.

3. In the Perspective Name column, replace New Perspective with Internet Sales.

4. If the Internet Sales measure group is not already expanded, expand it and select the following
check boxes:
• Internet Cost.
• Internet Profit.
• Internet Quantity.
• Internet Revenue.
5. Select the check boxes for the following dimensions:
• Customer.
• Order Date.
• Product.
• Ship Date.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-25

6. Create another new perspective named Reseller Sales, and clear the check boxes for all objects other
than those in the following list:
• The Reseller Quantity, Reseller Cost, Reseller Revenue, and Reseller Profit measures.
• The Order Date, Product, Ship Date, Reseller, Sales Territory, and Salesperson dimensions.
7. In the Perspectives dialog box, click OK.

8. On the File menu, click Save All.

X Task 5: Deploy and Browse the Model


1. On the Build menu, click Deploy AWSalesTab. After deployment completes, click Close.

2. On the Model menu, click Analyze in Excel. When prompted, ensure that Current Windows User
and the (Default) perspective are selected, and then click OK.

3. In Excel, in the PivotTable Fields pane, under Total Sales, and then select Total Revenue, Total Cost,
Total Profit, and Gross Margin.

4. In the PivotTable Fields pane, under Product, select Products By Category.

5. In the PivotTable Fields pane, under KPIs, expand Gross Margin, and then select Status.
6. In the PivotTable, expand Clothing and view the Gross Margin Status indicator for each product
subcategory. Close Excel without saving the workbook.

7. On the Model menu, click Analyze in Excel. When prompted, ensure that Current Windows User
and the Internet Sales perspective are selected, and then click OK.

8. In Excel, in the PivotTable Fields pane, verify that only four measures from the Internet Sales
measure group, and the Customer, Order Date, Product, and Ship Date dimensions available.
9. Close Excel without saving the workbook.

10. Close Visual Studio, saving your work if prompted.

Results: At the end of this exercise, you will have a tabular model that contains calculated measures, a
KPI, and perspectives.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L6-1

Module 6: Planning a BI Delivery Solution


Lab: Planning a BI Delivery Solution
Exercise 1: Assessing Reporting Requirements
X Task 1: Review Reporting Requirements
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab06\Starter folder.

4. In the D:\Labfiles\Lab06\Starter folder, double-click Interveiws.docx to open it in Microsoft Word.


5. Read the interviews in the document.

X Task 2: Plan Reporting Tools


1. Form a small group with two or three other students.

2. Discuss the reporting requirements in the interviews and agree on appropriate tools to support them.
3. In the D:\Labfiles\Lab06\Starter folder, double-click Reporting Requirements.docx to open it in
Microsoft Word.

4. Complete the table in Reporting Requirements.docx based on your discussion.

Results: At the end of this exercise, you will have a reporting requirements document that lists the
reporting scenarios that the BI solution must support, and the tools that you plan to use.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L7-1

Module 7: Designing a Reporting Services Solution


Lab: Designing a Reporting Services
Solution
Exercise 1: Designing a Reporting Environment
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab07\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Review Reporting Requirements


1. View the contents of the D:\Labfiles\Lab07\Starter folder.
2. In the D:\Labfiles\Lab07\Starter folder, double-click Interviews.docx to open it in Microsoft Word.

3. Read the interviews in the document.

X Task 3: Design a Report Folder Structure


1. Start Internet Explorer and browse to http://mia-sqlbi/reports_sql2.
2. Click New Folder. On the New Folder page, in the Name box, click the x icon to clear the existing
text and type Data Sources, and then click OK.

3. Repeat the previous step to create the following folders:


o Datasets

o Report Parts

o Reports

o Templates

4. On the Home page, click the Reports folder.

5. In the Reports folder, create the following folders:

o Executives

o Sales

o Finance

6. On the Reports page, click the Sales folder.

7. In the Sales folder, create the following folders:

o North America
o Europe

o Pacific
MCT USE ONLY. STUDENT USE PROHIBITED
L7-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

8. Minimize Internet Explorer.

Results: At the end of this exercise, you should have created folders in the report server at http://mia-
sqlbi/reports_sql2.
MCT USE ONLY. STUDENT USE PROHIBITED
L7-3

Exercise 2: Creating Reusable Report Items


X Task 1: Create Shared Data Sources
1. On the taskbar, click SQL Server Data Tools.

2. On the File menu, point to New, and then click Project.

3. In the New Project dialog box, click Report Server Project, in the Name text box, type AWReports,
in the Location box, browse to D:\Labfiles\Lab07\Starter, and then click OK.

4. In Solution Explorer, right-click Shared Data Sources, and then click Add New Data Source. In the
Shared Data Source Properties dialog box:
a. In the Name box, type AWDataWarehouse.

b. In the Type list, ensure that Microsoft SQL Server is selected.

c. In the Connection string box, type the following connection string:

SERVER=MIA-SQLDW;DATABASE=AWDataWarehouse;

d. Click the Credentials tab.

e. Select, Use Windows Authentication (integrated security).

f. Click OK.

5. Repeat the previous step to create a second data source with the following properties:

o Name: AWSalesMD

o Type: Microsoft SQL Server Analysis Services

o Connection String:

DATA SOURCE=MIA-SQLBI;INITIAL CATALOG=AWSalesMD;

o Credentials: Windows authentication

6. In Solution Explorer, right-click AWReports, and then click Properties. In the AWReports Property
Pages dialog box, set the following properties, and then click OK:

o TargetDataSetFolder: Datasets

o TargetDataSourceFolder: Data Sources

o TargetReportFolder: Templates

o TargetReportPartFolder: Report Parts

o TargetServerURL: http://mia-sqlbi/reportserver_sql2

7. In Solution Explorer, right-click AWReports, and then click Deploy.

8. After the deployment succeeds, maximize Internet Explorer, browse to the Home page in Report
Manager at http://mia-sqlbi/reports_sql2, and then click Data Sources.
9. Hold the mouse over AWDataWarehouse, click the drop-down arrow that appears, and then click
Manage.

10. In the AWDataWarehouse page:

11. Select Credentials stored securely in the report server, and then enter the following credentials:

12. User name: ADVENTUREWORKS\ServiceAcct


MCT USE ONLY. STUDENT USE PROHIBITED
L7-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

13. Password: Pa$$w0rd

14. Select Use as Windows credentials when connecting to the data source.

15. Click Test Connection, and after the connection is created successfully, click Apply. If the connection
fails, correct any errors in the settings, and then try again.

16. Click the Data Sources link at the top of the page, and then repeat steps 9 and 10 for the
AWSalesMD data source.

17. Close Internet Explorer.

X Task 2: Create a Report Template


1. In the AWReports project, in Solution Explorer, right-click Reports, point to Add, and click New
Item. Then, in the Add New Item – AWReports dialog box, select Report, change the Name
property to AWReport.rdl, and then click Add.

2. Click the design surface for the new report, and then on the Report menu, click Add Page Header.

3. If the Toolbox pane is not visible, on the View menu, click Toolbox. Drag a Text Box to the upper-
right corner of the report in the page header area.

4. Right-click the text box, and then click Text Box Properties. In the Text Box Properties dialog box,
change the Name to PageNumber, and then click the expression (fx) button for the Value property.

5. In the Expression dialog box, in the Category list, click Built-in Fields, and in the Item list double-
click OverallPageNumber. Ensure that the expression is set to =Globals!OverallPageNumber, and
then click OK.

6. In the Text Box Properties dialog box, click the Alignment tab, click Right and in the Horizontal
list, click Right. Then click OK.

7. Resize the page header so that it is just big enough to contain the text box.

8. Drag a Text Box to the upper-left of the page, under the page header area.

9. Right-click the text box, and then click Text Box Properties. In the Text Box Properties dialog box,
change the Name to ReportName, and then click the expression (fx) button for the Value property.

10. In the Expression dialog box, in the Category list, click Built-in Fields, and in the Item list double-
click ReportName. Ensure that the expression is set to =Globals!ReportName, and then click OK.

11. In the Text Box Properties dialog box, click the Alignment tab, and in the Horizontal list, click
Right.

12. Click the Font tab, select Bold, and in the Size list, click 16pt. Click OK, and resize the text box so that
it is the same width as the page and you can read the text ([&ReportName]).

13. Drag a Text Box to the page, under the ReportName text box.

14. Right-click the text box, and then click Text Box Properties. In the Text Box Properties dialog box,
change the Name to ExecutionTime, and then click the expression (fx) button for the Value
property.

15. In the Expression dialog box, in the Category list, click Built-in Fields, and in the Item list double-
click ExecutionTime. Ensure that the expression is set to =Globals!ExecutionTime, and then click
OK.

16. In the Text Box Properties dialog box, click the Number tab. In the Category list, click Date, and in
the Type list select *Monday, January 31, 2000 1:30 PM.

17. Click the Alignment tab and in the Horizontal list, click Right. Click OK, and resize the text box so
that it is the same width as the page and you can read the text ([&ExecutionTime]).
MCT USE ONLY. STUDENT USE PROHIBITED
L7-5

18. On the Report menu, click Add Page Footer.

19. Drag an Image to the footer and place it at the left edge of the page.

20. In the Image Properties dialog box, change the Name property to AWLogo, and then click Import.
Browse to the D:\Labfiles\Lab07\Starter folder, select Adventure Works Logo.jpg, and then click
Open. Click OK to close the Image Properties dialog box.

21. Resize the image so that it is approximately the same width as the text “[&ReportName]” in the
ReportName text box.

22. Drag a Text Box to the page, next to the image in the page footer.
23. Right-click the text box, and then click Text Box Properties. In the Text Box Properties dialog box,
change the Name to Warning and the Value to Property of Adventure Works Cycles – Do not
distribute without permission, and then click OK.
24. Widen the text box so that its text can be read.

25. Click the PageNumber text box, and then hold the Ctrl key and click the ReportName,
ExecutionTime, and Warning text boxes and press F4.
26. In the properties pane, set the CanGrow property to False.

27. On the File menu, click Save All.

X Task 3: Create Shared Datasets


1. In Solution Explorer, right-click Shared Datasets, and then click Add New Dataset.
2. In the Shared Dataset Properties dialog box, change the Name property to Internet Sales By
Country, in the data source list, select AWDataWarehouse, under the Query box, click Import.
Browse to the D:\Labfiles\Lab07\Starter folder, select Internet Sales By Country.sql, and then click
Open.

3. In the Shared Dataset Properties dialog box, click the Parameters tab, and in the Data Type list for
both the @StartDate and @EndDate parameters that have been created, select Date/Time. Then
click OK.

4. In Solution Explorer, right-click Shared Datasets, and then click Add New Dataset.

5. In the Shared Dataset Properties dialog box, change the Name property to Reseller Sales By
Region, in the data source list, select AWDataWarehouse, under the Query box, click Import.
Browse to the D:\Labfiles\Lab07\Starter folder, select Reseller Sales By Region.sql, and then click
Open.
6. In the Shared Dataset Properties dialog box, click the Parameters tab.

7. In the Data Type list for both the @StartDate and @EndDate parameters that have been created,
select Date/Time.

8. In the Data Type list for the @Regions parameter, select Text, check the Allow multiple values
check box, check the check box to the right of the @Regions text box, and in the empty box, type
Europe,North America,Pacific. Click OK.
9. In Solution Explorer, right-click Shared Datasets, and then click Add New Dataset.

10. In the Shared Dataset Properties dialog box, change the Name property to Last Month, in the data
source list, select AWDataWarehouse, and under the Query box, click Import.

11. Browse to the D:\Labfiles\Lab07\Starter folder, select Last Month.sql, and then click Open.

12. In the Shared Dataset Properties dialog box, click OK.


MCT USE ONLY. STUDENT USE PROHIBITED
L7-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 4: Deploy Template Items


1. In Solution Explorer, right-click AWReports, and then click Properties. In the AWReports Property
Pages dialog box, verify that the following properties are set as listed here, and then click OK:

o OverwriteDatasets: False

o OverwriteDataSources: False

o TargetDataSetFolder: Datasets
o TargetDataSourceFolder: Data Sources

o TargetReportFolder: Templates

o TargetReportPartFolder: Report Parts

o TargetServerURL: http://mia-sqlbi/reportserver_sql2

2. In Solution Explorer, right-click AWReports, and then click Deploy.

3. On the File menu, click Save All. Close Visual Studio.


4. Use File Explorer to copy the following files from the D:\Labfiles\Lab07\Starter\AWReports folder to
the C:\Program Files (x86)\Microsoft Visual Studio
10.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject folder:
o AWReport.rdl

o Reseller Sales By Region.rsd

o Internet Sales By Country.rsd


o Last Month.rsd

X Task 5: Create a Chart Report Part


1. Start Internet Explorer and browse to http://mia-sqlbi/reports_sql2.

2. Click the Report Parts folder, and then, click Report Builder. If you are prompted to run the
program, click Run.

3. When Report Builder starts, in the Getting Started dialog box, make sure that New Report is
selected, and then click Chart Wizard.

Note: If the Getting Started dialog box is not displayed, click the round button at the upper left of the
Report Builder window, and then click New.

4. On the Choose a dataset page, select Choose an existing dataset in this report or a shared dataset,
and click Browse. In the Select Dataset dialog box, browse to the Datasets folder, select Internet Sales
By Country, and then click Open.

5. On the Choose a dataset page, make sure that Internet Sales By Country is selected, and then click
Next.

6. On the Choose a chart type page, select Bar, and then click Next.

7. On the Arrange chart fields page, drag Country to the Series area, drag Revenue to the Values
area, and then click Next.

8. On the Choose a style page, select Corporate, and then click Finish.

9. In Report Builder, click the chart, and then on the View tab of the ribbon, click Properties.

10. Change the Name property to InternetSalesChart.


MCT USE ONLY. STUDENT USE PROHIBITED
L7-7

11. Resize the chart, and change its title to Internet Sales Revenue. Apply any other formatting changes
that you want.

12. Click the round button at the upper left, and then click Publish Report Parts. In the Publish Report
Parts dialog box, click the Review and modify report parts before publishing.

13. In the Publish Report Parts dialog box, expand InternetSalesChart and verify that the /Report
Parts folder is selected. Clear all check boxes other than InternetSalesChart, enter an appropriate
description, and then click Publish. After the report part is published successfully, click Close.

14. Close Report Builder without saving the report.

15. In Internet Explorer, refresh the view of the Report Parts folder and verify that the report part has
been published. Close Internet Explorer.

Results: At the end of this exercise, you will have published shared data sources, a report template,
shared datasets, and a report part.
MCT USE ONLY. STUDENT USE PROHIBITED
L7-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 3: Creating Reports


X Task 1: Create a Report From a Template
1. On the taskbar, click SQL Server Data Tools.

2. On the File menu, point to New, and then click Project.

3. In the New Project dialog box, click Report Server Project, in the Name text box, type Executive
Reports, in the Location box, browse to D:\Labfiles\Lab07\Starter, and then click OK.

4. In Solution Explorer, right-click Shared Data Sources, and then click Add New Data Source. In the
Shared Data Source Properties dialog box:
a. In the Name box, type AWDataWarehouse

b. In the Type list, ensure that Microsoft SQL Server is selected.

c. In the Connection string box, type the following connection string:

SERVER=MIA-SQLDW;DATABASE=AWDataWarehouse;

d. Click the Credentials tab.

e. Select Use Windows Authentication (integrated security).

f. Click OK.

5. In Solution Explorer, right-click Shared Datasets, point to Add, and then click New Item.

6. In the Add New Item – Executive Reports dialog box, select Reseller Sales By Region, change the
Name property to Reseller Sales By Region.rsd, and then click Add. The dataset is added and then
opened so you can see its source XML definition. Close the Reseller Sales by Region.rsd window.

7. Repeat the previous step to add a dataset named Last Month.rsd based on the Last Month item
template.

8. In Solution Explorer, right-click Reports, point to Add, and then click New Item.

9. In the Add Item – Executive Reports dialog box, select AWReport, change the Name property to
Reseller Sales.rdl, and then click Add.

10. Click the report design surface, and then on the View menu, click Report Data.

11. In the Report Data pane, right-click Data Sources, and then click Add Data Source. In the Data
Source Properties dialog box, change the Name property to AWDataWarehouse, select Use
shared data source reference, in the drop-down list select the AWDataWarehouse shared data
source, and then click OK.

12. In the Report Data pane, right-click Datasets, and then click Add Dataset. In the Dataset Properties
dialog box, change the Name property to ResellerSalesByRegion, in the list of shared datasets,
select Reseller Sales By Region, and then click OK.

13. Repeat the previous step to add a data set named LastMonth, based on the Last Month shared
dataset, to the report.

14. In the Report Data pane, expand Parameters and double-click StartDate.

15. In the Report Parameter Properties dialog box, on the Default Values tab, select Get values from
a query. In the Dataset list, select LastMonth, in the Value field list, select FirstDay, and then click
OK.

16. Repeat the previous two steps to set the default value of the EndDate parameter to the LastDay field
in the LastMonth dataset.
MCT USE ONLY. STUDENT USE PROHIBITED
L7-9

17. View the Toolbox pane, and drag a Table to the report.

18. In the Report Data pane, expand the ResellerSalesByRegion dataset, and drag the Revenue field to
the first column in the table.

19. Drag the Territory field to the Row Groups pane below the report, and drop it above the Details
group.

20. Drag the Country field to the Row Groups pane under the report, and drop it above the Territory
group.

21. Drag the Region field to the Row Groups pane under the report, and drop it above the Country
group.

22. Right-click the column headers for each empty column in the table, and then click Delete Columns.

23. Click the Preview tab to view the report, and note that the default values for all parameters are used.
Click the Design tab and apply any formatting you want to improve the report.

24. When you are satisfied with the report, in Solution Explorer, right-click Executive Reports and then
click Properties. In the Executive Reports Property Pages dialog box, set the following properties,
and then click OK.

a. OverwriteDatasets: False

b. OverwriteDataSources: False

c. TargetDataSetFolder: Datasets

d. TargetDataSourceFolder: Data Sources

e. TargetReportFolder: Reports/Executives
f. TargetReportPartFolder: Report Parts

g. TargetServerURL: http://mia-sqlbi/reportserver_sql2

25. In Solution Explorer, right-click Executive Reports, and then click Deploy. After deployment
succeeds, close Visual Studio.

26. Start Internet Explorer and browse to http://mia-sqlbi/reports_sql2. Click the Reports folder and
the Executives folder.

27. Click the Reseller Sales report and verify that it displays the reseller sales data by sales region. Keep
Internet Explorer open for the next task.

X Task 2: Create Linked Reports


1. In Internet Explorer, while viewing the Reseller Sales report, click the Executives link at the top of
the page.

2. Hold the mouse over Reseller Sales, click the drop-down arrow that appears, and then click Create
Linked Report.

3. In the Name box, type Reseller Sales – Europe, and then click Change Location.

4. In the folder tree, expand Sales, click Europe, and then click OK. On the Reseller Sales page, click
OK. A linked version of the report is created in the Europe folder and displayed.

5. At the top of the page, click Reseller Sales – Europe to view the properties of the linked report.

6. Click the Parameters tab, and in the row for the Regions parameter, click Override Default. Click
the empty Default Value box, and in the top line of the drop down list, type Europe.

7. Select the Hide check box for the Regions parameter, and then click Apply.
MCT USE ONLY. STUDENT USE PROHIBITED
L7-10 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

8. At the top of the page, click the large Reseller Sales – Europe page title to view the report, and then
verify that it is filtered to show only sales in Europe.

9. At the top of the page, click Reports, and then click Executives. Repeat steps 2 to 8 to create a linked
report named Reseller Sales – North America in the Sales/North America folder with a hidden
Regions parameter that has a default value of North America.

10. Repeat the previous step to create a linked report named Reseller Sales – Pacific in the Sales/Pacific
folder with a hidden Regions parameter that has a default value of Pacific.

11. Keep Internet Explorer open for the next task.

X Task 3: Create a Report with Report Builder


1. In Internet Explorer, at the top of the page, click Home. Click Reports, and then click Finance.

2. On the Finance page, click Report Builder. If you are prompted to run the program, click Run.

3. When Report Builder starts, in the Getting Started dialog box, click Open.
Note: If the Getting Started dialog box is not displayed, click the round button at the upper left of the
Report Builder window, and then click Open.

4. In the Open Report dialog box, browse to the Templates folder, select AWReport, and then click
Open.

5. Click the round button at the top left of the Report Builder window, and then click Save As. In the
Save As Report dialog box, click the Up One Level button, double-click Reports, double-click
Finance, change the Name property to Internet Sales, and then click Save.

6. On the ribbon, on the Insert tab, click Report Parts. Then, in the Report Part Gallery pane, in the
search box, type InternetSales and then click the search button. The chart you created earlier should
be displayed in the Report Part Gallery pane.

7. Double-click the InternetSalesChart icon to add the chart to the report, and then resize the chart
and report so that the chart is at the top of the main page area, under the ExecutionTime text box.
Leave a space under the chart so you can add a table to the report later.

Tip: You can close the Properties and Report Part Gallery panes to create more working space.

8. In the Report Data pane, expand Datasets and note that the InternetSalesByCountry dataset used
by the chart has been added to the report. Expand Parameters and note that the StartDate and
EndDate parameters used by the dataset have also been added to the report.

9. Right-click Datasets, and then click Add Dataset. In the Dataset Properties dialog box, set the
Name property to LastMonth, and then click Browse.

10. In the Select Dataset dialog box, browse to the Datasets folder, select Last Month, and then click
Open. In the Dataset Properties dialog box, click OK.

11. In the Report Data pane, double-click the StartDate parameter.

12. In the Report Parameter Properties dialog box, on the Default Values tab, select Get values from
a query. In the Dataset list, select LastMonth, in the Value field list, select FirstDay, and then click
OK.

13. Repeat the previous two steps to set the default value of the EndDate parameter to the LastDay field
in the LastMonth dataset.
14. On the ribbon, on the Insert tab, click Table, and then click Insert Table. Click and drag under the
chart to create the table.
MCT USE ONLY. STUDENT USE PROHIBITED
L7-11

15. In the Report Data pane, expand the InternetSalesByCountry dataset, and drag the Revenue field
to the first column in the table.

16. Drag the Cost field to the second column in the table.

17. Right-click the empty cell in the details row of the third column, and then click Expression. In the
Expression dialog box, enter the following expression, and then click OK.

=Fields!Revenue.Value - Fields!Cost.Value

18. Click the empty header row cell above the expression you just entered, and type Profit.

19. Drag the City field to the Row Groups pane under the report, and then drop it above the Details
group.

20. Drag the StateOrProvince field to the Row Groups pane under the report, and then drop it above
the City group.

21. Drag the Country field to the Row Groups pane under the report, and then drop it above the
StateOrProvince group.

22. Drag the Region field to the Row Groups pane under the report, and then drop it above the
Country group.

23. Click the Run button to view the report, and note that the default values for all parameters are used.
Click the Design tab, and apply any formatting you want to improve the report.

24. When you are satisfied with the report, click the Save button and close Report Builder.
25. In Internet Explorer, refresh the Finance page and verify that the report has been saved there. Click
Internet Sales to view the report.

26. Close Internet Explorer.

Results: At the end of this exercise, you should have created a report from a template, created a linked
report, and used Report Builder to create a report that includes a previously published report part.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L8-1

Module 8: Designing a Microsoft Excel-Based Reporting


Solution
Lab: Designing a Microsoft Excel-Based
Reporting Solution
Exercise 1: Planning Excel Features
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab08\Starter folder.


4. Right-click Setup.cmd, and then click Run as administrator.

5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Review Reporting Requirements


1. In the D:\Labfiles\Lab08\Starter folder, double-click Interviews.docx to open it in Microsoft Word.

2. Read the interviews in the document.

X Task 3: Determine Required Excel Features


1. In the D:\Labfiles\Lab08\Starter folder, double-click Excel Features.docx.
2. Complete the table in the document based on the information provided in the interviews you
reviewed in the previous task.

3. Save the document and close Microsoft Word.

Results: At the end of this exercise, you should have a document that contains a list of the required
Excel features.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 2: Analyzing a Cube with Excel


X Task 1: Connect to Analysis Services Data from Excel
1. On the taskbar, click the Excel 2013 icon.

2. On the Excel start page, click Blank workbook.

3. On the File tab, click Save As, and on the Save As page, click Browse. Browse to the
D:\Labfiles\Lab08\Starter folder and save the workbook as Sales Analysis.xlsx.

4. On the Data tab, click From Other Sources, and in the drop-down list, click From Analysis Services.

5. In the Data Connection Wizard dialog box, in the 1. Server name box, type MIA-SQLBI. Ensure that
Use Windows Authentication is selected, and then click Next.

6. On the Select Database and Table page, select the AWSalesMD database, and in the list of cubes,
select Sales, and then click Next.

7. On the Save Data Connection File and Finish page, click Finish.

8. In the Import Data dialog box, ensure that PivotTable Report is selected; ensure that Existing
worksheet is selected; and on the worksheet behind the dialog box, click cell B16. Click OK.

X Task 2: Add Data to a PivotTable


1. In the PivotTable Fields pane, under Reseller Sales, select the Reseller Revenue measure.

2. Under Sales Territory, select the Sales Territory By Geography hierarchy.

3. Under Ship Date, select the ShipDate.Fiscal Date hierarchy.

X Task 3: Add Slicers


1. On the ribbon, in the PIVOTTABLE TOOLS section, on the ANALYZE tab, click Insert Slicer.

2. In the Insert Slicers dialog box, under Product, select Category, and under Reseller, select Business
Type. Click OK.
3. Click the caption of the Category slicer and on the ribbon, in the SLICER TOOLS section, on the
OPTIONS tab, click Slicer Settings.

4. In the Slicer Settings dialog box, select the Hide items with no data check box, and then click OK.
5. Repeat the previous two steps to hide items with no data in the Business Type slicer.

6. Resize and move the slicers so that they are above the PivotTable.

7. In the Category slicer, click Bikes. In the Business Type slicer, click Specialty Bike Shop. The data in
the PivotTable is filtered to include only sales of bikes to specialty bike shops.

8. Click the Clear Filter button at the upper right of each slicer to remove the filters.

X Task 4: Add a PivotChart


1. Click any cell in the PivotTable, and then on the ribbon, on the INSERT tab, on the PivotChart drop-
down list, click PivotChart.

2. In the Insert Chart dialog box, ensure that Clustered Column is selected, and then click OK.

3. Move the chart to the right of the slicers above the PivotTable.

4. In the chart, expand North America, and then verify that the chart is updated to reflect the
expanded hierarchy.

5. In the Business Type slicer, select Warehouse, and then verify that the chart is filtered by the slicer.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-3

6. Close Excel, saving your changes.

Results: At the end of this exercise, you will have an Excel workbook that contains a PivotTable and a
PivotChart based on an Analysis Services cube.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 3: Creating a PowerPivot Workbook


X Task 1: Verify PowerPivot Add-In Installation
1. On the taskbar, click the Excel 2013 icon.

2. On the Excel start page, click Blank workbook.

3. On the ribbon, on the FILE tab, click Save As, and on the Save As page, click Browse. Browse to the
D:\Labfiles\Lab08\Starter folder and save the workbook as Marketing Analysis.xlsx.

4. On the ribbon, on the FILE tab, click Options.

5. In the Options dialog box, click Add-Ins. In the Manage drop-down list, select COM Add-ins, and
then click Go.

6. In the COM Add-Ins dialog box, ensure that the Microsoft Office PowerPivot for Excel 2013 add-
in is selected, and then click Cancel.

X Task 2: Create a PowerPivot Data Model


1. In Excel, on the ribbon, on the POWERPIVOT tab, click Manage.

2. In the PowerPivot for Excel – Marketing Analysis.xlsx window, on the ribbon, on the Home tab, in
the Get External Data drop-down list, in the From Database drop-down list, click From SQL Server.
3. In the Table Import Wizard dialog box, in the Server name box, type MIA-SQLDW. Ensure that
Use Windows Authentication is selected, and in the Database name drop-down list, select
AWDataWarehouse. Click Next.
4. On the Choose How to Import the Data page, ensure that Select from a list of tables and views to
choose the data to import is selected, and then click Next.

5. On the Select Tables and Views page, select the following views, change the friendly name of
InternetSales to Internet Sales, and then click Finish:

o Customer

o Date

o InternetSales

o Product

o Promotion
6. On the Importing page, after the data is successfully imported, click Close.

X Task 3: Create Relationships


1. In the PowerPivot for Excel – Marketing Analysis.xlsx window, on the ribbon, on the Home tab, click
Diagram View.

2. Arrange the tables so that you can see both the InternetSales and Date tables. Drag the
OrderDateKey field from the InternetSales fact table to the DateKey field in the Date dimension
table to create a relationship in which OrderDateKey is the foreign key, and DateKey is the primary
key.

3. Repeat the previous step to create the following relationships.


MCT USE ONLY. STUDENT USE PROHIBITED
L8-5

Dimension
Fact Table Foreign Key Primary Key
Table

InternetSales ProductKey Product ProductKey

InternetSales CustomerKey Customer CustomerKey

InternetSales PromotionKey Promotion PromotionKey

X Task 4: Configure Attributes


1. In the PowerPivot for Excel – Marketing Analysis.xlsx window, on the ribbon, on the Home tab, click
Data View.

2. On the Customer table, scroll to the right and double-click the Add Column column header.

3. Type Birth Year, and press enter. Then, with the new Birth Year column selected, in the formula bar,
enter the following DAX expression:

=YEAR([BirthDate])

4. Click the tab for the Date table, and then on ribbon, on the Design tab, in the Mark as Date Table
drop-down list, click Mark as Date Table. In the Mark as Date Table dialog box, ensure that the
DateAltKey column is selected in the list, and then click OK.

5. Click the MonthName column header, and on the ribbon, on the Home tab, click Sort by Column.

6. In the Sort by Column dialog box, in the Sort section, in the Column list, ensure that MonthName
is selected. In the By section, in the Column list, select MonthNumber, and then click OK.

7. On the ribbon, on the Home tab, click Diagram View.


8. In diagram view, click the caption of the Customer table, and then click the Maximize button on its
title bar.

9. While holding the Ctrl key, click the following columns in the Customer table:

o CustomerKey

o CustomerAltKey

o Title

o Name

o BirthDate

o StreetAddress

o EmailAddress

o Phone

o CurrentFlag
o StartDate

o EndDate

10. Right-click any of the selected columns, and then click Hide from Client Tools. The columns, which
you did not select, will remain visible as dimension attributes.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

11. Right-click the StateOrProvince column, click Rename, and then rename the column to State Or
Province. Then rename the following columns:

o MaritalStatus: Marital Status.

o NumberChildren: Children.

o NumberCars: Cars.
o CommuteDistance: Commute Distance.

o PostalCode: Postal Code.

12. Click the Restore button to return the Customer table to its restored state.

13. Repeat steps 8 to 12 as necessary to configure the columns in the following tables.

Table Visible Columns

Internet Sales • TotalProductCost (renamed to Cost)


• SalesAmount (renamed to Revenue)
• OrderDate (renamed to Order Date)

Date • DayNumberOfMonth (renamed to Day)


• MonthName (renamed to Month)
• CalendarYear (renamed to Year)

Product • ProductName (renamed to Product)


• ProductSubcategoryName (renamed to Subcategory)
• ProductCategoryName (renamed to Category)

Promotion • PromotionName (renamed to Promotion)


• PromotionType (renamed to Promotion Type)

X Task 5: Create Hierarchies


1. In diagram view, click the Product table, and then click the Maximize button in its title bar.

2. Click the Create Hierarchy button in the title bar of the maximized Product table.

3. When the new hierarchy is created, change its name to Products By Category.

4. Drag the Category, Subcategory, and Product attributes (in that order) to the Products By
Category hierarchy.
5. Click the Product attribute that is not in the hierarchy, hold the Shift key and click the Category
attribute, right-click the selected attributes, and then click Hide from Client Tools.

6. Click the Restore button in the title bar of the maximized Product table.
7. Repeat steps 1 to 6 to create a hierarchy named Calendar Date in the Date table with the following
members:

o Year

o Month

o Day
MCT USE ONLY. STUDENT USE PROHIBITED
L8-7

8. Repeat steps 1 to 6 to create a hierarchy named Sales Promotion in the Promotion table with the
following members:

o Promotion Type

o Promotion

9. Repeat steps 1 to 6 to create a hierarchy named Customers By Geography in the Customers table
with the following members:

o Country

o State Or Province

o City

o Postal Code

X Task 6: Test the PowerPivot Data Model


1. In the PowerPivot for Excel – Marketing Analysis.xlsx window, on the ribbon, on the File tab, click
Save.

Note: If an error message is displayed when saving the workbook in this pre-release version of Excel 2013,
click OK to close it.

2. In the ribbon, on the Home tab, in the PivotTable drop-down list, click PivotTable. In the Insert
Pivot dialog box, select Existing Worksheet, select cell A15, and then click OK.

3. In the PivotTable Fields pane, expand Internet Sales, and then select Revenue.
4. In the PivotTable Fields pane, expand Product, and then drag Products By Category to the ROWS
area.

5. In the PivotTable Fields pane, expand Promotion, and then drag Sales Promotion to the
COLUMNS area.

6. In the VALUES area at the bottom of the PivotTable Fields pane, click the drop-down arrow for the
Sum of Revenue field, and then click Value Field Settings.
7. In the Value Field Settings dialog box, click Number Format. In the Format Cells dialog box, click
the Accounting category, and then click OK. In the Value Field Settings dialog box, click OK to
close it.

8. In the PivotTable Fields pane, under the Customer table, expand More Fields, and then drag Cars,
Children, and Birth Year to the FILTERS area.

9. On the ribbon, in the PIVOTTABLE TOOLS section, on the ANALYZE tab, click Insert Slicer. In the
Insert Slicers dialog box, under the Customer table, expand More Fields, select the following fields,
and then click OK:

o Marital Status

o Gender

10. Move the slicers above the PivotTable, and then click the caption of the Marital Status slicer and on
the ribbon, in the SLICER TOOLS section, on the OPTIONS tab, click Slicer Settings.

11. In the Slicer Settings dialog box, select the Hide items with no data check box, and then click OK.

12. Repeat the previous two steps to hide items with no data in the Gender slicer.

13. On the ribbon, in the PIVOTTABLE TOOLS section, on the ANALYZE tab, click Insert Timeline. In
the Insert Timelines dialog box, select Order Date, and then click OK.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

14. Move the Order Date timeline above the PivotTable, and then change the time period from
MONTHS to YEARS.

15. Apply the following filters to the data:

o In the Marital Status slicer, click S.

o In the Gender slicer, click F.


o In the Order Date timeline, drag the ends of the blue bar to select only the last two years.

o In the Cars filter, select 0.

o In the Children filter, select 0.

o In the Birth Year filter, select Select Multiple Items, and then select all years later than 1970.

16. On the ribbon, on the FILE tab, click Save.

Note: If an error message is displayed when saving the workbook in this pre-release version of Excel 2013,
click OK to close it.

17. Keep the Excel workbook open for the next exercise, but close the PowerPivot window.

Results: At the end of this exercise, you will have an Excel workbook that contains a PowerPivot data
model based on data from the data warehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
L8-9

Exercise 4: Using Power View in Excel


X Task 1: Create a Power View Report
1. In the Marketing Analysis.xlsx workbook you created in the previous exercise, in the ribbon, on the
INSERT tab, click Power View.

Note: If a Power View report does not open in the POWER VIEW tab of the ribbon, view Excel options and
remove the Power View COM add-in, and then add it again.

2. In the Power View report, click Click here to add a title, and then type Sales Promotion Analysis.

3. On the ribbon, on the POWER VIEW tab, click Filters Area to hide the filters area.
4. In the Power View Fields pane, expand Internet Sales, and then select Revenue. Expand Promotion,
expand Sales_Promotion, select Promotion_Type. Expand Customer, and then click Commute
Distance.
5. On the Bar Chart drop-down list, click Clustered Bar. Resize the chart so it fills the left half of the
report.

6. In the Power View Fields pane, expand Date, expand Calendar_Date, and then drag the Year field
to the TILE BY area. Click each of the year headers in the report to view revenue by promotion type
broken down by commute distance for each year.

7. Click the blank area of the report on the right, and in the Power View Fields pane, under Internet
Sales, click Revenue. Then under Customer, expand Customers_By_Geography, select Country.

8. On the Other Chart drop-down list, click Pie. Resize the chart so it fills the top of the right half of the
report.
9. Click the blank area of the report on the right under the pie chart, and in the Power View Fields
pane, under Internet Sales, click Revenue. Under Customer, select Cars.

10. On the Column Chart drop-down list, click Stacked Column. Resize the chart so it fills the bottom of
the right half of the report.

11. In the Commute Distance legend, click the colored square for the 0-1 Miles category, and note that
all of the charts are shaded to reflect the selected value.

12. Click each of the other Commute Distance legend values and note the shading in all charts.

13. Click the currently selected Commute Distance legend value again to remove the shading.

14. On the ribbon, on the FILE tab, click Save.

Note: If an error message is displayed when saving the workbook in this pre-release version of Excel 2013,
click OK to close it.

15. Close Excel.

Results: At the end of this exercise, you will have an Excel workbook that contains a Power View report
based on a PowerPivot data model.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L9-1

Module 9: Planning a SharePoint Server BI Solution


Lab: Planning a SharePoint Server BI
Solution
Exercise 1: Creating a SharePoint Server Site for BI
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab09\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Enable SharePoint Publishing


1. Start Internet Explorer and if the Adventure Works Portal site does not open, browse to http://mia-
sqlbi/sites/adventureworks.

2. In the title bar for the home page, next to Student, click the Settings icon and in the menu, click Site
Settings.

3. On the Site Settings page, under Site Collection Administration, click Site Collection Features.

4. On the Site Collection Features page, in the SharePoint Server Publishing Infrastructure row, click
Activate, and then wait for the Active indicator to appear.
Note: The feature can take a few minutes to activate.

5. At the top of the Site Collection Features page, click Site Settings to return to the Site Settings
page.
6. Under Site Actions, click Manage Site Features.

7. On the Site Features page, in the SharePoint Server Publishing row, click Activate, and then wait
for the Active indicator to appear.
8. At the top of the Site Features page, click Adventure Works Portal to return to the home page.

X Task 3: Create a Subsite


1. On the Adventure Works Portal home page, in the Quick Launch pane on the left, click Site Contents.

2. At the bottom of the Site Contents page, click new subsite.


3. On the New SharePoint Site page, under Title, in the text box, type Adventure Works BI Portal.

4. Under Description, in the text box, type A subsite for Adventure Works BI reports.

5. In the URL name text box, type BI.

6. In the Template Selection area, under Select a template, on the Enterprise tab, click Business
Intelligence Center.

7. At the bottom of the page, click Create button. After a short time, the Adventure Works BI Portal site
is displayed.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

8. Select the URL in the Internet Explorer navigation bar, right-click it, and then click Copy.

9. In the Quick Launch area, click Home.

10. On the home page, under the Quick Launch area, click Edit Links, and then click LINK.

11. In the Add a link dialog box, in the Text to display box, type BI Portal, right-click the Address box,
click Paste, and then click OK.
12. Under LINK, click Save.

13. In the Quick Launch area, click the new BI Portal link and verify that the Adventure Works BI Portal
site is displayed.
14. In the Adventure Works BI Portal, in the title bar for the home page, next to Student, click the
Settings icon, and then click Site Settings.

15. On the Adventure Works BI Portal Site Settings page, under Look and Feel, click Navigation.

16. On the Navigation Settings page, in the Current Navigation section, select Structural Navigation:
Display only the navigation items below the current site. At the top of the page, click OK. Note
that the Quick Launch area now only shows links for the items in the BI Portal subsite, and not for
items in the parent site.

17. Click the image above the Quick Launch area, this provides a navigation link to the home page of the
subsite.
18. Keep Internet Explorer open at the Adventure Works BI Portal page for the next exercise.

Results: At the end of this exercise, you should have created a subsite based on the Business Intelligence
Center template at http://mia-sqlbi/sites.adventureworks/bi.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-3

Exercise 2: Configuring Reporting Services in SharePoint Server


X Task 1: Review Reporting Services Configuration
1. In the Adventure Works BI Portal, in the title bar for the home page, next to Student, click the
Settings icon and in the menu, click Site Settings.

2. Under Site Collection Administration, click Go to top level site settings.

3. On the Adventure Works Portal Site Settings page, under Site Collection Administration, click Site
collection features.

4. On the Site Collection Features page, verify that Report Server Integration Feature is activated.
5. Click the BI portal link in the Quick Launch area to return to the BI portal site.

X Task 2: Create a Document Library for Reports


1. In the Quick Launch pane on the Adventure Works BI Portal home page, click Libraries to view the
libraries in the subsite.

2. On the Site Contents page, click add an app.

3. On the Your Apps page, under Noteworthy, click Document Library. Then, in the Adding
Document Library dialog box, in the Pick a name box, type AWReports, and then click Create.
4. In the Quick Launch area, under Recent, click AWReports.

5. On the AWReports page, on the ribbon, on the Library tab, click Library Settings.

6. On the Settings page, click Advanced settings. On the Advanced Setting page, under Allow
management of content types, select Yes, and at the bottom of the page, click OK.

7. On the Settings page, in the Content Types list, note that only the Document content type is
enabled for the document library. Click Add from existing site content types.
8. On the Add Content Types page, in the Select site content types from drop-down list, select SQL
Server Reporting Services Content Types.

9. Click Report Builder Report, hold the Ctrl key while clicking Report Data Source, click Add, and
then click OK.

10. Under the Content Types list, click Change new button order and default content types.

11. On the Change Button Order page, clear the Visible check box for the Document content type, and
then click OK.

12. On the Settings page, under General Settings, click List name, description, and navigation. On the
General Settings page, in the Navigation section, under Display this document library on the Quick
Launch, select Yes, and then click Save.

13. Click the image above the Quick Launch area to return to the home page, and note that AWReports
is now listed in the Quick Launch area.

14. In the Quick Launch area, click AWReports.

15. On the AWReports page, on the ribbon, on the Files tab, click the New Document drop-down
button, and note that the Report Builder Report and Report Data Source content types are listed.
Click the New Document drop-down button again to hide the list.

16. Keep Internet Explorer open for the next exercise.

X Task 3: Deploy Report Items to SharePoint Server


1. On the taskbar, click the SQL Server Data Tools icon.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

2. On the File menu, point to Open, and then click Project/Solution.

3. Browse to the D:\Labfiles\Lab09\Starter folder, and then double-click AWReports.sln.

4. On the Project menu, click Properties.

5. In the AWReports Property Pages dialog box, in the Deployment section, set the following
properties, and then click OK:
o TargetServerURL: http://mia-sqlbi/sites/adventureworks/bi

o TargetDatasetFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Datasets

o TargetDataSourceFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Data Sources

o TargetReportFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Templates

o TargetReportPartFolder: http://mia-sqlbi/sites/adventureworks/bi/awreports/Report Parts

6. On the Build menu, click Deploy AWReports.


7. When deployment is complete, close Visual Studio.

8. In Internet Explorer, in the Quick Launch area, click AWReports and verify that the following folders
have been created:
o Data Sources

o Datasets

o Templates

9. Click Data Sources and verify that the AWDataWarehouse and AWSalesMD data sources have
been deployed.

10. Click the ellipsis to the right of AWDataWarehouse, and in the pop-up window that appears, click
EDIT.

11. In the AWDataWarehouse page:

12. Select Stored credentials, and enter the following credentials:


ƒ User Name: ADVENTUREWORKS\ServiceAcct

ƒ Password: Pa$$w0rd

13. Select Use as Windows credentials.


14. Click Test Connection, and when the connection is created successfully, click OK. If the connection
fails, correct any errors in the settings and try again.

15. Click the Data Sources link at the top of the page, and then repeat steps 10 and 11 for the
AWSalesMD data source.

16. Keep Internet Explorer open for the next exercise.

X Task 4: Test Self-Service Reporting


1. In Internet Explorer, in the Quick Launch area, click AWReports.

2. On the ribbon, on the Files tab, click New Folder. Then in the Create a new folder dialog box, in the
Name box, type Self-Service Reports and click Save.

3. Click the Self-Service Reports folder, and then, on the New Document drop-down list, click Report
Builder Report. If you are prompted to run the application, click Run.

4. When Report Builder starts, in the Getting Started dialog box, click Open.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-5

Note: If the Getting Started dialog box is not displayed, click the round button at the upper left of the
Report Builder window, and then click Open.

5. In the Open Report dialog box, double-click the AWReports folder and the Templates folder, select
AWReport.rdl, and then click Open.

6. Click the round button at the upper-left of the Report Builder window, and then click Save As. In the
Save As Report dialog box, click the Up One Level button, double-click Self-Service Reports,
change the Name property to Internet Sales, and then click Save.

7. In the Report Data pane, right-click Datasets, and then click Add Dataset. In the Dataset Properties
dialog box, set the Name property to InternetSalesByCountry, and then click Browse.

8. In the Select Dataset dialog box, double-click the AWReports folder and the Datasets folder, select
Internet Sales By Country.rsd, and then click Open. On the Dataset Properties dialog box, click
OK.

9. In the Report Data pane, expand Parameters, and then note that the StartDate and EndDate
parameters used by the dataset have also been added to the report.
10. Right-click Datasets, and then click Add Dataset. In the Dataset Properties dialog box, set the
Name property to LastMonth, and then click Browse.

11. In the Select Dataset dialog box, double-click the AWReports folder and the Datasets folder, select
Last Month.rsd, and then click Open. In the Dataset Properties dialog box, click OK.

12. In the Report Data pane, in the Parameters folder, double-click the StartDate parameter.

13. In the Report Parameter Properties dialog box, on the Default Values tab, select Get values from
a query. In the Dataset list, select LastMonth, in the Value field list, select FirstDay, and then click
OK.

14. Repeat the previous two steps to set the default value of the EndDate parameter to the LastDay field
in the LastMonth dataset.

15. On the ribbon, on the Insert tab, click Table, and then click Insert Table. Click and drag into the
blank area of the report to create the table.

16. In the Report Data pane, expand the InternetSalesByCountry dataset, and then drag the Revenue
field to the first column in the table.

17. Drag the Cost field to the second column in the table.

18. Right-click the empty cell in the details row of the third column, and then click Expression. In the
Expression dialog box, enter the following expression, and then click OK.

=Fields!Revenue.Value - Fields!Cost.Value

19. Click the empty header row cell above the expression you just entered, and then type Profit.

20. Drag the City field to the Row Groups pane below the report, and then drop it above the Details
group.

21. Drag the StateOrProvince field to the Row Groups pane below the report, and then drop it above
the City group.

22. Drag the Country field to the Row Groups pane below the report, and then drop it above the
StateOrProvince group.

23. Click the Run button to view the report, and note that the default values for all parameters are used.
Click the Design tab and apply any formatting you want to improve the report.

24. After you are satisfied with the report, click the Save button, and then close Report Builder.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

25. In Internet Explorer, in the Quick Launch area, click AWReports, click Self-Service Reports, and
verify that the report has been saved there. Click Internet Sales to view the report.

26. At the top of the page, click Adventure Works BI Portal to return to the home page for the BI portal
subsite, and keep Internet Explorer open for the next exercise.

Results: At the end of this exercise, you will have published Reporting Services reports to the BI subsite
and verified that self-service reporting is supported.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-7

Exercise 3: Configuring PowerPivot for SharePoint


X Task 1: Create a PowerPivot Gallery
1. In the Quick Launch pane on the Adventure Works BI Portal home page, click Libraries to view the
libraries in the subsite.

2. On the Site Contents page, click add an app.

3. On the Your Apps page, under Apps you can add, click PowerPivot Gallery. In the Adding
PowerPivot Gallery dialog box, in the name box, type AWPowerPivot, and then click Create.

4. In the Quick Launch area, under Recent, click AWPowerPivot.


5. In the AWPowerPivot page, on the ribbon, on the Library tab, click Library Settings.

6. On the Settings page, click List name, description, and navigation. On the General Settings page, in
the Navigation section, under Display this document library on the Quick Launch, select Yes, and then
click Save.

7. Click the image above the Quick Launch area to return to the home page, and note that
AWPowerPivot is now listed in the Quick Launch area.

X Task 2: Publish a PowerPivot Workbook


1. In the Quick Launch area, click AWPowerPivot.

2. On the ribbon, on the Files tab, click Upload Document.

3. In the Add a document dialog box, click Browse. Browse to the D:\labfiles\Lab09\Starter folder, and
then double-click Marketing Analysis.xlsx. In the Add a document dialog box, click OK.

4. Click the large image for the Marketing Analysis PowerPivot workbook to view it in Excel Services.

5. Click the Clear Filter icon at the upper-right of the Marital Status and Gender slicers, and verify that
the data in the PivotTable updates.

6. Close Internet Explorer.

X Task 3: Configure Data Refresh


1. Move the mouse to the lower-left of the taskbar, and when the Start screen icon appears, click it.
2. On the Start screen, click SharePoint 2013 Central Administration. When prompted to allow the
program to make changes, click Yes.

3. In SharePoint Central Administration, under Application Management, click Manage service


applications.

4. In the list of service applications, click Secure Store Service.

Tip: Make sure you click the link for the application, and not the link for its proxy.

5. In the Secure Store Service configuration page, note that a target application has already been
created for the PowerPivot unattended account for data refresh.

6. In the Quick Launch area, click Application Management, and then on the Application Management
page, under Service Applications, click Configure service application associations.

7. On the Service Application Associations page, click the SharePoint – 80 web application. In the list
of service applications, select Secure Store Service, scroll to the bottom, and then click OK. This
enables the Adventure Works Portal site and its subsites to use credentials in the secure store service.

8. In the Quick Launch area, click General Application Settings, and then on the General Application
Settings page, under PowerPivot, click Configure service application settings.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

9. On the PowerPivot Settings page, in the Data Refresh section, review the settings and note that:

o Default business hours are from 4:00 A.M. to 8:00 P.M.

o The unattended data refresh account has been configured to use the target application ID you
saw in the secure store service earlier.

o Users are allowed to enter custom Windows credentials.


10. At the bottom of the PowerPivot Settings page, click Cancel. Close Internet Explorer.

11. Start a new instance of Internet Explorer and if the Adventure Works Portal site does not open,
browse to http://mia-sqlbi/sites/adventureworks.

12. In the Quick Launch area, click BI Portal, and then click AWPowerPivot.

13. Click the Manage Data Refresh button to the right of the Marketing Analysis workbook.

14. In the Manage Data Refresh page, select Enable, and then review the default settings. When you
have finished reviewing the settings, click OK.

15. Close Internet Explorer.

X Task 4: View the PowerPivot Management Dashboard


1. Move the mouse to the bottom left of the taskbar, and when the Start screen icon appears, click it.
2. On the Start screen, click SharePoint 2013 Central Administration. When prompted to allow the
program to make changes, click Yes.

3. In SharePoint Central Administration, under General Application Settings, click PowerPivot


Management Dashboard.

4. In the Actions section, click Review timer job definitions.

5. On the Job Definitions page, click PowerPivot Management Dashboard Processing Timer Job. On the
Edit Timer Job page, click Run Now.

6. Under Timer Links on the left, click Running Jobs, and note that the job is running. Wait a few
minutes, and then click the Running Jobs link again until the job is no longer listed.
7. Under Timer Links on the left, click Job History, and verify that the PowerPivot Management
Dashboard Processing Timer Job completed successfully.

8. In the Quick Launch area, click Central Administration, and then under General Application Settings,
click PowerPivot Management Dashboard.

9. In the View drop-down list, click Activity, and review the server activity recorded for PowerPivot
workbooks.

10. Under Workbook Activity – Chart, click the Play button to view a timeline of workbook activity.

11. In the Quick Launch area, click Central Administration.

12. Keep Internet Explorer open for the next exercise.

Results: At the end of this exercise, you will have a PowerPivot Gallery that contains a published
PowerPivot workbook.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-9

Exercise 4: Create a PerformancePoint Dashboard


X Task 1: Configure the PerformancePoint Unattended Account
1. In SharePoint Central Administration, under Application Management, click Manage service
applications.

2. In the list of service applications, click PerformancePoint Services Application.

Tip: Make sure you click the link for the application, and not the link for its proxy.

3. On the Manage PerformancePoint Services page, click PerformancePoint Service Application Settings.

4. On the PerformancePoint Service Application Settings page, ensure that Unattended Service
Account is selected, enter the following credentials, and then click OK:

o User Name: ADVENTUREWORKS\ServiceAcct

o Password: Pa$$w0rd

5. Close the SharePoint Central Administration application.

X Task 2: Create a PerformancePoint Data Source


1. Start Internet Explorer and if the Adventure Works Portal site does not open, browse to http://mia-
sqlbi/sites/adventureworks. In the Quick Launch area, click BI Portal.

2. In the Quick Launch area, click Data Connections.

3. On the Ribbon, on the PERFORMANCEPOINT tab, click Dashboard Designer.

4. In the Internet Explorer prompt to open designer.application from mia-sqlbi, click Open. If the
Application Run – Security Warning dialog box is displayed, click Run.

Note: The Dashboard Designer can take a few minutes to open.

5. In the Dashboard Designer, in the Workspace Browser pane, right-click Data Connections, and then
click New Data Source.

6. In the Select a Data Source Template dialog box, under Template, click Analysis Services, and then
click OK.
7. When the new data source is created, rename it to AWSalesMD.

8. Under Connection Settings, in the Server text box, type MIA-SQLBI; in the Database drop-down
list, select AWSalesMD, and in the Cube drop-down list, select Sales.

9. On the New Data Source page, click the Time tab.

10. In the Time Dimension drop-down list, click Ship Date.Ship Date.Fiscal Date.

11. In the Choose a date to begin the year box for the selected time dimension, click Browse.

12. In the Select Member dialog box, select the 1st of July for the most recent fiscal year, and then click
OK.

13. In Hierarchy level list, click Day.

14. In the Enter a date that is equal to the period specified by the reference member above list, select the
same date that you chose in step 12.

15. In the Time Member Association pane, create the following mappings:
• Fiscal Year: Year.

• Fiscal Quarter: Quarter.


MCT USE ONLY. STUDENT USE PROHIBITED
L9-10 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

• Month Name: Month.

• Date: Day.

16. In the Workspace Browser pane, right click AWSalesMD, and then click Save.

X Task 3: Create a KPI


1. In Dashboard Designer, in the Workspace Browser pane, click PerformancePoint Content. On the
ribbon, on the Create tab, click KPI.

2. In the Select a KPI Template dialog box, select Blank KPI, and then click OK.

3. Rename the new KPI as Reseller Revenue.

4. In the Editor, in the Actual and Targets section, in the Name column, click Actual, and then rename
to YTD.

5. In the Data Mappings column, in the YTD row, click 1 (fixed values). In the Fixed Values Data Source
Mapping dialog box, click Change Source.
6. In the Select a Data Source dialog box, on the Workspace tab, click AWSalesMD, and then click
OK.

7. In the Dimensional Data Source Mapping dialog box, in the Select a measure drop-down list, select
Reseller Revenue.

8. In the Select a dimension section, click New Time Intelligence Filter. In the Time Formula Editor dialog
box, type YearToDate, and then click OK.
9. On the Dimensional Data Source Mapping dialog box, click OK.

10. In the Number column, in the YTD row, click (Default), in the Format Numbers dialog box, in the
Format drop-down list, select Currency, and then click OK.

11. In the Data Mappings column, in the Target row, click 1 (fixed values). In the Fixed Values Data Source
Mapping dialog box, click Change Source.

12. In the Select a Data Source dialog box, on the Calculated Metric tab, under Templates, click Blank
Calculation, and then click OK.

13. In the Calculated Metrics Data Source Mapping dialog box, click the second row labeled Value2,
and then click Delete.
14. Click Value1 and rename it to PreviousYear.

15. Click 1 (fixed values), and then in the Fixed Values Data Source Mapping dialog box, click Change
Source.

16. In the Select a Data Source dialog box, on the Workspace tab, click AWSalesMD, and then click
OK.

17. In the Dimensional Data Source Mapping dialog box, in the Select a measure drop-down list, select
Reseller Revenue.

18. In the Select a dimension section, click New Time Intelligence Filter. In the Time Formula Editor dialog
box, type YearToDate-1, and then click OK.

19. In the Dimensional Data Source Mapping dialog box, click OK.

20. In the Calculated Metric Data Source Mapping dialog box, in the Formula box, type
PreviousYear*1.25, and then click OK.

21. In the Number column, in the Target row, click (Default), and then, in the Format Numbers dialog
box, in the Format drop-down list, select Currency, and then click OK.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-11

22. In the Threshold section, set the following threshold values:

o Best: 120%

o Threshold 2: 90%

o Threshold 1: 50%

o Worst: 0%

23. In the Workspace Browser pane, right-click Reseller Revenue, and then click Save.

X Task 4: Create a Report


1. In Dashboard Designer, on the ribbon, on the Create tab, click Analytic Chart.

2. In the Create an Analytic Chart Report dialog box, on the Workspace tab, select AWSalesMD, and
then click Finish.

3. When the chart is created, rename it to Reseller Profit.

4. In the Details pane on the right of the screen, expand Measures.

5. Drag the Reseller Profit measure to the Bottom Axis area.

6. In the Details pane, expand Dimensions.

7. Drag the Sales Territory dimension to the Series area.

8. On the ribbon, on the Edit tab, in the Report Type drop-down list, click Stacked Bar Chart.

9. In the Workspace Browser pane, right-click Reseller Profit, and then click Save.

X Task 5: Create a Scorecard


1. In Dashboard Designer, on the ribbon, on the Create tab, click Scorecard.
2. In the Select a Scorecard Template dialog box, ensure that the Analysis Services template is
selected. Clear the Use wizards to create scorecards check box, and then click OK.

3. When the new scorecard is created, rename it to Reseller Revenue Scores.


4. In the Details pane to the right of the score card editor pane, expand KPIs, and then expand
PerformancePoint Content.

5. Drag Reseller Revenue to the left side of the editor in the area labeled Drop Items Here.

6. On the ribbon, on the Edit tab, click Update. The scorecard is updated with the information from
cube.

7. In the Details pane, expand Dimensions, and then drag Sales Territory to the right edge of the
Reseller Revenue cell.

8. In the Select Members dialog box, click All, and then click OK.

9. On the ribbon, on the Edit tab, click Update. Expand the All dimension member to see the results for
each sales territory.

X Task 6: Create a Dashboard


1. In Dashboard Designer, on the ribbon, on the Create tab, click Dashboard.

2. In the Select a Dashboard Page Template dialog box, select 2 Rows, and then click OK.

3. When the new dashboard is created, rename it to Sales Dashboard.

4. In the Pages pane, select Page 1 and rename it to Reseller Performance.


MCT USE ONLY. STUDENT USE PROHIBITED
L9-12 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

5. In the Details pane, expand Reports, and then expand PerformancePoint Content.

6. Drag Reseller Profit to the top row in the Dashboard Content pane.

7. In the Details pane, expand Scorecards, and then expand PerformancePoint Content.

8. Drag Reseller Revenue Scores to the bottom row in the Dashboard Content pane.

9. In the Workspace Browser pane, right-click Untitled Workspace, and then click Save.

10. Save the workspace as Reseller Workspace in the D:\Labfiles\Lab09\Starter folder.

11. In the Workspace Browser pane, right-click Sales Dashboard, and then click Deploy to SharePoint.

12. In the Deploy To dialog box, expand Adventure Works BI Portal, select Dashboards, and then click
OK. The dashboard is uploaded to SharePoint Server and opened in a new tab in Internet Explorer.

13. In Internet Explorer, on the ribbon, on the Page tab, click Make Homepage. When prompted to
confirm the action, click OK.

14. Close the Internet Explorer tab that contains the dashboard, and in the remaining tab (which should
be displaying the Data Connections library), click the BROWSE tab, and then click the image above
the Quick Launch area to go to the site’s home page (which is now the dashboard you created).
15. In the dashboard page, click the Reseller Profit chart, and then hold the mouse over each colored
band in the chart to see the profit for each sales territory.

16. Click the band for the North America sales territory, and then view the profit for the countries in that
territory.

17. Move the mouse to the upper-right of the chart, click the drop-down arrow that appears, and then
click Reset View to return to the default chart view for all sales territories.
18. In the Reseller Revenue Scores area, expand the sales territory hierarchy to view the sales
performance in each territory.

19. Close Internet Explorer.

Results: At the end of this exercise, you will have created four PerformancePoint reports on the
SharePoint site.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-1

Module 10: Monitoring and Optimizing a BI Solution


Lab: Monitoring and Optimizing a BI
Solution
Exercise 1: Monitoring and Optimizing a Data Warehouse
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the taskbar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab10\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Configure Data Collection


1. On the taskbar, click the SQL Server Management Studio icon, and when prompted, connect to the
MIA-SQLDW instance of the database engine by using Windows authentication.

2. In Object Explorer, expand Management, right-click Data Collection, and then click Configure
Management Data Warehouse.

3. In the Configure Management Data Warehouse Wizard dialog box, click Next.

4. On the Select configuration task page, ensure that Create or upgrade a management data warehouse
is selected, and then click Next.
5. On the Configure Management Data Warehouse Storage page, next to the Database name list, click
New. In the New database dialog box, in the Database name box, type ManagementDW, and then
click OK.
6. On the Configure Management Data Warehouse Storage page, in the Database name list, ensure that
ManagementDW is selected, and then click Next.

7. On the Map Logins and Users page, in the Users mapped to this login list, select
ADVENTUREWORKS\ServiceAcct; in the Database role membership for ManagementDW list, select
mdw_admin, and then click Next.

8. On the Complete the Wizard page, click Finish. When configuration is complete, click Close.
9. Right-click Data Collection, and then click Configure Management Data Warehouse.

10. In the Configure Management Data Warehouse Wizard dialog box, click Next.

11. On the Select configuration task page, select Set up data collection, and then click Next.

12. On the Configure Management Data Warehouse Storage page, click the ellipses next to the
Server name box, and then connect to the MIA-SQLDW instance of the database engine by using
Windows authentication. After the connection is made, in the Database name list, select
ManagementDW, and then click Next.

13. On the Complete the Wizard page, click Finish. After configuration completes, click Close.

14. Minimize SQL Server Management Studio.


MCT USE ONLY. STUDENT USE PROHIBITED
L10-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 3: Monitor Data Warehouse Load Activity


1. Move the mouse pointer to the lower-left of the taskbar until the Start screen image appears. Right-
click the Start screen image, and then click Computer Management.

2. In Computer Management, in the pane on the left, expand Performance, expand Monitoring Tools,
and then click Performance Monitor.

3. On the toolbar, click Add (the green + icon).

4. In the Add Counters dialog box, in the list of objects, expand Processor and select % Processor
Time. In the Instances of selected object list, select _Total, and then click Add.

Note: If the list of instances is empty, click the counter again to refresh the view.
5. Repeat the previous step to add the following counters. Where multiple instances are available, add
the _Total instance:

o Memory: Available Mbytes.


o Paging File: % Usage.

o PhysicalDisk: Disk Reads/sec.

o PhysicalDisk: Disk Writes/sec.


o SQLServer:Memory Manager: Memory Grants Pending.

o System: Processor Queue Length.

6. In the Add Counters dialog box, click OK. Note that Performance Monitor displays the counter
values.

7. In the pane on the left, right-click Performance Monitor, point to New, and then click Data
Collector Set.

8. In the Create new Data Collector Set dialog box, change the name to Data Warehouse
Performance Counters, and then click Next. Note the default value for the root directory, and then
click Finish.
9. In the pane on the left, expand Data Collector Sets, expand User Defined, right-click Data
Warehouse Performance Counters, and then click Start. Minimize Computer Management.

10. In the D:\Labfiles\Lab10\Starter folder, double-click LoadPartition.sln to open it in Visual Studio.

11. In Solution Explorer, double-click LoadDW.dtsx. Note that this package executes two other packages
that load the fact tables in the data warehouse, and then it updates the extraction log and processes
the Analysis Services cube.

12. On the Debug menu, click Start Debugging.

13. After execution of the LoadDW.dtsx package completes, click Stop Debugging on the Debug
menu. Close Visual Studio without saving any changes.

14. Maximize Computer Management, and in the pane on the left, right-click Data Warehouse
Performance Counters, and then click Stop.

15. Right-click Data Warehouse Performance Counters, and then click Latest Report. View the report,
which shows the performance counter values during the data warehouse load.

16. Minimize Computer Management.

X Task 4: Monitor Data Warehouse Query Activity


1. Move the mouse pointer to the lower-left of the taskbar until the Start screen image appears. Click
the Start screen image to view the Start screen.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-3

2. On the Start screen, click SQL Server Profiler.

3. In SQL Server Profiler, on the File menu, click New Trace. When prompted, use Windows
authentication to connect to the MIA-SQLDW instance of the database engine.

4. In the Trace Properties dialog box, in the Trace name box, type Data Warehouse Query Workload.

5. In the Use the template list, select Blank.


6. Select Save to file, and in the Save As dialog box, browse to the D:\Labfiles\Lab10\Starter folder, and
then click Save.

7. On the Events Selection tab, in the list of events, expand TSQL, select SQL:BatchCompleted and
SQL:Stmnt:Completed, and then clear the Show all events check box.

8. Clear the check boxes in all columns other than the following:

o ApplicationName

o DatabaseName

o Duration

o EndTime
o LoginName

o Reads

o RowCounts
o SPID

o StartTime

o TextData

9. Clear the Show all columns check box.

10. Click Column Filters, and in the Edit Filter dialog box, select DatabaseName, expand Like, type
%AWDataWarehouse%, and then click OK.
11. Click Run, and then minimize SQL Server Profiler.

12. In the D:\Labfiles\Lab10\Starter folder, double-click RunDWQueries.cmd. This executes a script that
runs queries in the data warehouse for over a minute.
13. After the script completes, maximize SQL Server Profiler, and on the File menu, click Stop Trace.

14. Review the trace, noting the TextData, Duration, and Reads values for the SQL:StmntCompleted
events where the value in the RowCounts column is over 100.

15. On the Tools menu, click Database Engine Tuning Advisor, and when prompted, use Windows
authentication to connect to the MIA-SQLDW instance of the database engine.

16. In the Database Engine Tuning Advisor, change the session name to Tune DW and ensure that under
Workload, File is selected. Click the Browse for a workload file button (this is a binoculars icon)
and open the Data Warehouse Query Workload.trc file in the D:\Labfiles\Lab10\Starter folder.

17. In the Database for workload analysis drop-down list, select AWDataWarehouse, and in the Select
databases and tables to tune list, select AWDataWarehouse.

18. On the Tuning Options tab, clear the Limit tuning time check box.

19. In the Physical Design Structures (PDS) to use in database section, select Indexes and indexed views;
in the Partitioning strategy to employ section, select Aligned partitioning; and in the Physical design
Structures (PDS) to keep in database section, select Keep aligned partitioning.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

20. Click Start Analysis and wait for the analysis to complete.

21. Review the recommendations, which list suggested indexes and statistics to drop or create.

22. Close Database Engine Tuning Advisor and minimize SQL Server Profiler.

X Task 5: Manage Workload Priorities


1. Maximize SQL Server Management Studio, and in Object Explorer, expand Management, right-click
Resource Governor, and then click Enable.

2. Expand Resource Governor, right-click Resource Pools, and then click New Resource Pool.

3. In the Resource Governor Properties dialog box, in the Resource pools section, add the following
resource pools.

Minimum CPU Maximum CPU Minimum Maximum


Name
% % Memory % Memory %

Low Priority 0 50 0 50

High Priority 20 90 20 90

4. In the Resource Governor Properties dialog box, in the Resource pools section, select the Low
Priority resource pool you created in the previous step, and in the Workload groups for resource
pool: Low Priority section, add the following workload group.

Memory
CPU
Maximum Memory Grant Degree of
Name Importance Time
Requests Grant % Time-out Parallelism
(sec)
(sec)

User Low 10 50 50 20 1
Queries

5. In the Resource Governor Properties dialog box, in the Resource pools section, select the High
Priority resource pool you created in the previous step, and in the Workload groups for resource
pool: High Priority section, add the following workload group.

CPU Memory
Maximum Memory Degree of
Name Importance Time Grant Time-
Requests Grant % Parallelism
(sec) out (sec)

ETL High 100 80 80 30 4

6. In the Resource Governor Properties dialog box, click OK.

7. On the File menu, point to Open, and then click File. Browse to the D:\Labfiles\Lab10\Starter folder,
and then open Classifier Function.sql.

8. Click Execute to run the script, which creates a function named fn_classify_apps that returns the
string “User Queries” if the application name in the current session is “SQLCMD”; it returns “ETL” if the
current application is named “SQL Server”.

9. In Object Explorer, right-click Resource Governor, and then click Properties. In the Resource
Governor Properties dialog box, in the Classifier function name list, select
[dbo].[fn_classify_apps], and then click OK.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-5

10. Minimize SQL Server Management Studio, and maximize Computer Management.

11. In Computer Management, in the pane on the left, if necessary, expand Performance and
Monitoring Tools, and then click Performance Monitor.

12. In the pane at the bottom, select each counter in turn, and on the toolbar, click Delete (the red X
icon) until there are no counters displayed.

13. On the toolbar, click Add (the green + icon).

14. In the Add Counters dialog box, in the list of objects, expand the SQLServer: Resource Pool Stats
object, and then click CPU control effect. % If the Instances of selected object list is empty, click
CPU control effect % again, and then click High Priority, hold the Ctrl key and click Low Priority,
and then click Add.

15. Repeat the previous step to add the following counters from the SQLServer:Workload Group Stats
object for the ETL and User Queries instances:

o CPU usage %

o Reduced memory grants/sec


16. In the Add Counters dialog box, click OK. Note that Performance Monitor displays the counter
values. Press Ctrl+H, and note that you can highlight the currently selected counter in the graph.

17. In the D:\Labfiles\Lab10\Starter folder, double-click RunDWQueries.cmd to start a user query


workload. Observe the values of the counters in Performance Monitor.

18. With the RunDWQueries.cmd command still running, in the D:\Labfiles\Lab10\Starter folder, double-
click RunETL.cmd to start an ETL workload. Observe the values of the counters in Performance
Monitor. Note that the CPU control effect % for both workloads increases as Resource Governor
prioritizes CPU resources for the ETL workload.

19. After both workloads finish running, minimize Computer Management.

X Task 6: View Server Activity


1. Maximize SQL Server Management Studio, and in Object Explorer, expand SQL Server Agent and its
Jobs folder.

2. Right-click collection_set_1_noncached_collect_and_upload, and then click Start Job at Step. In


the Start Job on MIA-SQLDW dialog box, click Start to run the job. After the job completes, click
Close.

3. Repeat the previous step to run the following jobs:

• collection_set_2_upload.

• collection_set_3_upload.

4. In Object Explorer, under Management, right-click Data Collection, point to Reports, point to
Management Data Warehouse, and then click Server Activity History.

5. Under the timeline, click the zoom in icon (a magnifying glass that contains a “+” symbol). Keep
zooming in to see activity for a shorter period of time.
6. In Object Explorer, under Management, right-click Data Collection, and then click Disable Data
Collection. After the action completes, click Close.

7. Keep SQL Server Management Studio open for the next exercise.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-6 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Results: At the end of this exercise, you will have a Performance Monitor report showing activity during
an ETL data load and recommendations from the Database Tuning Advisor based on a SQL Server
Profiler trace. You will also have created resource pools and workload groups for Resource Governor,
and generated server health data with the Data Collector.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-7

Exercise 2: Monitoring and Optimizing Analysis Services


X Task 1: Restart Analysis Services
1. In SQL Server Management Studio, in Object Explorer, click Connect, and then click Analysis Service.
Use Windows authentication to connect to the MIA-SQLBI instance of Analysis Services.

2. In Object Explorer, right-click the MIA-SQLBI Analysis Services instance, and then click Restart. When
prompted to allow the program to make changes, click Yes, and when prompted to confirm the
restart action, click Yes. Wait for Analysis Services to restart.

3. Minimize SQL Server Management Studio.

X Task 2: Create a SQL Server Profiler Trace for Analysis Services


1. Maximize SQL Server Profiler, and on the File menu, click New Trace.

2. When prompted, use Windows authentication to connect to the MIA-SQLBI instance of Analysis
Services.

3. In the Trace Properties dialog box, in the Trace name box, type Analysis Services Query Trace.

4. On the Events Selection tab, select Show all events, and then clear the Events check box in all rows
other than the following:
o Progress Report Begin.

o Progress Report End.

o Query Begin.

o Query End.

o Query Cube Begin.

o Query Cube End.


o Query Subcube.

o Serialize Results Begin.

o Serialize Results End.

5. Clear the Show all events check box, and then select Show all columns. Clear the selected check
boxes in all columns other than the following:

o EventSubclass

o TextData

o ApplicationName

o Duration

o DatabaseName

o ObjectName

o SPID

o CPUTime

6. Clear the Show all columns check box.

7. Click Column Filters, and in the Edit Filter dialog box, select DatabaseName, expand Like, type
AWSalesMD, and then click OK.

8. Click Run, and then minimize SQL Server Profiler.


MCT USE ONLY. STUDENT USE PROHIBITED
L10-8 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

X Task 3: Add Analysis Services Counters to Performance Monitor


1. Maximize Computer Management, and in the pane on the left, if necessary, expand Performance
and Monitoring Tools, and then click Performance Monitor.

2. In the pane at the bottom, select each counter in turn, and on the toolbar, click Delete (the red X
icon) until there are no counters displayed.

3. On the toolbar, click Add (the green + icon).

4. In the Add Counters dialog box, in the list of objects, expand the MSAS11: MDX object, click Total
cells calculated, click Add, and then click OK.

5. On the toolbar, in the Change graph type drop-down list, click Report. Total cells calculated
should currently have the value 0.000.

6. Minimize Computer Management.

X Task 4: Monitor an MDX Query


1. Maximize SQL Server Management Studio, and on the File menu, point to Open, and then click File.
Browse to the D:\Labfiles\Lab10\Starter folder and open MDX Query.mdx. If you are prompted, use
Windows authentication to connect to the MIA-SQLBI instance of Analysis Services.

2. In the Available Databases drop-down list, select AWSalesMD.

3. Click Execute, and wait for the query to return results.

4. Maximize SQL Server Profiler, and on the File menu, click Stop Trace. Maximize Computer
Management, and on the toolbar, click Freeze Display.
5. In SQL Server Profiler, view the trace, and note the Duration value for the last Query Subcube event
(which represents the time spent retrieving the cube from the storage engine), and the Duration
value for the last Serialize Results End event (which represents the time spent manipulating the data
after it was retrieved from storage).

6. In Computer Management, note the Total cells calculated value.

Note: These results indicate that the query spent significantly more time manipulating the data
than retrieving it from the storage engine, and a very large number of cells were calculated
during the execution of the query. The most appropriate way to improve the query performance
is to optimize the MDX and reduce the number of calculations being performed.

X Task 5: Monitor a Revised Query


1. In SQL Server Management Studio, in Object Explorer, right-click the MIA-SQLBI Analysis Services
instance, and then click Restart. When prompted to allow the program to make changes, click Yes,
and when prompted to confirm the restart action, click Yes. Wait for Analysis Services to restart.

2. Minimize SQL Server Management Studio.

3. In SQL Server Profiler, on the File menu, click Run Trace. Minimize SQL Server Profiler.
4. In Computer Management, on the toolbar, click Unfeeze Display. If the Total cells calculated value
does not revert to 0.000, right-click the report, and then click Clear. Minimize Computer
Management.
5. Maximize SQL Server Management Studio, and on the File menu, point to Open, and then click File.
Browse to the D:\Labfiles\Lab10\Starter folder and open Revised MDX Query.mdx. If you are
prompted, use Windows authentication to connect to the MIA-SQLBI instance of Analysis Services.

6. In the Available Databases drop-down list, select AWSalesMD.


MCT USE ONLY. STUDENT USE PROHIBITED
L10-9

7. Click Execute, and wait for the query to return results.

8. Maximize SQL Server Profiler, and on the File menu, click Stop Trace. Maximize Computer
Management, and on the toolbar, click Freeze Display.

9. In SQL Server Profiler, view the trace and note the Duration value for the last Query Subcube event
(which represents the time spend retrieving the cube from the storage engine) and the Duration
value for the last Serialize Results End event (which represents the time spent in the formula
engine).

10. In Computer Management, note the Total cells calculated value.

11. Close SQL Server Management Studio, and minimize SQL Server Profiler and Computer Management.

Note: The revised version of the query uses a WITH SET statement to sort the resellers by
revenue before applying the RANK function. This enables the query processor to use a linear hash
scan to find each reseller’s position in the ordered list, dramatically reducing the number of
calculations required to produce the results.

Results: At the end of this exercise, you will have created a SQL Server Profiler trace and used
Performance Monitor to view Analysis Services performance data while executing an MDX query.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-10 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Exercise 3: Monitoring and Optimizing Reporting Services


X Task 1: Deploy Reports
1. In the D:\Labfiles\Lab10\Starter folder, double-click Executive Reports.sln to open the solution in
Visual Studio.

2. On the Build menu, click Deploy Executive Reports.

3. After deployment completes, close Visual Studio.

4. Start Internet Explorer, and browse to the Adventure Works Portal at http://mia-
sqlbi/sites/adventureworks.
5. In the Quick Launch area, click Reports and verify that the report items have been deployed.

6. Click Data Sources, and then click the ellipsis to the right of AWDataWarehouse, and in the pop-up
window that appears, click Edit.

7. In the AWDataWarehouse page:

8. Select Stored credentials, and enter the following credentials:

ƒ User Name: ADVENTUREWORKS\ServiceAcct

ƒ Password: Pa$$w0rd

9. Select Use as Windows credentials.

10. Click Test Connection, and after the connection is created successfully, click OK. If the connection
fails, correct any errors in the settings and try again.

11. In the Quick Launch area, click Reports, and then click Reseller Sales to view the report.

12. At the top of the page, click Reports to return to the Reports folder.

X Task 2: Add Analysis Services Counters to Performance Monitor


1. Maximize Computer Management, and in the pane on the left, if necessary, expand Performance
and Monitoring Tools, and then click Performance Monitor.

2. In the pane at the bottom, select each counter in turn, and on the toolbar, click Delete (the red x
icon) until there are no counters displayed.

3. On the toolbar, click Add (the green + icon).

4. In the Add Counters dialog box, in the list of objects, expand the MSRS 2011 Web Service
SharePoint Mode object.

5. Click Total Cache Hits, hold the Ctrl key and click Total Cache Misses and Total Reports Executed,
click Add, and then click OK.

6. On the toolbar, in the Change graph type drop-down list, click Report.

Note: the current values of the three counters you added, and then minimize Computer Management.

X Task 3: Create a SQL Server Profiler Trace for Report Queries


1. Maximize SQL Server Profiler, and on the File menu, click New Trace.

2. When prompted, use Windows authentication to connect to the MIA-SQLDW instance of the
database engine.

3. In the Trace Properties dialog box, in the Trace name box, type Reporting Services Query Trace.

4. On the Events Selection tab, select Show all events, and then clear the Events check box in all rows
other than the following:
MCT USE ONLY. STUDENT USE PROHIBITED
L10-11

o RPC:Completed

o SQL:BatchCompleted

5. Clear the Show all events check box, and select Show all columns. Clear the selected check boxes in
all columns other than the following:

o TextData
o ApplicationName

o CPU

o Duration

o SPID

o StartTime

o BinaryData
o DatabaseName

6. Clear the Show all columns check box.

7. Click Column Filters, and in the Edit Filter dialog box, select DatabaseName, expand Like, type
%AWDataWarehouse%, and then click OK.

8. Click Run, and then minimize SQL Server Profiler.

X Task 4: Monitor Report Execution


1. In Internet Explorer, in the Reports folder, click Reseller Sales, and then view the report with the
default parameter values.

2. Maximize Computer Management and note that the number of report executions has increased by
one, but that the total cache hits and total cache misses are unchanged.

3. Maximize SQL Server Profiler and note that the trace includes:

o Two SQL:BatchCompleted events, which record the execution of a Transact-SQL query used to
retrieve the e default values for the StartDate and EndDate parameters. The query was run once
when the report was first executed, and again to populate the available values lists after the
report rendered.

o An RPC:Completed event that records the use of the sp_executesql stored procedure to retrieve
the data for the report.

4. In Internet Explorer, in the Parameters pane, change the Start Date and End Date parameters so
that the report will show data for the past six months, and then click Apply.

5. Maximize Computer Management and note that the number of report executions has increased by
one, but that the total cache hits and total cache misses are unchanged.

6. Maximize SQL Server Profiler and note that the trace includes a new RPC:Completed event to
retrieve the data for the report with the modified parameter values. Then, on the File menu, click
Stop Trace.

X Task 5: Configure Caching for a Shared Dataset


1. In Internet Explorer, click the Reports link at the top of the page, and then click the Datasets folder.

2. Click the ellipsis to the right of Last Month, and in the pop-up window that appears, click the ellipsis
and click Manage Caching Options.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-12 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

3. In the Manage Caching Options page, select Cache shared dataset. Ensure that On a custom schedule
is selected, and then click Configure.

4. In the Frequency section, select Month, and in the Schedule section, change the On calendar
day(s) value to 1, and the Start time value to 12:00, and then click OK.

5. In the Manage Caching Options page, click OK.

6. Maximize SQL Server Profiler, and on the File menu, click Run Trace.

7. In Internet Explorer, in the Quick Launch area, click Reports, and then click Reseller Sales to view the
report with the default parameter values.
8. Maximize Computer Management and note the number of report executions has increased by one,
but that the total cache hits and total cache misses are unchanged.

9. Maximize SQL Server Profiler and note that the trace includes:
o One SQL:BatchCompleted events, which records the execution of a Transact-SQL query used to
retrieve the default values for the StartDate and EndDate parameters when the report was first
viewed. When the report was rendered, the cached dataset was used to populate the available
values lists.

o An RPC:Completed event that records the use of the sp_executesql stored procedure to retrieve
the data for the report.
10. In Internet Explorer, in the Parameters pane, change the Start Date and End Date parameters so
that the report will show data for the past six months, and then click Apply.

11. Maximize Computer Management and note that the number of report executions has increased by
one, but that the total cache hits and total cache misses are unchanged.

12. Maximize SQL Server Profiler and note that the trace includes a new RPC:Completed event to
retrieve the data for the report with the modified parameter values. Then, on the File menu, click
Stop Trace.

X Task 6: Configure Caching for a Report


1. In Internet Explorer, click the Reports link at the top of the page.

2. Click the ellipsis to the right of Reseller Sales, and in the pop-up window that appears, click the
ellipsis and click Manage Processing Options.

3. In the Manage Processing Options page, in the Data Refresh Options section, select Use cached
data. Ensure that Elasped time in minutes is selected with the value 30, and then click OK.

4. Maximize SQL Server Profiler, and on the File menu, click Run Trace.

5. In Internet Explorer, click Reseller Sales to view the report with the default parameter values.

6. Maximize Computer Management and note that the number of report executions has increased by
one. The number of cache misses has also increased because there was no cached copy of the report
with the default parameter values.

7. Maximize SQL Server Profiler and note that the trace includes an RPC:Completed event that records
the use of the sp_executesql stored procedure to retrieve the data for the report.

8. In Internet Explorer, in the Parameters pane, change the Start Date and End Date parameters so
that the report will show data for the past six months, and then click Apply.

9. Maximize Computer Management and note that the number of report executions has increased by
one. The number of cache misses has also increased because there was no cached copy of the report
with the custom parameter values.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-13

10. Maximize SQL Server Profiler and note that the trace includes a new RPC:Completed event to
retrieve the data for the report with the modified parameter values.

11. In Internet Explorer, click the Reports link at the top of the page. Then click Reseller Sales to view
the report with the default parameter values again.

12. Maximize Computer Management and note that the number of report executions has increased by
one. The number of cache hits has also increased because the report was executed with the default
parameter values within the last 30 minutes, and could be retrieved from the cache.

13. Maximize SQL Server Profiler and note that the trace does not show any queries in the data
warehouse. All datasets were cached.

14. In Internet Explorer, in the Parameters pane, change the Start Date and End Date parameters so
that the report will show data for the past six months, and then click Apply.

15. Maximize Computer Management and note that the number of report executions has increased by
one. The number of cache hits has also increased because the report was executed with the same
custom parameter values within the last 30 minutes, and could be retrieved from the cache.
16. Maximize SQL Server Profiler and note that the trace does not show any queries in the data
warehouse. All datasets were cached. Then, on the File menu, click Stop Trace.

17. Close SQL Server Profiler, Computer Management and Internet Explorer.

Results: At the end of this exercise, you will have deployed Reporting Services items to a SharePoint
Server document library, and configured caching for a dataset and a report.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L11-1

Module 11: Operating a BI Solution


Lab: Operating a BI Solution
Exercise 1: Configuring SQL Server Integration Services
X Task 1: Prepare the Lab Environment
1. Ensure that the 20467A-MIA-DC and 20467A-MIA-SQLBI virtual machines are both running, and then
log on to 20467A-MIA-SQLBI as ADVENTUREWORKS\Student with the password Pa$$w0rd.

2. On the task bar, click the File Explorer shortcut.

3. View the contents of the D:\Labfiles\Lab11\Starter folder.

4. Right-click Setup.cmd, and then click Run as administrator.


5. Click Yes when prompted to confirm that you want to run the command file, and then wait for the
script to finish.

X Task 2: Create an SSIS Catalog


1. Start SQL Server Management Studio and connect to the MIA-SQLDW instance of the database
engine using Windows authentication.

2. In Object Explorer right-click Integration Services Catalogs and click Create Catalog.

3. In the Password box, type Pa$$w0rd, in the Retype Password box, type Pa$$w0rd, and then click
OK.

4. In Object Explorer expand Integration Services Catalogs.

5. Right-click SSISDB and click Create Folder.


6. In Folder name type DW ETL.

7. In the Folder description box, type Folder for the Adventure Works ETL SSIS Project, and click OK.

8. Expand the DW ETL folder, and then right-click the Projects folder and click Deploy Project.

9. In the Integration Services Deployment Wizard, on the Introduction page, click Next.

10. In the Integration Services Deployment Wizard, on the Select Source page, ensure that Project
deployment file is selected, and click Browse. Browse to the D:\Labfiles\Lab11\Starter folder and
double-click LoadPartition.ispac. Then click Next.

11. In the Integration Services Deployment Wizard, on the Select Destination page, ensure that the
Server name box contains the value MIA-SQLDW, and the Path box contains the value
/SSISDB/DW ETL/LoadPartition. Then click Next.

12. In the Integration Services Deployment Wizard, on the Review page, click Deploy.

13. When deployment has completed click Close.

X Task 3: Configure SSIS Environments


1. In SQL Server Management Studio, in Object Explorer, under DW ETL, right-click Environments and
click Create Environment.

2. In the Environment name box, type Test and click OK.

3. Expand Environments, right-click Test and click Properties.

4. In the Select a page pane, click Variables.


MCT USE ONLY. STUDENT USE PROHIBITED
L11-2 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

5. In the Variables page, create the following variables and then click OK.

Name Type Description Value Sensitive

StagingServer String SQL Server for staging localhost Unselected


DB

DWServer String SQL Server for data localhost Unselected


warehouse

6. Repeat steps 1 to 5 to create a second environment named Production with the following variables:

Name Type Description Value Sensitive

StagingServer String SQL Server for staging DB MIA-SQLDW Unselected

DWServer String SQL Server for data MIA-SQLDW Unselected


warehouse

7. In Object Explorer under DW ETL, expand Projects.

8. Right-click LoadPartition and click Configure.


9. In the Configure – LoadPartition dialog box, in the Select a page pane, click References and click
Add.

10. In the Browse Environments dialog box, under Local Folder (DW ETL), click Test, and click OK.

11. Repeat the previous two steps to add a reference to the Production environment.

12. In the Configure – LoadPartition dialog box, in the Select a page pane, click Parameters.

13. On the Connection Managers tab, select the AWDataWarehouse connection manager, and click
the ellipsis (…) button for the Value of the ServerName property.

14. In the Set Parameter Value dialog box, select Use environment variable and in the drop-down list,
select DWServer. Then click OK.

15. Repeat the previous two steps to use the StagingServer environment variable for the ServerName
property of the Staging connection manager. Then, click OK.

16. Keep SQL Server Management Studio open for the next exercise.

Results: At the end of this exercise, you will have an SSIS catalog that contains environments named
Test and Production, and you will have deployed the LoadPartition SSIS project to the SSIS catalog.
MCT USE ONLY. STUDENT USE PROHIBITED
L11-3

Exercise 2: Automate Data Warehouse Load Tasks


X Task 1: Create a Script to Maintain Indexes and Statistics
1. In Object Explorer, expand Databases, expand AWDataWarehouse, and expand Tables.

2. Expand the dbo.DimSalesTerritory table, right-click Indexes, and click Rebuild All.

3. In the Rebuild Indexes dialog box, in the Script drop-down list, click Script Action to New Query
Window. Then, click Cancel to close the dialog box without rebuilding the indexes.

4. In the SQLQuery1.sql window, modify all three of the ALTER INDEX statements to change the
ONLINE = OFF clause to ONLINE = ON.
5. At the end of the script, after the last GO statement, add the following Transact-SQL code:

EXEC sp_updatestats
GO

6. On the File menu, click Save SQLQuery1.sql. Save the script as Rebuild Sales Territory Indexes.sql
in the D:\Labfiles\Lab11\Starter folder.

X Task 2: Create an XMLA Script to Process a Cube


1. In SQL Server Management Studio, in Object Explorer, click Connect. Then click Analysis Services.

2. When prompted, use Windows authentication to connect to the MIA-SQLBI instance of Analysis
Services.

3. In Object Explorer, under the MIA-SQLBI Analysis Services server, expand Databases, expand
AWSalesMD, and expand Cubes.
4. Right-click Sales, and click Process.

5. In the Process Cube – Sales dialog box, in the Script drop-down list, click Script Action to File. Save
the script as Process Sales Cube.xmla in the D:\Labfiles\Lab11\Starter folder. Then, in the Process
Cube – Sales dialog box, click Cancel to close it without processing the cube.

X Task 3: Create a SQL Server Agent Job


1. In Object Explorer, under the MIA-SQLDW SQL Server, expand SQL Server Agent.

2. Right-click Jobs, and click New Job.

3. In the New Job dialog box, in the Name box, type Data Warehouse Load.

4. In the New Job dialog box, on the Steps page, click New.

5. In the New Job Step dialog box, in the Step Name box, type Run ETL Package, and in the Type drop-
down list, click SQL Server Integration Services Package.

6. On the Package tab, in the Server drop-down list, type MIA-SQLDW. Then click the ellipsis (…)
button next to the Package box.
7. In the Select an SSIS Package dialog box, expand SSISDB, expand DW ETL, expand LoadPartition,
select LoadDW.dtsx, and in the Select an SSIS Package dialog box, click OK.

8. In the New Job Step dialog box, on the Configuration tab, select the Environment check box, and
in the drop-down list, select .\Test. Then click OK.

9. In the New Job dialog box, on the Steps page, click New.

10. In the New Job Step dialog box, in the Step Name box, type Update Indexes. Ensure that Type is
set to Transact-SQL script (T-SQL), and in the Database drop-down list, select AWDataWarehouse.
MCT USE ONLY. STUDENT USE PROHIBITED
L11-4 Designing Business Intelligence Solutions with Microsoft SQL Server 2012

11. Under Command, click Open, then browse to the D:\Labfiles\Lab11\Starter folder, select Rebuild
Sales Territory Indexes.sql, and click Open.

12. In the New Job Step dialog box, verify that the Transact-SQL script you edited earlier is displayed,
and then click OK.

13. In the New Job dialog box, on the Steps page, click New.

14. In the New Job Step dialog box, in the Step Name box, type Process Cube, and in the Type drop-
down list, click SQL Server Analysis Services Command.

15. In the Server box, type MIA-SQLBI.


16. Under Command, click Open, then browse to the D:\Labfiles\Lab11\Starter folder, select Process
Sales Cube.xmla, and click Open.

17. In the New Job Step dialog box, verify that an XMLA script to process the Sales cube is displayed,
and then click OK.

18. In the New Job dialog box, on the Schedules page, click New.

19. In the New Job Schedule dialog box, in the Name box, type Monthly Data Load. In the Frequency
section, in the Occurs drop-down list, select Monthly, and review the default settings to verify that
the schedule will run at 12:00 on the first day of every month. Then click OK.

20. In the New Job dialog box, click OK.

21. Keep SQL Server Management Studio open for the next exercise.

Results: At the end of this exercise, you will have a SQL Server Agent job named Data Warehouse
Load.
MCT USE ONLY. STUDENT USE PROHIBITED
L11-5

Exercise 3: Test Automated Operations


X Task 1: Run the SQL Server Agent Job
1. In Object Explorer, right-click the Data Warehouse Load job and click Start Job at Step.

2. In the Start Job on MIA-SQLDW dialog box, click Start.

3. When the job completes successfully, click Close.

X Task 2: View Job History


1. In Object Explorer, right-click the Data Warehouse Load job and click View History.

2. In the Log File Viewer – MIA-SQLDW dialog box, expand the first entry and verify that all three
steps were completed successfully. Then click Close.

X Task 3: View SSIS Package Execution History


1. In Object Explorer under Integration Services Catalogs, right-click SSISDB, point to Reports, point
to Standard Reports, and click Integration Services Dashboard.

2. Notice that the most recent package execution succeeded, and then click Overview.

3. In the Overview report, in the Parameters Used table, verify that the values used for the
AWDataWarehouse.ServerName and Staging.ServerName parameters were the values you
specified in the Test environment.

4. At the top of the report, click View Messages.

5. In the Messages report, view the messages that were logged during package execution. Then, click
View Overview to return to the overview report.

6. At the top of the report, click View Performance.

7. In the Execution Performance report, note the duration of the package execution.

8. Close SQL Server Management Studio without saving any files.

Results: At the end of this exercise, you will have executed a job, reviewed job history, and reviewed SSIS
catalog reports.

Potrebbero piacerti anche