Sei sulla pagina 1di 92

Data Development with Microsoft ® technologies

ET
.N

Volume 4 / Issue 3

Microsoft Talks Data!


Link up with LINQ
ADO.NET Data
Services
Microsoft Sync
Framework
SQL Server 2008
ODBC Rocks!

www.code-magazine.com
US $ 5.95 Can $ 8.95
TABLE OF CONTENTS

Features
5 Welcome Letter 71 The Data Dude Meets Team Build
Gert Drapers
6 An Entity Data Model for Relational Data Part I:
Defining the Entity Data Model 76 XML Tools in Visual Studio 2008
Michael Pizzo Stan Kitsis

10 An Entity Data Model for Relational Data Part II: 80 ODBC Rocks!
Mapping an Entity Data Model to a Relational Chris Lee
Store
Michael Pizzo 84 Tesla: Democratizing the Cloud
Brian Beckman and Erik Meijer
16 Programming Against the ADO.NET Entity
Framework
Departments
Shyam Pather

22 Rich Query for DataSet – An Introduction to


LINQ to DataSet 50 CoDe Compilers
Andrew Conrad
57 Advertisers Index
26 LINQ to Relational Data: Who’s Who?
Elisa Flasko

32 Browsing Windows Live Expo with LINQ to XML


David Schach

38 ADO.NET Data Services


Mike Flasko and Elisa Flasko

42 Caching with SQL Server Compact and the


Microsoft Sync Framework for ADO.NET
Steve Lasker

52 Introducing the Microsoft Sync Framework:


Next Generation Synchronization Framework
Moe Khosravy

56 What’s New in SQL Server 2008?


Anthony Carrabino

58 Programming SQL Server 2008


Vaughn Washington

63 Use SQL CLR 2.0—Advancing CLR Integration in


SQL Server 2008
José Blakeley and Christian Kleinerman

68 Visual Studio 2008: RAD Gets RADer


Jonathan Wells

US subscriptions are US $29.99 for one year. Subscriptions outside the US pay US $44.99. Payments should be made in US dollars drawn on a US
bank. American Express, MasterCard, Visa, and Discover credit cards are accepted. Bill me option is available only for US subscriptions. Back issues
are available. For subscription information, email subscriptions@code-magazine.com or contact customer service at 832-717-4445 ext 10.
Subscribe online at www.code-magazine.com
CoDe Component Developer Magazine (ISSN # 1547-5166) is published bimonthly by EPS Software Corporation, 6605 Cypresswood Drive., Suite
300, Spring, TX 77379. POSTMASTER: Send address changes to CoDe Component Developer Magazine, 6605 Cypresswood Drive., Suite 300,
Spring, TX 77379.

4 Table of Contents www.code-magazine.com


ONLINE QUICK ID 0712012
The Data Programmability Team

Dear Reader,

It’s a great time to be working with data. I invite you to take a walk with us across As Web application technologies evolve,
As we take stock of how our customers use the techniques, tools, and components in the work we are doing evolves as well; this
information today, we see data being pulled this issue. We’ll be looking at our native includes areas we are currently working in
from the Web, our desktops, departmental, data access technologies where we con- such as ADO.NET Data Services and tier
workgroup and enterprise servers, and then tinue to innovate in raw speed and power splitting.
surfacing in all kinds of different form fac- to ensure the fattest and fastest pipe to da-
tors. The sheer quantity of data available in tabases via ODBC. We’ll show you some Two trends stand out as harbingers of radi-
systems is already mind-boggling and grow- of the innovation we have been doing in cal change: the proliferation of mobile com-
ing at a tremendous rate. At Microsoft, espe- XML Tools to support building applica- puting devices and near-ubiquitous wireless
cially in the Data Programmability team, we tions that manipulate and store XML and network connectivity. Together they open
are ridiculously excited to be working at the work with schemas via XSD. We’ll dive up a world of possibilities for new applica-
center of this storm. The authors of these into the .NET Framework where we have tions. The one-two combo we’ll show you
articles will show you new innovations and a bucket full of great stuff. Language In- with the new release of SQL Server Com-
best practices across the spectrum. These tegrated Query (LINQ) offers a huge step pact and Microsoft Sync Framework can
experts will show you new techniques and forward in writing understandable and unshackle your applications from the desk-
components for handling the myriad data approachable queries of all sorts in pro- top office and get your customers dancing
shapes, locations, and sources that develop- gramming languages. We’ll tour LINQ so- in the aisles.
ers deal with on a regular basis. This edition lutions for XML, SQL, Entities, and Data-
of CoDe Focus offers us the opportunity Set. Data access APIs are not a complete To get involved, try stuff out, and to learn
to tell the story of our vision for the data panacea, of course, and we’ve got a big more, come visit us on the Web at http://
platform as it stands today and in the near down payment for model-driven develop- msdn.com/data and then join the conversa-
future. Of course it doesn’t stop there. Your ment with data. We are super excited to tions starting from http://blogs.msdn.com/
thoughts, critical feedback, and participa- announce the release of the ADO.NET data.
tion in Community Technology Previews Entity Framework and the foundational
will help us all realize that vision. Entity Data Model, which drives a set of So let’s take that stroll—if you’re ready!
core data programming experiences now
“Data everywhere, usable by anyone at any and into the future. We are pleased to Sincerely,
time.”—That’s our grand mission: provide show the Entity Designer that provides an
the best programming patterns, infrastruc- exciting developer experience for creating Samuel Druker
ture, and tools for developing data-driven conceptual models in the EDM that can General Manager, Data Programmability
applications for Windows, .NET, and Mi- be used directly with the ADO.NET Entity Microsoft Corporation
crosoft SQL Server. Framework. http://msdn.com/data

www.code-magazine.com Welcome 5
ONLINE QUICK ID 0712022

An Entity Data Model for


Relational Data Part I:
Defining the Entity Data
Model
Michael Pizzo Microsoft’s Entity Data Model allows you to define an application-
Michael.Pizzo@microsoft.com oriented view of your data consistent with how you reason about
Michael Pizzo has worked for that data.
over 17 years in the design Part I of this article describes the Entity Data Model and how it enables you to
and delivery of data access
solutions and APIs at Microsoft. represent real-world concepts in a way that makes relationships between related
Michael first got involved in data pieces of data more explicit and easier to query, navigate, and consume than
access as a Program Manager through the traditional relational database model. Part II of the article discusses
for Microsoft Excel in 1987,
integrating Microsoft’s flagship
how Microsoft’s ADO.NET Entity Framework provides a flexible mapping of an
spreadsheet product with application-oriented conceptual schema in terms of the Entity Data Model to existing
relational data. This led to his relational database schemas. Shyam Pather’s article, “Programming Against
involvement in the design and the ADO.NET Entity Framework” completes the picture by describing the actual
delivery of ODBC, along with the
ODBC-based Microsoft Query
programming model and API exposed by the framework.
Tool shipped with Microsoft
Office. During the design of

T
he world around us is infinitely complex. Dr. Chen compares the relational and entity-relation-
ODBC, Michael was active in the
Describing that world in a way we can rea- ship models, saying that the entity-relationship model
standards organizations, sitting
son about requires us to break it into simpler adopts a “…more natural view that the real world
as Chair for the SQL Access
components. We can see a pattern emerge in how consists of entities and relationships” where the rela-
Group, working with X/Open we describe the physical world when we think tional model “…can achieve a high degree of data in-
on the CAE specification for about how we communicate concepts to our chil- dependence, but it may lose some important seman-
“Data Management: SQL-Call dren; there are “things” (“Tessa”, “Kara”, a blan- tic information about the real world.” For example,
Level Interface (CLI)”, serving ket) and relationships between the three tables ("Person", "Ob-
as Microsoft’s representative things (“The blanket belongs to ject", and "Ownership") are not
to the ANSI X3H2 Database Kara.”) Fast Facts enough to understand the mod-
Committee, and as an elected el. You must know that columns
ANSI representative to the
ADO.NET Entity Framework
In his groundbreaking 1976 within the "Ownership" table
ISO committee meetings that paper introducing the entity- lets developers build contain the primary key fields of
defined and adopted Part 3 of relationship model, Dr. Peter applications that the object being owned and the
the ANSI/ISO SQL specification Chen defines a way of modeling access data by programming owner—the semantic meaning of
for a call-Level Interface real-world concepts by break- against a conceptual those columns is not captured by
(SQL/CLI). Following ODBC, ing complex information into application model instead the model itself.
Michael was a key designer things (entities) and associa- of programming
and driver of Microsoft’s OLE tions between things (relation- Microsoft’s new ADO.NET En-
directly against a relational
DB API for componentized ships). tity Framework is an implemen-
storage schema. tation of Dr. Chen’s entity-rela-
data access within a COM
Contrast this with the rela- tionship model that maps rela-
environment, and later owned
tional model employed by rela- tional database schemas to an
the design and delivery of ADO.
tional databases today. The relational model, first Entity Data Model. This article describes key aspects
NET version 1.0. He is currently described by Edgar Codd in 1969, emphasizes rela- of the Entity Data Model along with features of the
a Principle Architect in the tions, or tables of data, not relationships. The re- ADO.NET Entity Framework, which allow mapping
Data Programmability Team at lational model is built around data normalization, of that Entity Data Model to a relational store.
Microsoft, contributing to the which simplifies data storage and maintenance by
architecture and design of the minimizing the duplication of information in or-
next version of ADO.NET and core der to enforce data consistency. For example, to Components of the Entity Data Model
building block for Microsoft’s describe the fact that the blanket belongs to Kara,
exciting new data platform; The you might define three tables: Person, Object, and Following the entity-relationship model, the Entity
ADO.NET Entity Framework. Ownership. Data Model is composed of entities and relationships.

6 An Entity Data Model for Relational Data Part I: Defining the Entity Data Model www.code-magazine.com
What Is an Entity? Compositional Relationships

An entity is an instance of an EntityType (for ex- Compositional relationships are a special type of
ample, a person or a blanket). The EntityType de- relationship in which one entity within the rela-
scribes the Properties that define the structure of tionship contains the related entity (or entities).
the entity (for example, name, birthdate, hair col- For example, OrderLines may be contained within
or and eye color). In order to be an entity, there an Order. In a compositional relationship, the con-
must be a set of Key properties that uniquely iden- tained entity (“OrderLine”) must be related to ex-
tify the instance from other instances of the same actly one containing entity (“Order”). Thus:
EntityType within an EntitySet (for example, so-
cial security number). • A contained entity cannot be associated with
more than one containing entity through the
Inheritance compositional relationship. It is possible for
it to be associated with other entities through
EntityTypes may extend other EntityTypes non-compositional relationships (an Order-
through inheritance. For example, a Salesperson Line cannot be associated with more than
EntityType may extend an Employee EntityType one Order, but may be associated with prod-
by adding properties for Region, Quota, and ucts, suppliers, etc.).
Commission, while an Engineer may extend the • Any instance of a contained entity must be
same Employee EntityType by adding properties related to an instance of a containing entity
indicating the ProductTeam she is a member of. (an OrderLine must have an Order).
The Entity Data Model does not support multiple • Deleting the containing entity (Order) deletes
inheritance (a Salesperson cannot also be an En- all contained entities (OrderLines).
gineer, within the same model). • Additionally, an entity can be the contained
entity in at most one compositional relation-
Inheritance typically implies substitutability (ei- ship.
ther a SalesPerson or an Engineer can be supplied
anywhere an Employee is requested) as well as While the initial version of the Microsoft Entity
polymorphism (a request for Employees can re- Data Model does not directly support composi-
turn both SalesPersons and Engineers). tional relationships, most of the characteristics of
a compositional relationship (other than the re-
ComplexTypes striction that an entity can be the contained en-
tity in at most one compositional relationship) can
Related properties may be grouped together be modeled through identifying relationships. In
as a single composite property. For example, an identifying relationship, the key field(s) of the
StreetAddress, City, Region, and ZipCode may be containing entity make up part of the key for the
grouped together into a single "Address" property. contained entity, and referential integrity is used
The structure of that composite property is de- to ensure the contained entity has a non-null con-
fined through a ComplexType that can be used by taining entity, and is deleted if the containing en-
multiple EntityTypes, as well as other Complex- tity is deleted.
Types, within a schema (for example, Employ-
ees may have a "HomeAddress" property, while Relationships with Payloads
Orders may have a "ShipTo" address). Complex-
Types differ from EntityTypes in that they do Relationships with payloads (often called associa-
not have independent identifiers; an instance tion entities) are used to add additional informa-
of a complex type is addressed by an instance tion to a relationship—for example, an "Employ-
of an entity plus the name of the property on ment" relationship between a Company and a Per-
that instance to which the complex type is de- son may include HireDate, Salary, and Level. Al-
fined. though the initial version of the Entity Data Model
does not directly support relationships with pay-
loads, the same information can be represented by
Relationships defining an intermediate EntityType with the ad-
ditional information that has one to one relation-
Relationships define interesting associations be- ships with the other two EntityTypes (for example,
tween entities (for example, "Ownership"). Rela- an “Employment” EntityType with HireDate, Sal-
tionships are described by an AssociationType, ary, and Level, and relationships to both Company
which defines the types of entities that make up and Person).
the association (for example, “ManagerEmployee”
is made up of two “Employee” EntityTypes), their N-ary Relationships
Roles (“Manager” and “Employee”) and Cardinal-
ity (each Employee has at most one Manager, while Similarly, although the first version of the Entity
a Manager can have one or more Employees). Re- Data Model only supports binary relationships
lationships may be one to one (for example, a mar- (relationships with exactly two ends), n-ary re-
riage), one to many (for example, Manager to Em- lationships (relationships that may have more
ployees), or many to many (for example, students than two ends) can be represented by defining an
to classes). intermediate EntityType with more than two bi-

www.code-magazine.com An Entity Data Model for Relational Data Part I: Defining the Entity Data Model 7
Conclusion
By modeling data Microsoft’s Entity Data model defines an entity-
in terms of instances and relationship model for dealing with data. By mod-
eling data in terms of instances and relationships,
relationships, services such as services such as querying, reporting, synchroniz-
querying, reporting, ing, and programmability against an object model
can be defined in terms of that entity model. Part II
synchronizing, of this article describes how the Entity Data Model
and programmability against is used by the Microsoft ADO.NET Entity Frame-
work to define an application-oriented schema
an object model that can be flexibly mapped to a variety of rela-
can be defined in terms of that tional schema representations.

entity model. Michael Pizzo

nary relationships. (For example, an EntityType


“Game” with relationships to home team, visiting
team, referee, and the location. In this case, you
may want to add other properties, such as start
time and duration of the game, and final score.)

While defining an intermediate EntityType cap-


tures the content of relationships with payloads
and n-ary relationships, doing so loses some of
the semantic meaning of the model (“Employ-
ment” isn’t really an EntityType; it exists only to
describe the relationship between two entities).
Microsoft is looking to add both association en-
tities and n-ary relationships, as well as compo-
sitional relationships, to future versions of the
Entity Data Model to more fully represent the
semantic meaning.

Containers
So far I've described how entity and relationship
types are defined. Applications interact with enti-
ties and relationships through an instance of an
EDM schema, defined by named sets of entity and
relationship instances.

EntitySets

Instances of entities live within a named EntitySet.


A single instance of an entity can belong to only
one EntitySet. An EntitySet is the equivalent of a
relational table.

RelationshipSets

Just as entities live within a named EntitySet, re-


lationship instances live within a RelationshipSet.
RelationshipSets hold the relationship instanc-
es of a particular type between entity instances
within two specific EntitySets. RelationshipSets
are loosely analogous to join tables in relational
schemas.

EntityContainers

EntitySets and RelationshipSets are defined within


an EntityContainer. An EntityContainer can have
multiple EntitySets of the same EntityType.

8 An Entity Data Model for Relational Data Part I: Defining the Entity Data Model www.code-magazine.com
ONLINE QUICK ID 0712032

An Entity Data Model for


Relational Data Part II:
Mapping an Entity Data
Model to a Relational Store
Michael Pizzo The ADO.NET Entity Framework allows you to define an
Michael.Pizzo@microsoft.com application-oriented view of your data consistent with how you
Michael Pizzo has worked for reason about that data, and map that conceptual view to existing
over 17 years in the design relational schemas.
and delivery of data access
solutions and APIs at Microsoft. Part I of this article described the Entity Data Model and how it enables you to
Michael first got involved in data model real-world concepts in a more natural way. Part II of the article describes how
access as a Program Manager that Entity Data Model is used within the ADO.NET Entity Framework to define an
for Microsoft Excel in 1987,
integrating Microsoft’s flagship
application-oriented conceptual view of your data, and how that view can be flexibly
spreadsheet product with mapped to existing relational schemas. Shyam Pather’s article, “Programming
relational data. This led to his Against the ADO.NET Entity Framework” completes the picture by describing the
involvement in the design and actual programming model and API used by developers to work with data using the
delivery of ODBC, along with the
ODBC-based Microsoft Query
ADO.NET Entity Framework.
Tool shipped with Microsoft
Office. During the design of

T
he ADO.NET Entity Framework supports and updates against the conceptual model such that
ODBC, Michael was active in the
flexible mapping of an application-oriented the query passed to the underlying ADO.NET Data
standards organizations, sitting
conceptual view of the data, defined in terms Provider is in terms of common relational operators
as Chair for the SQL Access
of the Entity Data Model, to against the actual storage sche-
Group, working with X/Open
on the CAE specification for
existing relational tables. The Fast Facts ma. All query evaluation is done
Entity Framework includes an within the store, and returned
“Data Management: SQL-Call EntityClient that maps between Microsoft’s new results are assembled to expose
Level Interface (CLI)”, serving an Entity Data Model and a re- ADO.NET Entity Framework is possibly polymorphic, hierarchi-
as Microsoft’s representative lational schema, an Object Ser- an implementation of cal data readers.
to the ANSI X3H2 Database vices layer that allows applica- Dr. Chen’s entity-relationship
Committee, and as an elected tions to query and return results model that maps relational The ability to generate these views
ANSI representative to the in terms of strongly-typed Com- database schemas to an on the client, based on metadata
ISO committee meetings that mon Language Runtime (CLR) supplied at runtime, provides a
Entity Data Model.
defined and adopted Part 3 of data objects, and a Metadata loose coupling between the ap-
the ANSI/ISO SQL specification Runtime that exposes metadata plication and storage schema, al-
for a call-Level Interface and mapping information to both the EntityClient lowing each to evolve independently over time with-
(SQL/CLI). Following ODBC, and Object Services layer. out having to define application-specific views on the
Michael was a key designer server or recompiling the application to account for
and driver of Microsoft’s OLE changes in the database schema. Alternatively, gener-
DB API for componentized EntityClient ated client views may optionally be compiled into the
application as a performance optimization.
data access within a COM
The EntityClient is an ADO.NET Data Provider that
environment, and later owned
allows applications to work in terms of a conceptual
the design and delivery of ADO.
Entity Data Model. The EntityClient accepts queries Object Services
NET version 1.0. He is currently defined in terms of a canonical Entity SQL (ESQL)
a Principle Architect in the grammar; a version of SQL extended to provide first- While applications can consume their conceptual
Data Programmability Team at class operators around types, collections, and rela- Entity Data Model through existing ADO.NET
Microsoft, contributing to the tionships defined in the Entity Data Model. Data Provider APIs using the EntityClient, it is of-
architecture and design of the ten convenient to query and return results in terms
next version of ADO.NET and core The EntityClient generates client-side query and up- of strongly-typed CLR data objects. The ADO.NET
building block for Microsoft’s date views to represent the mapping between the Entity Framework includes an Object Services layer
exciting new data platform; The conceptual Entity Data Model and the relational stor- that allows applications to query and return results
ADO.NET Entity Framework. age schema. These views are used to expand queries in terms of CLR data objects using either Entity SQL

10 An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store www.code-magazine.com
or Language Integrated (LINQ) queries. Results
may optionally be identity managed (so the same
entity returned through multiple queries within the
same context return the same object instance) and
change-tracked (so changes made to the in-memory
object graph can be saved back to the database).

Metadata
In order to generate mapping views, the Entity
Framework relies on three pieces of metadata:

• A description of the actual storage schema, de-


fined in terms of a Schema Storage Definition
Language (SSDL).
• A description of the application’s conceptual
view, defined in terms of a Conceptual Schema
Definition Language (CSDL).
• A description of the mapping between the
Conceptual Model and the Storage model, de-
fined in terms of a Mapping Schema Language
(MSL).

SSDL, CSDL, and MSL are XML grammars, gener-


ally provided as XML files or resources, but which Figure 1: Simple relational schema from the Northwind Sample Database that ships with Microsoft SQL
Server.
can be loaded from other sources at runtime such as
a metadata repository.
Listing 2 describes an equiva-
lent Conceptual view of Cus-
Mapping Entities tomers that mirrors the SSDL
view one for one.
The Entity Framework supports mapping a single
EntityType to one or more relational tables. Listing 3 shows the trivial map-
ping between the Customers Ta-
For example, Figure 1 shows a simple relational ble described in Listing 1 and the
schema (a simplified version of the “Northwind” Customers EntitySet described in
sample database shipped with Microsoft SQL Listing 2.
Server). Listing 1 shows the SSDL describing the
Customers table within the Northwind database.
Figure 2: Employee entity split between two tables.
Within Listing 1, the EntityType element defines Mapping a Single Entity
the Customer EntityType, the Property elements de- to Multiple Tables
fine its members, the Key element defines the fields
that uniquely identify a customer instance, and the
EntitySet element defines an EntitySet named “Cus- Suppose, however, that the information defined in
tomers” containing Customer EntityTypes within an EntityType actually exists in multiple tables in
the “dbo” EntityContainer. the database (for example, suppose the information

Listing 1: SSDL for Customers table


<Schema Namespace="NorthwindModel.Store" Alias="Self" <Property Name="CompanyName" Type="nvarchar"
xmlns="http://schemas.microsoft.com/ado/2006/04/edm/ssdl"> Nullable="false" MaxLength="40" />
<Property Name="ContactName" Type="nvarchar" MaxLength="30" />
<EntityContainer Name="dbo"> <Property Name="ContactTitle" Type="nvarchar" MaxLength="30" />
<EntitySet Name="Customers" <Property Name="Address" Type="nvarchar" MaxLength="60" />
EntityType="NorthwindModel.Store.Customer" /> <Property Name="City" Type="nvarchar" MaxLength="15" />
</EntityContainer> <Property Name="Region" Type="nvarchar" MaxLength="15" />
<Property Name="PostalCode" Type="nvarchar" MaxLength="10" />
<EntityType Name="Customer"> <Property Name="Country" Type="nvarchar" MaxLength="15" />
<Key> <Property Name="Phone" Type="nvarchar" MaxLength="24" />
<PropertyRef Name="CustomerID" /> <Property Name="Fax" Type="nvarchar" MaxLength="24" />
</Key> </EntityType>
<Property Name="CustomerID" Type="nchar"
Nullable="false" MaxLength="5" /> </Schema>

www.code-magazine.com An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store 11
Listing 2: CSDL for Customers EntitySet
<Schema Namespace="NorthwindModel" Alias="Self" <Property Name="CompanyName" Type="String"
xmlns="http://schemas.microsoft.com/ado/2006/04/edm"> Nullable="false" MaxLength="40" />
<Property Name="ContactName" Type="String" MaxLength="30" />
<EntityContainer Name="NorthwindEntities"> <Property Name="Title" Type="String" MaxLength="30" />
<EntitySet Name="Customers" <Property Name="Address" Type="String" MaxLength="60" />
EntityType="NorthwindModel.Customer" /> <Property Name="City" Type="String" MaxLength="15" />
</EntityContainer> <Property Name="Region" Type="String" MaxLength="15" />
<Property Name="PostalCode" Type="String" MaxLength="10" />
<EntityType Name="Customer"> <Property Name="Country" Type="String" MaxLength="15" />
<Key> <Property Name="Phone" Type="String" MaxLength="24" />
<PropertyRef Name="CustomerID" /> <Property Name="Fax" Type="String" MaxLength="24" />
</Key> </EntityType>
<Property Name="CustomerID" Type="String"
Nullable="false" MaxLength="5" FixedLength="true" /> </Schema>

Listing 3: Trivial MSL for Customers table to Customers EntitySet


<Mapping Space="C-S" <ScalarProperty Name="Address" ColumnName="Address" />
xmlns="urn:schemas-microsoft-com:windows:storage:mapping:CS"> <ScalarProperty Name="City" ColumnName="City" />
<ScalarProperty Name="Region" ColumnName="Region" />
<EntityContainerMapping StorageEntityContainer="dbo" <ScalarProperty Name="PostalCode" ColumnName="PostalCode" />
CdmEntityContainer="NorthwindEntities"> <ScalarProperty Name="Country" ColumnName="Country" />
<ScalarProperty Name="Phone" ColumnName="Phone" />
<EntitySetMapping Name="Customers" StoreEntitySet="Customers" <ScalarProperty Name="Fax" ColumnName="Fax" />
TypeName="NorthwindModel.Customer"> </EntitySetMapping>
<ScalarProperty Name="CustomerID" ColumnName="CustomerID" />
<ScalarProperty Name="CompanyName" ColumnName="CompanyName" /> </EntityContainerMapping>
<ScalarProperty Name="ContactName" ColumnName="ContactName" />
<ScalarProperty Name="Title" ColumnName="ContactTitle" /> </Mapping>

from the Employees table shown in Figure 1 was of three different ways: mapping of an entire in-
instead split into tables; one containing common heritance hierarchy to a single table, mapping of
address book information (FirstName, LastName, each type in an inheritance hierarchy to a different
Extension, ReportsTo, etc.) and one containing table, or a hybrid approach wherein the informa-
personal information such as Birthdate, Home Ad- tion common across types is in a single table with
dress, etc., as shown in Figure 2. In this case your additional tables containing the added columns for
conceptual model (CSDL) would remain the same, each derived type. The ADO.NET Entity Frame-
but your MSL would map the Employee EntityType work supports mapping to all three of these inheri-
to two separate tables, as shown in Listing 4. tance models.

Table per Hierarchy (TPH)


Mapping Inheritance
The Table per Hierarchy (TPH) mapping strategy
The Relational Data Model does not directly sup- uses a single (sparse) table containing the data for
port the concept of inheritance. Inheritance is all types within the hierarchy. A discriminator col-
commonly represented within a database in one umn is used to specify the subtype of a particular

Listing 4: MSL mapping Employee entity to two relational tables


<EntitySetMapping Name="Employees" <MappingFragment StoreEntitySet="Employees_Personal" >
TypeName="NorthwindModel.Employee"> <ScalarProperty Name="EmployeeID" ColumnName="EmployeeID" />
<ScalarProperty Name="BirthDate" ColumnName="BirthDate" />
<MappingFragment StoreEntitySet="Employees_AddressBook"> <ScalarProperty Name="HireDate" ColumnName="HireDate" />
<ScalarProperty Name="EmployeeID" ColumnName="EmployeeID" /> <ScalarProperty Name="Address" ColumnName="Address" />
<ScalarProperty Name="LastName" ColumnName="LastName" /> <ScalarProperty Name="City" ColumnName="City" />
<ScalarProperty Name="FirstName" ColumnName="FirstName" /> <ScalarProperty Name="Region" ColumnName="Region" />
<ScalarProperty Name="Title" ColumnName="Title" /> <ScalarProperty Name="PostalCode" ColumnName="PostalCode" />
<ScalarProperty Name="TitleOfCourtesy" <ScalarProperty Name="Country" ColumnName="Country" />
ColumnName="TitleOfCourtesy" /> <ScalarProperty Name="HomePhone" ColumnName="HomePhone" />
<ScalarProperty Name="Extension" ColumnName="Extension" /> <ScalarProperty Name="Notes" ColumnName="Notes" />
<ScalarProperty Name="Photo" ColumnName="Photo" /> </MappingFragment>
<ScalarProperty Name="PhotoPath" ColumnName="PhotoPath" />
</MappingFragment> </EntitySetMapping>

12 An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store www.code-magazine.com
row in the table, and columns not relevant for that The following MSL fragment shows how the Ad-
type are unused. This strategy optimizes for queries dress complex property is mapped within an Enti-
across different subtypes, particularly in the case of tySetMapping.
polymorphic results.
<ComplexProperty Name="Address">
Table per Concrete Type (TPT) <ScalarProperty Name="StreetAddress"
ColumnName="Address" />
<ScalarProperty Name="City"
The Table per Concrete Type (TPT) strategy uses dif-
ColumnName="City" />
ferent tables for each type in the hierarchy. Queries <ScalarProperty Name="Region"
against the base type unions the common fields from ColumnName="Region" />
the different tables. This strategy optimizes queries <ScalarProperty Name="PostalCode"
scoped to a particular subtype at the expense of ColumnName="PostalCode" />
queries against the base type. <ScalarProperty Name="Country"

Table per Subclass (TPC) </ComplexProperty>


ColumnName="Country" />
The Relational
Data Model does
The Table per subclass (TPC) strategy uses a single Mapping Relationships not directly support
table for the base properties across the type hierar-
chy, with separate tables for the additional proper- Relational databases use foreign keys to enforce ref- the concept of
ties defined in each subtype. This strategy is optimal erential integrity between related tables. While these inheritance.
for queries across the type hierarchy that project out integrity constraints are not a consumable part of
common fields from the base type by not requiring the data model (i.e., you can’t navigate them), they Inheritance
joins to the subtype tables. do imply relationships between tables. is commonly
The ADO.NET Entity Framework allows you to represented within
Mapping Complex Types map foreign key properties, whether or not they are a database in one
defined with a foreign key constraint, to navigable
Oftentimes there are groups of related properties relationships through Associations.
of three
that we’d like to see represented by a single com- different ways;
posite object. For example, looking at the North-
wind schema in Figure 1 we see that Customers,
The following fragment shows how an association
between Customers and Orders is defined within
the ADO.NET
Employees, and Orders all have address, city, region, the store schema definition (SSDL). Entity Framework
country, and postal code properties that represent supports mapping
addresses. It would be convenient if we could define <Association Name="FK_Orders_Customers">
a single Address type that contained this informa- <End Role="Customers" to all three of these
tion, and expose it as a single property anywhere we Type="NorthwindModel.Store.Customers" inheritance
needed to represent an address. Multiplicity="0..1" />
<End Role="Orders" models.
The ADO.NET Entity Framework supports Com- Type="NorthwindModel.Store.Orders"
plexTypes for doing just that. The snippet below Multiplicity="*" />
shows the definition of a complex type in CSDL. </Association>

<ComplexType Name="Address"> If the association in the store is based on a refer-


<Property Name="StreetAddress" ential constraint, that constraint can be called out
Type="String" MaxLength="60" /> within the Association element. For example, the
<Property Name="City" following constraint specifies that all orders must
Type="String" MaxLength="15" /> have a customer.
<Property Name="Region"
Type="String" MaxLength="15" /> <ReferentialConstraint>
<Property Name="PostalCode" <Principal Role="Customers">
Type="String" MaxLength="10" /> <PropertyRef Name="CustomerID" />
<Property Name="Country" </Principal>
Type="String" MaxLength="15" /> <Dependent Role="Orders">
</ComplexType> <PropertyRef Name="CustomerID" />
</Dependent>
The next CSDL fragment shows the use of the Com- </ReferentialConstraint>
plexType within an EntityType definition. When
binding to objects, the resulting CLR type would Associations can be defined in the conceptual sche-
have a single property called Address, of type Ad- ma (CSDL).
dress, containing the StreetAddress, City, Region,
PostalCode, and Country fields. <Association Name="FK_Orders_Customers">
<End Role="Customers"
<Property Name="Address" Type="NorthwindModel.Customers"
Type="Northwind.Address" Multiplicity="0..1" />
Nullable="false" <End Role="Orders"
/> Type="NorthwindModel.Orders"

www.code-magazine.com An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store 13
Multiplicity="*" /> defined within the model without having to build
</Association> complex joins in your query.

In addition, since an EntityContainer can contain


multiple EntitySets of the same type, an Assoca- Mapping Many:Many Relationships
tionSet is defined to specify the sets of entities be-
ing related. AssociationSets can be directly que- Many-to-many relationships (for example, a Stu-
ried, just as any other EntitySet. dent may be enrolled in multiple Courses, and
each Course may have multiple students) are
<EntityContainer Name="NorthwindEntities"> generally mapped in relational databases through
<AssociationSet Name="FK_Orders_Customers" an intermediate "join" table. Where that table
Association= contains no information other than the primary
"NorthwindModel.FK_Orders_Customers"> keys of the related entities, the Entity Framework
<End Role="Customers" EntitySet="Customers" /> allows you to navigate directly from one end to
<End Role="Orders" EntitySet="Orders" /> the other without going through the intermedi-
</AssociationSet> ate table. In the case that the intermediate table
</EntityContainer> has additional information that describes the re-
lationship (for example, number of absences and
Once associations have been defined in the grade) you can model the relationship as an entity
CSDL, NavigationProperties can be defined on (StudentEnrollment) with relationships to each of
the related Entities to navigate from one end of the other entities (Student and Course).
the relationship to the other. For data classes
generated from the conceptual model, navigation
properties appear as any other properties and can Mapping Functions and Stored
be used to navigate the object graph. Procedures
For example, rather than “CustomerID” appear- Databases typically include built-in and user-
ing as a string property on an Order, an Order defined functions, as well as stored procedures,
may contain a “Customer” property that allows for executing logic within the store. The ADO.
you to navigate to the actual related Customer NET Entity Framework allows you to call stored
instance. procedures and functions defined within your
storage schema (SSDL) as follows. For example,
<NavigationProperty Name="Customers" the following definition describes a function to
Relationship="NorthwindModel.FK_Orders_Customers" return all sales within a given category for a given
FromRole="Orders" year.
ToRole="Customers"
/> <Function Name="SalesByCategory" Aggregate="false"
BuiltIn="false" NiladicFunction="false"
Navigation properties can be defined on both ends IsComposable="false"
of a relationship; where the target end has a car- ParameterTypeSemantics=
dinality greater than one, the navigation property "AllowImplicitConversion"
is a collection of related entities (for example, a Schema="dbo">
Customer may have a collection of Orders). <Parameter Name="CategoryName"
Type="nvarchar" Mode="in" />
<NavigationProperty Name="Orders" <Parameter Name="OrdYear"
Relationship="NorthwindModel.FK_Orders_Customers" Type="nvarchar" Mode="in" />
FromRole="Customers" </Function>
ToRole="Orders"
/>
Using Stored Procedures for Change
Not only do relationships surface as navigation Processing
properties in generated objects, but they can be used
within queries to navigate the conceptual model.
The following query shows navigating relationships Changes to data are often controlled through
and complex properties to select the OrderID and stored procedures to insert, update, or delete
City for orders placed within Washington. items. The ADO.NET Entity Framework supports
the option to specify a stored procedure to be
SELECT o.OrderID, o.Customers.Address.City used in place of a generated DML statement in
FROM Orders AS o order to insert, delete, or update changes in the
WHERE o.Customers.Address.Region = 'WA' database.

The syntax of the query is Entity SQL (eSQL) Stored procedures to be used for updating must
which is described in greater detail in Shyam first be specified in the storage schema definition
Pather’s article on programming against the En- (SSDL) just as any other function or stored pro-
tity Framework. For the purposes of this example, cedure. For example, a stored procedure to add
notice how natural it is to navigate relationships an order detail might look like the following:

14 An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store www.code-magazine.com
<Function Name="AddOrderDetail" c.Region, c.PostalCode, c.Country,
IsComposable="false"> c.Phone, c.Fax)
<Parameter Name="OrderID" Type="int"/> FROM dbo.Customers as c
<Parameter Name="ProductID" Type="int"/> WHERE c.Country = "Italy"
<Parameter Name="UnitPrice" Type="money" /> </QueryView>
<Parameter Name="Quantity" Type="smallint" /> </EntitySetMapping>
<Parameter Name="Discount" Type="real" />
</Function> When using explicit mapping you must also specify
stored procedures, as described previously, if you
Once the stored procedure is declared, it can be wish to update the data retrieved through these
specified in a ModificationFunctionMapping ele- views.
ment within the EntitySet definition in the mapping
(MSL).
Mapping to Store Queries
<InsertFunction
FunctionName= In addition to allowing you to specify custom Entity In addition to
"NorthwindModel.Store.AddOrderDetail"> SQL query views in your mapping, the ADO.NET allowing
<ScalarProperty ParameterName="OrderID" Entity Framework lets you define a storage table in
Name="OrderID"/> terms of a native store query using a DefiningQuery. you to specify
<ScalarProperty ParameterName="ProductID" This allows you to leverage the full expressivity of custom Entity SQL
Name="ProductID"/> the store’s native query language, and apply map-
<ScalarProperty ParameterName="UnitPrice" ping to the target as a virtual storage table. As with
query views in your
Name="UnitPrice"/> explicit query views, if you use defining queries you mapping,
<ScalarProperty ParameterName="Quantity" must also specify stored procedures if you wish to
Name="Quantity"/> update the data retrieved through these virtual stor-
the ADO.NET Entity
<ScalarProperty ParameterName="Discount" age tables. Framework lets you
</InsertFunction>
Name="Discount"/>
The following storage schema definition (SSDL)
define a storage
fragment shows the use of a DefiningQuery to spec- table in terms of a
Note that you must use all stored procedures or
all generated update views for processing changes
ify a virtual table of Customers whose titles contain
the word “Sales”. Note that the query is written in
native store query
within an object graph. You cannot use a stored terms of the store’s native query language (in this using a
procedure to insert an entity and generated update
views to update or delete within the same EntitySet,
case, TSQL).
DefiningQuery.
or any EntitySet with a relationship to the EntitySet <EntitySet Name="Customers"
updated through stored procedures. EntityType=
"NorthwindModel.Store.Customers">
<DefiningQuery>
Advanced Mapping SELECT c.CustomerID, c.CompanyName,
c.ContactName, c.ContactTitle,
So far I’ve shown you examples of declarative c.Address, c.City, c.Region,
mappings from which the ADO.NET Entity Frame- c.PostalCode, c.Country, c.Phone, c.Fax
work generates client views for querying and up- FROM Customers as c
dating a relational schema in terms of a conceptual WHERE c.ContactTitle LIKE '%Sales%'
application-oriented data model. There are times, </DefiningQuery>
however, when you may require more control over </EntitySet>
the views used to map your conceptual model to
the store.

Conclusion
Explicit Mapping
Using the ADO.NET Entity Framework, developers
The ADO.NET Entity Framework lets you override are able to program against an application-centric
the automatic generation of query views by explic- Entity Data Model. Within the Entity Framework,
itly specifying a QueryView in your mapping. Views the EntityClient supports flexible declarative map-
are expressed in terms of Entity SQL. As a simple ping of that conceptual model to existing relational
example, the following mapping definition (MSL) storage schemas exposed through extended ADO.
fragment maps the Customers EntitySet to an Entity NET Data Providers. Developers may choose to
SQL query that returns only those customers from use the traditional ADO.NET Data Provider API by
Italy. writing to EntityClient directly, or query and return
results in terms of strongly-typed data objects using
<EntitySetMapping Name="Customers"> Object Services.
<QueryView>
SELECT VALUE NorthwindModel.Customer(
c.CustomerID, c.CompanyName, c.ContactName,
Michael Pizzo
c.ContactTitle, c.Address, c.City,

www.code-magazine.com An Entity Data Model for Relational Data Part II: Mapping an Entity Data Model to a Relational Store 15
ONLINE QUICK ID 0712042

Programming Against the


ADO.NET Entity Framework
The ADO.NET Entity Framework raises the level of abstraction at
which developers work with data.
Rather than coding against rows and columns, the ADO.NET Entity Framework
allows you to define a higher-level Entity Data Model over your relational data,
and then program in terms of this model. You get to deal with your data in the
Shyam Pather shapes that make sense for your application and those shapes are expressed in a
spather@microsoft.com richer vocabulary that include concepts like inheritance, complex types, and explicit
Shyam Pather is a Senior relationships.
Development Lead on the

I
Data Programmability Team at n his article, An Entity Data Model for Rela- SQL (E-SQL), which augments traditional SQL
Microsoft, currently focused on tional Data (parts I & II), Mike Pizzo explains with constructs necessary for querying in terms of
building the first release of the the details of the Entity Data Model and how the higher-level modeling concepts in Entity Data
ADO.NET Entity Framework. to map a model to a database. In this article, I dis- Models (inheritance, complex types, and explicit
Shyam began his career at cuss the capabilities the ADO.NET Entity Frame- relationships).
Microsoft in the Windows work provides for program-
Networking team, working first ming against a model after it Fast Facts Above the EntityClient layer,
on network driver infrastructure is defined. The material in this the ADO.NET Entity Frame-
and then on the first two releases article is intended as an over- The ADO.NET Entity work includes an Object Ser-
of Universal Plug and Play in view that covers the breadth of Framework allows you to vices layer, which provides a
Windows. Shyam joined the features available to develop- program in terms of a high- programming model in terms
SQL Server team to work on an ers. Each of these features is a level conceptual model of of strongly-typed objects. At the
incubation project that eventually topic unto itself and is covered your data. This eliminates Object Services layer, entities
became SQL Server Notification in more detail in the MSDN
much of the need for custom are represented as objects—in-
Services. After shipping two documentation that accom- stances of data classes that are
data access layers.
releases of that product, Shyam panies the ADO.NET Entity mapped to the entity types in
started in his current role on Framework. the model. The Object Services
ADO.NET. His team delivers key layer supports querying with both Entity SQL and
parts of the object-relational Language Integrated Query (LINQ). Query results
mapping technology on which Overview of the ADO.NET Entity are manifested as objects whose properties can be
the ADO.NET Entity Framework Framework read and set by application code. Object Services
is based. keeps track of changes made to the objects and can
propagate these back to the database. The ADO.
The ADO.NET Entity Framework is built in a lay- NET Entity Framework’s mapping engine performs
ered architecture on top of existing ADO.NET 2.0 the translation between changes to the entity objects
data providers. Figure 1 illustrates this layering. and operations on the underlying database tables.

The ADO.NET 2.0 Data Providers offer a common Developers can choose to program against either
programming model for accessing disparate data the EntityClient layer or the Object Services layer.
sources. Whether you are programming against a This choice provides flexibility for different applica-
Microsoft SQL Server database or an Oracle data- tion scenarios. In cases where the application does
base, you use the same fundamental programming not require objects, the EntityClient layer provides a
abstractions: a connection object, a command ob- lower-level entry point that offers the benefits of en-
ject, and a data reader object. tities without the overhead of object materialization.
The Object When you require the convenience and change-
The ADO.NET Entity Framework introduces a tracking capabilities of objects, you can choose to
Services layer new data provider called EntityClient; this is the work at the Object Services layer. In either case,
supports querying first layer above the ADO.NET 2.0 data providers. an underlying ADO.NET data provider performs
EntityClient offers the same programming abstrac- database operations and compensates for the dif-
with both Entity tions as the other data providers—connections, ferences between database systems. Thus, the ADO.
SQL and Language commands, and data readers—but adds mapping ca- NET Entity Framework exposes exactly the same
pabilities that translate queries expressed in terms programming model and query syntax, regardless of
Integrated of the model into the equivalent queries in terms of the particular DBMS in which the data is stored.
Query (LINQ). the tables in the databases. To complement these
capabilities, the ADO.NET Entity Framework in- In this article, I’ll approach the ADO.NET Entity
troduces an extended query language called Entity Framework top-down: First I’ll show some code

16 Programming Against the ADO.NET Entity Framework www.code-magazine.com


snippets that illustrate programming at the Object
Services layer. Then, I’ll delve into some examples
that show EntityClient in action.

A Sample Entity Data Model


For the purpose of illustration in this article, I’ll use
the sample data model shown in Figure 2. This mod-
el represents various types of athletic events, like
bike races, running events, and triathlons. There are
two base types in this model: Event, which encap-
sulates the attributes common to all athletic events,
and Participant, which represents a participant in
one or more events. The model declares a many-to-
many relationship between Events and Participants
called Event_Participant.

Both Event and Participant have sub-types. Event


is the parent for sub-types BikeEvent, RunEvent,
SwimEvent, and TriathlonEvent. Each of these sub-
types augments the base type with new properties.
Participant has a sub-type called Volunteer: This
is used to represent participants who volunteer, as
opposed to compete, in the events. Note that Vol- Figure 1: The ADO.NET Entity Framework is built in a layered architecture above the existing ADO.NET
unteer does not add any new fields to its base type, 2.0 data providers.
Participant.

The model also declares an entity container called


TriEventsEntities, which contains entity sets for
Events and Participants and an association set for In cases where the
the Event_Participant association. Figure 2 does
not show the entity container, but it is a fundamen- application does not require
tal part of the model. objects, the EntityClient layer
The ADO.NET Entity Framework supports sever-
provides a lower-level entry point
al ways to map a model like this one to relational that offers the benefits
tables. I happened to use a Table-per-Type (TPT)
mapping (a table for each base type and a table for
of entities without the overhead of
each sub-type that stores just the additional proper- object materialization.
ties declared by the sub-type). For the purposes of
this article, the mapping strategy is irrelevant: the

Figure 2: This sample data model describes athletic events and their participants.

www.code-magazine.com Programming Against the ADO.NET Entity Framework 17


ADO.NET Entity Framework supports the same to the database because the code snippet does not
programming patterns regardless of how you map include any connection information. The default
the model. constructor generated on the object context class,
TriEventsEntities, looks for a named connection
string in the App.config file with the same name.
Programming with Object Services The ADO.NET Entity Framework tools conveniently
create the TriEventsEntities connection string in the
To program against the model with Object Services, App.config file when generating the class. This con-
you need data classes that correspond to the types nection string will look something like the following:
declared in the model. The ADO.NET Entity Frame-
work includes tools that can generate these classes metadata=.\TriEvents.csdl|.\TriEvents.ssdl|.
from the model definition. Alternatively, you can \TriEvents.msl;provider=System.Data.SqlClient;
code these classes by hand (doing so requires you provider connection string="Data
to implement a specific interface mandated by the Source=.\SqlExpress;
ADO.NET Entity Framework). I’ve based the code Initial Catalog=TriEvents;
snippets shown in this article on classes generated by Integrated Security=True;
Composing Queries the ADO.NET Entity Framework’s tools. MultipleActiveResultSets=true"
The ObjectQuery<T> class
supports query composition: Here is a code snippet that enumerates entities in You can think of this connection string as consisting
You can compose any the model: of three parts. The first part specifies a list of meta-
ObjectQuery<T> with additional data files; these files define the model, the database
query operators to form a new using (TriEventsEntities entities = schema, and the mapping between the two in a for-
query. You do this via query- new TriEventsEntities()) mat defined by the ADO.NET Entity Framework. The
builder methods that correspond { second part of the connection string specifies which
to the various query operators. // Enumerate entities ADO.NET data provider to use to connect to the da-
foreach (Event e in entities.Events)
tabase. This example specifies the SqlClient data pro-
{
vider. The final part of the connection string is really
For example, the Console.WriteLine("Event Name: {0}",
another connection string: the one the data provider
ObjectQuery<T> class contains e.EventName);
} uses to connect to the database. The ADO.NET En-
a Where() method that takes
} tity Framework uses the three parts of the connection
a predicate. It composes this
string to establish a connection to a model: it uses the
predicate with the original query
The preceding snippet is all that’s required to connect metadata files to load the model and mapping infor-
to form a new, more restrictive
to the model and enumerate the Event entities within mation, it uses the data provider name to instantiate
query. ObjectQuery<T>’s query
it. Dissecting this code highlights several important the data provider, and it uses the provider’s connec-
builder methods support both
aspects of working with the ADO.NET Entity Frame- tion string to connect to the database.
LINQ query specification (via
work.
lambda expressions) and Entity
SQL fragments.
The first line of code creates an instance of the Navigating Relationships
TriEventsEntities class. The ADO.NET Entity Frame-
work tools generated this class and it corresponds to The Entity Data Model contains an explicit relation-
the entity container declared in the model. It inherits ship between Events and Participants. The Event and
from the Object Services base class, ObjectContext, Participant types have navigation properties that tra-
and you can loosely refer to it as the “object context” verse this relationship and allow you to find entities
for the model. You can think of this class as a starting on one end of the relationship when you are holding
point for interaction with the model–It encapsulates a reference to the other end. For example, given an
both a connection to the database and the in-mem- Event object, I can use the Participants navigation
ory state required to do change tracking and identity property to find the related Participant entities. The
resolution for the objects that result from queries. It is following code snippet shows how (in this and sub-
instantiated within a using block, which ensures that sequent code snippets, the code shown goes within
it is disposed of properly at the end of the code. the using block declared in the previous code snip-
pet; the using block is omitted here for brevity):
Within the using block, the foreach loop iterates over
objects in the Events property in the object context. // Navigate relationships explicitly
This property represents the contents of the Events foreach (Event e in entities.Events)
entity set (the generated object context class has a {
property for each entity set in the container). Iteration Console.WriteLine("Event Name: {0}",
over the Events property sends a query for all Events e.EventName);
to the database. The ADO.NET Entity Framework
materializes the results of this query into objects of e.Participants.Load();
the generated Events class. Within the foreach loop,
the code simply prints the value of one of the entity’s foreach (Participant p in e.Participants)
properties. You could easily imagine more sophisti- {
cated processing logic here. Console.WriteLine("\tParticipant:{0}",
p.ParticipantName);
At this point, you may be wondering how the ADO. }
NET Entity Framework establishes the connection }

18 Programming Against the ADO.NET Entity Framework www.code-magazine.com


This code navigates through the events and prints entities.CreateQuery<Event>(@" Data Programmability Team
the related participants for each one. As in the previ- SELECT VALUE e
ous example, the foreach loop that iterates over the FROM TriEventsEntities.Events AS e
Events results in a database query for all events. By WHERE e IS OF (ONLY TriEventsModel.RunEvent)");
default, when issuing such a query, the ADO.NET En-
foreach (Event e in runEventsQuery)
tity Framework does not pull in related entities (do- {
ing so would add overhead that is unnecessary, unless Console.WriteLine("Event Name: {0}",
you are going to access those related entities). There- e.EventName);
fore, to obtain the related Participant objects, the code }
makes a call to the Load() method on the Participants
collection on each Event entity. This issues a separate The code calls the CreateQuery<T>() method on the
query to the database to retrieve the participants asso- object context class to obtain an ObjectQuery<T> ob-
ciated with each event. Thereafter, you can enumerate ject. CreateQuery<T>() takes a string representation
the participants with a foreach loop. of the query. This example illustrates the use of the
IS OF operator in Entity SQL, which filters entities
If you know you will need to pull in related entities, based on type. Anil Nori
you can take advantage of an Object Services feature
called Relationship Span and avoid having to issue When you execute the query, the ADO.NET Entity “I am excited to work on
separate queries. The following code snippet illus- Framework translates the Entity SQL query represen- the next phase of the data
trates how to retrieve Event entities and the related tation into a query in terms of the database schema. platform strategy. With our
Participants at the same time: To do this, the ADO.NET Entity Framework consults store (database) presence on
the mapping information in the mapping files refer- the servers, desktops, devices,
// Relationship span enced in the connection string. The Object Services mid-tier and cloud, and with
foreach (Event e in layer materializes results of the query as objects of the data development technologies
entities.Events.Include("Participants")) Event class. on these tiers, we really have
{ an opportunity to build the
Console.WriteLine("Event Name: {0}",
Entity SQL provides powerful query capabilities. Be- end-to-end data platform. I am
e.EventName);
cause it is a text-based language, it lends itself well to quite excited about the future.”
foreach (Participant p in e.Participants) scenarios in which you construct queries dynamically.
{ For more information on Entity SQL, see the MSDN
Console.WriteLine("\tParticipant:{0}", reference documentation. With over 20 years of experience
p.ParticipantName); in building complex database
} As an alternative to Entity SQL, some developers may and eBusiness systems –Anil,
} choose to express queries with LINQ. LINQ offers Distinguished Engineer on the
the advantages of strong compile-time checking and Data Programmability team, is as
In the outer foreach loop, the code calls the Include() IntelliSense support in Visual Studio. The ADO.NET excited as ever about architecting
method to specify which related entities the query Entity Framework supports LINQ over its Object Ser- and building Microsoft’s data
should retrieve. The Include() method takes a text vices layer, as illustrated in the following snippet: platform.
representation of a path across one or more naviga-
tion properties (here you reference the “Participants” // LINQ Query Previously, he was at Oracle as
navigation property). In models that have several lev- var longEventsQuery = a database server architect for
els of relationships (for example, Categories related to from e in entities.Events Oracle8 and Oracle8i database
Products related to Suppliers), you can use the Include where e.StartDate != e.EndDate releases. At Oracle, he drove the
method to traverse more than one level. Because the select e; efforts in Oracle object-relational
code called the Include() method in the outer enu- and extensible technology,
meration, it is not necessary to call the Load() method foreach (Event e in longEventsQuery) Internet and multi-media
before enumerating the Participants on each Event. { DBMS development, and XML
Console.WriteLine("Event Name: {0}", technology. Nori also worked
e.EventName); as a Database Architect for DEC
Issuing Queries } database products, where he
was involved in the development
The previous snippets showed examples of enumerat- Here I use LINQ to formulate a query for events of centralized and distributed
ing the top-level properties in the object context class whose end dates are different than their start dates. DBMS products. In the space of
to query for all entities in a particular entity set. In Notice that I expressed the query in native C# lan- applications, Nori co-founded
your application, you will likely want to issue other guage constructs and it is not enclosed in quotes. Asera Inc., which built eBusiness
types of queries as well. The ADO.NET Entity Frame- Had this query contained a syntactic error, the platform, tools, and solutions
work’s Object Services layer supports this via the compiler would have caught it at compile time. For in Order Management and
ObjectQuery<T> class. more information on LINQ, see Elisa Flasko’s ar- Supply Chain. Anil is actively
ticle LINQ to Relational Data, also in this issue of involved in the database research
The following snippet shows ObjectQuery<T> used CoDe Focus. community. He has published
to query for all event entities that are of the sub-type, papers at ICDE, SIGMOD, and
RunEvent. Here, the query is expressed using Entity As with the Entity SQL example, the ADO.NET VLDB conferences.
SQL. Entity Framework translates this query into the ap-
propriate form for sending to the database. The da-
// Object Query tabase evaluates the query and the Object Services
ObjectQuery<Event> runEventsQuery = layer materializes the results as objects.

www.code-magazine.com Programming Against the ADO.NET Entity Framework 19


Performing Inserts, Updates, and determines the appropriate database tables to modify
Deletes as a result of the operations you performed with the
objects. In this example, the operations add rows to
the tables that store information about events and
So far, all the examples have been read operations. participants, as well as the link table used to relate
The ADO.NET Entity Framework’s Object Services the two.
layer also supports the inserting, updating, and de-
leting entities. The following code snippet shows Though this example just showed inserting new ob-
how you can create a new RunEvent entity, along jects, you can perform updates and deletes in much
with two related Participants: the same way. For updates, the code simply modifies
the properties of previously read objects. The object
// Insert new entities and relationships context will keep track of the changes and persist
RunEvent runEvent = new RunEvent(); them when you call SaveChanges(). For deleting en-
runEvent.EventName = "Seattle Marathon"; tities, the object context provides a DeleteObject()
runEvent.StartLocation = "Memorial Stadium"; method, which you can use to mark an object as
runEvent.StartDate = new DateTime(2007, 11, 25); deleted. Again, when you call SaveChanges(), the
runEvent.EndDate = new DateTime(2007, 11, 25); ADO.NET Entity Framework propagates the delete
runEvent.Distance = 26.2M; operation to the database.
runEvent.ElevationGain = 4000;

entities.AddToEvents(runEvent);
Programming with EntityClient
Participant p1 = new Participant();
p1.ParticipantName = "Colin"; By design, programming with EntityClient follows
entities.AddToParticipants(p1); the same patterns familiar to ADO.NET 2.0 devel-
opers. You use a connection object to establish a
Volunteer v1 = new Volunteer(); connection to a model, a command object to specify
v1.ParticipantName = "Carl"; a query, and a data reader object to retrieve query
entities.AddToParticipants(v1); results. EntityClient represents the results as data re-
cords, shaped according to the Entity Data Model.
runEvent.Participants.Add(p1);
runEvent.Participants.Add(v1);
Unlike Object Services, the EntityClient layer sup-
entities.SaveChanges(); ports only Entity SQL as a query language. There is
no LINQ support at this layer because there are no
strongly-typed objects over which queries could be
Simply create a new instance of the RunEvent class expressed. Also, EntityClient does not support write
and set the properties. In this model, the database operations. To perform inserts, updates, or deletes,
server generates the EventID property, so it is not you must use the Object Services layer.
necessary to set it here.
Despite these restrictions, EntityClient is a remark-
After creating the RunEvent object, add it to the list ably useful and efficient way to program against en-
of objects that the object context is tracking by call- tities. This section looks at some code examples that
ing the AddToEvents() method. The generated object illustrate EntityClient features.
context class has methods for adding objects to each
entity set, named AddTo[EntitySetName](). By calling
AddToEvents(), you make the object context aware Basic Querying with EntityClient
of the new Event entity for the purposes of change
tracking and communicating with the database. The following code snippet shows EntityClient used
to retrieve all Event entities of type RunEvent (no-
In addition to creating the RunEvent object, the tice that this is the same Entity SQL query used in
code also creates two Participant objects in much the an earlier Object Services example—EntityClient
same way. One of the participants is a volunteer, so supports exactly the same Entity SQL syntax):
it creates an instance of the Volunteer derived class.
For both objects, the code calls the context’s Add- using (EntityConnection conn = new
ToParticipants() method to include them in the set of EntityConnection("name=TriEventsEntities"))
objects the context is tracking. {
conn.Open();
EntityCommand cmd = conn.CreateCommand();
Finally, the code adds the participant objects to the
cmd.CommandText = @"
Participants collection on the RunEvent object. This SELECT VALUE e
establishes the relationship between the event and FROM TriEventsEntities.Events AS e
the participants. WHERE e IS OF (ONLY TriEventsModel.RunEvent)";

At this point the new objects and their relationships // Use data reader to get values
only exist in memory. To persist them, you call the using (DbDataReader reader =
SaveChanges() method on the context. This method cmd.ExecuteReader(
propagates all changes it has tracked to the database. CommandBehavior.SequentialAccess))
The ADO.NET Entity Framework’s mapping engine {

20 Programming Against the ADO.NET Entity Framework www.code-magazine.com


while (reader.Read()) while (reader.Read())
{ {
Console.WriteLine( IExtendedDataRecord extRecord =
"Event Name: {0}", reader as IExtendedDataRecord;
(string)reader["EventName"]);
} EntityType recordType =
} extRecord.DataRecordInfo.RecordType.EdmType
} as EntityType;

The first line creates a new EntityConnection within foreach (EdmMember member in
a using block (this ensures the connection is disposed recordType.Members)
properly). The EntityConnection constructor takes a {
connection string. Here I’ve used the same named if (member is EdmProperty)
connection string used under the hood in the pre- {
ceding Object Services examples. Alternatively, you
could have specified any other connection string, as Console.WriteLine("{0}: {1}",
member.Name,
long as it consisted of the three parts described ear-
reader[member.Name]);
lier. }
}
After creating the connection object, the code calls }
the Open() method to open the connection. Next it }
creates an EntityCommand object associated with
this connection by calling the connection’s Create-
Command() method. Then it sets the command text The code in this snippet could go in place of the re-
to an Entity SQL string. sults processing code in the previous snippet. Within
the while loop used to iterate over the result records,
Finally, it calls ExecuteReader() to execute the query you cast the reader to the IExtendedDataRecord in-
and obtain a data reader to iterate over the results. terface. This is a new interface, introduced in the
This part of the code should look identical to results ADO.NET Entity Framework, and provides access
processing code you may have written against ADO. to the metadata associated with the results. From
NET 2.0 providers. It uses a while loop to read each the IExtendedDataRecord you can navigate to the
record and prints the value of one of its fields. EdmType property to determine the entity type for
each record. The ADO.NET Entity Framework rep-
EntityClient uses the same mapping engine as the resents the entity type with an instance of Entity-
rest of the ADO.NET Entity Framework for perform- Type, a class in the ADO.NET Entity Framework’s
ing query translation. The engine translates the query Metadata system (documented fully in MSDN).
written here into database terms and reassembles the
results into the shapes specified by the Entity Data Given the EntityType object, the code iterates over
Model. Because EntityClient does not materialize the type’s members and prints the name and value
objects for the results, the overhead incurred is lower of each one. You can obtain the values by indexing
than in the Object Services layer. into the data record using the member name. The
if-statement that checks whether the member is of
type EdmProperty is necessary to filter out naviga-
Accessing Metadata from EntityClient tion properties, which do not appear in the result
records.

While the results that come back from EntityCli- In a real application you could use the metadata
ent queries may appear similar to those returned by returned with EntityClient records to build dynamic
other data providers, there is one important differ- UI or to provide more context for the meaning of
ence: EntityClient records come with rich metadata the results. Though I did not show it in this article,
that describes the entity types that define the record the same metadata is also available at the Object
structure. You can use this to your advantage when Services layer.
writing code that processes the result data.

Notice in the previous example that you had to Wrapping Up


hardcode the name of the field (“EventName”) that
you wanted to access in the result record. If you This article provides just a brief overview of the
wanted to build a more dynamic system that printed features in the ADO.NET Entity Framework’s pro-
all result fields without knowing them in advance, gramming surface. The ADO.NET Entity Frame-
you could use metadata to do so, as the following work’s documentation and samples describe the
example shows: features covered here (and several more I did not
cover) in greater detail. The ADO.NET Entity
// Use metadata to drive result retrieval Framework team encourages you to try the Beta 2
using (DbDataReader reader = release and share your feedback.
cmd.ExecuteReader(
CommandBehavior.SequentialAccess)) Shyam Pather
{

www.code-magazine.com Programming Against the ADO.NET Entity Framework 21


ONLINE QUICK ID 0712052

Rich Query for DataSet —


An Introduction to LINQ to
DataSet
For years developers have been asking for query over data contained
in a DataSet in a way that supports the expressiveness needed by
today’s data-centric .NET applications. As part of the .NET framework 3.5,
Andrew Conrad
aconrad@microsoft.com Microsoft® will introduce support for a technology called Language Integrated Query
http://blogs.msdn.com/ (LINQ), and with this introduction, an implementation of LINQ to DataSet.
aconrad/

T
he ADO.NET programming model gave us the the DumpResults method to execute queries and ex-
Andrew Conrad is a Lead ability to explicitly cache data in an in-memory amine the results.
Software Design Engineer data structure called the DataSet. The DataSet
with the SQL Server group at uses the relational paradigm of tables and columns
Microsoft Corporation. He was to store and represent the cached data. In addition, LINQ to DataSet—What Every Developer
one of the core designers of
the LINQ over DataSet and
the DataSet also exposes a number of services (i.e., Needs to Know
metadata, constraints, change
LINQ to Entities technologies
and has been working with the
tracking, etc.) that makes work- Fast Facts LINQ queries target
ing with data from a relational IEnumerable<T> sources. Unfor-
LINQ project since close to its database an easy and straightfor- LINQ to DataSet queries tunately, the DataSet’s DataTable
inception. ward task. can be arbitrarily complex and the DataTable’s DataRow-
and can contain anything that Collection do not implement
His current projects include
The one glaring weakness of can be expressed IEnumerable<DataRow> so these
codename Project “Astoria”
the DataSet is not exposing rich in the host language types cannot be used as sources for
(REST over Entities) and the
query capabilities, the kind of LINQ query expressions. To work
codename Project “Jasper” (C# or Visual Basic).
support developers have grown around this issue, an extension
incubation project (ADO.NET for accustomed to when accessing method called AsEnumerable was
Dynamic languages). For more relational data. Because of this limitation, developers added to the DataTable type. This extension method
information on these projects have had to live with DataSet’s limited query mecha- takes a source DataTable and wraps it in an object of
see: http://msdn2.microsoft. nism (via simple string expressions for sorting or filter- the type EnumerableRowCollection which implements
com/en-us/data/bb419139. ing) or have had to build their own custom rich query IEnumerable<DataRow>. This allows DataTables to be
aspx. implementations on top of DataSet. This has lead to a source for LINQ query expressions.
a lot of extra work and custom application code that
must be maintained by developers. With the exception of some DataSet-specific LINQ
query operators and features that I will discuss later
As part of .NET Framework 3.5, Microsoft introduces in this article, DataSet’s LINQ implementation does
support for a technology called Language Integrated not differ from LINQ to Objects. In fact, the DataT-
Query (LINQ). LINQ exposes a common query ab- able.AsEnumerable method turns a given DataTable
straction integrated into the .NET programming lan- into a sequence of objects where the element type is
guages (C# and Visual Basic) for querying all kinds DataRow. To illustrate this point, the following code
of data including objects, XML, and relational data. shows two, equivalent, LINQ query expressions both
Hence, supporting LINQ to DataSets seemed like a using the standard LINQ to Objects implementation
natural fit. of the Where query operator to apply the predicate to
the IEnumerable<DataRow> source.
All code snippets in this article will run against a sam-
ple instance of DataSet created with the code in List- var query1 = from park in parksDataTable.AsEnumerable()
ing 1. This code creates a DataSet with three tables: where (string) park["Country"] == "Japan"
National Parks, Countries, and Continents. The Data- select park;
Set also contains a DataRelation from Continents to
Countries. After creating the DataTables and DataSet, List<DataRow> list =
data is added to the sample DataSet instance. new List<DataRow>(
parksDataTable.Rows.Cast<DataRow>());
In Listing 2 you’ll see a small utility method for dump-
ing the query results to the console if required. All of var queryLinqtoObjects = from park in list
the samples in this article will run against the sample where (string)park["Country"] == "Japan"
DataSet mentioned earlier, making it possible to use select park;

22 Rich Query for DataSet – An Introduction to LINQ to DataSet www.code-magazine.com


Listing 1: Code to create sample DataSet
DataSet ds = new DataSet("Parks"); new object[] { 3, "Asia" });

DataTable parksDataTable = ds.Tables.Add("National Parks"); countriesDataTable.Rows.Add(


parksDataTable.Columns.Add("ID", typeof(int)); new object[] { 1, "Canada", 31612000, 1});
parksDataTable.Columns.Add("Name", typeof(string)); countriesDataTable.Rows.Add(
parksDataTable.Columns.Add("YearEstablished",typeof(int)); new object[] { 2, "USA", 302249000, 1 });
parksDataTable.Columns.Add("Country", typeof(string)); countriesDataTable.Rows.Add(
parksDataTable.Columns.Add("Rating", typeof(int)); new object[] { 3, "Argentina", 39921833, 2 });
countriesDataTable.Rows.Add(
DataTable countriesDataTable = ds.Tables.Add("Countries"); new object[] { 4, "Japan", 127433000, 3 });
countriesDataTable.Columns.Add("ID", typeof(int)); countriesDataTable.Rows.Add(
countriesDataTable.Columns.Add("Name", typeof(string)); new object[] { 5, "South Korea", 49024737, 3 });
countriesDataTable.Columns.Add("Population", typeof(long));
countriesDataTable.Columns.Add("ContinentID", typeof(int)); parksDataTable.Rows.Add(
new object[] { 1, "Jasper", 1907, "Canada", 8});
DataTable continentsDataTable = ds.Tables.Add("Continents"); parksDataTable.Rows.Add(
continentsDataTable.Columns.Add("ID", typeof(int)); new object[] { 2, "Yoho", 1886, "Canada" , 7});
continentsDataTable.Columns.Add("Name", typeof(string)); parksDataTable.Rows.Add(
new object[] { 3, "North Cascade", 1968, "USA", 9 });
DataRelation ContinentCountryDataRelation = ds.Relations.Add( parksDataTable.Rows.Add(
continentsDataTable.Columns["ID"], new object[] { 4, "Lago Puelo", 1971, "Argentina", 8 });
countriesDataTable.Columns["ContinentID"]); parksDataTable.Rows.Add(
new object[] { 5, "Bandai-Asahi", 1950, "Japan", null });
continentsDataTable.Rows.Add( parksDataTable.Rows.Add(
new object[] { 1, "North America" }); new object[] { 6, "Saikai", 1955, "Japan", 6 });
continentsDataTable.Rows.Add( parksDataTable.Rows.Add(
new object[] { 2, "South America" }); new object[] { 7, "Jirisan", 1967, "South Korea", 8 });
continentsDataTable.Rows.Add(

Listing 2: Sample method to display LINQ query results


void DumpResults<T>(IEnumerable<T> source) {
{ foreach(T t in source)
StringBuilder sb = new StringBuilder(); {
if (typeof(T) == typeof(DataRow)) foreach(PropertyInfo pi in typeof(T).GetProperties())
{ {
foreach(DataRow dr in source.Cast<DataRow>()) object o = pi.GetValue(t, null);
{ sb.Append(o + " ");
foreach(object o in dr.ItemArray) }
sb.Append((o == DBNull.Value ? "DBNull" : o) + " "); sb.Append('\n');
sb.Append('\n'); }
} }
} Console.WriteLine(sb.ToString());
else }

When writing LINQ queries against DataSet, one null values as the DBNull constant which cannot be
needs to be aware that the element type of the source cast to any CLR value type; this makes writing LINQ
is specifically DataRow. The following code snippet queries tricky since the developer must remember to
demonstrates why this is true. guard all indexer calls with a call to the DataRow.Is-
Null method to verify that a column value is not equal
var query2 = from park in parksDataTable.AsEnumerable() to DBNull, before accessing the value and unboxing
where !park.IsNull("Rating") && it to the desired type. If this guard is not put in place,
(int) park["Rating"] > 8 a runtime exception will occur when the value is DB-
select park; Null and the code tries to unbox the value.

There are actually two interesting issues here. First, To make writing LINQ to DataSet queries easier and
when accessing the column values of a given DataRow, less error prone, a generic extension method called
one must call the DataRow indexer. Since the column Field<T> was added to the DataRow class. Essentially
values can be any type, the indexer return type is object the Field<T> method works the same as the DataRow
instead of the specific type of column. In other words, indexer but allows the user to specify the static type for
the DataRow column value accessor is weakly typed. the return value of the method as the generic param-
This allows you to use a single method to retrieve col- eter. More importantly, if the user specifies a nullable
umn values independent of type. On the downside, it type as the generic parameter, the field method will
makes writing LINQ to DataSet queries awkward and convert any DBNull values to null. Using the field
error prone, given that the user must remember to cast method, here is the equivalent LINQ query from the
to the desired type. Second, the DataSet represents prior example.

www.code-magazine.com Rich Query for DataSet – An Introduction to LINQ to DataSet 23


var query3 = from park in parksDataTable.AsEnumerable() not require that other metadata (i.e., column names)
where park.Field<int?>("Rating") > 8 match since in most cases these may differ.
select park;

As a general rule, all access to DataRow column values Shaping the Query Results—Beyond
should be done through the Field<T> method when DataRows
developing LINQ to DataSet query expressions. This
makes the query expression much less error prone and LINQ query developers often want to shape their que-
generally provides much cleaner code. ry results to meet specific application needs. The sim-
plest example of this is using the LINQ Select query
Since DataSet is an entirely in-memory cache, all operator to project the results of a query expression.
LINQ queries against it are translated directly into
.NET IL (intermediate language). This is different from var query5 = from park in parksDataTable.AsEnumerable()
LINQ technologies such as LINQ to SQL or LINQ to orderby park.Field<int>("YearEstablished")
Entities which require the LINQ query to be translated select new
to the target source’s query language, and are therefore {
limited by what can be translated. As a result, LINQ ParkName = park.Field<string>("Name"),
to DataSet queries can be arbitrarily complex and can Established =
contain anything that can be expressed in the host lan- park.Field<int>("YearEstablished")
guage. For example, here is a LINQ to DataSet query };
that could not be easily translated to SQL.
When projecting the results of a LINQ to DataSet que-
Func<long, bool> checkPopulation = delegate(long i) ry, the fact that the values must be projected into a type
{ return i < 100000000;}; other than DataRow can be limiting. This must be done
because DataRows cannot be created that do not tie to
var query4 = from country in an existing DataTable. This means that the DataRow
countriesDataTable.AsEnumerable() column values are copied to the new instance based on
where country.Field<string>("Name").Length > 3 && the specified selector and hence any changes to the new
!country.HasErrors && instance are not reflected in the original DataRow.
checkPopulation(
country.Field<long>("Population")) Another detail to note—when defining the anonymous
select country; type in the projection, a name must be specified for
each member because the value is being selected via a
When working with LINQ to DataSet, you should method and not via a Property or Field member. This
also keep in mind how the LINQ set query operators is not really a huge issue, but a minor usability issue
(Union, Distinct, Intersection, Except) work. These when projecting from DataRows.
operators all have two overloads, one which takes
an IEqualityComparer<T> and one that does not. As a workaround you can create DataRows via the
When calling one of the set query operators where DataTable’s DataRow factory method (DataRowCol-
the source(s) is IEnumerable<DataRow>, the set op- lection.Add) in the projection selector expression, or
erator implementation will call DataRow.Equals() to you can project the entire DataRow into a member of
compare elements when an IEqualityComparer<Dat a new type instance.
aRow> comparer is not passed in. For developers fa-
miliar with the well known relational semantics of set var query6 = from park in parksDataTable.AsEnumerable()
operators, the results will be quite unexpected since orderby park.Field<int>("YearEstablished")
the DataRow.Equals method will compare DataRow select new
references and not the underlying values. {
ParkName = park.Field<string>("Name"),
To support more traditional relational semantics LINQ Established =
to DataSet includes a type called DataRowComparer park.Field<int>("YearEstablished"),
which includes a default DataRow comparer for com- DataRow = park
paring two DataRows. };

var source1 = from park in parksDataTable.AsEnumerable() You might also find it interesting to group results
where park.Field<int?>("Rating") < 8 based on a specific key.
select park;
var query7 = from park in parksDataTable.AsEnumerable()
var source2 = from park in parksDataTable.AsEnumerable() group park by park.Field<string>("Country") into g
where park.Field<int>("YearEstablished") > 1960 select new {Country = g.Key, Count = g.Count()};
select park;
Note that the key selector for the Group By query opera-
var unionResults = source1.Union(source2, tor uses the Field<T> method instead of the DataRow
DataRowComparer.Default); indexer. Although using the DataRow indexer could
work, there is an even more sinister problem if one for-
This comparer compares the specific row values and gets to not unbox the return value of the indexer. Name-
column types to decide if two rows are equal. It does ly, the compiler won’t complain if one doesn’t include

24 Rich Query for DataSet – An Introduction to LINQ to DataSet www.code-magazine.com


the proper cast expression. In some cases, this will mean a LINQ to DataSet query and turn it into a DataView
that object references, instead of the underlying values, or load it into a DataTable.
will be used to do comparisons between key values,
most likely not the desired result. As stated before, al- The AsLinqDataView query operator takes a LINQ to
ways use the DataRow.Field<T> method to avoid these DataSet query and turns it into a DataView instance. Al-
issues when developing LINQ to DataSet queries. though similar to any standard DataSet DataView, this
instance requires you to specify the predicate and order
As most developers experienced in SQL program- by selectors in the LINQ to DataSet query as lambda
ming know, the Join operator is an invaluable tool expressions instead of the string-based query language
for querying relational databases where the data has currently supported when creating DataViews.
been normalized into multiple tables. Interestingly,
DataSet already has the ability to do joins via Da- DataView dataView = (from park in
taRelation objects created by the developer to relate parksDataTable.AsEnumerable()
multiple DataTables. This makes it entirely possible where park.Field<int?>("YearEstablished") < 1960 Since LINQ to
that when using LINQ to DataSet one will not re-
quire the use of the Join query operator as much as
orderby park.Field<string>("Country")
select park).AsDataView();
DataSet
they would with the other LINQ technologies. The supports all
following sample demonstrates a Join between two DataRowView[] drv = dataView.FindRows("Canada"); standard LINQ
DataTables without a DataRelation and the use of an
existing DataRelation. Note that when calling the AsLinqDataView method, operators and
var query8 = from park in parksDataTable.AsEnumerable()
the implementation will create a DataView index expressions,
based on the criteria in the query. This index, main-
join country in countriesDataTable.AsEnumerable() tained on changes to the underlying source DataRows there is actually a
on park.Field<string>("Country") equals
country.Field<string>("Name")
or changes to the DataRowView instances, is based on significant amount
the key selector(s) specified in the OrderBy clause so
select new
that any lookups via the DataView happen via the un- of functionality
{
ParkName = park.Field<string>("Name"),
derlying index and will generally perform much better supported which is
than calling the Where query operator.
Country = country.Field<string>("Name"),
Continent = country.
well beyond most
GetParentRow( If you do not require live data and do not want the of the capabilities
ContinentCountryDataRelation) overhead associated with maintaining it, you are able
.Field <string>("Name") to copy the results of a LINQ to DataSet query expres- of typical query
}; sion into a new or existing DataTable via the CopyTo- mechanisms.
DataTable query operator.
Developers can choose to use either the LINQ Join
query operator or DataRelations. When using a Da- DataTable newParksTable = (from park in
taRelation, the membership of the join is pre-calculat- parksDataTable.AsEnumerable()
ed and kept up to date based on changes to the related where park.Field<int?>("YearEstablished") < 1960
DataTables. When using the LINQ Join query opera- orderby park.Field<int?>("YearEstablished")
tor the developer does not have to maintain DataSet select park).CopyToDataTable();
DataRelations; although one may require DataRela-
tions for other functionality like constraints. When using this operator, the DataRow column values
are copied into new DataRows, meaning that the data
is no longer live with respect to the original source. It
DataSet Specific LINQ Operators is up to the developer to merge back any changes to
the original source DataTable.
In addition to the standard LINQ query operators,
LINQ to DataSet includes two special LINQ query op-
erators called CopyToDataTable and AsLinqDataView Conclusion
designed specifically to work with DataTables and
IEnumerable<DataRow> sources. When using LINQ You’ll encounter many pleasures when using the LINQ
to query a DataTable, the results of the query are often technology, such as discovering where you can use it to
returned as an IEnumerable<DataRow> instance. This solve common coding problems, often in significantly
raises two potentially interesting issues for developers. fewer lines of code and in a much more maintainable
First, even though the individual rows in the query re- manner. Since LINQ to DataSet supports all standard
sult are live (the individual elements in the sequence are LINQ operators and expressions, it supports a signifi-
the same DataRows as those from the original source cant amount of functionality, well beyond most of the
DataTable(s)) the results of the query are not live. For capabilities of typical query mechanisms. Hence, one
example, after modifying one of the DataRows within should be open to experimenting with LINQ to DataS-
the set of query results so that it no longer qualifies as et to solve programming problems beyond those solved
part of the query results set, it will remain in the set of by simple filtering, ordering, and projecting operations
results until the query is run again. Second, much of demonstrated by the samples in this article.
the .NET infrastructure (i.e., binding, serialization, etc.)
has been built to work with DataTables and DataViews
instead of IEnumerable<DataRow> sources. Hence, it Andrew Conrad
becomes desirable in some cases to take the results on

www.code-magazine.com Rich Query for DataSet – An Introduction to LINQ to DataSet 25


ONLINE QUICK ID 0712062

LINQ to Relational Data:


Who’s Who?
With the combined launch of Visual Studio 2008, SQL Server
2008, and Windows Server 2008, Microsoft is introducing five
implementations of .NET Language Integrated Query (LINQ).
Of these five implementations, two specifically target access to relational databases:
LINQ to SQL and LINQ to Entities.
Elisa Flasko

M
Elisa.Flasko@microsoft.com icrosoft Language Integrated Query With this knowledge, you can see that many as-
(LINQ) offers developers a new way to pects of LINQ to SQL have been architected with
Elisa is a Program Manager with query data using strongly-typed queries simplicity and developer productivity in mind.
the Data Programmability team at and strongly-typed results, common across a APIs have been designed to “just work” for com-
Microsoft focused on a number number of disparate data types including rela- mon application scenarios. Examples of this design
of Data Platform Development tional databases, .NET objects, and XML. By us- include the ability to replace unfriendly database
technologies including ADO. ing strongly-typed queries and results, LINQ im- naming conventions with friendly names, map
NET, XML and SQL Native Client proves developer productivity with the benefits of SQL schema objects directly to classes in the ap-
products. Prior to joining Data IntelliSense and compile-time error checking. plication [a table or view maps to a single class; a
Programmability, Elisa worked as column maps to a property on
a Technical Presenter Business LINQ to SQL, released with the the associated class], implic-
Development Manager for Visual Studio 2008 RTM, is de- Fast Facts itly load data that has been re-
Microsoft, travelling across North signed to provide strongly-typed With the introduction quested but has not previously
America. Elisa has previously LINQ access for rapidly devel- of Language Integrated Query been loaded into memory, and
worked in both large and small oped applications across the in both C# and Visual Basic use common naming conven-
companies, in positions ranging Microsoft SQL Server family of tions and partial methods to
and the release of Visual
from sales to quality assurance databases. provide custom business or up-
Studio 2008,
and from software development date logic.
to program management. LINQ to Entities, released in an Microsoft is introducing FIVE
update to Visual Studio 2008 in different implementations of Partial methods, a new feature
the first half of 2008, is designed LINQ: LINQ to Objects, of C# and Visual Basic in Visu-
to provide strongly-typed LINQ LINQ to SQL, LINQ to Entities, al Studio 2008, allow one part
access for enterprise-grade ap- LINQ to XML, and LINQ to of a partial class to define and
plications across Microsoft SQL call methods that are invoked,
DataSet.
Server and third-party databases. if implemented in another part
of the class, otherwise the en-
tire method call is optimized away during compi-
What Is LINQ to SQL? lation. By using common naming conventions in
conjunction with these new partial methods and
LINQ to SQL is an object-relational mapping partial classes, introduced in Visual Studio 2005,
(ORM) implementation that allows the direct 1-1 LINQ to SQL allows application developers to
Many aspects mapping of a Microsoft SQL Server database to provide custom business logic when using gener-
of LINQ to SQL have .NET classes, and query of the resulting objects ated code. Using partial classes allows developers
using LINQ. More specifically, LINQ to SQL has the flexibility to add methods, non-persistent mem-
been architected been developed to target the rapid development bers, etc., to the generated LINQ to SQL object
with simplicity scenario against Microsoft SQL Server. classes. These partial methods can add logic for
insert, update, and delete by simply implementing
and developer Figure 1 & Figure 2 combined with the code snip- the associated partial method. Similarly, develop-
productivity pet below demonstrate a simple LINQ to SQL ers can use the same concepts to implement partial
scenario. Figure 1 shows the LINQ to SQL map- methods that hook up eventing in the most com-
in mind. ping, and Figure 2 shows the associated database mon scenarios, for example OnValidate, OnStatus-
APIs have been diagram, using the Northwind database. This code Changing or OnStatusChanged.
snippet shows a simple LINQ query against the
designed to “just Northwind database. Microsoft developed LINQ to SQL with a mini-
work” for common mally intrusive object model. Developers can
DataContext db = new DataContext(); choose not to make use of generated code and in-
application stead create their own classes, which do not need
scenarios. var customers = from c in db.Customers to be derived from any specific base class, meaning
where c.City == "London" that you can create classes that inherit from your
select c; own base class.

26 LINQ to Relational Data: Who’s Who? www.code-magazine.com


Northwind Database Diagram

Figure 1: Database diagram for the Northwind database.

Conceptual Data
Programming with
ADO.NET
Inheritance, an important feature of object-oriented var customers = from c in db.Customers Relational databases are often
programming, does not translate directly into the where c.Orders.Count > 5 designed and normalized, not to
relational database. Given this, the ability to map select c; make programming against them
in inheritance is very important. LINQ to SQL sup- easier, but to ensure performance,
ports one of the most common database inheritance data consistency, and concurrency.
mappings, where multiple classes in a hierarchy are foreach(var row in customers) Seldom do developers work
mapped to a single table, view, stored procedure, { directly with data in the form that
or table valued function using a discriminator col- Console.WriteLine("Customer ID = " + row.CustomerID);
is returned from a database. With
the upcoming introduction of
umn to determine the specific type of each row/in- foreach(var order in row.Orders)
the ADO.NET Entity Framework,
stance. Console.WriteLine("Order ID = " + order.OrderID);
Microsoft is beginning down a
}
path that will abstract away the
As with any application framework, developers complications of dealing with data
must also have the ability to optimize the solution In the above example, a separate query is executed at the logical database layer.
to best fit their scenario. LINQ to SQL offers a to retrieve the Orders for each Customer. If you
number of opportunities to optimize, including us- know in advance that you need to retrieve the or-
ing load options to control database trips and com- ders for all customers, you can use LoadOptions The ADO.NET Entity Framework,
piled queries to amortize the overhead inherent in to request that the associated Orders be retrieved brings to life the entity-relationship
SQL generation. along with the Customers, in a single request. model that many have used for
thirty years to capture and diagram
conceptual models of their data,
By default, LINQ to SQL enables deferred loading. Also by default, LINQ to SQL enables ObjectTrack-
prior to mapping this information
This means that if, for example, I query for my Cus- ing, which controls the automatic change tracking
into a final relational model.
tomer data using the Northwind model in Figure and identity management of objects retrieved from
The ADO.NET Entity Framework
2, I do not automatically pull the associated Order the database. In some scenarios, specifically where implements the entity-relationship
information into memory. However, if I try to ac- you are accessing the data in a read-only manner, model in the form of the Entity
cess the associated Order information via a naviga- you may wish to disable ObjectTracking as a perfor- Data Model (EDM), maintaining the
tion property from a Customer instance, the associ- mance optimization. principle concepts of entities and
ated Order information is automatically pulled into relationships as first class data
memory for me in a second database round trip. In Compiled queries offer another opportunity to types, and allowing developers
the following code snippet the Order information further optimize query performance. In many ap- to program against the same
is not loaded into memory until I access it from the plications you might have code that repeatedly conceptual model or business
second foreach statement. executes the same query, possibly with different concepts that were used in design.

www.code-magazine.com LINQ to Relational Data: Who’s Who? 27


LINQ to SQL Mapping Diagram tables directly to classes, with the required columns
from each table represented as properties on the
corresponding class. Usually in these scenarios, the
database has not been heavily normalized.

As an example, consider a simple retail application


that uses the Northwind database. As you look at
the Northwind database you can see a simple archi-
tecture that maps easily to a simple object model.

Using the LINQ to SQL Designer you can select the


subset of tables that best fit your
application, rename tables or
properties to make them friendli-
er, and create an object relational
mapping to develop against. Fig-
ure 1 & Figure 2 show the LINQ
to SQL mapping and associated
database diagram for a subset of
Figure 2: LINQ to SQL mapping diagram tables from the Northwind data-
for a simple scenario using the Northwind base. If you look more closely at
database and the associated database the object mapping you can see
diagram. Notice the use of an intermediary that foreign keys from the data-
table to map the many-to-many relation- base are represented in the object
ship between Employees and Territories. model as relationships between classes and allow
you to navigate from one object to another.

argument values. By default, LINQ to SQL parses In looking at the many-to-many relationship be-
the language expression each time to build the cor- tween Employees and Territories in the diagram,
responding SQL statement, regardless of whether and by digging further into the associated relation-
that expression has been seen previously. Compiled ship properties, you can see that LINQ to SQL does
queries allow LINQ to SQL to avoid reparsing the not directly support many-to-many relationships.
expression and regenerating the SQL statement for Rather, LINQ to SQL uses an intermediary class
each repeated query. named EmployeeTerritory with a one-to-many rela-
tionship to Employees and to Territories.
DataContext db = new DataContext();
The LINQ to SQL Designer also allows additional
var customers = functionality for you to expose stored procedures
CompiledQuery.Compile( and/or table valued functions as strongly typed
(DataContext context, string filterCountry ) => methods on the generated DataContext, and map
from c in db.Customers inserts, updates, and deletes to stored procedures if
where c.Orders.Count > 5 you choose not to use dynamic SQL.
select c;
The above example does not show a mapping for
foreach(var row in customers(db, "USA")) any type of inheritance, although using the North-
{ wind database and LINQ to SQL you could have
Console.WriteLine(row); chosen to use inheritance to create a Products class
} and a DiscontinuedProducts class that inherits from
foreach(var row in customers(db, "Spain")) Products. The DiscontinuedProducts class may in-
{ clude additional information, for example stating
Console.WriteLine(row); that the product has been discontinued, etc. LINQ
} to SQL supports Table per Hierarchy (TPH) inheri-
tance and would therefore map this two-class hier-
The above code snippet shows an example of a archy directly to the existing Northwind database as
simple compiled query, executed twice with varying the single Products table within, using the discrimi-
parameters. nator column “Discontinued”.

When Do I Use LINQ to SQL? What Is LINQ to Entities?


The primary scenario for using LINQ to SQL is in LINQ to Entities provides LINQ access to data ex-
applications with a rapid development cycle and a posed through the ADO.NET Entity Framework from
simple one-to-one object to relational mapping. In Microsoft SQL Server or other third-party databases.
other words, you want the object model to be struc-
tured similarly to the existing structure of your data- The ADO.NET Entity Framework is a platform,
base; you can use LINQ to SQL to map a subset of implementing the Entity Data Model (EDM),

28 LINQ to Relational Data: Who’s Who? www.code-magazine.com


Northwind Database with Modified
Employees Tables
Figure 3: Database diagram for a modified Northwind
Database. The Employees table has been vertically partitioned:
Employees_Personal and Employees_AddressBook.

which provides a higher level of abstraction when Microsoft designed the ADO.NET Entity Frame-
developing against databases. For further discus- work, and in turn LINQ to Entities, to enable flex-
sion of the Entity Framework and EDM, please see ible and more complex mappings, ideal in the en-
the sidebar called Conceptual Data Programming terprise type scenario, allowing the database and
with ADO.NET in this article, or An Entity Data applications to evolve separately. When a change
Model for Relational Data by Michael Pizzo or is made in the database schema, the application is
Programming Against the ADO.NET Entity Frame- insulated from the change by the Entity Framework,
work by Shyam Pather, also found in this issue of and you don’t have to rewrite portions of the ap-
CoDe Focus. plication, but rather to simply update the mapping
files to accommodate the database change.
More than a simple ORM, the ADO.NET Entity
Framework and LINQ to Entities allow develop- Similar to LINQ to SQL, LINQ to Entities uses
ers to work against a conceptual or object model partial classes and partial methods to allow cus-
with a very flexible mapping and the ability to ac- tomer update and business logic to be easily added
commodate a high degree of divergence from the to generated code. LINQ to Entities also provides
underlying store. the ability to declaratively call stored procedures
and use generated Update views when persisting
Figure 3 & Figure 4 below show a simple LINQ to objects.
Entities scenario, using a slightly more flexible map-
ping than seen in LINQ to SQL. Figure 3 shows the Three common mapping scenarios differentiate
database diagram including the changes that you LINQ to Entities from LINQ to SQL. LINQ to Enti-
could make to the Northwind database, splitting ties provides the ability (through more flexible map-
Employee information between two tables, to dem- pings) to map multiple tables or
onstrate two common flexible mapping concepts, views to a single entity or class,
and Figure 4 shows the corresponding conceptual to directly map many-to-many
model. In these figures you can see that the object relationships, and to map addi- LINQ to Entities and the Entity
model is not mapped directly, one-to-one, to the tional types of inheritance. Framework allow you to map Table
database. The code snippet below shows a simple
LINQ query against this database. Although you can map many-
per Hierarchy [inheritance] as well
to-many relationships in both as Table per Concrete Type…
LINQ to SQL and LINQ to
var customers = from c in db.Customers
where c.Orders.Count > 5 Entities, LINQ to Entities al-
and Table per Subclass.
select c; lows you to directly map many-

www.code-magazine.com LINQ to Relational Data: Who’s Who? 29


AdventureWorks Products
Figure 4: LINQ to Entities mapping
diagram corresponding to the modified
Northwind database. Notice the directly
mapped many-to-many relationship
between Employees and Territories
without an intermediary table and the
Employees_Personal and Employees_
AddressBook tables are mapped into a
single entity.

to-many relationships with no intermediary class, Due to the explicit nature of LINQ to Entities, devel-
while LINQ to SQL requires that you create an in- opers also have the ability to optimize the solution to
termediary class that maps one-to-many to each of best fit their scenario. For a moment let me revisit my
the classes that are party to the many-to-many rela- previous example of LINQ to SQL implicit loading.
tionship. When I queried for Customer data, Order information
was not automatically pulled into memory, but rather
As discussed earlier in this article, LINQ to SQL lets was only pulled into memory only when the Order
you map one of the most common inheritance sce- information was accessed. In LINQ to Entities, you
narios, Table per Hierarchy. LINQ to Entities and the have full control over the number of database round
ADO.NET Entity Framework allow you to map Table trips by explicitly specifying when to load such infor-
per Hierarchy, similarly to LINQ to SQL, as well as mation from the database. Navigating to associated
Table per Concrete Type, a separate table for each information that has not yet been retrieved from the
class or type in the hierarchy, or Table per Subclass, a database will not cause an additional database trip.
hybrid approach using a shared table for information
about the base type and separate tables for informa- You can further optimize LINQ to Entities by dis-
tion about the derived types. abling change tracking when working in a read-only
scenario.
Two features of LINQ to Entities and the ADO.NET
Entity Framework that set these technologies apart is
the ability to create Entity SQL views and Defining When do I Use LINQ to Entities?
Queries. Entity SQL views allow you to define the
mapping between your entity model and the store The primary scenario targeted by LINQ to Entities is
schema in terms of arbitrary Entity SQL queries. The a flexible and more complex mapping scenario, often
Defining Query feature allows you to expose a tabular seen in the enterprise, where the application is access-
view of any native store query as a table in your stor- ing data stored in Microsoft SQL Server or other-third
age schema. party databases.

30 LINQ to Relational Data: Who’s Who? www.code-magazine.com


In other words, the database in these scenarios con-
tains a physical data structure that could be significant-
ly different from what you expect your object model to
look like. Often in these scenarios, the database is not
owned or controlled by the application developer(s),
but rather owned by a DBA or other third party, pos-
sibly preventing application developers from making
any changes to the database and requiring them to
adapt quickly to database changes that they may not
have been aware of.

As an example, consider a simple HR application that Figure 5: Database


uses an existing company database. The database cur- diagram demonstrat-
rently exists as our modified Northwind example in ing Table per Subclass
Figure 3. In looking at the database, you can see that hierarchy. Shared Event
the architecture does not lend itself directly to how table includes informa-
tion that is common
you may think of the business objects; specifically
to the BikeEvent, RunEvent, SwimEvent and TriathlonEvent
you do not want to deal with two separate Employee
tables. BikeEvent, RunEvent, SwimEvent and TriathlonEvent
classes where each class contains half of the informa-
tables include information specific to each entity.
tion that the application will be accessing. You also do
not want to think about an EmployeeTerritory object
that does not add value beyond what is already pro-
vided in the Employee and Territory objects. Looking closely at the EDM you can see that the rela-
tionship between Employees and Territories no longer
For this example I will need to create a conceptual uses an intermediary class to model the many-to-many
model that exists with a single Employee object type, relationship, but rather models it directly, shown by
and a direct relationship between Employees and Ter- the *‡* symbol connecting two entities. You can also
ritories to allow ease of navigation. see that I have modeled a single Employees entity in-
cluding all of the properties contained in the Employ-
When I created the Entity Data Model using the new ees_Personal and Employees_AddressBook tables in
EDM Designer, I begin by walking through a simple the database.
wizard to create my base mapping. In this case I chose
to automatically generate my initial mapping from an In this example, you do not see a mapping for any
existing database, and then I edited the model using type of inheritance. As mentioned previously, LINQ
the drag and drop interface. I selected the subset of to Entities gives you the ability to map two types of
tables that best fit my application, renamed tables and inheritance, in addition to the Table per Hierarchy
properties to make them friendlier, and made a few used in LINQ to SQL. Figure 5 and Figure 6 show an
other changes such as combining the two Employees example of Table per Subclass inheritance and how
tables into a single Employee Entity. Figure 4 shows a it might be mapped to an Entity Data Model. This
diagram of the Entity Data Model as previewed in the EDM shows the RunEvent class, SwimEvent class,
EDM Designer. BikeEvent class, and the TriathlonEvent class all in-
herit from the Event class.
Elisa Flasko
Figure 6: Entity Data Model diagram demonstrating a common
mapping of inheritance from the database in Figure 5.
The BikeEvent class, RunEvent
class, SwimEvent class,
and the TriathlonEvent class
inherit from the Event class.

www.code-magazine.com LINQ to Relational Data: Who’s Who? 31


ONLINE QUICK ID 0712072

Browsing Windows Live Expo


with LINQ to XML
LINQ to XML, which makes query a first class construct in C# and
Visual Basic, is the new XML API in the .NET Framework 3.5.
With the introduction of Language Integrated Query (LINQ), Microsoft is introducing
LINQ implementations that work over objects, data, and XML. LINQ to XML
improves on System.Xml in the .NET Framework 2.0 by being both simpler to use and
David Schach more efficient. Microsoft developed this new API because the W3C-based DOM API
davidsch@microsoft.com does not integrate well into the LINQ programming model.
David Schach is a principal
developer at Microsoft and was

W
the developer lead responsible for indows Live Expo, an online ser- C#
LINQ to XML and XML language vice for buying and selling merchan-
integration in Visual Basic 9.0. dise, provides a set of Web services var d = XDocument.Load(
Prior to that, he worked on XML that enable programmatic access to the ser- @"http://expo.live.com/API/" +
and object to relational mapping vice’s listings, and returns results as XML. "Classifieds_GetCategories.ashx?appKey=" +
and XML features in SQL Server. In this article, I’ll show you how to build a "-1690295249159343350");
He has been involved with XML, Web page that searches and displays items from
XSLT, and XPath and other XML the Windows Live Expo listings. To make the Visual Basic
standards since 1998. In addition page a little more interesting, I’ll show you how
to implementing XPath 1.0 and to integrate the Virtual Earth map control and Dim d = XDocument.Load("http://expo.live.com/" _
XSLT 1.0, he was also developer use it to display the locations of the items found. + "API/Classifieds_GetCategories.ashx?appKey=" _
lead for msxml. Figure 1 shows the sample application in final + "-1690295249159343350")
form.
We load XML using the static
Because Windows Live ser-
Fast Facts Load method, available on both
vices produce and consume LINQ to XML is a new lighter XDocument and XElement.
XML data, they are ideal can- weight and higher performant The example above uses XDoc-
didates to showcase the power XML API in the Microsoft ument.Load, however, it could
of LINQ to XML. Let me show .NET Framework 3.5. It also have used XElement.Load
you how easily you can build supports all of the standard because the XML is both a valid
an application using LINQ to XML axes including Elements, XDocument and XElement.
XML.
Attributes, Descendants
The XML that was returned
and Ancestors. These
has a root element named cat-
Loading XML and from methods return IEnumerable egories and a list of category
collections so that they
the Web integrate well with LINQ
elements with ID and name
attributes. The application will
queries. Visual Basic adds use the id attribute to query
You need an application ID XML language extensions to Windows Live Expo for items
in order to use the Windows LINQ to XML. With XMLAXIS in each category.
Live Expo API, which you can properties you can access
get by simply signing up on XML elements and attributes
the Live Expo site. Once you by name using a special
have an application ID, use syntax. With XML Literals you
the GetCategories function
can create XML documents,
to obtain a list of all category
names and their associated
elements, and fragments
IDs. Your application needs within using XML syntax.
these IDs to search the online
listings.
<e:categories
To make the Web page easier to use, I’ll DataBind xmlns:e="http://e.live.com/ns/2006/1.0">
the categories and IDs to an ASP.NET Drop- <e:category e:id="2" e:name="Autos">
DownList control. The code snippet below <e:category e:id="4" e:name="Boats">
shows how to obtain the categories in XML for- <e:category e:id="3" e:name="Cars and Trucks">
mat. <e:category e:id="5" e:name="Motorcycles">

32 Browsing Windows Live Expo with LINQ to XML www.code-magazine.com


Figure 1: Sample application browsing Windows Live Expo data and mapping results using Microsoft Virtual Earth. LINQ to XML
ties everything together.

You can write a LINQ query using the Descendants clause creates an object with two members, Name
method to get the category names and IDs. LINQ to and Value. Because the type is not named and de-
XML supports all of the standard XML axes such as El- clared, it is an anonymous type created by the com- XML namespaces
ements, Attributes, Descendants, and Ancestors. Here piler. have
I used the Descendants method because category ele-
ments can be nested within other category elements. Assigning the query result to the CategoryList’s Da- traditionally
taSource property and calling DataBind displays been a source
Below is the complete query. the results in the page’s DropDownList control.
of confusion.
C# XML namespaces have traditionally been a source One of the goals
of confusion. One of the goals of LINQ to XML
XNamespace expo = was to simplify XML namespace handling. LINQ of LINQ to XML
"http://expo.live.com/ns/2006/1.0"; to XML solves the problem with two new classes was to simplify
called XNamespace and XName. You can cre-
var categories = from c in d.Descendants(expo + ate an XNamespace using the static method XML namespace
"category") XNamespace.Get. However, for convenience the handling.
select new class also has an implicit conversion from string to
{ XNamespace. You can initialize an XNamespace
Name = (string)c.Attribute(expo + variable by assigning the namespace uri string to
"name"), the variable.
Value = (string)c.Attribute(expo +
"id") XNamespace expo =
}; "http://expo.live.com/ns/2006/1.0";

CategoryList.DataSource = categories; In LINQ to XML, all names are represented by the


CategoryList.DataBind(); XName class, a fully qualified name. That is, it al-
ways has both a namespace and a local name. To
The expression: get an XName object requires an XNamespace ob-
ject because local names are always allocated with
d.Descendants(expo + "category") respect to some namespace. Here’s an example of
getting a name from the expo namespace.
returns an IEnumerable of XElements that match
the name. For each element returned, the select expo.GetName("category")

www.code-magazine.com Browsing Windows Live Expo with LINQ to XML 33


However, you won’t see this code in the sample compact than using the API directly. Sometimes you
application. For convenience, LINQ to XML over- have to use the API because there are many methods
loads the + operator. You can get a fully qualified in LINQ to XML that do not have a special syntax in
name using the simpler code shown. Visual Basic such as Ancestors.

expo + "category" In the example, the element and attribute names are
qualified. In C# you saw how to create fully quali-
Now creating and using XML names and namespac- fied names using an XNamespace object along with
es is really simple. implicit string conversions. Visual Basic goes one step
As part of the .NET further and extends the imports statement to support
Framework 3.5, You may have noticed the cast to string when getting XML namespaces in addition to CLR namespaces.
the attributes. Calling the Attribute method returns You do not have to define any XNamespace objects
all of the features an XAttribute object, but I want the attribute’s value. in the code. Just add an imports statement for each
of the LINQ to XML The XAttribute class has a Value property; however, it XML namespace you need in the code. After that you
API are available in is rarely used. Instead, the XAttribute and XElement
classes define a number of explicit conversions for
can use the prefix anywhere in the source file. Below
you can see how to define the expo and geo prefix in
both C# and Visual converting objects to any of the common CLR types. Visual Basic.
Basic. However, In this case I want a string but I could have also cast
to double, decimal or datetime if those value types Imports <xmlns:expo="http://expo.live.com...">
Visual Basic 9.0 were stored in the attribute. Imports <xmlns:geo="http://www.w3.org/2003...">
includes additional
language features Visual Basic Xml Properties Querying for the Items
to make XML As part of the .NET Framework 3.5, all of the features Now that you have the categories, I’ll show you how
processing even of the LINQ to XML API are available in both C# to write the code to search for the items on Windows
simpler. and Visual Basic. However, Visual Basic 9.0 includes Live Expo. I’ll use the ListingsByCategoryKeywordLo-
additional language features to make XML process- cation method to do that. This method takes a cat-
ing even simpler. One of those features is XML prop- egory, a set of keywords, a location, either latitude
erties. Now, you can directly query XML in Visual and longitude or a postal code, a maximum distance
Basic without using an API. to include in the search, and a maximum number of
results. You’ll get this information from the user from
An XML property lets you treat a sub element or the controls on the page.
Reference Links attribute as if it were an ordinary CLR property.
For LINQ to XML Syntactically, XML element properties differ from With the parameterized URL constructed you can load
normal properties by enclosing the name in angle the XML data from Live Expo. The data comes back
http://msdn2.microsoft.com/
brackets. XML attribute properties are prefixed with as RSS with some Windows Live Expo extension ele-
en-us/library/system.xml.
an ‘@’ character. Below you can see the same LINQ ments. In the code I’ll DataBind these items to an ASP.
linq(VS.90).aspx
query in Visual Basic using element and attribute NET ListView control. To get the items, the query uses
http://msdn2.microsoft.com/ properties to get the category names and ids. the Elements method and explicitly drills into the RSS
en-us/library/bb387098(VS.90). data from the root element. For each item, the query
aspx Visual Basic creates an anonymous type with the title, published
date, URL to the item’s Web page, location, and price.
For Visual Basic XML literals Dim categories = Again, each element’s value is retrieved by explicitly
From c In d...<expo:category> _ casting the element to the desired CLR type. The result
http://msdn2.microsoft.com/ Select Name = c.@expo:name, _ is then bound to an ASP.NET ListView to display on
en-us/library/bb384833(VS.90). Value = c.@expo:id the Web page.
aspx
http://msdn2.microsoft.com/ CategoryList.DataSource = categories
CategoryList.DataBind()
C#
en-us/library/bb384563(VS.90).
aspx var d = XDocument.Load(BuildListingUrl());
http://msdn2.microsoft.com/ What happened to the calls to Descendants and Ele-
en-us/library/bb384769(VS.90). ments? In addition to special syntax for element and var items = from item in
d.Elements("rss").Elements("channel").
aspx attribute names there is also special syntax for com-
Elements("item")
mon XML query axes. Table 1 shows how LINQ to
select new
XML methods map to Visual Basic property syntax.
Tool for XML to Schema {
Inference Title = (string)item.Element("title"),
LINQ to XML has both an Element method and an Published = (DateTime)item.Element("pubDate"),
http://msdn2.microsoft.com/ Elements method. The former returns a single element Url = (string)item.Element("link"),
en-us/vbasic/bb840042.aspx while the latter returns a collection. However, the Location = (string)item.Element(expo +
XML element property in Visual Basic maps to the El- "location").Element(expo + "city"),
ements method not the Element method. That means Price = (decimal)item.Element(expo + "price")
For information about LINQ };
XML element properties are always a collection.
http://msdn2.microsoft.
com/en-us/netframework/ In Visual Basic you do not have to use the XML ListView1.DataSource = items;
aa904594.aspx property syntax but it is usually more convenient and
ListView1.DataBind();

34 Browsing Windows Live Expo with LINQ to XML www.code-magazine.com


The Visual Basic code is nearly the same. Just as in feed that conforms to the GeoRSS specification. I’ll
C#, the element values are cast to the desired type. use LINQ to XML to create the GeoRSS XML.
However, in Visual Basic there is an explicit call to
the Value method because the XML element prop- One of the reasons why Microsoft created a new XML
erty is mapped to the Elements method. This method API was because the W3C-based XmlDocument and
returns a collection of XElements. While XElement XMLElement in System.Xml cannot be functionally
defines explicit conversions to CLR types, the col- created. The System.XML API uses a factory-based
lection does not. To solve this problem, Visual Basic pattern instead which, unfortunately, requires a fac-
defined a Value extension method on the collection. tory context for creating the elements and it needs
This method gets the first item from the collection multiple statements for constructing the XML. With
and returns its value. If the collection is empty, it re- LINQ you need to do everything within the context
turns Nothing. of a single LINQ expression.

Visual Basic The constructor for XElement, shown below, takes


a variable number of arguments.
Dim d = XDocument.Load(BuildListingUrl())
XElement(XName name, params object[] content);
Dim items =
From item In d.<rss>.<channel>.<item> _ Because each constructor can take as its argument the
Select _
constructor for its content, you can create the entire
Title = item.<title>.Value, _
Published = CDate(item.<pubDate>.Value), _
XML document or fragment within a single expression.
Url = item.<link>.Value, _
Location = _ In the LINQ query, each item from the Windows
item.<expo:location>.<expo:city>.Value, _ Live Expo data is converted to a GeoRSS item. The
Price = CDec(item.<expo:price>.Value) transformation is simple and summarized in Table 2.

ListView1.DataSource = items Listing 1 shows the C# code for the transformation.


ListView1.DataBind()
With the items converted, the Virtual Earth map

XML IntelliSense
control requests the full GeoRSS XML by embed-
ding the items inside of the RSS and channel ele-
One of the
ments. The server returns data to the map control reasons why
Visual Basic makes it even easier to use XML proper-
ties because it supports XSD-driven IntelliSense. If
via a separate aspx page. Microsoft created
you add an XSD file to the project, IntelliSense will C# a new XML API
pop up after typing a ‘<’ or ‘@’ in an expression of
var geoRss =
was because
type XElement or XDocument as shown in Figure 3.
If you don’t have a schema for your XML data, that’s new XElement("rss", the W3C-based
new XAttribute("version", "2.0"),
not a problem. Just load some sample XML into the
new XElement("channel", XmlDocument and
XML Editor in Visual Studio and infer the schema.
Save the inferred schema to your project directory
new XElement("title",
"Expo Live Result Locations"), XMLElement in
and then add the file to your project. new XElement("link", cmd),
geoItems
System.Xml
) cannot be
Creating GeoRSS for the Virtual Earth );
functionally
Map Control created.
This code correctly creates the GeoRSS XML but the
XML may not serialize exactly the way you want it
Until now the sample app has consumed XML
data from the Windows Live Expo service. Now I
want to generate some XML. The sample applica- C# Visual Basic
tion uses a Virtual Earth map control to display the d.Descendants d...<
locations of the items returned from Windows Live d.Elements d.<
Expo. After the map control is created on the Web d.Attribute d.@
page, it requests the pushpin locations from the
server. The server returns the locations in an RSS Table 1: LINQ to XML methods and XML properties.

Windows Live Expo Data GeoRSS


<item> <item>
<title> <title>
<description> <description>
<expo:location><expo:latitude> <geo:lat>
<expo:location><expo:longitude> <geo:long>
Table 2: Transforming Windows Live Expo data to GeoRSS data.

www.code-magazine.com Browsing Windows Live Expo with LINQ to XML 35


Listing 1: Windows Live Expo data transformation to GeoRSS items and the Virtual Earth map control displayed pushpins
showing each item’s location. When I counted the
var d = XDocument.Load(BuildListingUrl()); pushpins, however, there weren’t enough.
var items = from item in
The problem is that the position information for each
d.Elements("rss").Elements("channel").Elements("item")
item is the latitude and longitude of the item’s postal
select new
{ code. Two items with the same postal have the same
Title = (string)item.Element("title"), position so the map control only displays one pushpin.
Published = (DateTime)item.Element("pubDate"), Rather than generate multiple overlapping pushpins,
Url = (string)item.Element("link"), I need to generate one pushpin and concatenate the
Location = (string)item.Element(expo + titles and descriptions from each of the items at the
location").Element(expo + "city"), same location. I can use LINQ grouping to solve this
Price = (decimal)item.Element(expo + "price") problem.
};
Grouping has two parts: an expression for the objects
ListView1.DataSource = items; to put in the group and an expression for the key to
ListView1.DataBind partition the objects. In the query, I want to put the
RSS items in the group and I want to partition on
to look. While LINQ to XML names are always fully unique latitude and longitude values. Getting the items
qualified, they do not have a prefix. If you care about for the group is easy. That is just the item variable from
the prefix assigned to a name then you need to create a the “from” clause. The key has to be a single expres-
namespace attribute that defines the prefix. Without a sion but I have two values. To solve this I’ll use a new
namespace declaration, the serializer may create a name anonymous type with two members, latitude and lon-
with a default namespace or in some cases, with a syn- gitude. I’ll then assign a name for this group so that I
thesized prefix. In this case, I want the prefix for the geo can access it in the select clause.
namespace to be “geo”. To ensure the prefix is generated
correctly. I’ll add the following code to the rss element. C#

new XAttribute(XNamespace.Xmlns + "geo", var positions =


geo.NamespaceName), from item in
d.Elements("rss").
Elements("channel").
Using Grouping Elements("item")
group item by
At this point I thought I was done with the application. new
The ASP.NET ListView control displayed the items {

Figure 2: XML IntelliSense is available in Visual Basic when you add an XSD schema to the project.

36 Browsing Windows Live Expo with LINQ to XML www.code-magazine.com


Figure 3: Inspecting the result of grouping by position. Each result has a key and a collection of items. Each item in the group has the
same key value

latitude = (string)item. <geo:long><%= _


Element(expo + "location"). item.<expo:location>. <expo:longitude>.Value _
Element(expo + "latitude"), %></geo:long>
</item>
longitude = (string)item.
Element(expo + "location"). The XML literal compiles to calls to LINQ to XML
Element(expo + "longitude") API so the type of the geoItems variable is XElement.
} into g In addition to XML 1.0 syntax, XML literals support
select new { position = g.Key, data = g }; embedded Visual Basic expressions. You use embedded
expressions to programmatically create content within
The select clause then returns the group key with an XML literal. They use an ASP.NET-like syntax and
the latitude and longitude and the items at that loca- can contain any valid Visual Basic expression.
tion. See Figure 3 to inspect the group query result
in Visual Studio. The rest of the code to merge the Because the XML literal is processed and understood
titles and descriptions is in the sample application. by the Visual Basic compiler, you get additional sup-
port within the IDE when using XML literals. For ex-
ample, the IDE automatically inserts the end element
Visual Basic XML Literals when starting a new element and will auto correct the
end element name when the start element name is
Complementary to XML properties, are XML liter- changed. In addition, features such as outlining, re-
als. With XML literals you can create XML docu- naming, and find all references all work as expected
ments, elements, and fragments within a Visual Ba- with XML literals.
sic program using XML syntax. The code snippet
below shows code similar to the C# example above
to create the GeoRSS items in Visual Basic. Conclusion
Visual Basic XML is everywhere on the Web and applications need
to be able to easily produce and consume XML. While
Dim geoItems = _ this article barely scratches the surface of the LINQ to
From item In d.<rss>.<channel>.<item> _ XML APIs, you can see how easy it is to integrate
Select _ data from Web services using the .NET Framework
<item> 3.5. LINQ to XML vastly simplifies querying and con-
<title><%= item.<title>.Value %></title>
structing XML. Visual Basic takes XML programming
<description><%= _
item.<description>.Value %>
one level higher with XML properties, XML literals,
</description> and XSD-driven IntelliSense.
David Schach
<geo:lat><%= item.<expo:location>. _
<expo:latitude>.Value %></geo:lat>

www.code-magazine.com Browsing Windows Live Expo with LINQ to XML 37


ONLINE QUICK ID 0712082

ADO.NET Data Services


Separation of presentation and data has long been considered
a best practice in the development of Web applications. Driven
by the need for low friction deployment and a richer user experience, the types and
architectures of Web applications are evolving dramatically. With the introduction
and growth of AJAX-based applications and Rich Interactive Applications (RIA)
using technologies such as Microsoft® Silverlight™, separation of presentation and
data is no longer just a best practice, it is required.
Mike Flasko

T
Mike.Flasko@microsoft.com hese architectural changes have been accom- Primary interaction with an ADO.NET Data Ser-
panied by a number of new classes of appli- vice occurs in terms of HTTP verbs such as GET
Mike is a Program Manager (PM) cations. More specifically, two classes of ap- and POST, and the data exchanged in these inter-
in the SQL Data Programmability plications have become prevalent in today’s Web actions is represented using simple formats such
group at Microsoft. He is landscape: AJAX-based Web applications and as XML and JSON.
currently focused on the ADO. RIA applications using technologies such as Micro-
NET Data Services project. Prior soft’s Silverlight™. AJAX-based Web applications In both the AJAX application and RIA application
to working with ADO.NET Data serve pages containing presentation and behavior, models noted above, data is retrieved via an HTTP
Services, Mike was a PM on the using JavaScript to represent page behavior and request to a middle tier server. Currently, a number
Windows Network Developer then turn back and fetch data separately, using of Web service technologies (Windows Commu-
Platform team at Microsoft, XMLHTTP. Technologies such nication Foundation, ASMX,
where he was responsible for as these typically remove the Fast Facts etc.) and associated standards
the System.NET namespace option of a server-side render- (WS-*, SOAP, etc.) exist and
(.NET Framework), the Winsock ing process that mixes data and Originally started as may be used to serve data to
API, and the Winsock Kernel API. code. Instead, code written to an exploratory incubation clients in these scenarios. ADO.
Prior to his roles as a Program drive the presentation aspects project by a few people NET Data Services and its
Manager, Mike worked as a .NET is pre-compiled and deployed within the ADO.NET REST-based access patterns are
Developer Technology Evangelist. to the client via the Web server. team under the Microsoft designed to be complementary
After reaching the client Web to these existing technologies.
Codename “Astoria”,
browser, the code calls back
to a Web server to retrieve the
ADO.NET Data Services For applications that are high-
data to display within the user has since moved out ly driven by business process
interface. from under the “incubation” and/or typically access data
title into a first class through a rich façade layer,
Adding to the classes of appli- member of the ADO.NET Microsoft has a strong offer-
cations mentioned, “mashup”- suite of data access ing in WCF. For applications
style applications have been that are driven more by “pure”
technologies for
introduced, which leverage data (ex. NetFlix movie queue,
the presentation/data divide
next-generation Web stock ticker, etc.) or for data/
by aggregating data from mul- applications. business logic hybrid applica-
tiple sources to provide an en- tions, ADO.NET Data Ser-
hanced experience, using the vices provides a rich offering
combined data. This landscape makes it interest- by enabling the creation of reusable UI elements
ing to talk about ADO.NET Data Services (for- with a uniform URI addressing scheme. In addi-
merly known as Microsoft Project Codename “As- tion, it also enables the introduction of business
toria”), the services that applications will use, to logic in a way that is naturally integrated with
find and manipulate data on the Web, regardless the rest of the HTTP interface such as ADO.NET
of the presentation technology used or whether or Data Services query operators (paging, sorting,
not the front-end is hosted in the same location etc.) on the server.
or on the same server as the data.
Originally introduced as an incubation project
from the ADO.NET team under the Microsoft
What Are ADO.NET Data Services? Codename “Astoria” at Mix ’07, ADO.NET Data
Services has since moved out from under the
ADO.NET Data Services, provides the ability to “incubation” title and is now a production proj-
expose data as a service, available to clients from ect. As part of our move from incubation and
across the corporate intranet or across the Inter- into development as a production product, we
net. These services are then accessible over HTTP, started from scratch using feedback and lessons
using URIs to identify the pieces of information learned from our initial CTP incubation release.
or “resources” made available through a service. If you have been following ADO.NET Data Ser-

38 ADO.NET Data Services www.code-magazine.com


vices since the beginning, you may notice a few NET Data Service. As we progress towards a fi-
changes to the product in an upcoming technol- nal version of the product, we will likely update
ogy preview release. Since our initial release we the Online Service to reflect our current think-
have worked a lot on the architecture of the prod- ing; however, since the service is experimental the
uct, designing it such that we can broaden the ca- URLs noted below may change or be removed all
pabilities of the product to address the feedback together over time, so I recommend visiting the
we have received to date without adding undue ADO.NET Data Services section on http://msdn.
complexity to the system as a whole. microsoft.com/data after reading this article to see
if any updates to the URI syntax have been made.
For example:

http://astoria.sandbox.live.com/
ADO.NET Data Services, northwind/northwind.rse/Customers[ALFKI]

provides the ability to returns a single Customer resource. When a URI


expose data as a service, points a specific resource, you can perform CRUD Elisa Flasko
operations on the resource: An HTTP GET re- Elisa.Flasko@microsoft.com
available to clients from quest to the URI enables retrival of the resource,
across the corporate the POST verb is used to add a new instance, the Elisa is a Program Manager with
the SQL Data Programmability team
intranet or across the Internet. PUT verb to update and, finally, the DELETE
verb is used to remove a resource instance. at Microsoft focused on a number
These services are of Data Platform Development
technologies including ADO.
then accessible over HTTP, In addition, it is possible to traverse associations
between resources using a URI such as: NET, XML, and SQL Native Client
using URIs to identify products. Prior to joining Data
Programmability, Elisa worked as
the pieces of information or http://astoria.sandbox.live.com/northwind
/northwind.rse/Customers[city eq a Technical Presenter Business
“resources” made available 'London']/Orders?$orderby=OrderDate Development Manager for
Microsoft, travelling across North
through a service. This URI represents all the Orders for the Cus- America. Elisa has previously
tomers living in London, where the results are worked in both large and small
returned sorted by the OrderDate. companies, in positions ranging
from sales to quality assurance
Using ADO.NET Data Services The documents linked to from the developer cen- and from software development to
ter noted above details all the URI construction program management.
ADO.NET Data Services consumes a concep- rules, available query operators as well as the
tual model representing the entities that your data formats supported in HTTP requests and
application wishes to expose, along with the as- responses to/from an ADO.NET Data Service.
What Is REST?
sociations between entities, and exposes them as At the time of this writing, the latest CTP release In his dissertation, Roy Thomas
HTTP resources, each identified by a URI. The supports simple data formats such as XML and Fielding describes Representational
conceptual model exposed by ADO.NET Data JSON. For an in-depth overview of the formats State Transfer (REST) as an
Services is defined using the Entity Data Model used by ADO.NET Data Services, please see architectural style making use of
(EDM), which is supported by the ADO.NET http://msdn.microsoft.com/data. existing technologies such as HTTP
Entity Framework. By virtue of the ADO.NET and XML over the Web. The REST
Entity Framework, ADO.NET Data Services can architectural style outlines how to
expose high-level constructs which are mapped Rich Client Access address and define resources.
by the framework to the underlying store. For The list below highlights some
more information on the Entity Framework Since all ADO.NET Data Service resources are ex- key design principles that make
and EDM, please see Michael Pizzo & Shyam posed by simple HTTP entry points, any applica- up the REST interface:
Pather’s articles, also in this issue of CoDe Fo- tion with the ability to make an HTTP request can • Application state and
cus. access a data service. In addition to direct HTTP functionality are divided into
access using an HTTP API, client libraries are resources
Setting up a new ADO.NET Data Service is available for the .NET Framework and Silverlight
straightforward and, at the time of this writing, 1.1 applications. As well as abstracting the applica- • Every resource is uniquely
involves creating an EDM model and a data ser- tion from the base semantics of HTTP, the libraries addressable using a
vice using wizards in Visual Studio 2008 and then surface results as .NET objects and provide rich universal syntax for use in
writing a couple lines of code to hook up the data services to the application such as graph manage- hypermedia links
service to the model. The site http://msdn.microsoft. ment, change tracking, and update processing. • All resources share a
com/data includes links to step-by-step instruc- uniform interface for the
tions showing how to create a data service using For AJAX applications, most AJAX libraries in- transfer of state between
the latest release of ADO.NET Data Services. clude some type of easy-to-use wrapper object client and resource
over the XMLHTTP object. To facilitate this us- • A protocol that is:
The following examples use the read-only North- age model, ADO.NET Data Services supports
§ Client/Server
wind sample database and the experimental requests and responses in JSON format. This en-
ADO.NET Data Services Online Service to de- ables a payload returned from a data service to § Stateless
tail by example how to interact with an ADO. be used as the parameter to the JavaScript Eval( ) § Cacheable

www.code-magazine.com ADO.NET Data Services 39


Data Programmability Team function, which enables the AJAX application to ADO.NET Data Services is currently in active de-
directly materialize JavaScript constructs from velopment. Unfortunately, this article is unable to
the response from a data service. cover all aspects of this new technology. We’ll strive
to be very transparent with our blog at http://blogs.
microsoft.com/astoriateam. There you’ll learn more
Going Beyond “Pure Data” about how ADO.NET Data Services is progress-
ing and find out more on our current thinking in
So far I have described how ADO.NET Data Ser- various aspects of the product design, plus you can
vices map a URI to each entity within an EDM- provide us with your feedback.
based conceptual model, which a developer de-
fines and provides to a data service to expose Mike Flasko
as HTTP resources. This approach works well Elisa Flasko
for classes of data (ex. blog articles) that can be
served from the storage medium all the way to
Pablo Castro the application where it is likely to be directly
presented to the end user. While such data ex-
“Simplicity is what keeps the ists, it is clear that not all data fits this mold.
Internet working. One of the Some types of data will, for a variety of reasons
beautiful things about the (validation, etc), need to be accompanied by busi-
HTTP pattern and the concept ness logic that governs how the data is served.
of REST is that they both focus
on just that—simplicity.” ADO.NET Data Services enables a developer to
go beyond the pure data model where required
Pablo works in the SQL Server by defining service operations and/or intercep-
product group at Microsoft, tors. Service operations allow the developer of
designing the future versions a data service to define a method on the serv-
of the programming models for er, which just like all other ADO.NET Data
working with data. He has been Service resources, is identified by a URI. For ex-
involved in several releases of ample, the URI /MyFavoriteBooks?category=spor
large scale projects at Microsoft ts&$OrderBy=Title represents a call to the service
such as SQL Server 2005 and the operation named MyFavoriteBooks that takes a
.NET Framework. Before joining single category parameter. One of the interesting
Microsoft, Pablo spent some time features of service operations is that if the opera-
running a start-up focused on tion on the service returns a query representing
advanced application frameworks the resource to return as opposed to the resource
development, and before that he itself, then as shown in the example above,
split his time between building the output of the service operation can be acted
collaborative applications and on using standard ADO.NET Data Service op-
creating infrastructure for erations. This is shown in the query string of the
inference system creation and example above where the OrderBy operator is ap-
distributed execution used in risk plied to the results of the operation.
analysis applications.
Another mechanism known as interceptors is
provided to data services, which enables a devel-
oper to plug in custom validation logic into the
request/response processing pipeline of a data
service. Interceptors allow a developer to register
a method to be called when a particular CRUD
action occurs on a given type of resource being
exposed by ADO.NET Data Services. Such a
method then may alter the data or even terminate
the operation.

Looking Forward

Applications are moving to a model which divides


behavior and presentation from data access. This
trend has been driven by the ability of such archi-
tectures to deliver a rich, interactive client experi-
ence. ADO.NET Data Services provide a simple,
uniform way for applications to access their data.
Uniformity in this regard simplifies data access
and enables opportunities for generic UI controls
and libraries to be constructed based on a well-
known pattern.

40 ADO.NET Data Services www.code-magazine.com


ONLINE QUICK ID 0712092

Caching with SQL Server


Compact and the Microsoft
Sync Framework for ADO.NET
With Sync Services for ADO.NET, developers can easily optimize
their online experience by caching data locally within the easy-
to-deploy SQL Server Compact embedded database engine. In this
Steve Lasker
SteveLas@Microsoft.com article, I’ll cover how Sync Services for ADO.NET was designed to fit the growing
developer needs for caching data locally in online-optimized, offline-enabled
Steve is a Program Manager at
Microsoft working to empower applications.
developers to enable their
occasionally connected users.

T
For Visual Studio 2005, Steve here’s nothing new about wanting to cache Saving Leftover Pizza, or Shopping for
was on the Visual Basic team
working on many of the data
data, or enabling users to work offline. The Fresh Pasta and Sauce
problem has been that enabling this func-
design-time features for building tionally has been just too hard as the technologies
client applications including the required have been complex to deploy, manage, When developers start thinking about caching, it’s
Data Source Window, DataSet and haven’t really been focused on developers helpful to consider there are multiple models to
Designer, Object Binding, and or productivity. Most applications are designed caching. I like to break them down into two dis-
enabling SQL Server Compact to solve some business prob- tinct categories I call Passive
for usage across all Windows lem requiring the developer Fast Facts Caching and Active Caching.
operating systems. For Visual to spend most of their time With Passive Caching, the cli-
Studio and SQL Server 2008, focusing on the business com- Sync Services and SQL ent portion of the application
Steve focused on SQL Server ponents of the application. All Server Compact aim to bring simply caches results to que-
Compact 3.5 and, working with the infrastructure bells and new levels of productivity, ries as they’re returned from
an extremely talented group of whistles are cool, but it’s typi- enabling developers to easily the server. There’s not a lot of
people across the company, led cally hard to justify the time add caching and offline complexity here and it can be
the design for Sync Services 1.0. spent on something the end somewhat minimal impact to
features to their applications.
user doesn’t see. add to your existing applica-
Prior to joining Microsoft, Steve With a focus on friction free, tion architecture, but as we’ll
was a technical architect with Microsoft designed Sync Ser- non-admin deployment, see, the benefits are minimal
an early Internet eCommerce vices for ADO.NET to sim- developers no longer need as well. With Active Caching,
company before moving to a plify the integration of local to trade powerful, productive the application proactively
consulting firm building Web, caching and synchronization features with an end-user seeks out to pre-fetch data
client, and device applications into your application. It focus- focused deployment model. that the client may require. In-
for corporate customers. With an es on a goal of “Simple things stead of pre-fetching answers
engineering background in the should be simple while com- to the individual queries, the
remote broadcasting industry, plex things should be enabled.” Sync Services for application will retrieve the raw data required to
Steve lived the life of the mobile ADO.NET ships in two major upcoming products. answer the questions.
workforce where he gained his Visual Studio 2008 marks the first release of the
passion for the occasionally technology and ushers in unparalleled ease of use An alternative way to think about passive cach-
connected user. with an integrated Sync Designer leveraging SQL ing is the same way that you may shop for food.
Server Compact 3.5 as the local cache. The second You could order out for pizza, Chinese food, or
Steve speaks regularly at release of Sync Services for ADO.NET ships as a in some cities, you can have just about anything
various events, maintains a part of SQL Server 2008 enabling further server delivered directly to your home. Now imagine
blog with screencasts, Q&A and configuration simplicity with greater performance that instead of just saving the leftovers, you could
architectures for occasionally for sync-enabled applications. Sync Services 2.0 actually “copy” the delivery and save it in your
connected systems. incorporates the Microsoft Sync Framework and refrigerator. The refrigerator represents your lo-
You can reach Steve via enables additional support for non-relational cal cache. The next time you’re hungry for Chi-
e-mail or his blog: data stores in collaborative, peer-to-peer topolo- nese food or pizza, you’re all set; you simply open
blogs.msdn.com/SteveLasker gies. This article discusses the project goals and the cache. But what if you want spaghetti and
functionality enabled with Sync Services for meatballs? You could place another order for de-
ADO.NET. For more info on the Microsoft Sync livery or you could dissect the pizza w/meatballs
Framework, please read Moe Khosravy's article, and the Chinese noodles to create the spaghetti.
Introducing the Microsoft Sync Framework—Next This can get quite complex, not to mention
Generation Synchronization Framework. strange.

42 Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET www.code-magazine.com
End User

UI Designer Client Dev Service Dev DBA


Figure 1: Typical central management of all queries increasing the workload of the central servers, making the application rigid and
difficult to maintain.

Turning back to the data example, what if you applications may start fairly simple, allowing the
cached queries for Products.ProductName LIKE client to directly query the database, many appli-
‘a%’ or Products.Category = ‘Parts’. If you ask cations quickly move into the enterprise or need
those same questions again, you can leverage to communicate over the public Internet. To limit
the cache. But what if you ask for the top five the server exposure, clients communicate via ser-
products? If you just look at the cache, you only vices which are responsible for opening the con-
get the top five products that start with ‘A’ in the nections to the database behind the firewall. This
‘Parts’ category. This starts to represent the limits architecture quickly adds a lot of complexity to
of Passive Caching. your development. As seen in Figure 1, for each
question your users ask the server, the develop-
With Active Caching, your application will have ment team would need some subset of the follow-
brought down the raw materials to answer the ing:
questions. Just as you may shop for the raw mate-
rial such meat, pasta, and spices, you can cache • Stored procedures and/or views for each
the appropriate amount of raw data. Sometimes query on the database
raw data may be too detailed. Mom may not ap- • Configuration of permissions on each of the
prove, but you might buy pre-prepped foods such objects
as canned tomato sauce, which you then spice up • Load testing and tuning of the queries as the
to satisfy your unique taste. De-normalizing the server has a multiplier of users asking inter-
data as it’s brought down to the client may be your secting questions
tomato sauce. However, buying one of everything • Indexes supporting the WHERE clauses of
at your food store isn’t practical either. Determin- the sprocs & views
ing the right subset can help you cook up the re- • Service contract wrapping each sproc/view
sults you need locally without having to replicate with potentially a custom object definition
the entire enterprise database. Microsoft Outlook for the result shape
uses this same model with Exchange Server. The • Client proxies for each service API
appropriate filter of data, your inbox, calendar, • And of course the supporting UI to ask and
tasks etc., are brought down to your cache and answer each question
you then search, filter, and group the data locally,
offloading the details from Exchange Server. Making even the simplest change to the queries
or results can be painful to implement across the
entire stack requiring many roles in your devel-
Simplifying Data Access by Caching opment team to get involved. Back to my pizza
Data Locally example, what if you wanted extra cheese, or you
wanted everything but onions on the “everything”
pizza? Since you’re getting pre-cooked meals, you
In addition to the benefits of Active Caching, must go back to the food store or possibly to the
another reason to cache data is it can actually supplier. Of course this isn’t very practical. To
simplify your overall development. While many solve the problem, many development teams au-

End User

UI Designer Client Dev Service Dev DBA

Figure 2: Offloading detailed questions to the client, simplifying the server-side configuration and reducing the central workload.

www.code-magazine.com Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET 43
tomate the creation of these various tiers, however very practical as Erik has other articles he needs
inevitably there are changes that require database to edit. If Erik has to micro-manage my writing,
tuning for each interleaving query. Once the appli- I’d create too much work for him, which begs
cation goes into production, it’s difficult to make the question, why enlist me to write the article?
changes as the changes typically trickle across all We’ve all been in this situation before thinking
tiers, requiring a new version of the entire stack. it’s too much trouble to explain and you should
just do it yourself. But if you did that for every-
By leveraging this pre-fetch, Active Caching thing, could you get everything done you needed
model, you can now simplify your application to do? This delegation of responsibilities is how
development. The DBA and service developer most things scale.
now maintain a set of services that serve out raw
data, or essentially your version of your local food
store. Just as the food store has frozen pre-cooked Just Because You Can, Doesn’t Mean
dinners, or even hot, ready-to-eat meals, the ser- You Should
vices will likely offer a blend of sync services and
live services that may not be practical to “cook
up” locally. As you can see in Figure 2, with the Consider an order management system. For each
Sync Services and raw data local, the UI designer, client developer, order, and each line item, how much of the system
SQL Server Compact and end user can be quite productive, asking all needs to be involved? Does the server really need to
sorts of interesting questions without having to know the details for how each line item is added?
Work on .NET worry about the impact to the central resources. Or, similar to the article editing process, should the
Framework 2.0 This can not only be productive for development, user get the reference data they need to create the
To minimize the complexity for but extremely powerful to end users as they can order including the product catalog, categories, pric-
developers wanting to enable now ask complicated questions that could have ing, and customer info. As they create orders, users
local caching, Sync Services caused havoc to the multi-user server. The local could save them locally until they are ready to be
for ADO.NET is built on .NET query may not always be the most efficient query, submitted. The server then receives completed or-
FX 2.0. This means clients that but only the individual user is affected as there’s ders and revalidates them for accuracy. It’s just soft-
have 2.0, 3.0 or 3.5 versions of no multiplier of users on a local cache. This re- ware, so why not have all the logic on the server?
the framework can utilize Sync duces the workload to the production server as it
Services for ADO.NET. only needs to return changes to the raw data. As corporate data centers evolve, the amount of
central processing is causing them to grow at an
exponential pace. While data centers grow, client
Micro Management or Delegation computers are leaving lots of processing power
“on the table”. This is starting to look a lot like
Browsers offload In addition to the change management issues the mainframe to PC revolution all over again.
above, how much duplication of effort should you The difference this time may be balancing both
entry and minimal implement within your application? Though I am the use of the client and the central servers. In the
validation to the writing this article for a special CoDe Focus issue mainframe days, the client sent each keystroke to
of CoDe Magazine, typically Rod Paddock (Edi- the server. Browsers offload entry and minimal
client. Is bringing tor-in-Chief of CoDe) would create a theme for validation to the client. Is bringing reference data to
reference each magazine and then find appropriate authors the client just a continuum of the PC evolution?
for each article. They may enlist a writer like me
data to the client for a topic; they’d provide me a target page count, By offloading some of the validation to the client,
just a continuum of a Word template, and CoDe Magazine publish- applications can better leverage each tier in the
ing guidelines. I then go offline and write the ar- architecture and empower users to make impor-
the PC evolution? ticle. When done, I submit it for editing. A few tant and quick decisions, while focusing the serv-
days later I’ll get the article back with the edits. er on validating completed orders. Just because
(The number of edits typically relates to how well companies can centrally run all the application
I followed the guidelines.) I review the changes, logic doesn’t mean it’s the best overall solution
making sure the underlying meaning isn’t lost and for those data centers or the end users who get
submit it again for “validation”. If all is accepted, frustrated at the latency and the requirement that
the article goes into layout and publishing. If not, they be tethered to the network. Using this lay-
we repeat the same validation and editing loop. ered, empowered approach, you’ll likely find that
This model scales as Rod employs the expertise you can make many enhancements in one portion
of great editors such as Erik Ruthruff for overall of the application without affecting other tiers. If
content quality and delegates the technology ex- you already have the reference data on the client,
pertise to the individual authors. you can likely cook up other enhancements by
simply versioning the client app and leaving the
Now imagine this process in the “online” elec- central services alone.
tronic model. As I write, I’d submit each para-
graph, heading, and figure to CoDe for validation.
Since I’m not as structured as some writers, I How to Keep Your Cache Up to Date
tend to write out my thoughts, chunk them up,
and continually reorganize them. This can make Now that you may be convinced having a local
my style challenging to edit, so imagine if I sub- cache can help your clients make better decisions,
mitted my content as I write it. This wouldn’t be how do you keep the cache up to date? In my

44 Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET www.code-magazine.com
food shopping comparison, you use up resources • Be protocol and transport agnostic leverag-
and need to replenish them when inventories get ing the various transports, protocols, and se-
low. With reference data, you don’t “use up” data, curity models developed by other smart folks
but rather need to get updates to the data. With at Microsoft (Web services, WCF, …).
Active Caching, you bring down the raw materi- • Enable endless scalability of the number of
als. But similar to food shopping, you don’t re- clients synchronizing with the server.
purchase everything; you just get “the updates.” • Provide a common sync model for all types
There are a few models to tracking changes. The of data, not just relational data, but files and
server could keep track of all the clients or the custom data sources.
clients can track their own reference points. • Ensure data can be exchanged in a variety of
connectivity patterns including centralized
As each client asks, “What changed for me?” the and collaborative, peer-to-peer topologies.
server can send its change list. The problem with
this model is scalability. While this may work for I’ll expand on these points a bit below, providing
a small number of clients, as you get to the thou- a bit more context. As we started to scope the ef-
sands, this model tends to fall over as the server fort of accomplishing these goals, we considered
must track changes for each client. Rather than the common data sources that developers re-
have the server track each client, what if the serv- quired. The most obvious were files and relational
er simply tracked when changes were made? As data. XML data sort of fell into a combination of
each client synchronizes with the server, the cli- files and unstructured relational storage. In many
ent simply stores the reference point of the server, cases developers were choosing XML storage
so later on, it can ask for what’s changed since the as it was too hard to deploy anything else. For
last reference point. Sync Services for ADO.NET version 1.0 (hereaf-
ter, V1) we decided to focus on relational data
So imagine this scenario: On Monday Adam, your in a centralized topology, but carefully designed
user, installs the Order Entry application. Adam the system to support other data sources in the
downloads his product line from the overall cata- future. Be sure to read Moe Khosravy’s article,
log and the customers for his region. After a two- Introducing the Microsoft Sync Framework—Next
day road trip, he reconnects to the corporate serv- Generation Synchronization Framework, covering
ers. Rather than asking the servers for what data it the broader goals for the sync platform including
has for Adam, the application asks what’s changed peer-to-peer topologies.
since Monday for his product lines and his custom-
ers. In Sync Services for ADO.NET, the storing of As we scoped V1 to relational data, we elaborated
the server’s reference point on the client is known on the domain-specific data sources:
as an anchor-based tracking system. This works re-
ally well for centrally managed data systems that • Leverage a developer’s ADO.NET experi-
need to support a large number of clients. ence while being DBA-friendly for server
configuration.
• Deliver a good, better, best programming
Painting the Big Picture model enabling any ADO.NET server da-
tabase, provide a designer for SQL Server,
Before jumping into the details for Sync Services and use SQL Server 2008 to deliver the best
for ADO.NET, it may help to understand a num- overall performance and simplest to config-
ber of key design goals and priorities that our ure.
team had in mind.

• Simple things should be simple while com- The Power of SQL Server in a
plex things should be enabled. Compact Footprint
• Work within friction free, non-admin, Web
deployment models.
• Be developer-oriented and leverage their While many IT-supported applications were suc- One Size Doesn’t
domain-specific skills for each type of data cessful deploying SQL Server Express, the ma- Fit All
source. jority of applications didn’t have direct IT sup-
Sync Services is just one
• Deliver an end-to-end experience built on a port. Many of these applications were persisting
component of a larger
layered component model. their data as XML using serialized DataSets as
architecture. On my blog I
• Enable independent server and client con- they couldn’t use SQL Server Express due to its
describe how “Logical Queuing”
figuration empowering the client application deployment requirements. With our V1 scop-
and “Notifications to Pull” are
to determine the data it needs, while freeing ing to relational data and our design goals pri-
other important architectures
the server to version independently offering oritizing non-admin, Web deployment models, we
for using the right tool for the
additional services. knew we had to enable a local store that could
right job.
• Enable rich eventing on all tiers for conflict deliver the power of SQL Server in a compact
detection and business rule re-validation for footprint.
domain-specific data sources.
• Enable easy movement/refactoring between Since 2001, SQL Server has had an embedded
two-tier, n-tier, and service-oriented archi- database engine available to the Windows Mobile
tectures. platform known as SQL Server CE. Targeted at

www.code-magazine.com Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET 45
Figure 3: High-level architecture for Sync Services for ADO.NET.

devices, Microsoft designed it for constrained en- • Rich subset of query programming of SQL
vironments, but it still had the power to handle Server.
relatively large amounts of data in a compact • Rich subset of data types of SQL Server.
footprint. With a footprint under 2 MB, it became • Robust data programmability through ADO.
a popular choice for applications that need the NET with native programmability through
power of SQL Server in a compact footprint. Sev- OLE DB.
eral applications in Windows Vista, Media Center • ISAM-like APIs for bypassing the query pro-
PC, and MSN Client all embedded this engine cessor, working directly against the data.
within their applications. You likely didn’t even • Updatable, scrollable cursors eliminating the
know you had SQL Server Compact running, and need for duplicate, in-memory copies.
that’s one of its key advantages. It’s simply em- • Support for up to 4GB of storage, with in-
bedded cleanly within each application. creasing storage in future versions.
• Rich sync technologies including Merge Rep-
In 2005, SQL Server CE became SQL Server Mo- lication, RDA, and Sync Services for ADO.
bile to recognize its expanded usage on the Mo- NET.
bile platform of the Tablet PC. With Visual Studio • Full support for SQL reporting controls.
2005 focusing on client development, we knew • Single-file code-free file format enabling
we couldn’t wait for Visual Studio 2008 to enable a safe document-centric approach to data
developers looking to cache data for optimized storage.
online or those building offline applications. We • Support for custom extensions enabling dou-
knew we’d take a while to complete Sync Services ble-click association with your application.
and SQL Server Mobile simply needed a licensing • Support for network storage, such as net-
change. With Visual Studio 2005 Service Pack 1 work redirected documents and settings.
we released SQL Server Compact 3.1 as the suc- • Concurrent application support from mul-
cessor to SQL Server Mobile and SQL Server CE tiple processes or threads enabling sharing
to address all our current Windows desktop op- of data on a single user’s machine.
erating systems. • Best of all, it’s free to deploy, license, distrib-
ute, and embed within your applications.
Some of the key benefits of SQL Server Compact
include the following: With SQL Server Compact 3.1 shipping with Vi-
sual Studio 2005 SP1, developers can fill the need
• ~1MB for the storage engine and query pro- between XML storage and SQL Server Express.
cessor. With Visual Studio 2008, SQL Server Compact
• Consistent engine and data format for all 3.5 ships as the default local database with in-
Windows operating systems from the phone tegrated designer support including the support
to the desktop. of Sync Services for ADO.NET and LINQ to
• Runs in-process (i.e., embedded). SQL. With the release of the ADO.NET Entity
• Deployed as DLLs eliminating the need for a Framework, SQL Server Compact will be used to
Windows service. persist ADO.NET entities locally within your ap-
• Supports central installation with traditional plication.
MSI (requires administrative rights).
• Supports private deployment of the DLLs Future versions of SQL Server Compact will con-
within the application directory (enabling tinue to take advantage of the unique embedded
non-admin deployment). nature to provide an integrated experience within
• Full side-by-side support for multiple ver- your application programming model. SQL Serv-
sions enabling different applications to em- er Express will certainly continue for develop-
bed SQL Server Compact without fear that ers requiring the same programming model with
other applications using newer versions may our data service SKUs, including Workgroup,
destabilize their application. Standard and Enterprise. SQL Server Express

46 Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET www.code-magazine.com
Figure 4: SyncAdapters follow a familiar pattern to ADO.NET DataAdapters.

and SQL Server Compact are not meant to com- sert, update and delete commands that may have
pete, but rather provide overlapping technolo- been configured for you, or customized to suit the
gies enabling the unique needs of custom ap- needs of your specific application using custom
plications. They are best when used together T-SQL, sprocs, views or functions. However the
with SQL Server Express as the free entry point DataAdapter is designed for a one-time retrieval
to Microsoft’s data service platform and SQL of data. It doesn’t have the infrastructure to re-
Server Compact as the local cache utilizing Sync trieve incremental changes. As seen in Figure 4,
Services to synchronize the shared data and local Sync Services uses a SyncAdapter, which is based
cache. on the DataAdapter programming model. The Se-
lectCommand is replaced with three new com-
mands for retrieving incremental changes.
Simple to Complex
Configuring each command and its associated
As developers are continually challenged with parameters can be quite tedious, especially when
new emerging technologies, it becomes diffi- you just need the basics. Similar to the SqlCom-
cult to invest a lot of time learning a technology mandBuilder, Sync Services adds a SqlSync-
just to get the simple things done. To enable the AdapterBuilder which allows you to specify a
simple scenarios with the complexity as needed, few configuration options to create a fully-con-
Sync Services took an incremental approach to figured SyncAdapter. While Sync Services can
its design. This includes simple snapshot caching, work with any ADO.NET provider, the SqlSync-
enabling developers to add incremental changes, AdapterBuilder is specific for Microsoft SQL
or ultimately the ability to send changes from the Server enabling the “Better on SQL Server” de-
local store back up to the server. sign goal.

Additionally, many applications may start off as


two-tier applications and over time they expand Server Configuration
requiring n-tier architectures. Rather than assume
you must re-architect the application to make Figure 3 shows the green portions representing
this move, Sync Services was designed to easily the server-side configuration. As most database
move from two-tier to n-tier architectures, includ- servers are shared servers for several applications,
ing service-oriented architectures (SOA). Figure each application may only require a subset of the
3 shows how the components are broken into data, or the data may need to be reshaped or de-
groupings. You can see server components, client normalized before it travels down to the client.
components, and a SyncAgent which orchestrates For this reason, Sync Services exposes granular
the overall synchronization process. configuration with ADO.NET DbCommands en-
abling any ADO.NET data provider. For each log-
ical table you wish to synchronize, a SyncAdapter
Building On Your ADO.NET Skills is configured.

One of the significant improvements between One of the challenges with DataAdapters is the
classic ADO and ADO.NET was the ability to ability to update parent-child relationships. With
drill into and configure the commands which common referential integrity patterns, child de-
were executed against the server. With ADO. letes must first be executed before the parents can
NET, components such as the DataAdapter re- be deleted. Likewise, parents must first be created
trieve data from your database populating a dis- before children can be inserted. The DbServer-
connected DataSet sending the data across the SyncProvider contains a collection of Sync-
wire. As inserts, updates and deletes are made, Adapters that enables the hierarchal execution of
the DataAdapter can shred the changes based commands based on the order of SyncAdapters in
on the RowVersion in the DataSet, executing in- the collection.

www.code-magazine.com Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET 47
Client Configuration Similar to the design goals for Sync Services, the
Sync Designer starts simple and lets you incre-
The client has two main components. The client mentally add complex requirements as needed.
provider, in this case the relational store of SQL The Sync Designer focuses on the following:
Server Compact, and the SyncAgent which orches-
trates the synchronization process. You could sort • Configure SQL Server databases for change
of think of the SyncAgent as the thing that does the tracking, including tracking columns, track-
food shopping for you once you’ve given it the shop- ing tables (known as Tombstone tables) and
ping list, and location of the food store. While server triggers to maintain the tracking columns.
databases tend to be shared resources, client data- • SQL scripts generation for later reuse and
bases are typically specific to the application. With editing when moving from development to
this design simplicity there’s no need for configura- production.
tion of the individual commands for each local ta- • Generation of typed classes for instancing
ble. However as client applications get installed on the sync components, similar to the typed
potentially thousands of clients, there are a number Dataset and TableAdapter experiences.
of other deployment options. Other configuration • Auto creation of the local database and
options such as what subset of server data each cli- schema, including primary keys.
ent actually requires and the ability to group several • Separation of client and server components
Resources for SQL tables within a transactional sync operation can be to enable n-tier scenarios.
Server Compact and achieved with a SyncGroup configured through the • Creation of WCF service contracts for n-tier
Sync Services for Configuration object of the SyncAgent. Once the enabled applications.
ADO.NET SyncAgent is configured with a RemoteProvider,
LocalProvider and the SyncTables, you can simply The Visual Studio 2008 Sync Designer is focused
• www.SyncGuru.com. Rafik call the Synchronize method. around read-only, reference data scenarios, so
Robeal, our developer guru for you won’t see any configuration for uploading
Sync Services, provides many In two-tier architectures the RemoteProvider is the changes. However, the designer does generate
examples and insight into DbServerSyncProvider. Later in this article, in the the upload commands—they’re just not enabled.
Sync services for ADO.NET. section called “Going N-tier with WCF,” I’ll dis- Similar to typed DataSets, you can extend the
• blogs.MSDN.com/ cuss how you move from a two-tier architecture to designer-generated classes through partial types
SteveLasker. Includes something with multiple tiers. To enable different enabling the upload commands with a single line
many screencasts, demos, data sources ranging from relational databases to of code per table. By selecting “View Code” from
answers to common files, as well as custom objects, Sync Services fol- the Solution Explorer context menu on the .sync
questions and blogicles lows a standard provider model. Sync Services 1.0 file you can reconfigure the Customers table to
for occasionally connected includes a relational provider for SQL Server Com- enable bidirectional synchronization as seen in
architectures. pact, or the SqlCeClientSyncProvider. Listing 1.
• www.Microsoft.com/SQL/
Compact. Product info and Configuring the SqlCeClientSyncProvider only re- Other features of Sync Services that aren’t en-
downloads for SQL Server quires a connection string for the local database abled through the designer but can be extended
Compact. with the default option to automatically create the through partial classes are:
local database and schema. Similar to the auto cre-
• blogs.MSDN.com/
ation of the database, the SyncTable configuration • Filtering of columns or rows
SQLServerCompact. The
enables options for creating the tables and whether • Foreign keys, constraints, and defaults
SQL Server Compact team
local changes to the table should be synched back • Batching of changes
blog.
to the server. Several events are available within the
• www.Microsoft.com/ client and server providers enabling conflict detec- To get a feel for the designer simply add a Lo-
SQL/Editions/Compact/ tion and business rule validation. cal Database Cache item to any non-Web project
SSCEComparison.mspx. type. The Local Database Cache item aggregates
Choosing between SQL the configuration of Sync Services for ADO.NET
Server Express and SQL Designer Productivity and SQL Server Compact 3.5 for the local cache.
Server Compact. The designer will enable connections to SQL
• MSDN.Microsoft.com/Sync. To meet our developer productivity design goal, Server databases and will automatically create
Info for the overall sync Visual Studio 2008 adds a Sync Designer to sim- a SQL Server Compact 3.5 database. With the
platform. plify configuration of Sync Services for ADO.NET. server database selected, you can add tables to be
cached based on the default schema of the user id
in the SQL Server connection string. By checking
Listing 1: Extending the Sync Designer, enabling bidirectional synchronization and N’ Tier sync with one of the tables the designer will configure the
WCF server tracking changes using “smart defaults”.
Partial Public Class NorthwindCacheSyncAgent
Private Sub OnInitialized() If you can’t make server-side changes to your data-
' Enable Customers to sync with the server base, and the data is small enough to just retrieve
Me.Customers.SyncDirection = _ the entire result each time you sync, you can change
SyncDirection.Bidirectional the “Data to download” combo box to “Entire table
' Glueing the SyncAgent.Remote provider to each time,” also known as snapshot. When per-
' a WCF service proxy forming snapshot sync, the tracking combo boxes
Me.RemoteProvider=New ServerSyncProviderProxy( become disabled as the SQL Server configura-
New SyncServRef.NWCacheSyncContractClient()) tion options aren’t required. Assuming you want

48 Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET www.code-magazine.com
Figure 5: Sync Services in a layered service-oriented architecture.

incremental changes, you can then choose which model could utilize a scheduled task to run daily to
columns will be used to track the changes. The purge tombstones that are older than a configured
designer will default to adding an additional Date- number of days. Furthering our goal to make Sync
Time column for LastEditDate and CreationDate. If Services run “Best on SQL Server,” SQL Server
your table already has TimeStamp columns for last 2008 adds a feature known as SQL Server Change
updates, you can select the existing column. Sync Tracking that dramatically simplifies this server
Services stores a common anchor value for last edit configuration and reduces the overhead by track-
and creation columns, so you’ll need to either use ing changes deep within the SQL Server engine.
DateTime or TimeStamp on both the last edit and Change tracking incorporates a change retention
creation tracking columns. Since you can only have policy that automatically purges historical changes
one TimeStamp column per table, you’ll notice that based on the per-table configured value.
if you choose TimeStamp for the LastEdit compari-
son, the creation column will default to a BigInt. Since Visual Studio 2008 ships prior to SQL
Timestamps are really just big integers serialized in Server 2008, and developers may need to target
binary form. pre-SQL Server 2008 servers, the Sync Designer
will default to creating a Tombstone table and the
associated triggers to maintain the tracking infor-
Tracking Deletes mation. As with the rest of the SyncAdapter con-
figuration, if the designer-generated commands
Tracking deletes is an interesting problem. If you don’t fit your needs, you can easily customize these
synchronize with the server on Monday, and come commands using sprocs, views, and functions.
back on Wednesday asking what’s changed, how Sync Services simply needs a DbCommand with
does the server know what’s deleted unless it keeps results for the specific command, in this case the
a historical record? There are a couple of different list of primary keys to delete. As you move to SQL
approaches to tracking deletes. Prior to the Sar- Server 2008, your configuration becomes easier,
banes-Oxley (SOX) compliance days, it was typical and your sync-enabled database will simply per-
to just delete aged data. However, between SOX form better.
and increased disk storage, applications are keep-
ing the deleted data around. While it’s important to
keep the deleted data on the server, it’s typically not Setting the Hierarchical Order
necessary to clog up your clients. When each client
synchronizes, Sync Services simply requests the list Once you’ve added the cached tables, the Sync
of primary keys that should be deleted locally. Designer will configure your server and save the
creation and undo scripts for inspection and re-ex-
ecution when moving from development to produc-
Tombstones tion. Back in the Configure Data Synchronization
dialog box, you’ll see your list of tables. It may not
A standard model for tracking deletes is to create be that important for reference data, but if you in-
a separate table that contains the primary keys for tend to send your changes back up to the server,
deleted rows. Just as cemeteries use tombstones to and you want to make sure parent records, (such
leave an indicator of what once was, tombstone as the OrderHeader table) are inserted before Or-
tables have become the standard in many sync-en- derItems and OrderItems are deleted before Order-
abled systems. Another interesting analogy is how Header rows are deleted, you can use the arrows to
tombstones, if not managed, could eventually cover shuffle the list of tables sorting the parents above
the earth leaving no room for the “active partici- their children. The designer sets the order of Sync-
pants.” Luckily data is a lot less emotionally sen- Adapters within the DbServerSyncProvider. Click-
sitive so a standard model is to purge tombstone ing OK to finish the Sync Designer configuration
records after a period of mourning. How long you will configure the Sync Services runtime, generate
keep your tombstones is related to a balance of disk the classes for use within your application, and ex-
space, database sizes, and how long you expect your ecute the Synchronization method to automatically
users to work offline. The pre-SQL Server 2008 generate the SQL Server Compact database with

www.code-magazine.com Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET 49
CODE COMPILERS
the schema and data you’ve selected. As the Sync
Designer completes it adds the newly created SQL
DATA DEVELOPMENT Server Compact database to your project which
triggers Visual Studio to automatically prompt to
October 2007 Volume 4 Issue 3 create a new typed DataSet for your newly added
SQL Server Compact data source.
Group Publisher
Markus Egger
Associate Publisher
Rick Strahl
Sync Services, SQL Server Compact
Managing Editor and LINQ
Ellen Whitney
Content Editors
H. Kevin Fansler Erik Ruthruff Similar to building houses, it’s difficult to build
the second floor when the basement is still being
Writers In This Issue
Brian Beckman José Blakeley completed. Because of the parallel development
Anthony Carrabino Andrew Conrad of LINQ, Sync Services and the Microsoft Sync
Gert Drapers Elisa Flasko Framework, Sync Services for ADO.NET 1.0 uses
Mike Flasko Moe Khosravy
Stan Kitsis Christian Kleinerman DataSets and DbCommands for its programming
Steve Lasker Chris Lee APIs. Future releases of Sync Services will be ex-
Erik Meijer Shyam Pather tended to work with the new LINQ programming
Michael Pizzo David Schach
Vaughn Washington Jonathan Wells models.
Technical Reviewers
Elisa Flasko The Sync Services, Active Caching approach syn-
Ellen Whitney chronizes data between two or more data sources.
Art & Layout The orange box in Figure 3 shows how interacting
King Laurin GmbH with the data in SQL Server Compact is complete-
info@raffeiner.bz.it
ly up to the application developer. SQL Server
Production Compact continues to support the typed DataSets
Franz Wimmer
King Laurin GmbH and TableAdapter programming model from Visu-
39057 St. Michael/ Eppan, Italy al Studio 2005 as well as the updateable, scrollable
Printing cursor using the SqlCeResultSet. With the new
Fry Communications, Inc. LINQ programming model, developers may use
800 West Church Rd. LINQ to SQL or LINQ to Entities. While Visual
Mechanicsburg, PA 17055
Studio may auto-prompt to create typed DataSets,
Advertising Sales
Vice President, Sales and Marketing
this is simply based on providing a consistent expe-
Tom Buckley rience with what developers may already be used
832-717-4445 ext 34 to with Visual Studio 2005. Developers wishing to
tbuckley@code-magazine.com use LINQ to SQL or LINQ to Entities can simply
cancel this dialog and use the appropriate tools
Sales Managers to generate classes over tables within their SQL
Erna Egger Server Compact database. For more information
+43 (664) 151 0861
erna@code-magazine.com on LINQ to SQL and LINQ to Entities, see the
Tammy Ferguson article, LINQ to Relational Data: Who’s Who? by
832-717-4445 ext 26 Elisa Flasko in this issue of CoDe.
tammy@code-magazine.com

Circulation & Distribution


General Circulation: EPS Software Corp. Going N-tier with WCF
Newsstand: Ingram Periodicals, Inc.
Media Solutions One of the challenges developers faced in Visual
Worldwide Media
Studio 2005 was how to easily refactor Typed Da-
Subscriptions
Subscriptions Manager
taSets from two-tier to n-tier architectures. ADO.
Cleo Gaither NET factored DataSets and DataAdapters as
832-717-4445 ext 10 separate components allowing the DataAdapters
subscriptions@code-magazine.com and their associated commands and connections
US subscriptions are US $29.99 for one year. Subscriptions outside the US to be hosted on the server. DataSets could be
are US $44.99. Payments should be made in US dollars drawn on a US bank. shared on the client and server reusing common
American Express, MasterCard, Visa, and Discover credit cards accepted. business rule validation and schema. However, in
Bill me option is available only for US subscriptions. Back issues are available.
For subscription information, email subscriptions@code-magazine.comor contact Visual Studio 2005, the Typed DataSet Designer
customer service at 832-717-4445 ext 10. didn’t easily enable this scenario. Late in the Vi-
sual Studio 2005 cycle we prototyped a number
Subscribe online at of different solutions which eventually lead to the
www.code-magazine.com following design.
CoDe Component Developer Magazine
EPS Software Corporation / Publishing Division In Visual Studio 2008, both typed DataSets and
6605 Cypresswood Drive, Ste 300, the Sync Designer now support full separation of
Spring, Texas 77379
Phone: 832-717-4445Fax: 832-717-4460 the client- and server-generated code. Within the

50 Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET www.code-magazine.com
Sync Designer, clicking the advanced chevron will vice-oriented architectures with minimal impact
expose the n-tier configuration options. Here you to your code.
can target the client- and server-generated classes
to different existing projects within your solution.
If you close the dialog and add a new “WCF Ser- A Quick Recap
vice Library” project to your solution you can use
the designer to generate your server classes into a In this article I’ve covered why caching data isn’t
WCF service project. just about enabling offline functionality and is a
common practice in other scenarios such as the
The Sync Designer will not only generate the food supply chain. Caching can ease your develop-
server provider classes into the WCF project, it ment cycle, enabling decoupled tiers so you can
will also generate a WCF service contract wrap- independently update components of your appli-
ping the configured server sync provider and its cations, and of course, you’re that much closer to
associated interface. At the top of the gener- enabling full offline scenarios. By comparing the
ated file, a few commented snippets are includ- caching scenarios to other real world issues, and
ed for configuring your WCF service. With the walking through the configuration of Sync Ser-
classes separated and the WCF service fully config- vices for ADO.NET using Visual Studio 2008, I’ve
ured, you simply need to glue them back together hopefully demonstrated the need and productivity
by generating a service and its associated proxy. gains for caching reference data, tracking deletes,
sending updates and simply moving from two tiers
to n-tier over WCF. As your applications scale to
Multiple Inheritance and Organic more complex requirements, Sync Services can be
Onions reconfigured to your unique needs. As you need to
enable different data sources or support collabora-
tive topologies, your investments in Sync Services
One of our design goals for Sync Services was to can be expanded with the Microsoft Sync Frame-
minimize dependencies on external components work. As your companies roll out SQL Server
and leverage other teams building transports, ser- 2008, features like SQL Server Change Tracking
vices, and security models. Developers may need and Sync Services for ADO.NET will deliver the
to work over Web services, use the more powerful best developer experience with the greatest perfor-
features of WCF, work over REST, use their own mance for sync-enabled applications.
custom transports, or serialize data in a custom-
ized format. As you can see in Figure 5, we de-
signed Sync Services to work over stateless service It’s Not a Matter of If, but When.
models. All that’s needed is a matching service and Are You Ready?
proxy. With Sync Services 1.0 using DataSets, de-
velopers can leverage the out of the box serializa-
tion of DataSets, or they can use LINQ to trans- You can always tell the difference between expe-
form DataSets to their own custom payload. Keep- rienced and novice motorcycle riders by the pro-
ing with the food metaphor, they may convert the tective gear they wear. As the experienced will tell
DataSets to serialized jellybeans. On the client, the you, it’s not a matter of if you fall; it’s a matter of
developer simply converts the jellybeans back to a when. By building systems that can leverage a local
DataSet, handing it to the SyncAgent, and they are cache, your users, and your business will be pro-
good to go. tected from the inevitable. With SQL Server Com-
pact, Sync Services for ADO.NET, and the Micro-
One of the design challenges we faced with en- soft Sync Framework you have more productive
abling transport-agnostic programming was the and powerful tools enabling applications to easily
base class requirements of many components. Web integrate redundancy into your applications. Done
services and WCF proxies require their own base right you can reduce the workload to your cen-
classes. The RemoteProvider on the SyncAgent re- tral services empowering your users to work any-
quires its own base class of a SyncProvider. To get time, anywhere. When they’re connected, things
around the multiple inheritance issues we added a may work better, but when they’re not, they can
wrapper class to Sync Services enabling delegation continue to be productive producing revenue for
to the appropriate base class. Since the wrapper your company. Although networks and wireless
class was generic, but wrapped other proxy classes, continues to be available in more locations with
we initially called it the OrganicOnion class. As faster bandwidth, they’re not going to be available
we moved into production this was changed to everywhere, and if you assume your users won’t
the less creative, but more professionally named have a problem, well, I guess you haven’t fully ex-
ServerSyncProviderProxy class. To glue the Syn- perienced motorcycle riding.
cAgent back together with the configured Server-
SyncProvider in the WCF project, developers can Steve Lasker
add the code similar to Listing 1 in the partial class
of the Sync Designer.

With this configuration, your application can eas-


ily work over two-tier architectures, n-tier or ser-

www.code-magazine.com Caching with SQL Server Compact and the Microsoft Sync Framework for ADO.NET 51
ONLINE QUICK ID 0712102

Introducing the Microsoft


Sync Framework:
Next Generation
Synchronization Framework
Moe Khosravy The Microsoft® Sync Framework is the new framework and
Moe is the Lead Program runtime for adding synchronization, roaming, and offline
Manager on the Microsoft Sync capabilities to applications. It supports peer-to-peer scenarios,
Framework and has been with
the project since incubation. You
works with devices and services, and is agnostic of data types,
can find him roaming the halls of stores, and protocols. In this article, I’ll cover the high-level vision for the
campus preaching content flow platform as well as the enabled scenarios made possible by the framework for
and applying quotes from developers, ISVs, and OEMs.
The Simpsons to everyday life.
He was previously an architect in
the Advanced Technologies group

S
for Vital Images Inc, working on everal years ago, Microsoft assembled archi- audio, video, settings, and files/folders across any
advanced medical visualization tects and researchers from across the com- number of PCs, services, and devices—all directly
and analysis software solutions. pany to understand the limitations prevent- via peer-to-peer sync or through any number of
ing the seamless flow of data intermediaries such as PCs or
across applications, devices, Fast Facts services.
and services to allow consum-
ers access to their data wher- For years developers At the same time, more and
ever they need it. Throughout have been looking more applications were look-
the investigations, several ob- for an easy way to ensure ing to enable an Outlook-like
servations quickly emerged: cached mode of operation
that their applications may
in which an application oper-
• Lack of generic synchroni- be taken offline ates offline and periodically
zation technology lead to and synchronized with other synchronizes with the server.
the creation of numerous devices such as cellphones, Clearly, the common frame-
single-purpose solutions. PDAs, and other PCs. work would need to excel at
• Solutions tended to re- scenarios such as:
invent the wheel; often
falling into the same pitfalls explored by oth- • Caching and line of business (LOB) appli-
ers. cations that can take data from relational
• Solutions could not be bridged as they made databases and disparate back-end services
contradictory assumptions about the syn- offline.
chronization methodology (i.e., relied too • Taking Web services and Web applications of-
heavily on specific topologies, data types, fline, providing richer experiences or simply
stores or business logic). network resiliency.

It became clear that the only way to make prog-


ress towards enabling seamless “content flow” Project Goals
was to build a common framework with the fol-
lowing attributes: The Microsoft Sync Framework was created with
the following goals in mind:
• Powerful. Able to solve the numerous hard
problems related to interoperability. 1. Provide a common runtime for developers
• Flexible. Able to be used in all endpoints in building synchronization solutions, allowing
the sync-enabled ecosystem. them to reuse a reliable and optimized code
• Simple. Easy to use in any architecture to base capable of addressing the numerous subtle
compose new systems. issues that developers run into when building
caching, offline, sharing or roaming scenarios.
Only then could the framework support roaming 2. Facilitate content flow across solutions, even
and sharing scenarios for content such as PIM, if they utilize different protocols and stores

52 Introducing the Microsoft Sync Framework: Next Generation Synchronization Framework www.code-magazine.com
by standardizing on the synchronization synchronization (that is, allowing concurrent up- Data Programmability Team
metadata. dates on multiple endpoints). The Microsoft Sync
3. Simplify the actual development of sync Framework runtime implements the algorithms
solutions by providing domain-specific and required to work with the metadata, allowing
end-to-end components for common sce- endpoints to easily participate in the ecosystem.
narios such as syncing relational databases,
file systems, lists, devices, PIM, music, video, To make it easier and more productive for de-
etc… velopers building on Microsoft technologies to
leverage these capabilities, Microsoft has start-
ed integrating the enabling technology into our
next-generation flagships and platforms. For
more information on Sync Services for ADO.
Microsoft designed NET and for an example of a Microsoft Sync
the Microsoft Sync Framework Framework-enabled solution for rich synchro-
nization and caching of relational databases,
as a componentized and please see Steve Lasker’s article, Caching with
Sam Druker
layered architecture to SQL Server Compact and the Microsoft Sync
“I think we’re moving to
Framework for ADO.NET in this issue of CoDe
allow developers to pick Focus. the point that the cloud is
becoming the next great
and choose only what they platform for people to build
require to enable their Components of the Microsoft Sync applications on.”
scenarios and ensure Framework Sam currently leads the Data
that assets built on the Programmability team at
Microsoft, in the SQL Server
framework could participate The synchronization framework team designed
division; Sam is responsible for
the Microsoft Sync Framework as a componen-
in the content flow ecosystem tized and layered architecture to allow develop- building out the data platform
development strategy and
we’re working to grow. ers to pick and choose only what they require to
engineering, shipping XML and
enable their scenarios and ensure that assets built
on the framework could participate in the con- Data platform technology in the
tent flow ecosystem we’re working to grow. .NET Framework and throughout
the SQL Server product line. Sam
The Microsoft Sync Framework The Microsoft Sync Framework’s designers di- is also responsible for core XML
Approach vided these components into the following logical and Data access components in
layers, which are available to both managed and the Windows platform, Internet
native code developers. (Note: Device builds for Explorer, and Office.
A key tenet of the Microsoft Sync Framework is ARM, SH, x86, MIPS as well as Apple Mac sup-
to bring existing applications and devices into the port are in development and will be provided in a Sam has been in the software
sync ecosystem by leveraging as much as possible future CTP on MSDN). industry for more than 18 years.
from their existing implementations. This meant Prior to joining Microsoft, Sam
supporting synchronization between data stores • Core Sync Runtime. Infrastructure that in- was Vice President Engineering
ranging from large distributed Web stores backed cludes the algorithms to utilize the metadata as for LinkExchange, an early
by relational databases down to the file systems well as components to drive roaming, sharing, provider of Internet marketing
found on removable USB drives. offline, and sync on behalf of applications. services acquired by Microsoft
• Sync Provider Framework. Components in 1998 and Vice President
The framework would also have to support nu- designed to make it easier to expose data to Engineering for Cygnus Solutions,
merous data types and schemas (such as relation- the platform. This is effectively the plug-in a provider of compiler and
al data or files), as well as various sync topologies model to the Microsoft Sync Framework by development tools to the open
(such as peer-to-peer meshes or hub-and-spoke). which developers can either configure exist- source community, since
These requirements emerged prominently when- ing Sync Providers or write their own. acquired by Red Hat Software.
ever multiple sync solutions were bridged in a • Domain-specific components. Infrastruc- In the early nineties, he was a
heterogeneous mesh. ture to facilitate the rapid development of Principal Software Engineer at
end-to-end solutions involving specific stores
Cognex Corporation, a leading
For these reasons the heart of the Microsoft Sync such as SQL Server, SQL Server Compact,
manufacturer of machine vision
Framework consists of the common synchroni- FAT, NTFS, etc.
systems for factory automation
zation metadata model. By agreeing on only the
metadata without forcing agreement on protocols and robotics. Sam started his
or storage, the Microsoft Sync Framework lays Core Sync Runtime software engineering career as
the foundation for content-flow as well as a ge- Vice President Engineering for
neric solution for synchronization, roaming, and The core runtime contains the metadata services Zortech, which shipped the first
sharing. used by all of the Microsoft Sync Framework cli- native code compiler for C++.
ents. Some of the features include: Sam is a graduate of MIT's VI-3
The metadata at the core of Microsoft Sync EECS program, a former offensive
Framework is highly efficient and compact, yet • Multi-master metadata representation and lineman for the Engineers, and
it provides full support for correct multi-master management, including conflict detection, now lives in Seattle, WA.

www.code-magazine.com Introducing the Microsoft Sync Framework: Next Generation Synchronization Framework 53
custom and preset conflict resolution han- the ability to store metadata. The Microsoft
dling, change enumeration assistance, etc. Sync Framework has services to allow data
• Conflict preservation management and rep- from these participants to flow within the
resentation of conflict resolution. ecosystem on behalf of a fully featured par-
• Support for change units (tracking changes ticipant/provider.
at a property level) and consistency units • Anchor-Based Providers. Support for pro-
(dealing with logical groupings of objects). viders that rely on a simple tick-count based
• Handling of filtering changes and filter sync enumeration mechanism for their change
(both for items and for parts of items). detection mechanism (e.g. timestamps, tick
• Recovery from a multitude of failure sce- counts, etc.).
narios, such as tombstone cleanup, interrup-
tions, network failures, etc.
• Synchronization session management, such Storage Specific
as cancellation, progress reporting, etc. Components
The Microsoft Sync Framework provides an
out-of-the-box Synchronization Agent that can In addition to the Microsoft Sync Framework-en-
synchronize endpoints when requested. The abled Sync Providers in development across Mi-
endpoints are abstracted using the notion of a crosoft, the Microsoft Sync Framework includes
Sync Provider, which is responsible for expos- several components to simplify the development
ing the capabilities of a given endpoint, store or of offline, sync, sharing, and roaming scenarios
protocol. Applications use Sync Providers for using specific stores and protocols. These compo-
the stores they need to synchronize and develop- nents include:
ers use the Microsoft Sync Framework to create
custom Sync Providers for virtually any type of • Relational Data Providers. Synchronization
data. Services for ADO.NET 2.0, the next version
of Sync Services discussed in Steve Lasker’s
Core to the framework is support for Simple article, (Caching with SQL Server Compact
Sharing Extensions (SSE). The Microsoft Sync and the Microsoft Sync Framework for ADO.
Framework natively supports endpoints that wish NET in this issue of CoDe Focus) will ship
to interoperate using SSE extensions for RSS with SQL Server 2008. Sync Services 2.0
and ATOM feeds. Furthermore, the framework is built on the Microsoft Sync Framework
provides services for feed generation and con- aligning these two evolving sync platforms. It
sumption, including the requisite conflict detec- continues to dramatically simplify the cach-
tion and preservation. While the Microsoft Sync ing of a remote or local database for scenar-
Framework offers RSS/ATOM support out-of- ios such as line of business and branch of-
the-box, developers can extend this support to fice/point-of-sale scenarios, while providing
other formats as well. High level producer/con- peer-to-peer and other advanced sync capa-
sumer APIs help easily turn any Microsoft Sync bilities. These providers take full advantage
Framework provider into an SSE-enabled end- of the new change-tracking features in SQL
point! Server 2008 and SQL Server Compact.
• SQL Server Compact Metadata Store. A
ready-to-use component for storing sync
Sync Provider Framework metadata such as versions, anchors, and
change detection information. This compo-
The Microsoft Sync Framework supports several nent greatly simplifies the development of
ways of writing Sync Providers. Each is intended custom providers that do not have a natural
to make the experience as easy and efficient as pos- place to store metadata.
sible. In addition to full-featured (that is, “full par- • File and Folder Sync Provider. A ready-to-
ticipant”) providers that are capable of true peer- configure provider capable of representing
to-peer sync, the framework provides support for any Win32-compatible file system (e.g. FAT,
allowing legacy and existing devices and applica- NTFS, removable device). This provider han-
tions to participate in the content flow ecosystem. dles challenges such as change detection on
Specifically, the Microsoft Sync Framework Sync FAT volumes (including move and rename
Providers support the following scenarios: heuristics), name-name collision resolution,
update-delete conflicts (including hierarchi-
• Partial Participants. Providers that sim- cal update-delete), and the ability to preview
ply store but do not understand most of the a synchronization action.
sync metadata (e.g. a USB keychain, legacy • ADO.NET Data Services Offline Provider.
phone, media device). Despite being very ADO.NET Data Services providers are cur-
easy to develop even on endpoints that do rently being explored through prototypes
not host the Microsoft Sync Framework en- that allow synchronizing data using REST-
gine, these providers can participate in all style interfaces for taking data services of-
multi-master content flow scenarios. fline.
• Simple Participants. Endpoints that lack • Plus many more providers currently in de-
the ability to detect changes and that lack velopment across Microsoft!

54 Introducing the Microsoft Sync Framework: Next Generation Synchronization Framework www.code-magazine.com
Sync-Enabled Content
Flow Scenarios

By removing the requirement that endpoints


need intimate knowledge of one another in
order to share or roam data, the Microsoft Sync
Framework creates a sync-enabled occasionally
connected ecosystem capable of providing:

• Offline support for rich Internet applica-


tions. Ability to take a Web service or an
Internet application offline with all changes
synced back to any other endpoint without
conflicts.
• Calendar, Contact, Task List sync. Support
for field-level sync such as “first name” “last
name” in an item-enabled rich PIM sync sce-
nario.
• Rich media sync. Ability to efficiently sync
a full media library, or smaller subset, to the
cloud or to devices either directly from an-
other endpoint or through intermediaries.
Ability to sync properties such as ratings and
play count without transferring the entire
media file.
• Easy caching. The ability to easily cache a
subset of the information from a remote or
local store to support offline scenarios.
• Sharing. Ability to have endpoints, such as
devices, share data between any number of
devices, clients, and services with edits pos-
sible at any node. This makes tethered or
over-the-air sharing or roaming of media,
PIM, files, and settings a breeze.

Conclusion
While this is an introductory, high-level overview
of the Microsoft Sync Framework CTP1, you’ll find
plenty of material including “How-to” documents,
full-featured samples, articles, and white papers as
we unveil the framework on MSDN and at confer-
ences in North America and Europe.

Please visit http://msdn.microsoft.com/sync to get a


copy of the Microsoft Sync Framework CTP1 SDK!
I’m personally looking forward to reading your com-
ments and feedback as we extend the framework
and deliver on the content flow vision and strategy.
I encourage developers to evaluate the SDK and re-
quest the features and improvements that will help
you become more productive and the platform eas-
ier to use. We can be reached at syncsdk@microsoft.
com and look forward to your feedback!

Moe Khosravy

www.code-magazine.com Introducing the Microsoft Sync Framework: Next Generation Synchronization Framework 55
ONLINE QUICK ID 0712112

What’s New in SQL Server


2008?
SQL Server 2008 is scheduled for release in 2008 and promises
to deliver an array of new and exciting benefits to both
developers and IT Pros alike.

Anthony Carrabino

T
anthony.carrabino@microsoft. he list below highlights some of the new and quired to keep backups online and enables
com improved capabilities planned for SQL Serv- backups to run significantly faster.
er 2008. Enjoy! • Occasionally connected systems (OCS).
Anthony is a Sr. Product Manager New unified synchronization support across
for SQL Server at Microsoft. applications, data stores, and data types
He has been developing and Enterprise Data Platform with SQL Server 2008 and SQL Server
marketing commercial software Compact 3.5.
products for developers since • Transparent data en- • Partitioned table parallel-
1994. Before joining Microsoft,
Fast Facts
cryption. Enables en- ism. Improved performance
Anthony was President/CEO of cryption of an entire da- Did you know that on large partitioned tables.
Vista Software for eight years. tabase, data files, and log SQL Server is the fastest • Star join query optimiza-
During that time he and his team files, without the need growing DBMS and BI vendor tions. Improved query per-
created VistaDB™ as the world’s for application changes. in the world? formance for common data
first fully managed SQL database • Hot Add CPU. Scale And did you know that SQL warehouse scenarios by
engine for Microsoft .NET. your databases by dy- recognizing data warehouse
Server ships more units than
Before VistaDB, he established namically adding CPU join patterns.
APOLLO™ as one of the world's resources to supported
ORACLE and
most popular data engines for hardware platforms IBM combined?
managing legacy CA-Clipper without requiring appli- Beyond Relational
and FoxPro files. Other career cation downtime.
highlights include selling a suite • Policy-based management. New policy- • New spatial data types. New vector-based
of data access components based management architecture. spatial data types that conform to industry
called Firefly for Flash™ to • Extensible key management. New encryp- spatial standards allowing location-aware
Macromedia in 2002 and working tion and key management with support for applications to be developed.
with the team that licensed a third-party key management and hardware • New date and time data types. DATE
graphics framework to Computer security module (HSM) products. (date-only type), TIME (time-only type),
Associates in 1995 that became • Performance data collection. New central- DATETIMEOFFSET (time zone aware DA-
the GUI for CA-Clipper 5.3. Chess, ized data repository stores performance TETIME type), and DATETIME2 (new DA-
soccer, reading and working out data and new reporting and monitoring TETIME type with larger fractional seconds
fill his spare time. tools provide performance insights to ad- and year range than DATETIME).
ministrators. • New HIERARCHYID system type. HI-
• Data compression. Improved data com- ERARCHYID is a CLR UDT that provides
pression stores data more effectively and methods for creating and operating on val-
provides significant performance improve- ues that represent hierarchy nodes.
ments for heavy I/O workloads. • New FILESTREAM data type. Allows bina-
• Streamlined installation. New installation, ry data to be stored in an NTFS file system
setup, and configuration architecture sepa- while maintaining transactional consistency
rates the installation of physical bits on the with the database.
hardware from the configuration of the SQL • Integrated Full-Text Search. Perform high-
Server enabling custom installation configu- speed text searches on large text columns.
rations to be used. • Large user-defined types (UDTs). 8 KB
• Resource Governor. Define resource lim- limit for UDTs has been removed.
its and priorities for different workloads
enabling concurrent workloads to provide
consistent performance. Dynamic Development
• Predictable query performance. Greater
query performance, stability, and predict- • Sparse columns. Provides very efficient
ability with new functionality to lock down management of empty data in a database by
query plans. enabling NULL data to consume no physi-
• Backup compression. Reduces storage re- cal space.

56 What’s New in SQL Server 2008? www.code-magazine.com


• CLR integration and ADO.NET object ser- Conclusion Disclaimer
vices. Developers can program against a da-
tabase using CLR objects that are managed Developers and organizations can count on SQL This article is based on a pre-
by ADO.NET. Server 2008 to deliver a powerful set of capabili- release build of SQL Server 2008
• Change data capture. Captures and main- ties to solve the growing needs of managing data and makes no guarantees that
tains changes to data and schema across in the enterprise, on desktops, and on mobile features and/or capabilities listed
tables. devices. As an integrated part of the Microsoft here will be available in the final
• MERGE SQL statement. Enables develop- Data Platform vision, SQL Server 2008 is the release of SQL Server 2008.
ers to handle common tasks such as check- most comprehensive release of Microsoft SQL
ing if a row exists before executing an insert Server to date. For more information, please visit
or update. http://www.microsoft.com/sql/prodinfo/futureversion/
• GROUPING SETS. Enables multiple default.mspx
groupings to be defined in the same query,
producing a single result set that is equiva- Anthony Carrabino
lent to a UNION ALL of differently grouped
rows.

Pervasive Insight

• New design of SQL Server Integration


Services (SSIS) Pipeline. Improved scal-
ability of runtime into multiple processors
allows Data Integration packages to scale
more effectively.
• Integration Services Persistent Lookups.
Improved performance of large table look-
ups.
• Block computations. Provides a significant
improvement in processing performance,
enabling users to increase the hierarchy
depth and computation complexity.

ADVERTISING INDEX
• New MOLAP-enabled writeback capa- Advertisers Index Data Development with Microsoft ® technologies

ET
.N
bilities. Removes the need to query ROLAP
partitions and provides users with enhanced Code Magazine 41, 63
writeback scenarios from within analytical www.code-magazine.com Volume 4 / Issue 3

applications. Microsoft Talks Data!


• Scalable reporting. Improved reporting en- Link up with LINQ
EPS Software Corp. 69 ADO.NET Data
gine and tools for creating, processing, for- Services
www.eps-cs.com/TabletPC Microsoft Sync
matting, and viewing reports with extensible Framework
architecture that enable easy integration of SQL Server 2008

reporting services. Microsoft 92 ODBC Rocks!

• Internet report deployment. Effortlessly http://www.microsoft.com/sql/bigdata


deploy reports over the Internet. www.code-magazine.com
US $ 5.95 Can $ 8.95

• Manage reporting infrastructure. Control SXSW Interactive Conference 87


server behavior with memory management, www.sxsw.com
infrastructure consolidation, and easier Advertising Sales:
configuration through a centralized store Vice President,
Tech Conferences Inc. 2-3 Sales and Marketing
and API. www.devconnections.com Tom Buckley
• Improved scale-out configuration. New 832-717-4445 ext. 34
support for managing multiple report serv- tbuckley@code-magazine.com
VFP Conversion Tools 79
ers. Sales Managers
www.vfpconversion.com/tools
• Built-in forms authentication. Built-in Erna Egger
forms authentication enables users to easily +43 (664) 151 0861
West Wind Technologies 9 erna@code-magazine.com
switch between Windows and Forms.
• Report Server application embedding. Re- www.west-wind.com Tammy Ferguson
port Server application embedding enables 832-717-4445 ext 26
tammy@code-magazine.com
the URLs in reports and subscriptions to Xiine 91
point back to front-end applications. www.xiine.com
• Microsoft Office integration. New Micro-
soft Office rendering enables users to con-
sume reports directly from within Microsoft
Word, and the existing Microsoft Excel® This listing is provided as a courtesy to our readers and
renderer has been greatly enhanced to sup- advertisers.
port nested data regions and sub-reports as The publisher assumes no responsibility for errors
well as merged cell improvements. or omissions.

www.code-magazine.com What’s New in SQL Server 2008? 57


Programming SQL Server
ONLINE QUICK ID 0712122

2008
SQL Server Katmai, now officially announced as SQL Server
2008, introduces a significant amount of new and improved
functionality, including enhanced data types and greater
programming flexibility for database application developers.

I
Vaughn Washington n this article I’ll look at the range of data access sions. One important takeaway is that you don’t have
vaughn.washington@ technologies available for leveraging the power to rewrite your application from the ground up to take
microsoft.com of SQL Server 2008. After I show you how to advantage of SQL Server 2008. Instead, you can gain
select the right technology for significant value from incremen-
Vaughn Washington currently
leads the development team for
your needs, I’ll dive into the im- Fast Facts tal changes to your existing data
provements, first examining a access layer. With that said, let’s
native data access in the SQL new construct for programming Did you know SQL dive in.
Server division at Microsoft; data. Then I’ll turn to expanding Server 2008 has already
Vaughn is responsible for building the set of building blocks—data released more than
and maintaining Microsoft’s types—available to developers. 16 improvements in the July New Programming
world class data providers for
SQL Server: ODBC and OLE
Taking a brief pause, I’ll discuss CTP? You can register to Constructs: Table-
how SQL Server 2008 removes
DB. His responsibilities include limitations for some scenarios.
participate in this and future Valued Parameters
delivering scalable, performant, Then building on what the ar-
technology previews at:
and secure client components ticle has covered, I’ll examine http://connect.microsoft.com/sql One of the most requested new
that are consumed by Microsoft new out-of-the-box solutions for features by developers is the abil-
internal and external customers common application scenarios. ity to cleanly encapsulate tabular
through APIs that adhere The article will hopefully leave you with a new un- data in a client application, ship it to the server in a
to industry standards and derstanding of the possibilities for taking advantage single command, and then continue to operate on the
expectations while showcasing of Katmai: SQL Server 2008. data as a table in T-SQL. The simplest such use case is
SQL Server’s unique value the long-desired array data type. Traditionally, applica-
proposition. tions solve this need by doing one of the following:
Choosing a Data Access Technology
• Defining stored procedures with large numbers
SQL Server offers a wide range of data access tech- of parameters and pivoting the scalar parameter
nologies to developers. The best place to start the dis- data into rows.
cussion on taking advantage of new programmability • Using an out-of-band bulk insert mechanism like
features is how to choose the right one. For new ap- SqlBulkCopy in ADO.NET to create a temporary
plication development, using the .NET Framework table.
and specifically the SQL Server provider in ADO.NET, • Using parameter arrays in the application and
System.Data.SqlClient, will be the best choice in most repeatedly executing logic that operates on a
cases. If you’re working within an existing application scalar “row” of data.
with business logic and associated data access code in
native C/C++, you can choose from the SQL Server None of these solutions is ideal. The pivoted param-
Native Client ODBC driver or the OLE DB provider. eter solution, in addition to being ungraceful, creates
Both options allow you to take advantage of the full set code that is difficult to maintain that’s also tough to
of features of SQL Server 2008 and the choice will usu- move forward when the time comes to add a new “col-
ally be based on your application requirements. umn” to the conceptual “table.” The bulk insert solu-
tion is clumsy when the application needs to do any
Additional data access options include Windows Data additional filtering or apply more complex logic. The
Access Components (WDAC)—new in Windows Vista, parameter array solution, while it may perform well
previously named Microsoft Data Access Components for small data volumes, becomes untenable for larger
(MDAC)—and the Microsoft SQL Server 2005 JDBC batches both on the client where memory consump-
Driver for Java environments. For purposes of this dis- tion may become a problem and on the server where
cussion, I’ll focus on SqlClient and SQL Server Native per-row invocations of the procedure provide non-op-
Client; for more information on these and other data timal performance.
access technologies, visit http://msdn.com/data.
Table-valued parameters (TVPs), believe it or not, ad-
To get the most out of the new functionality, you’ll need dress all of these problems. TVPs provide an improved
to use .NET Framework 3.5 or SQL Server Native Cli- programming model and significant performance ben-
ent 10.0 which works side-by-side with previous ver- efits in certain scenarios.

58 Programming SQL Server 2008 www.code-magazine.com


Existing Type Accuracy Date/Time Range
Smalldatetime 1 minute 1900-1-1 through 2079-6-6
Datetime 3.33 milliseconds 1753-1-1 through 9999-12-31
Table 1: Existing SQL Server 2005 date and time data types along with their range and maximum precision.

Imagine a simple order processing scenario. Using eter class to take a DataTable as a value. DataTable’s Connecting the SQL
TVP starts with defining a table type in T-SQL on ubiquity in data application programming makes it an Server Developer
the server: ideal choice for this simple model. SQL Server Native
Client OLE DB accomplishes the same model by lever-
Community with the
-- TSQL to CREATE a TABLE TYPE tt_OrderItems aging the COM IRowset interface and by introducing Product Team
an implementation that allows buffering. SQL Server You may have heard about
CREATE TYPE tt_OrderItems AS TABLE ( Native Client ODBC uses a similar method modeled MSDN forums, http://forums.
[ItemId] int NOT NULL, after parameter arrays where applications allocate and microsoft.com, as a tool for
[Quantity] int NOT NULL) bind arrays of buffers. For the streaming model, ADO. seeking and providing answers
NET supports specifying DbDataReader and IEnum for technical problems, but did
Next, create a stored procedure that uses the table erable<SqlDataRecord> as a parameter value, which you know that in addition to
type you just created and additionally takes the cus- provides a solution for both external and user-defined MVPs, Microsoft development
tomer who placed the order as a parameter: data sources. In much the same vein, SNAC OLE DB product teams participate in
accepts a custom IRowset implementation and ODBC the forums? For SQL Server
-- TSQL to CREATE a PROCEDURE sp_AcceptOrder builds on the data-at-execution (DAE) paradigm by programming questions,
-- that performs set-based operation on TVP accepting DAE table-valued parameters. All providers check out these forums:
also expose a rich set of services for discovering and .NET Development > .NET
CREATE PROCEDURE sp_AcceptOrder ( describing table-valued parameter metadata. Framework Data Access and
@CustomerId int, Storage, and SQL Server > SQL
@OrderItems tt_OrderItems READONLY) When you can update an application to use table-val- Server Data Access. SQL Server
AS ued parameters, it should gain the benefits of having 2008 has versions of these
INSERT dbo.AcceptedOrders cleaner, more maintainable code in both the client and forums under SQL Server Katmai
SELECT O.ItemId, O.Quantity server application tiers; faster performance, particu- for any questions you have using
FROM @OrderItems AS O larly when you use set-based operations on the server new functionality.
INNER JOIN dbo.Items AS I side by leveraging the power of the SQL query proces-
ON O.ItemId = I.ItemId sor; and better scalability for large data volumes. In
WHERE I.BackOrdered = 0 addition to enhancing the programming model, SQL
Server 2008 introduces new intrinsic data types to bet-
This example is fairly simple, but it illustrates a big ter align with the precise needs of applications.
win for developers, which is that if you can update
an application to implement its business logic to use
set-based operations on a batch of data, it should see
significant performance gains. When you can update an
application to use table-valued
Here are the application changes needed to use table-
valued parameters. When using table-valued param- parameters, it should gain
eters, client applications generally have two possible the benefits of having cleaner,
programming models:
more maintainable code in both
• Bind in-memory table data as a parameter. This the client and server application
is usually the simplest and fastest to code at the
expense of not being as scalable in the applica-
tiers; faster performance,
tion for large batches of data due to increased particularly when you
memory consumption.
• Stream row data to the provider from a disk or
use set-based operations on
network-backed data source. This model takes the server side by leveraging
a bit more code to set up with the advantage of
having a fixed memory usage profile in the ap-
the power of the SQL query
plication. processor; and better scalability
For small batches of data, the performance difference
for large data volumes.
on the client will usually be negligible, so choosing the
simpler programming model may be your best choice.
On the server side there will be no performance differ- New Building Blocks: New and
ence between the models. Enhanced Date & Time Types
Diving into the details for a moment, ADO.NET ac- In previous releases of SQL Server, the database ex-
complishes the first model by extending the SqlParam- posed date and time data primarily through two data

www.code-magazine.com Programming SQL Server 2008 59


types: DATETIME and SMALLDATETIME (Table 1). the storage size of columns to exactly fit application
In SQL Server 2005, these types began to show their needs, which for large databases, translates to signifi-
age and developers began to bump into their limita- cant savings in storage costs. Now let me discuss how
tions. data access stacks expose these types.

You can break down problems with the existing date/ The new SQL Server types DATE and DATETIME2
time types into five categories: correlate to existing types in all data access technolo-
gies. For ADO.NET this is DateTime for both, for
• Applications that work in terms of only date or ODBC it’s SQL_DATE and SQL_TIMESTAMP re-
only time data must implement a layer of abstrac- spectively, and for OLE DB it’s DBTYPE_DBDATE
tion on the server data often writing their own and DBTYPE_DBTIMESTAMP. The new SQL types
validation routines. While feasible to accomplish, TIME and DATETIMEOFFSET were more difficult
this increases the burden on developers. to express in some cases because the conceptual type
• Table column storage requirements for either didn’t already exist. While you could map TIME to
date or time only are considerably less than com- TimeSpan in ADO.NET, Microsoft needed to invent
bined date and time. This means that as the size new provider types in ODBC and OLE DB (SQL_
of a database increases, storage costs increase at SS_TIME2 and DBTYPE_DBTIME2) because exist-
a rate faster than necessary. ing types don’t support fractional-second precision.
• Existing ranges are often not large enough to rep- As you’d guess, these match their pre-existing types
resent the data that applications need to handle in every way with the addition of a fraction compo-
(like process control and manufacturing). nent matching the component that already exists in
• You cannot commonly represent time-zone data. SQL_TIMESTAMP/DBTYPE_DBTIMESTAMP. The
Some applications choose to work around this server type DATETIMEOFFSET is unique in that, in
by defining an additional column for the offset addition to containing all date/time parts of other
and storing date/time in UTC. Along with that types, it also includes a time-zone offset. To accom-
solution comes the baggage of performing all the modate this type, Microsoft introduced new types
necessary calculations to treat this as a single across the board. The .NET Framework 3.5 release
type and the inability to straightforwardly take includes a new system type conveniently named Da-
advantage of built-in date/time functions in SQL teTimeOffset while OBDC and OLE DB introduce
Server. SQL_TIMESTAMPOFFSET and DBTYPE_DBTIME-
• Many other database vendors (and the ANSI STAMPOFFSET; all of which should look familiar to
SQL standard) support unique date, unique time, people used to working with their non-time-zone-
and time-zone aware date/time types, such that, aware equivalents. These new types are first-class
migrating an application using this functionality citizens in every way—with the goal of being better
to SQL Server was sometimes a cumbersome data-type alternatives for new application develop-
process that might even require changing appli- ment and replacement options for application en-
cation requirements. hancements. In addition to introducing new data
types and programming constructs, SQL Server 2008
In order to address these problems, SQL Server 2008 also removes limitations on existing types to open the
introduces support for four new types: DATE, TIME, door to new application scenarios.
DATETIME2, and DATETIMEOFFSET; along with a
rich set of built-in T-SQL function support for the new
types: Breaking Barriers: Removing Size
Limitations
• DATE. Provides a broader range of day value
than DATETIME by starting at 1/1/1 rather than
1/1/1753, and provides better storage scalability. Here is a close look at new support for large common
• TIME. Also provides storage scalability over ex- language runtime (CLR) user-defined types (UDT) and
isting types and introduces user-defined, fraction-
al-second precision from 0 to 7. In other words,
based on your needs you can define a table col- All new types support a
umn of type TIME to be accurate to the second
or 100 nanoseconds and you’ll pay only for the broader range, and where
accuracy you need in storage costs. appropriate, user-defined,
• DATETIME2. A composition of DATE and TIME
in that it supports both a wider range and a user- fractional-second precision.
defined, fractional-second precision to 100 ns. This translates to developers being
• DATETIMEOFFSET. Includes all of DATE-
TIME2 and additionally introduces time-zone able to tune the storage size
offset support, which should significantly reduce of columns to exactly fit
the amount of custom code you need for time-
zone-aware applications. application needs, which for large
databases, translates to significant
As you can see, all new types support a broader range,
and where appropriate, user-defined, fractional-sec- savings in storage costs.
ond precision (Table 2). This allows developers to tune

60 Programming SQL Server 2008 www.code-magazine.com


New Type Maximum Accuracy Date/Time Range
date 1 day 1-1-1 through 9999-12-31
time 100 nanoseconds 00:00:00 through 23:59:59.9999999
datetime2 100 nanoseconds 1-1-1 through 9999-12-31
23:59:59.9999999
datetimeoffset 100 nanoseconds 1-1-1 through 9999-12-31
23:59:59.9999999
Table 2: New SQL Server 2008 date and time data types along with their range and maximum precision.

the introduction of storing large object (LOB) column


data transactionally in the file system.
In addition to being able to
Support for .NET Framework CLR UDTs first appeared
in SQL Server 2005 with the goal of providing database
fully utilize the functionality of
application developers a more flexible programming the CLR UDT on both server and
model. UDTs gave developers a way to create complex
types and also to express and encapsulate computa-
the application tier when working
tionally expensive business logic alongside that data. in ADO.NET, using user-defined
Since the release of SQL Server 2005, customers have
been adopting the technology and using it in interesting
serialization can allow native
ways; however, the current 8,000-byte maximum size applications to access a UDT as a
has limited the set of scenarios UDTs can address.
stream of bytes with a well-defined
SQL Server 2008 removes this restriction allowing a format. This, in turn, enables
CLR UDT to be “unlimited” length. In practice the
storage size is actually limited at the SQL large object
scenarios where applications
(LOB) limit of 2^31 -1 bytes, or about 2 GB, in much interpret these bytes by parsing or
the same fashion as the varbinary(max), varchar(max),
and nvarchar(max) types that were introduced in SQL
by overlaying structure and provide
Server 2005. Theoretically, you can expose any .NET similar business logic
Framework CLR class or structure as a SQL CLR in middle tiers or client
UDT as long as it meets several requirements involving
constructors, annotations, and implementing a set of applications to what exists as CLR
interfaces. Large UDTs in particular must implement methods in the assembly.
a user-defined, serialization-to-bytes method that SQL
Server relies on both when consuming parameters
of that type from clients and when sending column In addition to being able to fully utilize the func-
result sets. A value of Format.UserDefined in the tionality of the CLR UDT on both server and the
SqlUserDefinedTypeAttribute annotation indicates application tier when working in ADO.NET, using
that the UDT has user-defined serialization and also user-defined serialization can allow native applica-
introduces the requirement for the developer to imple- tions to access a UDT as a stream of bytes with a
ment the IBinarySerialize interface. well-defined format. This, in turn, enables scenarios
where applications interpret these bytes by parsing or
The analogy for other “max” types remains useful by overlaying structure and provide similar business
when describing how to take advantage of large ver- logic in middle tiers or client applications to what
sions of UDT in client applications. Integrating into exists as CLR methods in the assembly. While these
applications currently using UDTs should be seamless, techniques can solve a range of scenarios involving
with only minor changes in metadata returned for col- structured data, a different limitation removal assists
umns and parameters of these types to allow discrimi- the growing number of document management-style
nating applications to distinguish large versions from applications.
their less-than-8,000-byte counterparts. To see this dif-
ference, ADO.NET applications for result sets will in- On the unstructured data side of the fence, SQL Serv-
voke the SqlDataReader GetSchemaTable() method er 2008 also enables applications to store unstructured
that returns a DataTable and examine the ColumnSize data directly on the file system—outside of the database
column where a value of “-1” indicates a large UDT. file—leveraging the rich streaming APIs and perfor-
For parameters, they’ll examine or set the SqlParam- mance of the Windows file system. Though accessible
eter Size property where “-1” has the same meaning. through the file system, the data also remains compat-
In ODBC, you specify and distinguish large UDTs ible with the T-SQL programming model. Using this
by the use of the SQL_SS_LENGTH_UNLIMITED dual programming model—T-SQL and Win32—appli-
macro originally introduced for “max” types. In OLE cations can maintain transactional consistency be-
DB, you represent the large UDT column or parameter tween unstructured data stored in the file system and
length as “~0”. structured data stored in the relational database.

www.code-magazine.com Programming SQL Server 2008 61


You use this functionality—SQL Server Filestream— • Geometry. Implements the Open Geospatial
by adding a new storage attribute, FILESTREAM, Consortium (OGC) geometry type and encapsu-
on varbinary(max) columns, a type introduced in lates the entire OGC type hierarchy.
SQL Server 2005. The beauty of the feature is that
other than removing the 2 GB LOB storage limita- These CLR system types are available in every database
tion, filestream columns work seamlessly in other and the CLR assemblies are available as a standalone
operations, including data markup language (DML) redistributable package for applications to install with
operations like SELECT, INSERT, DELETE, and classes named SqlHierarchyId and SqlGeometry re-
MERGE. In case you haven’t already heard, MERGE spectively. This allows .NET Framework applications
is yet another new SQL Server 2008 programmabil- in the middle tier or client to interact with the type like
ity feature that allows expressing multiple other DML any other class. Non-.NET Framework applications
operations in one statement, but that’s a detail for will generally rely on server-side conversion or seri-
another day. alization to a documented well-known format. In the
case of the Geometry type, applications will be able
Getting back to Filestream, take a look at how this is to retrieve the serialized bytes in a well-known format
exposed in different data access stacks. Because the by using a new built-in function when issuing queries,
only real change from the database application per- choosing between STAsText() or STAsBinary(). These
spective is the difference in maximum size, existing provide, respectively, the Open Geospatial Consor-
programming patterns continue to work unchanged. tium (OGC) well-known text and well-known binary
Having said that, once you’ve made the transition to format. Alternatively, for all system types, developers
storing large data sizes, streaming data in your appli- can create user-defined functions on the server that
cation both into and out of the server becomes more operate on the data providing access to the full func-
important for scalability. Even though coding patterns tionality of the type.
for streaming haven’t changed in SQL Server 2008,
it’s worth doing a quick refresher. For ODBC, the
application binds parameters using SQLBindParam- Putting the Pieces Together
eter with ColumnSize set to SQL_SS_LENGTH_UN-
LIMITED and sets the contents of StrLen_or_IndPtr In summary, SQL Server 2008 introduces a new pro-
to SQL_DATA_AT_EXEC before it calls SQLExecDi- gramming construct, table-valued parameters, to more
rect or SQLExecute. You must unbind and retrieve rationally model scenarios where applications are
result columns via SQLGetData. For OLE DB, the operating as tables of data. Table-valued parameters
application uses DBTYPE_IUNKNOWN in parame- provide better performance and scalability than exist-
ter and result bindings. For parameters, the consumer ing solutions, particularly when you can implement
must implement an ISequentialStream; for results, server-side logic with set-based operations. Along
the provider returns an ISequentialStream interface. with this new programming construct, new intrinsic
For optimal streaming performance, applications can data types more efficiently handle date and time data
use a Win32 file access API to read or write data us- and will help developers reduce custom code in ap-
ing a UNC path returned by new server built-in func- plications by choosing a type that closely matches
tion PathName() available from filestream columns. their needs. This comes with the benefit of aligning
The advantage of streaming using Win32 API over database storage costs to those same needs. For exist-
T-SQL grows as the data grows in size and the benefit ing types, SQL Server 2008 removes two limitations:
can be seen as early as data 1 MB in size. SQL Server lifting the 8,000-byte size limit on CLR user-defined
Native Client also includes a new method, OpenSql- types and allowing you to store and access binary
Filestream, that combines and simplifies the opera- large object columns through the file system. Building
tions of opening the file and associating it with an ex- on enhanced CLR type support, SQL Server 2008 also
isting SQL Server transaction context using another introduces two system types to provide out-of-the-box
new server built-in function, Get_filestream_trans- solutions for planar spatial and hierarchal data. This
action_context(). is far from an exhaustive list of new programming fea-
tures, but it should start to get you thinking about how
you can get the most out of SQL Server 2008.
Building on New Foundations: System
Vaughn Washington
Types

Beyond the plumbing and infrastructure improve-


ments discussed so far, SQL Server 2008 introduces
two new CLR system types as out-of-the-box solutions
for common application scenarios:

• HierarchyId. Allows database applications to


model tree structures like organizations or file
systems in a more efficient way than currently
possible. The type includes a rich set of func-
tions for answering questions about relationships
between nodes in the tree and reorganizing the
tree.

62 Programming SQL Server 2008 www.code-magazine.com


Get 1 year of CoDe plus this
book for ONLY $37.95*

Pay the special combination price of just $37.95


and you will receive a 1 year subscription to
CoDe Magazine AND a copy of Pro VS 2005 Reporting!
Two great developer references, one low price! With this special offer
you get cutting edge, in-depth Visual Studio .NET articles delivered
to your door six times a year with Code Magazine plus a copy of
Pro VS 2005 Reporting by Kevin Goff and Rod Paddock (the editor of
CoDe Magazine).
This combination offer is only available online using this special URL:
www.code-magazine.com/subscribe/ap37

Visit www.apress.com for more great .NET titles!


* While supplies last. Please allow 4-6 weeks for delivery. US addresses only.
ONLINE QUICK ID 0712132

Use SQL CLR 2.0—Advancing


CLR Integration in SQL
Server 2008
The integration of the .NET Common Language Runtime (CLR)
inside SQL Server 2005 (SQL CLR 1.0) enabled database
programmers to write business logic in the form of functions,
José Blakeley
joseb@microsoft.com stored procedures, triggers, data types, and aggregates using
modern .NET programming languages.
José Blakeley is a software
architect in the SQL Server This article presents the advances to the CLR integration introduced in SQL Server
division at Microsoft Corporation 2008, which significantly enhances the kinds of applications supported by SQL
working with the SQL Server Server.
Engine team. Previously he
was lead architect in the Data

I
Programmability team and helped n particular, this article describes the support of a CLR class or struct instance to a binary stream.
to build the ADO.NET Entity. for large (greater than 8000 bytes) user-defined The deserialization routine performs the inverse oper-
José was a lead designer of the types and aggregates, multiple-input user defined ation. SQL Server supports two forms of serialization:
SQLCLR integration in SQL Server aggregates, and order-aware native (Format.Native) and user-
2005 and has contributed to table-vauled functions. The Fast Facts defined (Format.UserDefined).
numerous programmability and CLR integration in SQL Server All forms of UDTs in SQL Server
extensibility features in various 2008 will also leverage the da- Geographical information 2005 have a size limit of up to
SQL Server releases. Before tabase scalar type extensibility is rapidly becoming 8000 bytes. SQL Server 2008 re-
joining Microsoft in 1994, José introduced in SQL CLR 1.0 to mainstream to many business moves the size restriction of For-
was a Member of the Technical provide a hierarchical identifier applications. mat.UserDefined UDTs allowing
Staff at the Computer Science data type to enable encoding SQL Server 2008 provides UDTs up to 2 GB.
Laboratory at Texas Instruments of keys describing hierarchies
new spatial data types for
where he was a principal (folders, inheritance, etc) as Listing 1 shows a code fragment
well as a built-in framework for
developers to build describing the definition of a
investigator in the development
spatial applications. This frame- location-aware applications. Polygon type as a large UDT. To
of DARPA Open-OODB, an
object-oriented database system. work includes: a class library define a UDT as large, one only
He has over 20 granted or for geometry types based on the needs to specify a value of -1 for
pending patents. José received Open Geospatial Consortium for both flat earth and the MaxByteSize property of the SqlUserDefinedType
a computer systems engineering round earth solutions, as well as a spatial index. custom attribute.
degree from ITESM, Monterrey,
Mexico, and M.Math and Ph.D.
degrees in computer science Large UDTs Handling User-defined Aggregates
from the University of Waterloo,
SQL Server 2005 introduced user-defined types A very welcome extensibility feature in SQL Server
Canada.
(UDTs) as a mechanism to extend the scalar type sys- 2005 is the ability to implement user-defined aggre-
tem of SQL. UDTs are not a general object-relational gate functions. As with any such feature, the usage
(O-R) mapping mechanism. That is, you should not scenarios expand very quickly beyond those initially
use UDTs to model complex business objects such as envisioned and require the platform to support ad-
employee, contact, customer, or product. For more ditional behaviors and capabilities. SQL Server 2008
general O-R mapping mechanisms, consider the expands user-defined aggregates to support multiple
LINQ to SQL or LINQ to Entities frameworks avail- input arguments as well as the ability to have results
able in Visual Studio 2008. Microsoft designed SQL for the aggregate be larger than 8000 bytes.
Server UDTs, implemented as CLR classes or structs,
to model atomic scalar types such as monetary cur- Customers have frequently asked the SQL Server
rency, geometry and spatial types (see section on spa- team to add support for a string concatenation aggre-
tial types), specialized date or time, etc. gate function. As an explanation for why they want
the request, a customer would typically explain how
To persistently store UDT instances, it must be possible they’d like to stitch together a set of e-mail addresses
to transform the state of the instance to and from its with a semicolon as a delimiter. SQL Server 2008’s
internal binary format. You can achieve this through user-defined aggregate function addresses this re-
a pair of routines for serialization and deserialization. quest. The accumulate method of the aggregate class
The serialization routine transforms the run-time state may look something like:

64 Use SQL CLR 2.0—Advancing CLR Integration in SQL Server 2008 www.code-magazine.com
public void Accumulate (SqlString Value) This capability makes it simple to write more pow-
{ erful aggregates such as weighted averages or other
if (!Value.IsNull) statistical functions such as median.
result+= Value.Value + ";";
}
Order-Aware TVFs
If you tried this approach in SQL Server 2005 you
quickly realized that the resulting concatenated string One of the most powerful extensibility features en-
is constrained to an 8000-byte size limit. SQL Server abled by the CLR integration in SQL Server 2005 is
2008 doesn’t have this restriction and allows you to the ability to write table-valued functions in any .NET
have an aggregate as large as 2GB. In addition, you do language. These are functions that have an initializer
not need to change the managed code (C# in the snip- method to set up any necessary context and provide
pet above)—you only need to change the T-SQL reg- an enumerator and a method that is invoked once per
istration. The example below specifies nvarchar(max) row. You can see a canonical example in Listing 2 that
as the return type of the aggregate function: shows return entries in the Event Log as a rowset in
SQL Server. Christian Kleinerman
CREATE AGGREGATE dbo.concat Chirstian.Kleinerman@
(@Value nvarchar(4000)) Imagine now a query that wants to retrieve the first microsoft.com
RETURNS nvarchar(max) few (e.g., 10) records from the log in chronological
EXTERNAL NAME [concatProject].[concat]; order: Christian Kleinerman has
over 12 years of experience
You can use this aggregate in a query as you would SELECT TOP 10 * FROM dbo.Initmethod('System') working with SQL Server—the
use any other aggregate function: ORDER BY timeWritten last seven as part of the SQL
Server development team in
SELECT dbo.concat(email) FROM dbo.attendees If you examine the execution plan for this query, you Redmond, WA. He is currently
WHERE … will notice that the optimizer introduces a sort opera- the Group Program Manager of
tor before returning the results so it can guarantee the the Relational Engine team where
Imagine now a scenario where the format of each top events given the particular requested order (Fig-
he has worked on a variety
concatenated value depends on some other e-mail- ure 1).
of features and technologies.
specific flags included in a separate column of the
attendees table. SQL Server 2008 will let you create You can see from the execution plan that a large por- Before Microsoft he worked
aggregates that take multiple columns as input as in tion of the estimated execution cost goes into the on eCommerce Web site
the following query: sort operation. However, what happens if the table- development and previously was
valued function is already returning the data in the co-founder of a development
SELECT dbo.concat (email, format_flags) right order? The sort operation introduced by the company specializing in
FROM dbo.attendees WHERE … optimizer is unnecessary. SQL Server 2008 intro- scheduling software.

Listing 1: Definition of a Polygon type as a large UDT


[Serializable] public void Read(BinaryReader r) {…}
[SqlUserDefinedType(Format.UserDefined, IsByteOrdered = false,
MaxByteSize = -1, I public void Write(BinaryWriter w) {…}
sFixedLength = false)]
public class Polygon : INullable, IBinarySerialize [SqlMethod(InvokeIfReceiverIsNull = true)]
{ [return: SqlFacet(MaxSize = -1)]
public Polygon(SqlString value) {…} public string ToString() {…}

public bool IsNull {…} public static Polygon Parse(SqlString s) {…}

public static Polygon Null {…} }

Listing 2: UDT
public class TabularEventLog public static void FillRow(Object obj, out SqlDateTime
{ timeWritten, out SqlChars
[SqlFunction(FillRowMethodName = "FillRow", message, out SqlChars category,
TableDefinition = "timeWritten datetime, out long instanceId)
message nvarchar(2000), {
category nvarchar(500), EventLogEntry eventLogEntry = (EventLogEntry)obj;
instanceId bigint")] timeWritten = new
public static IEnumerable InitMethod(String logname) SqlDateTime(eventLogEntry.TimeWritten);
{ message = new SqlChars(eventLogEntry.Message);
return new EventLog(logname, category = new SqlChars(eventLogEntry.Category);
Environment.MachineName).Entries; instanceId = eventLogEntry.InstanceId;
} }
}

www.code-magazine.com Use SQL CLR 2.0—Advancing CLR Integration in SQL Server 2008 65
SQL Server 2008 supports storing and querying of
geospatial data, that is, location data referenced to the
earth. Without going into too much detail, two com-
mon models of this data are the planar and geodetic
coordinate systems. The main distinction between
these two systems is that the latter takes into account
the curvature of the earth. SQL Server 2008 intro-
duces two new data types: geometry and geography,
Figure 1: The optimizer introduces a sort operator before returning the results so it can guarantee the top which correspond to the planar and geodetic models.
events given the particular requested order.
SQL Server implements these data types as CLR types
duces an additional ORDER clause in the CREATE leveraging the User Defined Type infrastructure avail-
FUNCTION statement that tells the optimizer the able in SQL Server 2005. Microsoft registers and in-
expected ordering of the results from the table-val- cludes these types in every installation of SQL Server,
ued function: making it easy to create columns of either of these
types on a table:
References CREATE FUNCTION [dbo].[InitMethod](@logname
[nvarchar](4000)) CREATE TABLE points_of_sale
Balaji Rathakrishnan, et al. RETURNS TABLE (
Using CLR Integration in (id int, name nvarchar(50), location geometry);
[timeWritten] [datetime] NULL,
SQL Server 2005. [message] [nvarchar](max) NULL,
The Open Geospatial Consortium (OGC) defines ca-
Alazel Acheson, et al. Hosting the [category] [nvarchar](4000) NULL,
nonical textual representation for a geometry known
.NET Runtime in Microsoft SQL [instanceId] [bigint] NULL
) WITH EXECUTE AS CALLER as the Well-Known Text (WKT). You can use the
Server, ACM SIGMOD Conf. 2004. WKT to insert data into the points_of_sale table by
ORDER ([timeWritten])
AS using a static method that converts from WKT to an
EXTERNAL NAME actual instance:
[SqlServerProject1].[TabularEventLog].[InitMethod]
INSERT INTO points_of_sale (id, name, location)
Note how with this small enhancement, the execu- VALUES (1000, N'Main Store',
tion plan for the same query shows no sort operation geometry::STPointFromText('POINT (50 50)', 0));
and the bulk of the cost happens in the CLR function
itself (Figure 2). You can later write queries against the table created
above to answer business questions. For example,
assume that you have a geometry variable that holds
Spatial Framework a polygon representing a given urban area. You can
then write a query to retrieve from the database all
Even though location is a concept prevalent in every- points of sale that fall within the given urban area:
thing that people do, in recent years location aware-
ness has become an integral part of everyday soft- SELECT id, name
ware applications. With the proliferation of mapping FROM points_of_sale
frameworks, GPS, and other location-aware devices, WHERE location. STIntersects(@urban_area) = 1;
it is increasingly common to encounter spatial data,
which often needs to be stored, queried, and reasoned At the same time, SQL Server 2008 provides new
upon. SQL Server 2008 leverages the CLR integration spatial indexing capabilities to speed up processing of
to provide native spatial data support. queries involving spatial operations. As an example,
the following statement would create an index on the
Many business applications have some form of loca- table above:
tion data: sales regions, delivery routes, factory or
point of sale locations, or employee addresses. Not CREATE SPATIAL INDEX SIndx_points_of_sale
only do business managers often require that applica- ON points_of_sale(location)
tions store this information in a database, but business WITH ( BOUNDING_BOX = ( 0, 0, 500, 500 ) );
managers also commonly desire the ability to run que-
ries that make use of the spatial semantics: What is The geometry data type works well for operations
the average distance that customers have to the near- performed on instances spanning small areas. When
est point of sale location? How do sales for a region operating on larger distances, you must take into ac-
compare to those from adjacent regions? count the curvature of the earth and this is when the

Figure 2: With ordered TVFs the optimizer does not need to introduce a sort operator to the execution plan.

66 Use SQL CLR 2.0—Advancing CLR Integration in SQL Server 2008 www.code-magazine.com
geography data type comes into play. This is imple-
mented as a separate CLR class. Even though the
usage is very similar to that of the planar data type,
methods invoked on the geodetic type operate on an
ellipsoidal model of the earth.

Leveraging the integration of the CLR into the data-


base engine, SQL Server 2008 introduces spatial data
support, which will make development of location-
aware applications become mainstream. Figure 3: An organizational hierarchy.

DECLARE @mnode hierarchyid, @lc hierarchyid Data Platform Team


Hierarchical Identifiers SELECT @mnode = node FROM organization WHERE
empid = @mgrid
SQL Server 2008 introduces the HierarchyId type, de- BEGIN TRANSACTION
signed to make it easier to store and query hierarchi- SELECT @lc = max(node) FROM organization
cal data. Hierarchical data is defined as a set of data WHERE @mnode = node.GetAncestor(1)
items related to one another by hierarchical relation- INSERT organization (node, empid, name)
ships—relationships where one item of data is the par- VALUES (@mnode.GetDescendant(@lc, NULL), @empid,
ent of another item. Common examples include: an @name)
organizational structure, a hierarchical file system, a COMMIT
set of tasks in a project, a taxonomy of language terms, END
a single-inheritance type hierarchy, part-subpart rela-
tionships, and a graph of links among Web pages. Microsoft implemented the HierarchyId built-in SQL
scalar type as a CLR UDT. Table 1 lists some of the
Consider the example organizational hierarchy in Fig- key methods exposed by this type.
ure 3.
You can expect that this new data type will enable the
The number sequence indicated at the upper-right efficient implementation of applications that access Peter Spiro
corner of every box represents a hierarchical identifier hierarchically organized data. “I didn’t get my first real
for the corresponding department. One could model corporate job till age 29…and I
this organization using the following table and index: think that’s a good thing to do,
Summary because when you do that, you
CREATE TABLE organization ( don’t get burned out at 40.”
node hierarchyid primary key clustered, SQL Server 2008 builds on the CLR integration and From a forestry degree to
level as node.GetLevel() persisted, scalar data type extensibility foundation laid out in making charcoal in the Peace
empid int unique, SQL Server 2005. Microsoft has extended the UDT Corps, Peter is now one of 14
name nvarchar(100) capabilities by allowing UDTs of size up to 2 GB. Technical Fellows at Microsoft.
… The SQL Server team have also illustrated two new Recruited to Microsoft from DEC
) capabilities—the spatial data type framework and the in 1994, Peter was brought in to
CREATE UNIQUE INDEX org_idx ON organization hierarchical id—built on the scalar UDT extensibility help re-engineer the database
(level, node) mechanism introduced in SQL Server 2005. These management system and move it
new capabilities will significantly enhance the kinds to the next level.
Inserting the CEO record: of applications supported by SQL Server. Other ca-
“So just building this world-class
pabilities not described in this paper include often-
database was the first thing
INSERT organization (node, empid, name) VALUES requested enhancements such as the support for
we did. Now we are moving
(hierarchyid::GetRoot(), 123, 'Frank Smith') INullable<T> for describing parameters to functions
from a database to a suite of
and procedures and table-valued parameters.
products, building a better
Adding an employee reporting to a manager:
Josè Blakeley database for the cloud, building
Christian Kleinerman better management capabilities,
CREATE PROC AddEmp (@mgrid int, @empid int, @name
building synchronization
nvarchar(100) ) AS BEGIN
technologies, building in-
memory caching and mapping
technologies.
Method Description
It’s not a core database anymore;
Static SqlHierarchyId GetRoot() Returns the root of the hierarchy type it’s a collection of services on
SqlHierarchyId GetDescendant ( Returns a child Id x such that child1 < x and top of the database that create a
SqlHierarchyId child1, SqlHierarchyId child2) child2 > x. database platform.”
SqlBoolean IsDescendant (SqlHierarchyId child ) Returns true if child is >= this.
SqlInt16 GetLevel() Returns an integer representing the depth of the
node this in the tree. Returns 0 if this is the root.
SqlHierarchyId GetAncestor (int n ) Returns a hierarchical id representing the nth
ancestor of this.
Table 1: A list of some of the key methods exposed by CLR UDT.

www.code-magazine.com Use SQL CLR 2.0—Advancing CLR Integration in SQL Server 2008 67
ONLINE QUICK ID 0712142

Visual Studio 2008:


RAD Gets RADer
Visual Studio 2008 is all about making it easier for developers and
development teams to create software for the latest and greatest
platforms with technologies such as .NET Language Integrated Query
(LINQ), ASP.NET AJAX, and the Windows Presentation Foundation
(WPF) designer, to name just a few. In this brief article I will highlight just a
Jonathan Wells few of the code editing and designer improvements that are new to Visual Studio 2008.
blog.onoj.net
Jonathan is currently a product LINQ and IntelliSense
manager in Microsoft’s Developer Fast Facts Studio 2008 so that you can take
advantage of the new design and
Division and focuses on Visual Integrating query syntax into Vi- Many of the new designer code editing capabilities without
Studio 2008. During his seven sual Basic and Visual C# enables and code editing capabilities have to target a new framework
years at Microsoft, Jonathan has functionality such as statement introduced in version.
served as a Software Design completion and IntelliSense Visual Studio 2008 are also in
Engineer / Test (C# team), .NET when working with data queries. the free and lightweight Visual Web Designer
Compact Framework product The approach for querying data
manager, and as an Architect Studio Express Editions.
with LINQ is the same whether Visual Studio 2008 uses the same
Evangelist. you are accessing an in-memory Web designer that ships with the
list, XML data, a SQL database or any combination Microsoft Expression products. The result is a snap-
of these together. In addition to the design-time assis- pier Web development experience and a swag of new
tance provided by statement completion, smart com- designer features. The same multi-targeting capability
pile auto correction (that squiggle under syntax errors) applies to the new Web designer capabilities.
and IntelliSense, LINQ enables compile-time valida-
tion of queries so that you catch potential bugs before Visual Web Developer 2008 Express Edition now sup-
your code is executed. ports split-view editing allowing simultaneous display
of HTML source and the design view. Changes made
LINQ’s expressive style enables a cleaner code base in one view are immediately displayed in the other.
which is ultimately easier to understand and main-
tain. Working with Cascading Style Sheets (CSS) is now a
whole lot easier with the addition of CSS Style Man-
Working with JavaScript ager, CSS properties windows, and CSS Source View
IntelliSense. These features reduce the effort required
Visual Studio 2008 includes IntelliSense support for to understand and manage your project’s CSS format-
JavaScript, which will improve the ASP.NET AJAX ting rules.
development experience. IntelliSense greatly improves
the discoverability of variables, objects, and their meth- Other Improvements Worth Checking Out
ods without having to context switch by ALT-TAB’ing
to some form of documentation. Speaking of context Client developers should check out the new WPF
switch, a simple but neat new feature that pertains to designer and project support. Both Windows Forms
any IntelliSense dropdown is IntelliSense Transpar- and WPF application types can now take advantage of
ency Mode. In the past, developers had to escape out ASP.NET Application Services (Membership, Roles,
of IntelliSense to see the code under the dropdown. Profile) for roaming smart client user data. Developing
Visual Studio 2008 lets you make the dropdown semi- solutions for Microsoft Office is much richer and now
transparent by holding the Ctrl key. includes the ability to integrate with the Ribbon. Prac-
Visual Studio 2008 titioners of agile software development methodologies
Beta 2 JavaScript debugging is sure to be a popular feature and distributed teams will be pleased to discover that
This beta includes a Go-Live with Web developers and is available in Visual Web Continuous Integration is now built into Visual Studio
license so that you can start Developer 2008 Express Edition, which is free. The Team System 2008 Team Foundation Server.
taking advantage of Visual JavaScript debugger is fully loaded with the debugging
Studio 2008’s new features features that developers have come to expect of Visual You can find more information about the many im-
today. With mulit-targeting you Studio. You need not upgrade your existing applica- provements and features in Visual Studio 2008 by
can use Visual Studio 2008’s tions to Visual Studio 2008 and the .NET Framework visiting the MSDN Visual Studio Developer Center:
new designer and code editing 3.5 to take advantage of the debugger, or indeed any http://msdn.microsoft.com/vstudio.
features with projects that target of the new Web designer and features. Multi-target-
.NET Framework version 2.0 and ing allows you to open existing projects that target Jonathan Wells
3.0; today! version 2.0 or 3.0 of the .NET Framework in Visual

68 Visual Studio 2008: RAD Gets RADer www.code-magazine.com


The slate.
The pen.
The mightier you.
When you’re ready to become mighty,
EPS Software can help you build
state-of-the-art Tablet PC and Mobile PC
applications. EPS Software employs
experts like Markus Egger (according
to Microsoft, one of the ten most important
Tablet PC influentials in the world),
Claudio Lassala (Microsoft MVP) and other
well-known names. EPS Software (we are
also the publishers of CoDe Magazine) has
what any organization needs to implement
a successful Tablet PC project.

www.eps-cs.com/TabletPC
ONLINE QUICK ID 0712152

The Data Dude Meets Team


Build
“Integrate the data tier developer into the core development life
cycle and process.” That is one of the main objectives of Visual Studio Team
Edition for Database Professionals, also known under its project name “Data Dude.”
Bringing the data tier developer into Visual Studio is the first step in enabling closer
integration between the application and data tier developer. Having both environments
Gert Drapers leverage the same Team Foundation Build (Team Build) system enables daily and
gert.drapers@microsoft.com automatic integration of changes into the build process, enforcing closer integration and
Gert Drapers is the Software shorter feedback cycles between the two originally disjoint disciplines.
Architect and Development

V
Manager of the “Data Dude” isual Studio Team Edition for Database Pro- presents the physical layout of all the directories
project. Gert has been with fessionals (VSDBPro) enables you to manage and files used to store the objects inside the proj-
Microsoft since 1991 where he your SQL Server database schema (definition) ect (Figure 2). The Schema View provides a logi-
has worked in various capacities the same way as you handle your application projects cal representation of the complete schema orga-
on SQL Server from version 7.0 and source code. It starts by representing your da- nized by schema and/or object type (Figure 3).
to SQL Server 2005. He also tabase schema inside the newly
created System.Transactions. added Database Project (.dbproj) Fast Facts
(see Figure 1). It does so through Build & Deploy
a collection of T-SQL DDL frag- Visual Studio Team Edition
ments. Each fragment is stored in for Database Professionals Now that you have a collection of
a single .SQL file and represents adds new capabilities to .SQL fragments, how do you go
a single schema artifact, for ex- help with the configuration about deploying your schema?
ample a table, queue, constraint management, tracking,
or procedure definition. Because collaboration, and testing of This is where “build” comes
fragments only represent a single in. When you build a Database
your SQL Server databases.
schema object, tracking changes Project, the build engine takes
and versioning these schema ob- all the fragments inside the proj-
jects is a lot simpler and precise. ect and compares them with the schema inside the
target database.
After you have established the Database Project
you have two views: The Solution Explorer view Build is based on a difference-based build engine (Fig-
ure 4). The build engine takes two inputs: the project
state; which is “what you want” and the target data-
base; which is “what you have.” The comparison re-
sults in a set of schema objects that are different, which
are sorted to reflect the correct dependency order. The
build engine determined this dependency information
during the load of the project by parsing and interpret-
ing all DDL fragments for the lifetime of the project.
Finally, the build engine generates a deployment script
in the form of a .SQL script which contains a set of
T-SQL DDL statements to incrementally update the
schema on the target server/database. This alleviates
the need to manually create and/or maintain incre-
mental update scripts to keep your database schema
up-to-date.

Using the incremental deployment script you can up-


date the target database, using the Deploy command
inside Visual Studio.

The Truth Is in the Project


The big change is that the Database Project is now
Figure 1: Adding a New Project to SCC. the authoritative source of the schema definition,

70 The Data Dude Meets Team Build www.code-magazine.com


not the database instance(s) the schema is deployed which now can include building and deploying your
to. Since the Database Project has become the cen- database schemas.
ter of truth for our database schema, placing the
project under source code control enables the next To build from the command line you simply start a
step in our crusade of integrating the data tier into Visual Studio Command Prompt (this makes sure
the development life cycle: versioning of the sche- MSBuild.exe is in the PATH), navigate to the direc-
ma objects (Figure 5). tory where your project file is stored, and execute
the following statement:
This enables placing the project under source code
control (SCC), the versioning of the individual CD C:\src\AdventureWorksDB <enter>
schema objects, and the tracking of the changes of
these objects. It also enables alignment in version- MSBuild AdventureWorksDB.dbproj /t:build <enter>
ing between the application and the data tier. Both
can use the same labels to indicate versions at a giv- This will use all the property settings as defined inside
en point in time and use branching when projects the (.dbproj) project file and use them to build the proj-
are derived or merged. ect. If you need to override some properties you simply
pass them to MSBuild.exe via the command line as
Figure 2: Solution Explorer—
So when a developer syncs his or her sources to a name-value pairs using the property parameter.
AdventureWorksDB.
certain label inside source code control, this con-
tains the definition of the state of the database CD C:\src\AdventureWorksDB <enter>
schema as it was used at that point in time when
the label was created. MSBuild AdventureWorksDB.dbproj /t:build
/p:TargetDatabase="MyAdventureWorks"
Now that you understand how the individual proj- /p:TargetConnectionString="Data
ects work, let me switch gears and focus on how Source=(local)\sql90;User=sa;
you can integrate this into the team environment. Password=MySecret99;Pooling=False;" <enter>

This statement will change the target database name


MSBuild Is Your Friend to “MyAdventureWorks” and use your local SQL
Server instance named SQL90 and connect using
Like most Visual Studio-based project systems, standard SQL Server authentication.
the core tasks inside the projects like compilation
and linkage are implemented as MSBuild tasks. The result of the build is a .SQL file, which by de-
The same is true for VSDBPro. The two core ac- fault is named the same as your project and lives
tivities for Database Projects (.dbproj), “Build” and in the sql directory under the location where your
“Deploy” are implemented by two MSBuild tasks project file is located.
named “SqlBuildTask” and “SqlDeployTask.” Figure 3: Schema View—
To deploy the resulting build script you change the AdventureWorksDB.
All VSDBPro-related MSBuild tasks are defined in target of the project to /t:Deploy:
the .targets file located in:
CD C:\src\AdventureWorksDB <enter>
%ProgramFiles%\MSBuild\Microsoft\VisualStudio\
v8.0\TeamData\ Microsoft.VisualStudio.TeamSystem. MSBuild AdventureWorksDB.dbproj /t:deploy <enter>
Data.Tasks.targets

The properties exposed by the MSBuild tasks are doc-


umented via an accompanying XSD file located in: Difference Based Build
%ProgramFiles%\Microsoft Visual Studio
8\Xml\Schemas\1033\MSBuild\ Microsoft.
VisualStudio.TeamSystem.Data.Tasks.xsd

Any project file that wants to include the functional-


ity offered by the build task includes the .targets file;
the .XSD file is used to provide IntelliSense support
when you edit the project file inside the XML Edi-
tor in Visual Studio.

Building from the Command Line

The biggest benefit of all this? You can build and


deploy your Database projects from the command
line. When needed you can use the command line
to override settings and use MSBuild to automate
almost all possible tasks in a normal build process, Figure 4: Difference Based Build.

www.code-magazine.com The Data Dude Meets Team Build 71


Figure 5: Initial Check-in.
You can combine the build and deploy into a single override property values via the command line. The
invocation using: Database project file is split in two pieces; projects
settings which are stored in the .dbproj file and user
CD C:\src\AdventureWorksDB <enter> settings which are stored in the .dbproj.user file. Both
these files are aggregated into a single file image that
MSBuild AdventureWorksDB.dbproj /t:build;deploy represent the project and are passed on to the tasks at
<enter> execution time. You can see the partial content of the
AdventureWorksDB.dbproj.user file in Listing 1. It
MSBuild extracts all information needed to execute contains properties with their respective values that
the tasks from the project files and allows you to are specific to the user. In general it contains settings
that you want the user to be able to change for him or
Listing 1: Partial AdventureWorksDB.dbproj.user herself, but not affect other users of the project.
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup Condition=" '$(Configuration)' == 'Default' "> An example of a user setting is the “AlwaysCreat-
<TargetDatabase>AdventureWorksDB</TargetDatabase> eNewDatabase” property which indicates that you
<GenerateDropsIfNotInProject>False</GenerateDropsIfNotInProject> always want to drop and recreate your database
<DefaultDataPath>D:\DATA\</DefaultDataPath> when it is being deployed.
<TargetConnectionString>Data Source=TESTHOST\SQL90QA;
Integrated Security=True;Pooling=False</TargetConnectionString> NOTE: The .dbproj.user file should not be checked
<AlwaysCreateNewDatabase>False</AlwaysCreateNewDatabase> in!
<BlockIncrementalDeploymentIfDataLoss>True
</BlockIncrementalDeploymentIfDataLoss> In the next section, Team Build, you’ll learn that you
<PerformDatabaseBackup>False</PerformDatabaseBackup> need to be aware of which properties are defined in-
</PropertyGroup> side the .dbproj.user file when you start integrating
</Project> Database Projects into a Team Build environment.

72 The Data Dude Meets Team Build www.code-magazine.com


Figure 6: New Team Build Type Creation Wizard page 1. Figure 7: New Team Build Type Creation Wizard page 2.

Team Build With a completed Team Build Type you can right-
click on the build type inside Team Explorer to
Now you’ll learn to integrate a Database Project into build the project. Figure 10 shows the Build dia-
Team Build. The first thing we need to do is add a log box. Visual Studio
new Team Build Type, which we will do using the Team Edition
“New Team Build Type Creation Wizard” (Figure 6). When you click “Build” you will see the Team
This wizard will guide you through the process of cre- Build page which displays the progress of the
for Database
ating a Team Build type in six steps. The wizard will build process. The first time you build a database Professionals
ask you to provide the name of the type; which solu- project following the steps laid out in this article,
tion you want to build; the configuration section to the build will fail! (Figure 11)
(VSDBPro)
use; some location settings, and some other options. enables you to
The best way to troubleshoot this is to look at the
Figure 7 shows where you will select the solution. build output placed in the drop location, which you
manage your SQL
Figure 8 shows the most important page in the wiz- specified for the Team Build Type when you cre- Server database
ard—the Configurations page. Here you would nor-
mally choose between Debug and Release, but in this
ated it. Since build output is shared between users,
it helps to make the drop location available through
schema (definition)
case the Database Project does not implement either a share name, which in this example is \\mi-svr\ the same way
of these two configurations; it uses the Default con-
figuration, which is what you need to use when defin-
drops.
as you handle
ing the build type. dir \\mi-svr\drops\AdventureWorksDB_20070810.1 /s /b your application
In the fourth page (Figure 9) you specify the relative \\mi-svr\drops\AdventureWorksDB_20070810.1\BuildLog.txt
projects and
locations that Team Build will use. For all other op- \\mi-svr\drops\AdventureWorksDB_20070810.1\Default.txt source code.
tions in this article I will use the default values pro- \\mi-svr\drops\AdventureWorksDB_20070810.1\
vided by the wizard. ErrorsWarningsLog.txt

Figure 8: New Team Build Type Creation Wizard page 3. Figure 9: New Team Build Type Creation Wizard page 4.

www.code-magazine.com The Data Dude Meets Team Build 73


The BuildLog.txt file contains a complete recording of
all the steps performed during the build. The Default.
txt file contains the warning and error information for
the Default build configuration and the ErrorsWarn-
ingsLog.txt contains the warning and error informa-
tion for all configurations.

In the build that failed, look at the files named De-


fault.txt and ErrorsWarningsLog.txt, which will reveal
the same information.

Solution: AdventureWorksDB.sln, Project:


AdventureWorksDB.dbproj, Configuration: Default, Any CPU
(0,0): error TSD257: The value for $(DefaultDataPath) is
not set, please set it through the build property page.

The above results clearly point out that the Default-


DataPath is missing. This happens to be a user setting
Figure 10: Build Team project.
and since .dbproj.user files are not checked in and
Team Build does not automatically create one when
missing like Visual Studio does, some properties do <DefaultDataPath>Undefined</DefaultDataPath>
not have values defined inside the .dbproj file. <TargetConnectionString></TargetConnectionString>
<TargetDatabase></TargetDatabase>
In order to resolve this, you need to manually edit </PropertyGroup>
the project and provide the values for the missing
properties. You will find that the .dbproj file con- You need to change the .dbproj file to reflect the
tains the tags for the properties, but either no value correct property values that the Team Build service
is defined or they contain the value “Undefined.” should use when building the project. Keep in mind
that the security privileges of the Team Build ser-
<PropertyGroup> vice can be different than one of an interactive user
<Configuration Condition=" building the project inside the Visual Studio IDE.
'$(Configuration)' == '' Using the MSBuild-based command line build in
">Default</Configuration> combination with the Windows RunAs command

Figure 11: Build failure.

74 The Data Dude Meets Team Build www.code-magazine.com


Figure 12: Build success.

provides a good test environment for interactive Wrap Up Overview of Team


troubleshooting.
Now you know how to integrate Database Projects
Foundation Build
<PropertyGroup> into a Team Build environment. You could achieve You can find out more about
<Configuration Condition=" '$(Configuration)' == '' more things by extending the approach described the various pieces of Team
">Default</Configuration> above. For example, you could automatically de- Foundation Build at:
<DefaultDataPath>D:\DATA</DefaultDataPath> ploy the database schema as part of the build pro- http://msdn2.microsoft.
<TargetConnectionString>Data cess by adding the SqlDeployTask to the post-build com/en-us/library/
Source=TESTHOST\SQL90QA;Integrated events, or you could integrate database unit testing ms181710(VS.90).aspx
Security=True;Pooling=False</TargetConnectionString> and data generation into the test execution part of
<TargetDatabase>AdventureWorksDB</TargetDatabase> Team Build.
</PropertyGroup> Gert Drapers

Make sure to check in the changes to the .dbproj


file so that when the Team Build service performs
a build and syncs the sources, it will automatically
pick up the new project file with the changed val-
ues. The build should now complete successfully
and without errors as shown in the resulting build
report in Figure 12.

As the last step, check the output of the build


by inspecting the drop location using the com-
mand:

dir \\mi-svr\drops\AdventureWorksDB_20070810.4 /s /b

The drop location now contains the build script


(.sql) and the metadata file (.dbmeta) which you
can use to deploy the database schema to the TES-
THOST\SQL90QA database instance.

www.code-magazine.com The Data Dude Meets Team Build 75


ONLINE QUICK ID 0712162

XML Tools in Visual Studio


2008
XML is everywhere from XML Web Services to databases to config
files to Office documents. This article will show you tooling support offered
in Visual Studio 2008 that will make working with XML easier. It will cover editing
XML files, working with XML schemas, debugging XSLT style sheets and extending
Visual Studio by writing your own custom XML Designers.
Stan Kitsis

W
stan.kitsis@microsoft.com hen you open a file with an XML extension So what do you do when there are no snippets and
in Visual Studio 2008 (for example, .xml, .xsd, you need to create an XML file based on an existing
Stan Kitsis is a Program Manager .xslt, svg, or .config), you will invoke its XMLschema? XML Editor offers a wide range of features
in the Data Programmability team Editor. XML Editor comes with a full range of features when you associate your XML documents with XML
at Microsoft. Stan has over 10 you would expect from a Visual Studio editor, which schemas. Schema-based IntelliSense, real-time valida-
years of software development includes IntelliSense, color-coding, brace matching, tion, and error reporting are just a few of them. In
experience. His primary focus is outlining, and formatting. It provides full XML 1.0 addition, XML Editor can dynamically generate snip-
on the XML tools, which include syntax checking, end-tag completion, as well as DTD pets based on an existing schema. Once you provide a
XML Editor, XSLT Debugger, and and XML schema support with real-time validation. name of the element you want to add, the XML Editor
XML Schema Designer. Prior to can parse the schema for required
working on the XML tools, Stan Fast Facts information, generate a snippet,
was responsible for parts of the Editing XML Files and insert it for you. To invoke
“Visual Studio seems to dynamic snippet functionality, all
System.xml and MSXML APIs.
Face it, manual editing of XML be the best editor for XML you need to do is type the name
files can be very tedious and time (compared to Eclipse plugins of the element as in the following
consuming. To help with this, the and NetBeans) and I use it example:
Visual Studio 2008 XML Editor even when my Web project is
comes with a number of pro- <element_name
Java based.”
ductivity enhancement features.
One such feature is an extensible and press TAB. The XML Editor
library of XML code snippets— - Customer statement from will create a snippet, which looks
XML files that contain a configu- the MSDN Forums very much like the one in Figure
rable code segment, which acts 1 except this time you didn’t have
as a template to use while editing to do anything up front. This is
documents. Visual Studio installs a number of XML a very powerful feature, especially when you need to
code snippets that help developers to write XML sche- create documents with large content models.
mas and XSLT style sheets. To invoke a snippet while
editing an XML file, select “Insert Snippet” from the By default the XML Editor generates only the required
“Edit > IntelliSense” menu. Once you have inserted a content, but this can be customized by annotating
snippet, you can TAB between highlighted modifiable your XML schemas. More information is available on
fields to enter data. Figure 1 shows an example of in- MSDN under, “How to: Generate an XML Snippet
serting a snippet. From an XML Schema”.
XML Editor comes You can also write your own snippets. If you want to
with a full range create a simple snippet, follow these easy steps: Schema Cache and Schema Catalogs
of features you 1. Create a new XML file and type in: For advanced users, XML Editor offers features such
would expect as schema cache and schema catalog files. Schema
cache is a folder that contains a number of well-known
from a Visual <snippet
W3C schemas, as well as a number of Microsoft-spe-
Studio editor, 2. Press ESCAPE to close the IntelliSense win- cific schemas. It serves as a repository of widely used
dow. schemas that are unlikely to change. You’ll find the
which includes 3. Press TAB. default location for the schema cache at %vsin-
IntelliSense, color- 4. Fill in the blanks. stalldir%\xml\schemas, where “%vsinstalldir%”
coding, brace 5. Press ENTER to finish. is a variable representing the location in which Vi-
sual Studio itself was installed. When you declare one
matching, and For more detailed information about creating snip- of the namespaces defined by these schemas in your
formatting. pets, I recommend VSEditor’s blog post “Code
Snippet – Schema Description” (http://blogs.msdn.
XML files, XML Editor will automatically associate
appropriate schemas from the cache location and in-
com/vseditor/archive/2004/07/14/183189.aspx). stantly provide you with IntelliSense and validation.

76 XML Tools in Visual Studio 2008 www.code-magazine.com


Schema catalogs are XML files located in the Schema 1. Debugging of standalone
Cache directory (catalog.xml file is the default). They transformation, which is
give advanced users more granular control over vari- useful when your primary
ous namespaces they might want to use. For example, interests are input docu-
you can associate specific namespaces with external ment, transformation itself,
locations: and output document.
2. CLR-integrated debugging,
<Schema which is useful for debug-
href="mylocation/myschema.xsd" ging transformations in the
targetNamespace="http://myschema"/> context of your CLR appli-
cation.
You can also use catalog files to create associations
between schema files and file extensions, which you
can find particularly useful when your schema has Debugging Standalone
no targetNamespace: Transformations
<Association
extension="config" Figure 2 showcases the debug-
schema="xml/schemas/dotNetConfig.xsd"/> ging environment when you
Figure 1: Invoking an XML snippet in Visual Studio 2008.
work with XSLT transforma-
New in Visual Studio 2008, you can also add condi- tions. In the default configura-
tions: tion, you can see the XSLT file and a tabbed view
of the input and output documents. You can also
<Association configure Visual Studio to simultaneously show
extension="config" all three documents. When you step through the
schema="xml/schemas/dotNetConfig30.xsd" transformation code, you can see the input data,
condition="%TargetFrameworkVersion% = 3.0" /> the transformation that handles the input, and the
generation of the output file.
This condition means that the dotNetConfig30.xsd
schema should only be associated when the current If you are used to either the C# or Visual Basic de-
project is targeting .NET Framework version 3.0. bugging environment, you will notice that debugging
XSLT looks very similar to debugging other CLR lan-
Finally, you can create a chain by pointing one cata- guage programs. All the controls and keystroke com-
log file at another: binations are the same. In Figure 2 you can also see
the Locals window, which can show implicit XSLT
<Catalog href="http://mycompany/catalog.xml"/> variables (self, position, and last), as well as all local
and global variables declared in your template. You
can also see the call stack window, which you can use
Working with Large Files to navigate to various XSLT templates up and down
the call stack. You can also use the XSLT debugger to
Another important editing feature I would like to
highlight is the XML Editor’s support for editing
large files. While you could work with large files in Feature Visual Studio 2008 Support
previous versions, Visual Studio 2008 supports in-
Auto-completion X
cremental parsing of the XML documents. Now if
you work with a 10 MB file, you don’t have to wait Syntax coloring X
for Visual Studio to parse the entire file every time Outlining X
you make an edit. The XML Editor will isolate the IntelliSense X
edits and reparse only what’s needed, offering better
Extendable snippet library X
performance and responsiveness.
Dynamic snippets X
I have covered a few interesting features of XML Go to definition (from XML to XSD) X
Editor, but obviously couldn’t go over all of them. Back/Forward navigation X
Table 1 shows some of the other features available
Project and user profile support X
to Visual Studio users.
Real-time validation X
Full XSD Support X
Debugging XSLT Style Sheets Namespace support X
DTD support X
XSLT is a W3C standard transformation language,
which is very popular among a large group of devel- XML-to-XSD generation X
opers. Visual Studio Professional edition provides DTD-to-XSD conversion X
support for both editing and debugging XSLT style Unicode support X
sheets. Editing XSLT files is very similar to editing
Large file support (10 MB+) X
other XML files. When it comes to XSLT debug-
ging, Visual Studio supports two main scenarios: Table 1: Overview of XML Editor features available in Visual Studio 2008.

www.code-magazine.com XML Tools in Visual Studio 2008 77


Figure 2: Debugging standalone XSLT transformations. Figure 3: Debugging an XSLT transformation from a C# application.

set breakpoints in both your XSLT code and in your Table 2 shows a summary of various features available
input XML documents. when debugging XSLT transformations.

Integrated CLR Debugging XSLT Debugger is tightly


While debugging standalone transformations is use- integrated with other CLR
ful, sometimes you need to debug XSLT as a part of debuggers, which lets you
your C# or Visual Basic application. XSLT Debugger,
which is tightly integrated with other CLR debuggers, seamlessly step from C# to XSLT to
lets you seamlessly step from C# to XSLT to Visual Visual Basic while debugging
Basic while debugging your application. The follow-
ing example shows a C# code snippet that uses the your application.
XslCompiledTransform class to initiate an XSLT
Mutual transformation:
registration Extending XML Tools
XslCompiledTransform xsltcmd =
new XslCompiledTransform(true); XML Editor provides a good foundation for devel-
Events xsltcmd.Load(XSLTFile); opers to build custom designers on top of it. Figure
XmlUrlResolver resolver = new XmlUrlResolver(); 4 shows how it was done in the previous version of
XmlWriter writer = XmlWriter.Create(OutputFile); Visual Studio (Visual Studio 2005). Developers could
build on top of XML Editor by sharing the buffer.
xsltcmd.Transform(XMLFile, writer); They would create their custom designer by sharing
Figure 4: Old way of building IVsTextLines buffer and parsing XML using System.
custom XML designers on top of If you set a breakpoint on the call to Transform Xml.XmlReader or XmlDocument. Don Demsak used
XML Editor. method and step into it, the debugger will take you this approach when he created XPathmania, an XPath
to the XSLT style sheet (Figure add-on to Visual Studio (for more information about
Mutual 3). All of the features described XPathmania, see http://www.donxml.com). However, be-
registration in the standalone debugging sec- cause the integration happened on the buffer level, this
tion are also available here. The method created a lot of large buffer edits. In addition,
only exception is the ability to set it was somewhat limited because System.Xml parser
Events breakpoints in the data files. This consumes only valid XML while designers, by defini-
feature is not available when step- tion, should work with invalid files.
ping into XSLT from another CLR
program. Everything else behaves XML Editor, on the other hand, has its own parser
Figure 5: Visual Studio 2008 way of extending XML Editor. exactly the same. with error recovery. It also builds its own parse tree
representing the contents of the buffer. In Visual Stu-
dio 2008, Microsoft exposed this LINQ to XML tree
Feature Visual Studio 2008 Support to third-party developers. You can see the new archi-
Browser view X tecture in Figure 5. The new API allows developers
Locals, watch, call stack X to create custom views over XML Editor parse tree,
provides them with XML Editor’s error recovery logic,
Viewing input XML, output, X and the LINQ to XML parse tree with a transacted up-
and XLST during debugging date that goes all the way back to the buffer, integrated
Breakpoints in XML X with Visual Studio UndoManager. These changes will
Breakpoints in XSLT X make it easier for developers to build great XML tools
on top of Visual Studio 2008.
CLR language debugger integration X Stan Kitsis
Table 2: Overview of XSLT Debugger features available in Visual Studio 2008.

78 XML Tools in Visual Studio 2008 www.code-magazine.com


Migration Headache?

For immediate relief,


visit
www.VFPConversion.com
today!

VFP Conversion is a migration services brand of EPS


Software Corp., providing expert upgrading of VFP
applications to the latest technologies available from
Microsoft. To learn more about how VFP Conversion can
assist your enterprise, call toll free: 1 (866) 529-3682.
info@VFPConversion.com www.VFPConversion.com
ONLINE QUICK ID 0712172

ODBC Rocks!
Fifteen years after its launch, ODBC is a firmly entrenched
cornerstone of the software industry. This article explains why and will
explore the relationship between Microsoft SQL Server and ODBC and discuss where
ODBC may go in the future.

C
Chris Lee onceived as a broadly based, multi-platform, developers, third-party tools and application packages
Chris.T.Lee@microsoft.com multi-database data access technology, ODBC are ADO.NET, JDBC, ODBC and OLE DB. Devel-
has been an outstanding success. Probably the oping drivers for all of these requires significant time,
Chris Lee is a program manager best known implementation of ISO/IEC 9075-3:2003 effort, and expertise. What is the best strategy if re-
in the Data Programmability SQL Call Level Interface (part 3 of the complete SQL sources are limited or time to delivery is important?
Runtime Connectivity team standard), ODBC is included The answer is simple. Implement
with responsibility for Microsoft in Windows, MacOS, all major ODBC first and evaluate require-
SQL Server Native Client. Prior Linux distributions, and is readily Fast Facts ments for the others afterwards.
to joining Microsoft he worked available for many Unix versions ODBC drivers are available Why? First, there are bridges
for Micro Focus, Merant, Blyth including AIX, HP-UX, Solaris, from all the other APIs to ODBC,
for every major database
Software, Inmos and Perkin- and FreeBSD. Even PDAs and so as soon as ODBC is enabled so
including Microsoft SQL
Elmer Data Systems. His career smartphones have ODBC! are all the others. Second, even
experience includes work Server, Microsoft Access,
ignoring the API bridges, ODBC
on operating systems, data Though often thought of as an Oracle, DB2, MySQL, will provide users of the data
access methods, compilers, API for C and C++ applications, Informix, Ingres, Sybase, source with the widest choice of
development tools, and database ODBC is frequently used with PostgreSQL, Interbase, third-party software since ODBC
access. other languages. For example, FoxPro, Paradox, Progress, has been around longest and ac-
many COBOL applications use ADABAS, Supra, dBase, cumulated the widest selection of
ODBC for database access as do SQLite, Teradata, Rdb, etc. tools and applications.
dynamic languages such as PHP,
Perl, Python and Ruby along ODBC drivers are available ODBC can, and often is, imple-
with Microsoft Access and other for many other data sources mented on top of a proprietary
RADs. including Excel, Btrieve, API. In the early days of ODBC,
many perceived this to be a weak-
C-ISAM, RMS, text files,
Despite being conceptually lim- ness in first generation ODBC
ited to relational databases, no VSAM, IMS, IDMS, drivers. Research soon showed
significant data source lacks an LINC, 4D, AutoCAD, Pick, that layering ODBC on top of
ODBC driver: text files, Excel FileMaker, QuickBooks, proprietary native APIs usually
spreadsheets, ISAMs such as COBOL EXTFH, SAS, etc. has minimal performance impact.
dBase, Paradox, C-ISAM, Btrieve In some cases ODBC even out-
and VSAM—you name it and an performed proprietary APIs when
ODBC driver is most likely available. For any data the driver adopted strategies to overcome weak default
source that lacks an ODBC driver there are numerous behavior in the underlying native API. For some data-
driver development kits and driver development shops bases, including Microsoft SQL Server, ODBC actually
waiting to help you plug that gap! It goes without say- fulfills the role of proprietary native API.
ing that no respectable relational database lacks an
ODBC driver. Corporate IT and enterprise developers live in a very
heterogeneous environment and have to support an
ODBC is very popular for custom enterprise applica- accumulated legacy spanning multiple languages, op-
tion development and is widely supported by market erating systems, and data sources. There may also be
leading ISVs. ERP, CRM, and SCM packages all use pressures to consolidate and standardize application
ODBC as do query, analysis, reporting and ETL pack- platforms alongside pressure to modernize existing
ages as well as productivity applications such as Mi- applications, re-engineer business processes, increase
crosoft Office. information integration, and introduce business intel-
ligence. Phew! How to juggle these pressures and stay
So what makes ODBC popular with different industry sane?
factions and why will ODBC remain popular for the
foreseeable future? In data integration and consolidation scenarios from
ETL to BI, ODBC’s ubiquity is invaluable in bringing
For the data source owner ODBC is a must. The data from both packaged and custom applications to-
industry standard APIs required to enable access by gether.

80 ODBC Rocks! www.code-magazine.com


The .NET platform is compelling where new develop- a wider range of customer situations than staff whose
ment is required, but in many cases timescale, busi- skill sets span a narrower spectrum.
ness risk, and overall cost factors favor code re-use
and incremental modernization. ODBC has a lot to
offer here. Often application developers can recom- ODBC and Microsoft SQL Server
pile existing code for use in .NET applications with the
Microsoft Visual Studio 2005 C++ compiler using the ODBC has broad ongoing appeal across the software
/clr compiler switch. This greatly simplifies mixing ex- industry and remains a key, though often unsung, ele-
isting C++ and ODBC business logic with a modern- ment of the Microsoft Data Platform. Let me now dis-
ized user interface, which can be developed most ef- cuss the relationship between ODBC and Microsoft
ficiently using the .NET Framework. ODBC provides SQL Server in more detail.
a common API for a diverse spectrum of data sources
and operating systems and can access multiple data When first launched, DB-Library was SQL Server’s
sources simultaneously. These are essential capabili- only API for client applications. Microsoft later sup-
ties for information consolidation, aggregation, and plemented the SQL Server API with ODBC then with
integration whether used in native applications or via OLE DB and most recently with ADO.NET. DB-Li-
the .NET Framework Data Provider for ODBC. brary has been deprecated due to technical limitations
and is today only supported to provide backwards
In addition to directly reusing existing business logic, compatibility to legacy applications that have not yet
ODBC is also a prime candidate to accelerate con- converted to one of the other APIs.
version from other proprietary APIs. Consider license
consolidation via database migration, where it may be At the time of its release, the SQL Server team at Mi-
more cost effective to convert applications that use crosoft believed OLE DB would supersede ODBC.
proprietary APIs such as DB-Library, CT-Library and This is no longer the case and ODBC’s future is com- At the time
OCI to ODBC rather than to rewrite completely for pletely secure. ODBC is more widely used than OLE
.NET since ODBC is conceptually quite similar to DB and it is better suited to some key scenarios that I of its release,
other native code APIs. will discuss later in this article. experts at
ISVs often choose to support multiple databases to Both OLE DB and ODBC are true native APIs for Microsoft believed
increase addressable market and customer appeal. SQL Server in that they map API calls directly into OLE DB would
ODBC offers several approaches to this requirement. SQL Server’s network protocol, Tabular Data Stream
Above all, ODBC offers a common API that can ad- (TDS). When Microsoft recommended best practic- supersede ODBC.
dress multiple data sources and, as seen above, a wide es are followed, ODBC is a very thin wrapper over This is no longer
range of available drivers. However, there are differ- TDS with no intermediate buffering between network
ences among data sources and this is reflected in a packet buffers and the application. It therefore has ex- the case …
range of behaviors across different ODBC drivers. cellent performance and scalability characteristics.

The simplest approach to solving this problem uses a ODBC and OLE DB support for Microsoft SQL Serv-
combination of the most restrictive behavior patterns er is available in WDAC (Windows Data Access Com-
and SQL dialect subset. ODBC offers escape sequenc- ponents), originally known as MDAC (Microsoft Data
es that iron out minor differences among SQL dialects Access Components), and in Microsoft SQL Server
to assist in this approach. Next, if this is insufficient, Native Client, a component of Microsoft SQL Server
applications can query ODBC to determine the char- 2005 and later versions. Support in WDAC targets
acteristics of the particular driver and database in use legacy and generic applications that do not exploit
and adapt dynamically to them. An application can the unique features of Microsoft SQL Server 2005
also determine the actual driver and database in use and later versions. Applications that do wish to fully
and trade off generality to exploit unique characteris- exploit the unique features of Microsoft SQL Server
tics in meeting the most stringent functional and per- 2005 or later should instead use Microsoft SQL Serv-
formance demands. ODBC places no restrictions on er Native Client, which enables use of features such as
the statements sent from applications to data sources, snapshot isolation, database mirroring, query notifi-
so does not suppress the richness of the SQL dialect cation and data types such as xml and varchar(max).
on a connection. Microsoft will continue to add support for new SQL
Server features to all of the APIs it supports: ADO.
Some ISVs use an internal data abstraction layer to NET (via SqlClient), ODBC and OLE DB (via SQL
enable use of different APIs for different databases. Server Native Client) and JDBC.
ODBC has much to offer, even in this scenario. For
some databases, notably Microsoft SQL Server,
ODBC is the best performing API available. Ongoing Roles for ODBC
System integrators and value-added resellers (VARs) I’ll now examine some key scenarios where ODBC will
benefit from ODBC in two ways. They can guarantee play an important role in SQL Server’s future, starting
that whatever the operating platform and other in- with data integration and business intelligence. Some
frastructure, ODBC will be available for all the data of the SQL Server components involved here rely on
sources they will encounter, which simplifies devel- OLE DB internally and reflect this in their external
opment of a specialist proprietary toolset. Secondly, interfaces. In recognition of the popularity of ODBC,
staff familiar with ODBC can be deployed to satisfy Microsoft has decided to continue support for the Mi-

www.code-magazine.com ODBC Rocks! 81


ODBC Overview crosoft OLE DB Provider for ODBC in both 32- and
64-bit versions for Windows Server 2003, Windows
to the changing capabilities of SQL Server in its role
of principal native API? Second, how will ODBC
ODBC (which stands for Open Vista, and Windows Server 2008 to ensure that ODBC evolve in its role as an industry standard cross-plat-
DataBase Connectivity) was can be used when it is the best choice available. form, cross-database, API?
designed to be a common Call
Level Interface (CLI) for relational Dynamic languages such as PHP, Perl, Python and ODBC defines areas where drivers may add their
databases. ODBC 1.0 was based Ruby account for a significant amount of developer own driver-specific extensions above and beyond the
on work by the SQL Access Group activity. Some of these are supported on the .NET core ODBC specification, and Microsoft SQL Server
and released in September 1992, platform but are also popular on other platforms. All Native Client uses this capability to add support for
following work by Microsoft and of these languages can access Microsoft SQL Server new features added to Microsoft SQL Server. Other
Simba Technologies. ODBC 2.0 via ODBC, though the libraries available to achieve drivers have also added their own extensions for
was an impoved and extended this vary in quality and performance. Microsoft will this purpose. The following two examples will dem-
version of ODBC 1.0. Microsoft continuously review the need for libraries optimized onstrate how Microsoft SQL Server Native Client
aligned ODBC 3.0 with the X/ for Microsoft SQL Server and is developing a PHP ex- adapts to SQL Server enhancements.
Open and ISO CLI specification tension library based on SQL Server Native Client’s
that was adopted as part three ODBC driver. SQL Server 2008 adds additional date/time data
of the overall SQL standard. types to supplement the existing datetime and
Subsequently Microsoft added Business interest in application migration and server smalldatetime types: date and datetime2 corre-
Unicode and 64-bit support to consolidation as a means of reducing operational and spond to ODBC’s existing SQL_TYPE_DATE and
ODBC and the current version is license costs is increasing. SQL Server’s excellent to- SQL_TYPE_TIMESTAMP types; time (time with 0
3.52. tal cost of ownership profile makes it very attractive to 7 digits fractional seconds scale) and timestam-
At one time Microsoft considered for this role. ODBC has significant roles to play here poffset (effectively datetime2 plus a timezone offset)
extending ODBC to allow it to in both data consolidation and application migration, do not match any existing ODBC data types and so
be used with other data sources as described earlier. Microsoft SQL Server Migration new types must be added to support them. These are
but the consensus among users Assistant helps customers migrate schema and the da- SQL_SS_TIME2 and SQL_SS_TIMESTAMPOFF-
was that ODBC should remain tabase itself to Microsoft SQL Server and Microsoft is SET.
focussed on relational databases working on documentation and tools to help migrate
and so Microsoft designed OLE application source code to use ODBC and Transact In SQL Server 2008, a table-valued parameter (TVP)
DB to fulfil the universal data SQL in the future. is a parameter to a T-SQL statement or stored proce-
access role. dure that can consist of multiple rows and columns.
Easysoft and OpenLink Software Non-Windows clients and mid-tier servers add an For example, consider the following statement:
developed ODBC driver managers extra dimension to the scenarios described above.
for non-Windows platforms based Currently third-party ODBC drivers meet customer Insert into OrderItems (OrdID, ProdCode, Qty)
on Microsoft’s specifications requirements, but Microsoft continuously evaluates Select ?, ProdCode, Qty from ?
for ODBC on Windows, and providing its own drivers to ensure customers are get-
these became the open source ting the best experience. The first parameter is the primary key of a newly in-
unixODBC and iODBC projects. serted Order and the second is a TVP that has a row
A wide range of ODBC drivers are Last but not least, Microsoft wants existing customers for each order item with product code and quantity
available for use with these driver to be even more successful in the future. This involves values. The TVP allows multiple OrderItem rows to
managers. maximizing their return on existing code assets by ap- be inserted in a single statement from a single param-
plication modernization and code reuse. If existing eter. TVPs add performance by reducing round trips
Data Direct Technologies also
business logic continues to meet business needs then it between client and server and by enabling optimi-
supplies a wide range of ODBC
makes more sense to reuse it than to rewrite it. For ex- zations in TDS and the relational engine. They also
drivers for both Windows and
ample, approvals and forecasting in financial applica- enhance encapsulation by enabling a single stored
non-Windows platforms.
tions use the same algorithms regardless of application procedure to perform a complete business trans-
architecture. The ability to mine routines from batch action, with exceptional code clarity, by relaxing
Although conceptually limited and call center applications and re-deploy in SOAs, normal “rectangular” restrictions on parameter sets
to relational databases, drivers where they then become available for B2B and self-ser- to stored procedures. Note this example of the lat-
have been developed for a wide vice consumer scenarios, makes sound business sense. ter:
range of non-relational data Naturally, .NET is a strong contender for developing
sources. Today ODBC remains the new communication and UI elements of the mod- create type OrdItemType as table(
one of the most widely used data ernized application, but core C++ and ODBC-based ProdCode integer, Qty integer)
access APIs with broadly based business logic and data access routines can often be re-
support from database vendors, used with very little modification and at very low cost. create procedure OrderEntry
ISVs, and third-party connectivity Visual Studio and C++ CLI provide excellent re-use (
specialists. and interoperability capabilities and there is an oppor- @CustCode varchar(5),
tunity to extend this further in the future by providing @Items OrdItemType READONLY,
increasingly more sophisticated re-factoring tools. @OrdNo integer output,
@OrdDate datetime output)
as
What Do the Next 15 Years Hold in set @OrdDate = GETDATE();
Store for ODBC? insert into Orders (OrdDate, CustCode)
values (@OrdDate, @CustCode);
select @OrdNo = SCOPE_IDENTITY();
There are two key and inter-related questions to an- insert into TVPItem (OrdNo, ProdCode, Qty)
swer here. First, how will ODBC continue to adapt select @OrdNo, ProdCode, Qty from @Items

82 ODBC Rocks! www.code-magazine.com


Without a TVP you would have to split this into two ever happens; nothing will be allowed to undermine
stored procedures, one for each table. ODBC’s rich legacy and enduring value throughout
the industry.
Since you can mix table-valued and traditional single-
valued parameters in a statement, SQL Server Native Microsoft regards the Entity Data Model (EDM) as
Client must provide a way to handle what amounts to a major generational step change across the whole
a nested table when binding ODBC parameters. This software industry, applicable well beyond the cur-
is achieved by first binding a TVP as type SQL_SS_ rent ADO.NET Entity Framework. Thinking how
TABLE, a new type representing a nested table. Then ODBC and the EDM might come together provides
a statement attribute SQL_SOPT_SS_PARAM_FO- an additional factor in ODBC’s future. ODBC could
CUS is set to the parameter ordinal of the TVP. This be extended to handle queries in Entity SQL return-
directs subsequent calls to SQLBindParameter to ing entity-shaped results with polymorphic rows and
columns of the TVP (rather than the “top-level” pa- nested entity values. The experience with non-rectan-
rameter set) to define the nested row structure. What gular data for table-valued parameters suggests that
seems at first thought to be a complex and difficult handling this via driver-specific extensions in SQL
extension to ODBC turns out to be quite simple in Server Native Client would be feasible. However, Mi-
practice and the resulting application code is very crosoft is enabling the EDM for a broad range of data
compact and readable. sources, not just SQL Server. EDM support will be
available for other relational database products via
The complete TVP implementation in SQL Server the ADO.NET Entity Framework, so why not do the
Native Client builds naturally on existing ODBC con- same by extending ODBC’s core specification to sup-
cepts of array binding and “data at execution” param- port the EDM? In part the answer depends on the
eter values to allow rows of a TVP to be supplied at level of adoption EDM would see in ODBC applica-
runtime either in an in-memory array or as batches tions, and the nature of the demand: should EDM and
of one or more rows that are streamed to the server relational access use separate drivers, a common driv-
when the statement is executed. er but separate connections or co-exist on the same
connection? Would existing code adopt EDM access
So far, so good. SQL Server Native Client shows that incrementally or is the EDM attractive only in new
an ODBC driver can add significant functionality code? This is another area where Microsoft would be
without requiring any changes to the ODBC specifica- interested in hearing the views of interested parties.
tion. Although, what happens if and when a new SQL
Server feature goes beyond what ODBC extensibility ODBC has come a long way and hopefully this article
allows? Are there new SQL Server application sce- has given you new insights into the breadth and depth
narios where ODBC’s architecture is less than ideal? of what it has and will continue to offer for many
These are tricky questions, to which possible solutions years to come; how it might evolve in the future; and
might be: (1) update the core ODBC specification, or of the ongoing long term commitment Microsoft has
(2) turn SQL Server Native Client into a native API made to it.
that can work without an ODBC Driver Manager and
so is no longer constrained by the ODBC specifica- To comment on the points raised in this article, please
tion. Each of these paths have implications. send e-mail to odbcfut@microsoft.com.

Changing the ODBC specification is not entirely risk You can find further information about ODBC in the
free. One of ODBC’s greatest virtues is its stability. following references:
Change the specification and existing drivers and ap-
plications may start behaving in new and unexpected 1. Mike Pizzo’s very entertaining history of Mi-
ways. Branching away from the current specification crosoft’s data access APIs at http://blogs.msdn.
may satisfy some scenarios but abandon others. Is com/data/archive/2006/12/05/data-access-api-of-
there a utopian “third option” for SQL Server Na- the-day-part-i.aspx
tive Client and ODBC? IBM’s call level interface for 2. The Wikipedia ODBC page at http://en.wikipedia.
DB2 can operate in dual modes as ODBC driver and org/wiki/Open_Database_Connectivity
extended call level interface. This could be an evo- 3. Ken North’s article on SQL and ODBC in in-
lutionary path for SQL Server Native Client and/or tegration frameworks for Business Integration
ODBC. Another alternative may be to enable current Journal at http://www.sqlsummit.com/PDF/BIJ_
and future versions of ODBC to co-exist side-by-side North_Nov2004.pdf
in some way. The current ODBC Driver Manager al- 4. Ken North’s ODBC portal at http://www.sqlsum-
ready achieves this by supporting the slightly different mit.com/ODBCPORT.HTM
behaviors of ODBC 2 and ODBC 3 side-by-side. 5. Online ODBC documentation at http://msdn2.
microsoft.com/en-us/library/ms710252.aspx
Are other organizations interested in extending ODBC 6. Online information about Microsoft SQL Server
beyond the current core specification? This is an area Native Client at http://msdn2.microsoft.com/en-
where Microsoft would like to encourage discussion us/data/aa937705.aspx
among interested parties and assess the potential for a 7. The SQL Native Client Blog at http://blogs.msdn.
broadly based consensus. Some form of community- com/sqlnativeclient/
based evolution might be able to react more quickly to
a wider range of input than a full-blown ISO or ANSI Chris Lee
process. You can rest assured of one certainty, what-

www.code-magazine.com ODBC Rocks! 83


ONLINE QUICK ID 0712182

TESLA: Democratizing the


Cloud
In our service-oriented world, users need the same experience on
any device, whether mobile phone, office PC, or Internet café.
Moreover, they want the same experience any time they access applications, offline
or online. For developers, this means tackling multi-tier, distributed, and concurrent
programming. LINQ 1.0 radically simplified multi-tier programming with unified
Brian Beckman query and deep XML support. TESLA is a broad engineering program by the authors
bbeckman@microsoft.com to extend the success of LINQ with external relationships, reshaping combinators,
Brian Beckman has worked assisted tier-splitting, and join patterns.
on many projects at Microsoft
over the last fifteen years from
Research to Crypto to BizTalk LINQ 1.0: Standard Query Operators and composable—while the number of basic operators
to games to LINQ. Before
Microsoft, Brian worked at the Deep XML Support is small, the number of combinations is unlimited.
That way, one small set of primitives covers all sce-
Jet Propulsion Laboratory on narios. Furthermore, SQOs can just as easily trans-
research operating systems and It seems that developers spend the majority of form data from one data model to another, say from
radio astronomy. He holds a PhD development hours and com- XML to objects, or entities to
in Astrophysics from Princeton puter cycles just plumbing: Fast Facts JSON, or relations to XML and
University. reshaping, reformatting, wrap- back, etc., in a regularized way.
ping, connecting, serializing, TESLA will bring much
and marshaling from one data greater expressive power and This is classic engineering by
model to another just to inter- robustness to distributed data separation of concerns. Separate
face with supplied technologies. applications by integrating the data models from the query
Most developers would rather monadic comprehensions, languages, look for common se-
spend their time and energy mantics amongst the query lan-
composable mapping,
on presentation, business logic, guages, and capture them in a
data analysis, and data mining:
assisted tier-splitting, and higher-level definition that each
the parts of applications that join patterns into mainstream data model can implement in its
generate value. .NET programming own natural way.
languages.
To mitigate this situation, some In current practice, every data
academics and industry experts model comes with its own id-
recommend picking one of the data models—XML, iosyncratic, tightly coupled query language. XML
JSON, objects, entities, or relations—as the univer- comes with XQuery; objects require hand-written
sal one and mapping all the other data models into query logic; and the data tier has SQL. Consider
it. Other experts prefer to invent grandiose new the following three ways to write the same example
data models that purport to encompass all the other query. In XQuery, on the client, one writes:
data models.
from $c in customers/customer
where $c/city == "Seattle"
One Query Language, Many Data return <result>{$c/Name}{$c/Age}</result>
Models
On the middle tier, one would write in C#:

We (the authors of this paper) do not believe in ei- class Result { string Name; int Age; }
ther of these “universalist” approaches. The allure ...
of a unified data model is naïve in practice. The pro- var r = new List<Result>();
posals, especially with their mapping solutions, are foreach(var c in Customers)
at least as complicated as the jumbles they contend if(c.City == "Seattle")
to replace and their learning curves just add more r.Add(new Result{c.Name, c.Age});
developer cost.
And, on the data tier, one would write SQL:
It is preferable to unify at the level of operations,
that is, to define a small, universal set of Standard Select Name, Age
Query Operators (SQOs) that work the same way From Customers
on all data models. These operators are limitlessly Where City = "Seattle"

84 TESLA: Democratizing the Cloud www.code-magazine.com


In each case, the meaning of the query is the same: Consider the first one, Where<T>, with T repre-
senting a type such as Int or String or Customer.
“Give me the collection of name, age pairs from Where takes an IEnumerable<T> and a Func<T,
customer records such that the customer’s city is Se- bool>. This Func is a predicate function that takes
attle.” a T and returns a Boolean. Where returns a new
IEnumerable<T> that contains only the elements of
It is intolerable to impose the burden of managing such its input IEnumerable<T> that satisfy the predicate
superficial and gratuitous differences on the developer. function.
It doesn’t matter whether, under the covers, Customers
are objects, rows, XML elements, or any other kind of Read the rest of the operator definitions in a similar
collection, and developers should insist on writing the fashion and look up details and examples in any
same query the same way. This is exactly what LINQ 1.0 number of places on the Web such as http://www.
query comprehension syntax does. All three queries look codeplex.com/LINQSQO and http://www.hookedonlinq.
the same in C# 3.0: com. There are a few more convenience operators
in the SQO specification, but these six are the ker-
from c in Customers nel set.
where c.City == "Seattle" Erik Meijer
select new { c.Name, c.Age } All the SQOs are static methods that take any erik.meijer@microsoft.com
IEnumerable<T> as their first argument and re- Before joining Microsoft, Erik was
Visual Basic 9 has a very similar syntax. The com- turn an IEnumerable of another potentially differ- associate professor at Utrecht
piler simply translates comprehension syntax into ent type S. Because they are static methods with University where he worked
nested calls of the SQOs, and the SQOs, in turn, no implicit this parameter, you can write SQOs for on languages such as Haskell,
call methods in the interface IEnumerable. Only a class without changing or even recompiling the XMLambda, and Mondrian
the implementations of the various IEnumerables class. This critical design choice ensures modularity and directed the Microsoft lab
for the various types of Customers know whether (see sidebar, “Modularity Saves Money”). Further- (now defunct). He is currently
to scan elements, iterate over objects, or join over more, since every SQO just consumes and produces
working on .NET language and
tables. IEnumerables, you may compose and nest SQOs in
type-system support for bridging
an unlimited variety of ways. This is another criti-
object-oriented (CLR), relational
Mathematicians recognized this “design pattern” cal design choice (see sidebar, Composability Saves
(SQL), and hierarchical (XML)
decades ago. Each data model defines a monad Money). TESLA carries forward these design prin-
data with first-class functions.
and the query is a monad comprehension. Don’t ciples through external relationships and reshaping
let the terminology intimidate; monads are very combinators. I’ll discuss more about them below.
He is a regular contributor to
simple structures and monad comprehensions are Lambda-The-Ultimate.org.
just like the “set” comprehensions you learned in
school, for example, “the set of all integers x such
that x is divisible by 5.” Just as sets work the same It is preferable to unify
whether they contain integers, apples, or even oth-
er sets, so monads work the same whether they at the level of operations,
contain objects, elements, or rows. IEnumerable that is, to define a small,
captures monadic “collection-ness” and hides the
differences. Any class that implements IEnumer-
universal set of Standard Query
able automatically supports the SQOs. Operators (SQOs)
that work the same way on
Operators and Syntax all data models.
The following six SQOs are the core of LINQ 1.0.
History of Monad
These are precisely the monadic primitives for fil- Deep XML in Visual Basic 9 Comprehensions
tering, transforming, joining, grouping, and aggre- The functional language Haskell
gating over arbitrary collections of arbitrary types Sometimes you are working with a single data was the first to introduce monad
(abbreviating IEnumerable as IEn): model and would like detail-control leverage from comprehensions for universal
the programming language. Therefore, in LINQ 1.0, query (http://www.haskell.org).
static class Sequence Microsoft also built deep XML integration into Vi- Haskell requires all mutable data
{ static IEn<T> Where<T> sual Basic 9. If you are working directly with XML to be in monads. Instead of, say,
(this IEn<T>, Func<T, bool>) { } documents, you can write the query as follows: tacking characters onto the end
static IEn<S> Select<T, S> of a string by side effect, as you
(this IEn<T>, Func<T, S>) { } From C in Customers.<customer> do in ordinary languages, string-
static IEn<S> SelectMany<T, S> Where C.@City = "Seattle" building in Haskell is a monad
(this IEn<T>, Func<T, IEn<S>>) { } Select <result> that yields new strings with the
static IEn<V> Join<T, U, K, V> <%= C.<Name> %> needed characters every time
(this IEn<T>, IEn <U>, <%= C.<Phone> %> you run it. The monadic approach
Func<T, K>, Func<U, K>, Func<T, U, V>) { } </result> gives Haskell uniquely high
static IOrderedSequence<T> OrderBy<T, K> clarity and hygiene, making it the
(this IEn<T>, Func<T, K>) { } The compiler just translates such queries into ex- “programming tool of choice for
static IEn<IGrouping<K, T>> GroupBy<T, K> pressions over the SQOs, which, in turn, are over discriminating hackers”
(this IEn<T>, Func<T, K>) { } } the LINQ-to-XML provider. (http://icfpc.plt-scheme.org/).

www.code-magazine.com TESLA: Democratizing the Cloud 85


first glance, these would seem to satisfy goodness
criteria like eliminating hand-written code or sepa-
LINQ 1.0 solved the impedance rating the concerns of mapping and data represen-
tation. However, the cost with both approaches
mismatch problem by reframing is complexity. If you can configure the code gen-
it from the intractable one of data erators at all, they entail a configuration language
more complex than either the relational model or
models to the tractable one of the object model. Similarly, the external mapping
query languages. languages can be much more complex than either
side. Some will argue that their complexity is un-
avoidable, because mapping is dealing with the dif-
ficult problem of impedance mismatch. To be rich
enough for real-world scenarios, they must support
LINQ 1.0 Summary the Cartesian product of features from both sides.
This article argues, however, that since their com-
LINQ 1.0 solved the impedance mismatch prob- plexity admittedly arises from the beginning with
lem by reframing it from the intractable one of data the union of the data models, everything from both
models to the tractable one of query languages. sides, you can eliminate it by beginning with the
LINQ 1.0 also dramatically simplified the work of intersection, the parts the two sides have in com-
XML users. mon. To build up richness, you can resort to the
proven technique of composability: define a small
number of things that fit together in an unlimited
TESLA: Reaching Across the Cloud in variety of ways.
Space and Time
External Relationships Cure
Much work remains to achieve the larger goal of Co-Dependency
democratizing the Cloud: making distributed, data-
intensive programming easy enough for everyone.
The next big issues are object-relational mapping, Because you want ordinary object-programming
distributed programming, and concurrent program- dot notation to traverse primary-key and foreign-
ming. This article’s solutions for these issues are key relationships between classes in both direc-
external relationships, reshaping combinators, as- tions, typical mapping solutions interpolate ex-
sisted tier-splitting, and join patterns. plicit properties into class definitions. See Listing
1 for a typical example. These properties introduce
tight, bi-directional, internal coupling between the
Co-Dependency and Non- Customer and Order classes. Changing one may re-
Composability quire changing the other, particularly if the primary
or foreign key specifications change. Even worse,
The “last mile” of data programming, mapping one introducing a new relationship, say between Cus-
data model to another, remains painful. In par- tomer and PostalAddress, requires changing the
ticular, the mapping between relational data and Customer class, with changes possibly cascading
objects is very thorny. Typically, one finds code to other classes.
generators that read relational schemata and write
class definitions (Figure 1) or one finds external In the database, coupling is usually only one way.
mapping-specification files, perhaps in XML. At Relational databases encode one-to-many relation-

Listing 1: LINQ to SQL and monolithic mapping


[Table(Name = "Customers")] public class Order
public class Customer {
{ [Column(IsPrimaryKey = true)]
[Column(IsPrimaryKey = true)] public int OrderID;
public string CustomerID;
... [Column(IsForeignKey = true)]
private EntitySet<Order> _Orders; public string CustomerID;
[Association(Storage = "_Orders", OtherKey = "CustomerID")] ...
public EntitySet<Order> Orders private EntityRef<Customer> _Customer;
{ [Association(Storage = "_Customer", ThisKey = "CustomerID")]
get { return this._Orders; } public Customer Customer
set { this._Orders.Assign(value); } {
} get { return this._Customer.Entity; }
} set { this._Customer.Entity = value; }
}
[Table(Name = "Orders")] }

86 TESLA: Democratizing the Cloud www.code-magazine.com


ships upstream from the target table (Orders) to the machinery must honor the primary-key and for-
source table (Customers) by foreign key. However, eign-key constraints from the relational model, but
once you map to objects, you get bi-directional will yield much more stable and decoupled class
coupling. Even if you only want one navigation definitions at the object level. While it is too ear-
direction, the most natural one is the other way, ly to propose the precise syntax, it will resemble
downstream from the source table (Customers) to Listing 2.
the target table (Orders), ignoring foreign keys. In
other words, the natural object solution defines a
property in Customer referencing the collection of Reshaping Combinators Cure
Orders, but this property has no overt reflection in Non-Composability
the store.

Recall that the SQOs, as static methods, avoid gra- Since existing mapping solutions go all the way
tuitous tight coupling. Perhaps you can do some- between tables and classes, and never tables-to-
thing similar here and—just as in LINQ 1.0—intro- tables or classes-to-classes, it is impossible to build
duce convenient syntactic sugar. This is exactly up mappings incrementally by composing smaller,
what the authors have done, working with our more understandable mappings in sequence. You
colleague, Kasper Osterbye of IT University at Co- must make all mapping decisions at one time, dra-
penhagen. matically increasing the cost (see sidebars). Con-
trast that with the composability of traditional
Developers should have notation for declaring re- database views, which you can build up block-by-
lationships as properties externally to the class def- block because combining one view with another
initions, hence to have ordinary dot notation for yields a new view.
accessing properties bidirectionally, but without
modifying or even recompiling the original class To cure mapping non-composability, we define a
definitions. Customers and Orders do not need small set of reshaping combinators. Like the SQOs,
to know exactly how they are related or even if these are static methods that take a reshaping and
they are related. All that matters is that application return a reshaping, and which can combine in near-
code can get from a customer to its order and from ly arbitrary ways for an unlimited variety of solu-
an order to its customer. tions. The authors worked with our colleague, Mark
Shields, to determine a small set that suffices for all
Dot notation naturally chains, so the properties practical scenarios. While it is much too early to pro-
remain composable. Internally, the relationship pose specifics, a rough explanation means that you

www.code-magazine.com TESLA: Democratizing the Cloud 87


Listing 2: External relationships sketch
public class Customer }
{
... // Notice no explicit reference to Orders public extension Order
} {
private EntityRef<Customer> _Customer;
public class Order
{ [Association(Storage="_Customer", ThisKey="CustomerID")]
... // No need to link to Customer data here, either public Customer Customer
} {
get { return this._Customer.Entity; }
public extension Customer set { this._Customer.Entity = value; }
{ }
private EntitySet<Order> _Orders; }

[Association(Storage="_Orders", OtherKey="CustomerID")] ... // This shows how to use the induced properties
public EntitySet<Order> Orders foreach (Order o in c.Orders)
{ {
get { return this._Orders; } var a = o.Customer.Address;
set { this._Orders.Assign(value); } // can be chained ... they’re COMPOSABLE
} }

Listing 3: Sample of join patterns


Class OnePlaceBuffer(Of T) Function __Take_And_Contains(ByVal t As T) _
As T
Public Synchronous Put(t As T) When Take, __Contains
Public Synchronous Take() As T Me.__Empty()
Asynchronous __Empty() Return t
Asynchronous __Contains(t As T) End Function

Sub __Put_And_Empty(ByVal t As T) Public Sub New()


When Put, _Empty Me.__Empty()
Me.__Contains(t) End Sub
End Sub
End Class

work in a simplified monad of tuples and pose an Mapping Solved


effective collection of operations like the following:
With external relationships and reshaping com-
• Adding, dropping, and renaming fields binators, you only need a very thin layer of non-
• Nesting and un-nesting tuples programmable default mapping at the edges be-
• Generalizing and specializing field types tween the relational world and the object world.
• Scalar transforms Developers will express everything else in their
• Arithmetic on field data favorite LINQ-enabled programming languag-
es.

Just as LINQ 1.0 solved impedance mismatch by


Just as LINQ 1.0 solved turning it into a solved problem (monad com-
impedance mismatch by turning prehensions), TESLA will solve the mapping
problem by turning it into something solvable:
it into a solved problem external relationships and reshaping combina-
(monad comprehensions), tors.
TESLA will solve the mapping
problem by turning it into The Cloud Reaches Across Space
something solvable: and Time
external relationships and
reshaping combinators. The other open problems we will attack in TESLA
are distribution and concurrency, the new applica-
tion design patterns for the Cloud.

88 TESLA: Democratizing the Cloud www.code-magazine.com


In an increasingly distributed and service-oriented var o = new ListBox();
world, users demand the same experience every- //... set up UI ...
where they may be, using any device, and at any i.KeyUp += delegate {
time, connected or not. For developers, this means o.Items = Format(d.Suggest(i.Text, 10));
tackling distributed programming (multiple places)
and concurrent programming (multiple times). How You want to test and tweak this application purely
do you get there from the familiar world of one-ma- on the client tier until satisfied with its functional
chine, step-by-step applications? behavior. To deploy the Dictionary class as a ser-
vice on the middle-tier, just add a custom attribute
[RunAt(Server="...")] to the class declaration:
Assisted Tier-Splitting
[RunAt(Server="http://foo.baz.qux/...")]
For the space dimension, how can you get, with class Dictionary {...}
minimal agony, from a program that runs on one
machine to a program that runs on two machines, Guided by this metadata, the compiler rewrites
and from there, to any number of machines? You the MSIL of the client-side Dictionary class with a
want to avoid maintaining different application proxy that calls into a server-side stub to the origi-
builds for each client platform. You would even nal implementation. Client and server communi-
like to delay assigning functionality to client and cate via a stateful RPC-style protocol using JSON,
server until the very last responsible moment, XML, or S-expressions as the serialization format.
when you know the capabilities of the client de- The run-time infrastructure picks the format based
vice. on additional metadata in the custom attribute.

We want to start with an ordinary, single-tier ap-


plication, and then successively refactor the code
Modularity Saves
into a distributed application. To be clear, auto- By the converse of the Money
matic repartitioning is bad. However, there is plen- “Modularity” dictates that
ty of opportunity just for good tooling for program-
expansion theorem, parallel you reduce co-dependencies
mer-specified repartitioning. execution of the new programs amongst parts of a system. This
reduces the number of changes
Tier-splitting refactoring honors the converse
maintains invariants you must make at one time when
statement of the expansion theorem from process of the original sequential program. you update a system. That saves
algebra. In its forward direction, the theorem de- money because cost goes up
scribes the parallel composition of two programs very quickly with the number
as an arbitrary interleaving of a single sequential of changes made at one time.
program. We go the other way, describing a se- The reason is just entropy: the
quential program as a parallel composition of two As with all distributed applications, you must worry number of ways of making a
programs, ensuring that the parallel execution about network latency and availability. All remote mistake grows exponentially with
of the new programs maintains invariants of the calls should be asynchronous. Let the compiler the number of changes, while the
original sequential program. The transformation write a new overload for the Suggest method with number of ways of getting it right
inserts synchronization, marshaling, and security an [Async] custom attribute: does not grow. Modularity is a
logic using join patterns, described below. design dogma.
[Async]
To see how this works in practice, imagine a tiny extern void Suggest(string p, int n,
“dictionary-suggest” application that proposes Action<IEnumerable<Entry>> cont);
completions when a user types part of a word. The
Suggest method in class Dictionary returns the This new overload takes an additional parameter, a
topmost n descriptions that match a given prefix continuation delegate of type Action<...>. This con-
p. As a single-tier, sequential program, this is a tinuation takes as input the IEnumerable<Entry>
straightforward LINQ query over a collection of that was the output of the old, synchronous Suggest.
dictionary entries: This is a completely automatable rewrite into con-
tinuation-passing style:
IEnumerable<Entry> Suggest(string p, int n)
{ var o = new ListBox();
return (from e in this.entries ...
where p.Length > 0 && p.Matches(e) i.KeyUp += delegate {
select e).Take(n); d.Suggest(i.Text, 10,
} entries => {o.Items = Format(entries); });

The client program, at each keystroke, takes the


value of the textbox i, looks up the completions of Don’t Stop at Just Two Tiers
the word in the dictionary, and populates listbox o
with the first 10 results, properly formatted: With our colleague Dragos Manolescu, we are pur-
suing much more broadly distributed real-world
var d = new Dictionary(); applications. This means continuing tier-splitting
var i = new TextBox(); recursively beyond client and server until all com-

www.code-magazine.com TESLA: Democratizing the Cloud 89


putation happens close to its data and resources. message arrive, and __Take_And_Contains will fire
Let’s move the dictionary query into the data tier when at least one Take message and at least one
and, perhaps, split the client tier to run UI in Sil- __Contains message arrive.
verlight, but leave the Dictionary proxy class in the
HTML DOM.
Conclusion: Repeating the Success
On all tiers, we do not assume there is a native Formula
.NET runtime. Instead, we target any available run-
time, adopting the point of view of a pluggable com- LINQ 1.0 broke ground for radically simplifying
piler back end. In the near term, the minimal client distributed data-intensive applications in .NET.
will be a browser with JavaScript. Therefore, we TESLA will stretch the programming model to cov-
translate IL into JavaScript with very high fidelity, er the Cloud.
by deep embedding. Similarly, in the near term, the
typical server will be a cluster of commodity-hard- Brian Beckman
ware blades. Therefore, we have a version of LINQ Erik Meijer
where GroupBy aggressively exploits the parallel-
ism inherent in horizontally partitioned data.

Join Patterns

Finally, it has to be easier for developers to specify


synchronization and replication of program states.
This is essential for occasionally connected and
collaborative scenarios. In the past, Microsoft of-
fered little more than thread pools, message queues,
monitors, semaphores, and so on; that is, low-level
communication primitives. Overt process notations
like the Pi-calculus also seem to be off-the-mark for
Composability Saves mainstream developers working with object-orient-
Money ed programming.
“Composability” dictates that you
We deem C-omega style join patterns to be the most
build up complex operations from
attractive alternative for TESLA and we are pursu-
smaller, generic operations, each
ing them for .NET languages with our colleague
as combinable or composable
Claudio Russo.
with as many of the others as
possible. Both the SQOs and the
Join patterns are a generalization of declarative
reshaping combinators follow
events that developers already know well. With de-
this principle. Composability
clarative events, the developer declares a disjunc-
implies modularity, and therefore
tion (OR-chain) of events to which a set of event-
saves money for exactly the
handler methods must respond. In the following
same reason: by reducing co-
example, the Visual-Basic Handles clause below
dependencies and therefore
specifies that the method __Click_Or_Focus must
the entropy costs of changing
run whenever either the Button.Click or the Text-
a system. Composability is a
box.Focus event is received:
design dogma.
Sub __Click_Or_Focus _
(ByVal S As Object, ByVal E As EventArgs)
Handles Button.Click, TextBox.Focus
...
End Sub

Declarative join patterns substitute a conjunction


(AND-chain) of messages as opposed to a disjunc-
tion of events. The new When clause is the paral-
lel construction to the existing Handles clause. The
When clause marks a method as one that must fire
when all the specified messages have arrived. In
caller code, messages just look like method calls,
with asynchronous ones returning immediately and
synchronous ones blocking.

For just a taste of this style of programming, con-


sider Listing 3. __Put_And_Empty will fire when
at least one Put message and at least one __Empty

90 TESLA: Democratizing the Cloud www.code-magazine.com


Have You Xiine It?

All You Can Read.


Digitally.
Experience the next generation
in digital reading FREE.
Go to www.xiine.com to read
CoDe Magazine and other
great publications.

Potrebbero piacerti anche