LINQ To SQL: Surprising Features
In a previous post, I've outlined my subjective opinion regarding the differentiating features of Entity Framework compared with LINQ to SQL. This time, I'd like to focus on a couple of possibly surprising LINQ to SQL features that do not fall within the straight and narrow MSDN samples and conference presentations.
Assume that we're developing a three-tier application that sounds like the classic candidate for ADO.NET Data Services or the Entity Framework. We have a database, which stores a relatively simple object model. We have a set of WCF services that query and expose that model. We have a variety of clients communicating with these WCF services that retrieve some parts of the data. I've apparently just described a good 90% of software projects.
One primary requirement from this kind of project is following DRY - the Don't Repeat Yourself principle (and its good friend, the Once and Only Once principle). The piece of code that we will be very tempted to repeat has to do with the objects representing our data. If we were to use LINQ to SQL as the data accessor, we would have a set of classes representing our database tables. And then if we were to communicate this data across WCF boundaries, we would have another set of classes representing our data contracts. On the client side, we might need yet another set of classes for binding the data to the UI. The single primary risk here is that any change to the database schema cascades to changes in multiple locations in our code.
Why am I stating the obvious, though? Of course we don't want to repeat ourselves. That's what ADO.NET Data Services are meant for. And that's what ASP.NET Dynamic Data offers. And assuming a non-web context, the UI binding I asked for? WinForms feature all that, not to mention WPF with its even more powerful data binding concepts.
But can we do that with LINQ to SQL?
The canonical LINQ to SQL demo (in a smart-client context) normally goes like this:
- Presenter mentions in brief that they will be using the Northwind database (everybody loves Northwind)
- Presenter clicks a couple of buttons and makes a couple of menu selections
- A design surface appears
- Presenter drags database tables to the design surface (isn't that a whole deal of fun, and the very purpose of LINQ to SQL?)
- Presenter proceeds to write a three-line query against the automagically generated classes
- Presenter clicks a few more buttons and makes some more menu selections
- A cool UI appears, with a grid displaying Northwind data!
Adding a middle tier is out of the question. What kind of audience would endure writing WCF data contracts by hand?
My kind of audience. A gigantic LINQ to SQL myth is that designer-generated attributes are the only mapping mechanism supported, and that your must work schema-first: Without a database in place, there's no way of getting the magic. Is there some kind of Entity Framework conspiracy in the wild?
LINQ to SQL has full support for mapping objects to the database schema (i.e. working classes-first, schema-later). LINQ to SQL has full support for specifying the mapping information in separate XML files, which can be modified without any changes to the code. LINQ to SQL has full support for mapping Plain Old CLR Objects to your database schema, and for generating your database schema from those objects. It won't let you map a single class to multiple database tables. Nor would it make you a good cup of coffee. But I digress.
We'll be working with the following database, aptly named Not Northwind At All:
Policies govern a quota for the number of operations of a particular type that can be performed within a given time period. Operations record the fact that the operation was performed, and introduce the concept of compensation - an operation which reverts the quota effect of a previously performed operation.
Without any surface-making and table-dragging, let's open a Visual Studio command prompt and execute the following command:
sqlmetal /conn:"Data Source = localhost\SQLEXPRESS; Initial Catalog = NotNorthwindAtAll; Integrated Security=True; Pooling=False" /code:DeleteMe.cs /map:NotNorthwindAtAll.xml
The result is two files: A good 300+ lines of code, which we will throw away immediately (because that's the designer-generated classes we don't want to use), and an XML mapping file which maps the schema to CLR types. In its raw form, the XML is a bit on the verbose side, so let's trim it down a little bit to produce this:
<?xml version="1.0" encoding="utf-8"?>
<Database Name="NotNorthwindAtAll"
xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">
<Table Name="dbo.Operations" Member="Operations">
<Type Name="Contracts.Operation">
<Column Name="Id" Member="Id"
DbType="UniqueIdentifier NOT NULL"
IsPrimaryKey="true" />
<Column Name="PolicyId" Member="PolicyId"
DbType="UniqueIdentifier NOT NULL" />
<Column Name="TimeStamp" Member="TimeStamp"
DbType="DateTime NOT NULL" />
<Column Name="IsCompensation" Member="IsCompensation"
DbType="Bit NOT NULL" />
<Association Name="FK_Operations_Policies"
Member="Policies"
ThisKey="PolicyId" OtherKey="Id"
IsForeignKey="true" />
</Type>
</Table>
<Table Name="dbo.Policies" Member="Policies">
<Type Name="Contracts.Policy">
<Column Name="Id" Member="Id"
DbType="UniqueIdentifier NOT NULL"
IsPrimaryKey="true" />
<Column Name="Name" Member="Name"
DbType="NVarChar(MAX) NOT NULL"
CanBeNull="false" UpdateCheck="Never" />
<Column Name="Quota" Member="Quota"
DbType="Int NOT NULL" />
<Column Name="PeriodHours" Member="PeriodHours"
DbType="Int NOT NULL" />
<Association Name="FK_Operations_Policies"
Member="Operations"
ThisKey="Id" OtherKey="PolicyId"
DeleteRule="NO ACTION" />
</Type>
</Table>
</Database>
What we have here is a simple mapping from the dbo.Operations and dbo.Policies tables to the Contracts.Operation and Contracts.Policy CLR types. Since we have just thrown away the C# code implementing these types, we have to write them ourselves (or we could take the designer-generated code and call it our own).
Before we do that, though, let me address a reasonable concern. I said above that LINQ to SQL lets you work objects-first. I said above that I won't be table-dragging, but instead I used sqlmetal (that counts). Am I experiencing an acute case of schizophrenia?
I think not. We're using sqlmetal because it's convenient; we can take the designer-generated code and edit it because it's convenient. But we're not locked into using designer-generated code. We could have rolled the data contracts first (because that's what business requirements dictated) and add the schema and mapping later. We could have generated the whole database from that schema, in fact (and at the end of the post, I demonstrate how to do just that).
So let's try it the other way around. Assume the following WCF-ready data contracts:
[DataContract]
public partial class Policy
{
[DataMember]
public string Name { get; set; }
}
[DataContract]
public partial class Operation
{
[DataMember]
public Policy Policy { get; set; }
[DataMember]
public DateTime TimeStamp { get; set; }
[DataMember]
public bool IsCompensation { get; set; }
}
That's the data we want the client to send over the wire when an operation is performed. That's the data we want our reporting clients to retrieve when they need to aggregate operations and quotas. That's the data contract. (Oh, and WCF in .NET 3.5 SP1 will have POCO support, so we will be able to omit the data contract attributes.)
How do we map that with the above XML? Namely, where are the Id fields? Where are all the policy details? And where in the XML is the association between the Operation.Policy property and the actual Policy object?
Well, to be strict, they aren't there. We need to add them. But there's one thing we are not going to do. We are not going to write a new set of classes called Policy and Operation and whatnot, and then convert from these classes to our data contracts. Instead, let's use the fact we've made the data contracts partial:
namespace Contracts
{
partial class Policy
{
public Guid Id { get; set; }
public int Quota { get; set; }
public int PeriodHours { get; set; }
}
}
We have extended the data contract so that it fits our expanded view of the database schema (in fact, we have quadrupled the number of properties on the Policy object!). From here, we can move on to actually implementing our first operation. Our service contract will be the following:
[ServiceContract]
public interface IPolicyService
{
[OperationContract]
bool PerformOperation(Operation op);
[OperationContract]
Policy[] GetAvailablePolicies();
}
We'll focus on GetAvailablePolicies first. We need to create a DataContext and run some LINQ magic against the Policies table. But first, we need to specify that our data context will use an XML-based mapping and not the standard attribute mapping. This can be encapsulated in the following class, assuming that the XML mapping is built as an embedded resource:
internal static class DataContextUtil
{
const string ConnectionString =
@"Data Source=localhost\SQLEXPRESS;Initial Catalog=NotNorthwindAtAll;Integrated Security=True;Pooling=False";
public static DataContext GetContext()
{
using (Stream mappingStream = Assembly.GetExecutingAssembly()
.GetManifestResourceStream("Service.NotNorthwindAtAll.xml"))
{
XmlMappingSource mapping =
XmlMappingSource.FromStream(mappingStream);
DataContext context = new DataContext(
ConnectionString, mapping);
context.Log = Console.Out;
return context;
}
}
}
Note the assignment to the DataContext.Log property, making all SQL statements executed by the data context visible through the console. This makes the GetAvailablePolicies implementation rather straightforward, just as it should be:
public Policy[] GetAvailablePolicies()
{
using (DataContext context = DataContextUtil.GetContext())
{
return (from policy in context.GetTable<Policy>()
select policy).ToArray();
}
}
Here's the SQL output that's occurring behind the scenes as the client calls the operation:
SELECT [t0].[Id], [t0].[Quota], [t0].[PeriodHours], [t0].[Name]
FROM [dbo].[Policies] AS [t0]
-- Context: SqlProvider(Sql2005) Model: MappedMetaModel Build: 3.5.21022.8
Our next challenge is implementing the PerformOperation logic. First, we need to extend the data contract so that it fits our database view of things:
namespace Contracts
{
partial class Policy
{
public Guid Id { get; set; }
public int Quota { get; set; }
public int PeriodHours { get; set; }
}
partial class Operation
{
public Guid Id { get; set; }
public Guid PolicyId { get; set; }
}
}
Note that we haven't done anything to ensure that LINQ to SQL correctly interprets the Operation.Policy property. It's actually fairly simple - we can use EntityRef<Policy> to capture this requirement. This is left as an exercise for the reader.
From here, we can move to the (fairly ugly but sufficient) implementation of PerformOperation. It's really a good fit for a stored procedure, and if we used a store procedure we would map it to a CLR method.
public bool PerformOperation(Operation operation)
{
//This really belongs in a stored procedure.
using (DataContext context = DataContextUtil.GetContext())
{
Policy policy =
(from pol in context.GetTable<Policy>()
where pol.Name == operation.Policy.Name
select pol).Single(); //Assures it exists
int invocations =
(from op in context.GetTable<Operation>()
where op.PolicyId == policy.Id
where op.TimeStamp > DateTime.Now.AddHours(-policy.PeriodHours)
select op).Count();
if (invocations >= policy.Quota)
return false; //Quota exceeded
operation.Id = Guid.NewGuid();
operation.PolicyId = policy.Id;
context.GetTable<Operation>()
.InsertOnSubmit(operation);
context.SubmitChanges();
return true;
}
}
If the quota has been exceeded, this generates two trips to the database - one to retrieve the policy information and another to count the operations already performed in the relevant timespan. If the quota has not been exceeded, there's the additional trip required to insert the new invocation. That's why a stored procedure would be so beneficial - it would cut down the number of trips to 1. Another thing to mention is that I haven't implemented counting compensations - which could potentially cost us another database trip (implementing this is left as an exercise for the reader).
Another thing to note here is that the Operation object we receive as a parameter is not initialized with an Id or a PolicyId. We need to manually specify this information before we go on and insert the operation to the Operations table.
The full solution with everything we did above can be downloaded from my SkyDrive. To facilitate testing, I've included some startup code that creates the database when the service starts and inserts the policy information into the Policies table. How hard was it to create the database?
using (DataContext context = DataContextUtil.GetContext())
{
if (context.DatabaseExists())
context.DeleteDatabase();
context.CreateDatabase();
}
Not very. Time to wrap things up.
What have we done here? We have taken a single representation of the data, which was well-suited for transferring it over WCF, and adapted it so that it can be used directly with our LINQ to SQL data accessor. We added information required for database normalization, but we didn't repeat ourselves: Not even one of the data properties appears in the program more than once. This non-intrusive capability of benefiting from LINQ to SQL is a key metric of its success.
LINQ to SQL is not a drag and drop technology. I've just started to embrace this notion - and it doesn't play well with the click-choose-click LINQ presentations. You don't have to wait for Visual Studio 2008 SP1 - you can embrace a fully-featured object-relational mapper in your applications, now - and since November 26, 2007, for that matter. It's there. It's waiting. It's called LINQ to SQL.