Replacing Entity Framework Classes With Record Types

I recently have been working on a mixed-language project (C# and F#) where the original implementation used the Entity Framework reverse POCO generator found here.  If you are not familiar, this Visual Studio add-in uses a T4 template to inspect a database and generate Entity Framework classes based on the tables that are located in the database.  The generator created two files for a given table: the entity and a class for configuration.

Capture

At the time, it probably was the best way to generate the hundreds of classes that were needed for the C# project.  However, now that we are introducing F# to the code base, it made sense to use the right tool for the job.

AFAIK, there are two ways you can use F# to do ORM.  Way #1 is to use type providers and way #2 is to have record types and hand-roll the connectivity.  I would have preferred to use a type provider to expose the database types in two lines of code, but since these types are being used by other C# and VB.NET projects, this was not possible.

Option 2 was to create record types.  Instead of hand writing the types, I decided to create a quick script that turns C# classes into F# types.  Since the each individual C# class was in its own file, the script traverses a directory and pull all of the C# files:

1 open System 2 open System.IO 3 open System.Collections.Generic 4 5 let path = @"C:\Git\..." 6 let folderInfo = System.IO.DirectoryInfo(path) 7 let files = folderInfo.GetFiles("*.cs")

Within each file, I needed to get both the type name and then the properties.  Getting the name was a text search for “public class” and the attributes was “get;set”

1 let parseClass (values: IEnumerable<string>) = 2 let className = 3 values 4 |> Seq.filter(fun l -> l.Contains("public class")) 5 let typeName = 6 match className |> Seq.length with 7 | 0 -> None 8 | _ -> Some (className |> Seq.head) 9 let propNames = 10 values 11 |> Seq.filter(fun l -> l.Contains("{ get; set; }")) 12 typeName, propNames

With the parsed values (1 class name and an array of attributes), I could then create the type name and the type attributes:

1 let createTypeName (className:string option) = 2 match className with 3 | Some cn -> 4 let typeName = cn.Replace(" public class"," ").Trim() 5 match(typeName.Contains("Configuration")) with 6 | true -> None 7 | false -> Some typeName 8 | None -> None 9 10 let reverseValues (typeAttribute:string) = 11 let tokens = typeAttribute.Split(' ') 12 match tokens.Length with 13 | 0 | 1 -> "" 14 | _ -> tokens.[1] + ":" + tokens.[0] 15 16 let createTypeAttributes (items: IEnumerable<string>) = 17 let temp = 18 items 19 |> Seq.map(fun i -> i.Replace("public","")) 20 |> Seq.map(fun i -> i.Replace("{ get; set; }","")) 21 |> Seq.map(fun i -> i, i.IndexOf("//")) 22 |> Seq.filter(fun (i,t) -> t > 0) 23 |> Seq.map(fun (i,t) -> i.Substring(0,t)) 24 |> Seq.map(fun i -> i.Trim()) 25 |> Seq.map(fun i -> reverseValues i) 26 match Seq.length temp with 27 | 0 -> "" 28 | _ -> Seq.reduce(fun acc elem -> acc + ";" + elem) temp

With those values, set, I could then create the types:

1 let createType (values:string option * IEnumerable<string>) = 2 createTypeName(fst values), 3 createTypeAttributes (snd values)

and then it was just a matter of putting it all together:

1 let printType (typeName:string) (typeAttributes: string) = 2 printfn "type %s {%s}" typeName typeAttributes 3 4 files 5 |> Array.map(fun f -> f.FullName) 6 |> Array.map(fun fn -> fn, File.ReadLines(fn)) 7 |> Array.map(fun (fn,c) -> parseClass c) 8 |> Array.map(fun x -> createType x) 9 |> Array.filter(fun (x,y) -> x.IsSome) 10 |> Array.map(fun (x,y) -> x.Value, y) 11 |> Array.iter(fun (x,y) -> printType x y)

With these values, I could create a single file and have all of my domain objects in 1 place – with about 95% less noise code.

Git is here

Basic Insert Operation Using F#

So the more I use F#,  the more I come to understand the benefits and limitations of the language.  Since I spend a majority of my day job in C# and JavaScript, it is a natural comparison between these two languages and F#.  One of the tenants of F# is ‘less noise, more signal’.  After looking at some projects, I am coming to the conclusion that Entity Framework, LinqToSql, <Any other ORM> is just noise.  It is expensive noise at that – if you have worked on a production app using EF and you to do anything outside of the examples on MSDN, you know what I mean.

So can EF type providers replace the overhead, code bloat, and additional costs of Entity Framework?  I decided to do a small test to see.  I needed to load into SqlServer 27,481 records of crash data that I get from the North Carolina DOT.  The records came to me in an Excel Spreadsheet which I pumped into MSAccess.  Then, instead of using SQl Server SSIS/Bulk Data Load functions, I decided to create an app that pulls that data from the Access database and load it into the SqlServer database via a type provider.

My first step was to look for a MSAccess type provider.  No luck.  I then hit up Stack Overflow and found this and this article for working with Access.  I coded up a solution to get the data into a DataReader like so

  1. static member GetCrashData =
  2.     let connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=E:\Documents\Road Alert\WakeCountyCrashes.accdb; Persist Security Info=False;"
  3.     use connection = new OleDbConnection(connectionString)
  4.     let commandText = "Select * from Data"
  5.     use command = new OleDbCommand(commandText,connection)
  6.     connection.Open()
  7.     use reader = command.ExecuteReader()

My first attempt to get the data from the read was a tuple like so:

  1. [while reader.Read() do
  2.      yield reader.GetInt32(0),
  3.      reader.GetFieldValue(1).ToString(),
  4.      reader.GetFieldValue(2).ToString(),
  5.      reader.GetDouble(3),
  6.      reader.GetFieldValue(4).ToString(),
  7.      reader.GetFieldValue(5).ToString(),
  8.      reader.GetFieldValue(6).ToString(),
  9.      reader.GetDateTime(7),
  10.      reader.GetDateTime(8),
  11.      reader.GetFieldValue(9).ToString(),
  12.      reader.GetFieldValue(10).ToString()
  13.  ]

Sure enough, it works like a champ from my C# UI (once I ran AccessDatabaseEngine.exe on my machine – sigh)

  1. private static void GetCrashData()
  2. {
  3.     var results = CrashDataLoader.GetCrashData;
  4.     Console.WriteLine(results.Count().ToString());
  5. }

Gives

image

The next thing I did was to create a Type Provider

image

And then create the method to insert the data into the database:

  1. static member LoadCrashData  =
  2.     let targetDatabase = targetSchema.GetDataContext()
  3.     let rows = CrashDataLoader.GetCrashData
  4.     targetDatabase.TrafficCrashes.InsertAllOnSubmit(rows)
  5.     targetDatabase.DataContext.SubmitChanges()
  6.     true

The problem is that I ran into was that the GetCrashData was returning a Tuple and the LoadCrashData was expecting a Typed CrashData element.  I searched for a bit and then gave up trying to figure out how to map the two without explicitly assigning each field.  So then I did it the old fashion way like so:

  1. static member TrafficCrashFromReader(reader: OleDbDataReader) =
  2.     let trafficCrash = new targetSchema.ServiceTypes.TrafficCrashes()
  3.     trafficCrash.NCDmvCrashId <- System.Nullable<float> (float (reader.GetFieldValue(0).ToString()))
  4.     trafficCrash.Municipality <- reader.GetFieldValue(1).ToString()  
  5.     trafficCrash.OnRoad <- reader.GetFieldValue(2).ToString()  
  6.     trafficCrash.Miles <- System.Nullable<double> (double (reader.GetFieldValue(3).ToString()))
  7.     trafficCrash.Direction <- reader.GetFieldValue(4).ToString()
  8.     trafficCrash.FromRoad <- reader.GetFieldValue(5).ToString()
  9.     trafficCrash.TowardRoad <- reader.GetFieldValue(6).ToString()
  10.     trafficCrash.DateOfCrash <- System.Nullable<DateTime> (reader.GetDateTime(7))
  11.     trafficCrash.TimeOfCrash <- System.Nullable<DateTime> (reader.GetDateTime(8))
  12.     trafficCrash.CrashType <- reader.GetFieldValue(9).ToString()
  13.     trafficCrash.CrashSeverity <- reader.GetFieldValue(10).ToString()
  14.     trafficCrash

The fact that I am using the <- symbol is a code smell, but I am not sure how to get around it.

In any event, once I ran it in my Console app:

  1. private static void LoadCrashData()
  2. {
  3.     Stopwatch stopWatch = new Stopwatch();
  4.     stopWatch.Start();
  5.     CrashDataLoader.LoadCrashData();
  6.     stopWatch.Stop();
  7.     Console.WriteLine("The load took " + stopWatch.Elapsed.TotalSeconds + " seconds.");
  8. }

I got nothing after 30m minutes!  Yikes.

I then went back and wrote a function to insert 1 row at a time:

  1. static member LoadCrashDataRow dataRow =
  2.     let targetDatabase = targetSchema.GetDataContext()
  3.     targetDatabase.TrafficCrashes.InsertOnSubmit(dataRow)
  4.     targetDatabase.DataContext.SubmitChanges()
  5.     true

And the consuming app:

  1. private static void LoadCrashData()
  2. {
  3.  
  4.     var crashRows = CrashDataLoader.GetCrashData;
  5.     Stopwatch stopWatch = new Stopwatch();
  6.     stopWatch.Start();
  7.     foreach (var crashRow in crashRows)
  8.     {
  9.         CrashDataLoader.LoadCrashDataRow(crashRow);
  10.         Console.WriteLine(crashRow.NCDmvCrashId + " loaded.");
  11.     }
  12.     stopWatch.Stop();
  13.     Console.WriteLine("The load took " + stopWatch.Elapsed.TotalSeconds + " seconds.");
  14. }

Sure enough, it works like a champ.

image

So it is slow – though I am not sure EF is any faster.  But not having to deal with that .edmx files, the .tt files, the whatever-else-we-throw-in files, I think further research is definitely warranted.  Also, there are some other things I think F# Type Providers need to have:

1) Ability to handle proxies

2) Making Plural tables singular (The table name is Crashes, the type should be Crash)

3) An MS Access TP would be great

4) An Azure Sql Database TP would be doubly great 

Self-Referencing Keys Gotcha in EF

Looking at the Employee table in Northwind, there is a self-referencing key on the ‘ReportsTo’ field

image

When a series of new employees are created, you might run into problems trying to commit those changes all at once.  For example:

  1. static void Main(string[] args)
  2. {
  3.     Console.WriteLine("Start");
  4.  
  5.     using (NorthwindEntities entities = new NorthwindEntities())
  6.     {
  7.         Employee employee1 = new Employee();
  8.         employee1.FirstName = "Test";
  9.         employee1.LastName = "Employee1";
  10.         entities.Employees.Add(employee1);
  11.  
  12.         Employee employee2 = new Employee();
  13.         employee2.FirstName = "Test";
  14.         employee2.LastName = "Employee2";
  15.         employee2.ReportsTo = employee1.EmployeeID;
  16.         entities.Employees.Add(employee2);
  17.  
  18.         entities.SaveChanges();
  19.     }
  20.  
  21.     Console.WriteLine("-Stop-");
  22.     Console.ReadKey();
  23. }

 

When you run this, you get this kind of exception:

image

 

The reason is that the PK of the 1st employee is 0 and the PK of the second employee is ….. 0.  Since both are created on the client and have not had the ‘real’ primary key assigned by the database, the FK gets confused because there are 2 Primary Keys with the same value.  And according to these calculations, that is impossible.

The way to fix this problem is to move the SaveChanges() to after each add like this:

  1. using (NorthwindEntities entities = new NorthwindEntities())
  2. {
  3.     Employee employee1 = new Employee();
  4.     employee1.FirstName = "Test";
  5.     employee1.LastName = "Employee1";
  6.     entities.Employees.Add(employee1);
  7.     entities.SaveChanges();
  8.  
  9.     Employee employee2 = new Employee();
  10.     employee2.FirstName = "Test";
  11.     employee2.LastName = "Employee2";
  12.     entities.Employees.Add(employee2);
  13.     employee2.ReportsTo = employee1.EmployeeID;
  14.     entities.SaveChanges();
  15.  
  16. }

Or to not associate the FK until the 1st bulk commit like this;

  1. using (NorthwindEntities entities = new NorthwindEntities())
  2. {
  3.     Employee employee1 = new Employee();
  4.     employee1.FirstName = "Test";
  5.     employee1.LastName = "Employee1";
  6.     entities.Employees.Add(employee1);
  7.  
  8.     Employee employee2 = new Employee();
  9.     employee2.FirstName = "Test";
  10.     employee2.LastName = "Employee2";
  11.     entities.Employees.Add(employee2);
  12.     entities.SaveChanges();
  13.  
  14.     employee2.ReportsTo = employee1.EmployeeID;
  15.     entities.SaveChanges();
  16.  
  17. }

Switch Databases Using EF: Part Deux

Consider you are a company where you want to keep each of your customer’s data in its own database on the same server.  Each of the databases has the identical schema.  For example:

image

In the Data Access Layer, you are using Entity Framework and you want to move between databases at run time.  I blogged about one possible solution to this problem here, but I was thinking about this some more and I have a different (better?) idea. 

The first thing to realize is that EF does not make this easy.  There is not a Contxext.ChangeDatabase(String) method and the constructor does not allow to pass in a connection string.  This tight-coupling smells of poor design, but fortunately this is not insurmountable.

The first step is to create a Model (NorthwindModel in this case).  When you inspect the context, you can see that Microsoft made the entities a partial class

image

So you can add a class to your project like so:

  1. public partial class NorthwindEntities
  2. {
  3.     public NorthwindEntities(String connectionString)
  4.     {
  5.         base.Database.Connection.ConnectionString = connectionString;
  6.     }
  7.  
  8. }

 

Then, in your consuming class, you can create a method like this:

  1. internal static String CreateConnectionStringWithCustomDatabase(String baseConnectionString, String database)
  2. {
  3.     SqlConnectionStringBuilder connectionStringBuilder = new SqlConnectionStringBuilder(baseConnectionString);
  4.     connectionStringBuilder.InitialCatalog = database;
  5.     return connectionStringBuilder.ToString();
  6. }

And then you can switch databases.  Just pass in the default connection string and the new database name like so:

  1. static void Main(string[] args)
  2. {
  3.     Console.WriteLine("Start");
  4.     String defaultConnectionString = "data source=DIXON12;initial catalog=TBD;integrated security=True;MultipleActiveResultSets=True";
  5.     String actualConnectionString = CreateConnectionStringWithCustomDatabase(defaultConnectionString, "Northwind");
  6.  
  7.     using (NorthwindEntities entities = new NorthwindEntities(actualConnectionString))
  8.     {
  9.         foreach (var product in entities.Products)
  10.         {
  11.             Console.WriteLine(product.ProductName);
  12.         }
  13.     }
  14.  
  15.     Console.WriteLine("-End-");
  16.     Console.ReadKey();
  17.  
  18. }

 

And you can change databases on the fly:

image

The ’gotcha’ to realize is that Entity Framework requires that you have a default value in the app/web.config file.  If not, you will get this exception:

image

This smells like too much coupling, but that is a common theme with EF.  So make sure you have a default EF connection string in the .config – even if you don’t even use it.

Entity Framework–> Change Database

I was working with Entity Frameworks over the weekend when I ran into some sloppy programming on Microsoft’s part.  I made a project that uses Entity Frameworks to connection to Northwind.  I then added method to change the database at runtime like so:

public class NorthwindFactory
{
    public static void ChangeDatabase(String newDatabaseName)
    {
        using (NorthwindEntities entities = new NorthwindEntities())
        {
            entities.Connection.ChangeDatabase(newDatabaseName);
        }
    }
}

When I checked out intellisense on the ChangeDatabase method, there was nothing to indicate anything unusual.

image

When I invoked the method, I got the following exception:

image

Sure enough, when I go to MSDN help online, ChangeDatabase() is not supported in 4.5 and 4.0, but it is supported in 3.5. 

image

This is a clear violation of the Liskov Substitution Principle.  It is a also sloppy – they could have marked it obsolete or, at least, changed the XML code comments to reflect the fact that it is not supported.  Instead, they left it up to the developer to get the exception the first time s/he invokes it.

So instead of using the ChangeDatabase() method, I needed to use the SqlConnectionStringBuilder and EntityConnectionStringBuilder like so:

SqlConnectionStringBuilder sqlConnectionStringBuilder = 
    new SqlConnectionStringBuilder();
sqlConnectionStringBuilder.DataSource = ".";
sqlConnectionStringBuilder.InitialCatalog = newDatabaseName;
sqlConnectionStringBuilder.IntegratedSecurity = true;

EntityConnectionStringBuilder entityConnectionStringBuilder = 
    new EntityConnectionStringBuilder();
entityConnectionStringBuilder.Metadata = 
    @"res://*/NorthwindModel.csdl|res://*/NorthwindModel.ssdl|res://*/NorthwindModel.msl";
entityConnectionStringBuilder.Provider = "System.Data.SqlClient";
entityConnectionStringBuilder.ProviderConnectionString = 
    sqlConnectionStringBuilder.ConnectionString;

The problem is that I can’t use that entityConnectionStringBuilder inside of the using block of the Entity that I want to change:

image

So to switch databases, I have to drop the “using” statement.  The below code snippet works

NorthwindEntities entities = new NorthwindEntities();
SqlConnectionStringBuilder sqlConnectionStringBuilder = 
    new SqlConnectionStringBuilder();
sqlConnectionStringBuilder.DataSource = ".";
sqlConnectionStringBuilder.InitialCatalog = newDatabaseName;
sqlConnectionStringBuilder.IntegratedSecurity = true;

EntityConnectionStringBuilder entityConnectionStringBuilder = 
    new EntityConnectionStringBuilder();
entityConnectionStringBuilder.Metadata = 
    @"res://*/NorthwindModel.csdl|res://*/NorthwindModel.ssdl|res://*/NorthwindModel.msl";
entityConnectionStringBuilder.Provider = "System.Data.SqlClient";
entityConnectionStringBuilder.ProviderConnectionString = 
    sqlConnectionStringBuilder.ConnectionString;

entities = new NorthwindEntities(entityConnectionStringBuilder.ConnectionString);

I am not sure Microsoft thought this one through…

Entity Framework Using Stored Procedures

Entity Framework is really designed to access your tables directly, but you can use stored procedures.  This is typical where you have a legacy system where there is lots of business logic embedded in your stored procedures and you can’t afford to do a rewrite.

Taking the Northwind database, I added stored procedures to handle the typical CRUD operations to the Product table:

Create procedure usp_GetProducts
as
Select * from Products
GO

Create procedure usp_UpdateProduct
@productId int,
@productName nvarchar(40)
as
Update Products
Set ProductName = @productName
Where ProductID = @productId
GO

Create procedure usp_DeleteProduct
@productId int
as
Delete Products
Where ProductID = @productId
GO

Create procedure usp_InsertProduct
@productName nvarchar(40)
as
Insert Products
Values(
@productName,null,null,null,null,
null,null,null,0)

GO

I then wired up a EF model in Visual Studio using the EF designer support pointing at the tables so I wouldn’t have to create the entities by hand:

 

image

I then added the four stored procedures that I just created – note that the .edmx designer does not show the sps graphically.

Taking the Product entity, I then selected the “Stored Procedure Mapping” context menu:

 

image

To get the following menu:

image

It was easy enough then to add the stored procedures that I just created to the entity:

image

The problem is then with the R of CRUD – there is no designer support for Selecting the Products.  You have to do this in code like so:

public NorthwindRepository()
{
    using (NorthwindEntities entities = new NorthwindEntities())
    {
        String commandText = "usp_GetProducts";
        var productQuery = entities.ExecuteStoreQuery<Product>(commandText,null);
        foreach (Product product in productQuery)
        {
            //TODO
        }
    }
}

Once you have that, you can use stored procedures in your project.  Note that many stored procedures are not simple select statements but bring in extra fields like this:

Create procedure usp_GetProducts
as
Select
P.ProductID,
P.ProductName,
P.SupplierID,
S.CompanyName
from Products as P
Inner join Suppliers as S
on P.SupplierID = S.SupplierID
GO

Is that is the case, you CAN’T just use the entity based on the table and pump in that stored procedure – the fields must match exactly – name, data types, etc.  If they don’t match, you get a exception like this:

image