Basic Insert Operation Using F#
January 28, 2014 4 Comments
So the more I use F#, the more I come to understand the benefits and limitations of the language. Since I spend a majority of my day job in C# and JavaScript, it is a natural comparison between these two languages and F#. One of the tenants of F# is ‘less noise, more signal’. After looking at some projects, I am coming to the conclusion that Entity Framework, LinqToSql, <Any other ORM> is just noise. It is expensive noise at that – if you have worked on a production app using EF and you to do anything outside of the examples on MSDN, you know what I mean.
So can EF type providers replace the overhead, code bloat, and additional costs of Entity Framework? I decided to do a small test to see. I needed to load into SqlServer 27,481 records of crash data that I get from the North Carolina DOT. The records came to me in an Excel Spreadsheet which I pumped into MSAccess. Then, instead of using SQl Server SSIS/Bulk Data Load functions, I decided to create an app that pulls that data from the Access database and load it into the SqlServer database via a type provider.
My first step was to look for a MSAccess type provider. No luck. I then hit up Stack Overflow and found this and this article for working with Access. I coded up a solution to get the data into a DataReader like so
- static member GetCrashData =
- let connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=E:\Documents\Road Alert\WakeCountyCrashes.accdb; Persist Security Info=False;"
- use connection = new OleDbConnection(connectionString)
- let commandText = "Select * from Data"
- use command = new OleDbCommand(commandText,connection)
- connection.Open()
- use reader = command.ExecuteReader()
My first attempt to get the data from the read was a tuple like so:
- [while reader.Read() do
- yield reader.GetInt32(0),
- reader.GetFieldValue(1).ToString(),
- reader.GetFieldValue(2).ToString(),
- reader.GetDouble(3),
- reader.GetFieldValue(4).ToString(),
- reader.GetFieldValue(5).ToString(),
- reader.GetFieldValue(6).ToString(),
- reader.GetDateTime(7),
- reader.GetDateTime(8),
- reader.GetFieldValue(9).ToString(),
- reader.GetFieldValue(10).ToString()
- ]
Sure enough, it works like a champ from my C# UI (once I ran AccessDatabaseEngine.exe on my machine – sigh)
- private static void GetCrashData()
- {
- var results = CrashDataLoader.GetCrashData;
- Console.WriteLine(results.Count().ToString());
- }
Gives
The next thing I did was to create a Type Provider
And then create the method to insert the data into the database:
- static member LoadCrashData =
- let targetDatabase = targetSchema.GetDataContext()
- let rows = CrashDataLoader.GetCrashData
- targetDatabase.TrafficCrashes.InsertAllOnSubmit(rows)
- targetDatabase.DataContext.SubmitChanges()
- true
The problem is that I ran into was that the GetCrashData was returning a Tuple and the LoadCrashData was expecting a Typed CrashData element. I searched for a bit and then gave up trying to figure out how to map the two without explicitly assigning each field. So then I did it the old fashion way like so:
- static member TrafficCrashFromReader(reader: OleDbDataReader) =
- let trafficCrash = new targetSchema.ServiceTypes.TrafficCrashes()
- trafficCrash.NCDmvCrashId <- System.Nullable<float> (float (reader.GetFieldValue(0).ToString()))
- trafficCrash.Municipality <- reader.GetFieldValue(1).ToString()
- trafficCrash.OnRoad <- reader.GetFieldValue(2).ToString()
- trafficCrash.Miles <- System.Nullable<double> (double (reader.GetFieldValue(3).ToString()))
- trafficCrash.Direction <- reader.GetFieldValue(4).ToString()
- trafficCrash.FromRoad <- reader.GetFieldValue(5).ToString()
- trafficCrash.TowardRoad <- reader.GetFieldValue(6).ToString()
- trafficCrash.DateOfCrash <- System.Nullable<DateTime> (reader.GetDateTime(7))
- trafficCrash.TimeOfCrash <- System.Nullable<DateTime> (reader.GetDateTime(8))
- trafficCrash.CrashType <- reader.GetFieldValue(9).ToString()
- trafficCrash.CrashSeverity <- reader.GetFieldValue(10).ToString()
- trafficCrash
The fact that I am using the <- symbol is a code smell, but I am not sure how to get around it.
In any event, once I ran it in my Console app:
- private static void LoadCrashData()
- {
- Stopwatch stopWatch = new Stopwatch();
- stopWatch.Start();
- CrashDataLoader.LoadCrashData();
- stopWatch.Stop();
- Console.WriteLine("The load took " + stopWatch.Elapsed.TotalSeconds + " seconds.");
- }
I got nothing after 30m minutes! Yikes.
I then went back and wrote a function to insert 1 row at a time:
- static member LoadCrashDataRow dataRow =
- let targetDatabase = targetSchema.GetDataContext()
- targetDatabase.TrafficCrashes.InsertOnSubmit(dataRow)
- targetDatabase.DataContext.SubmitChanges()
- true
And the consuming app:
- private static void LoadCrashData()
- {
- var crashRows = CrashDataLoader.GetCrashData;
- Stopwatch stopWatch = new Stopwatch();
- stopWatch.Start();
- foreach (var crashRow in crashRows)
- {
- CrashDataLoader.LoadCrashDataRow(crashRow);
- Console.WriteLine(crashRow.NCDmvCrashId + " loaded.");
- }
- stopWatch.Stop();
- Console.WriteLine("The load took " + stopWatch.Elapsed.TotalSeconds + " seconds.");
- }
Sure enough, it works like a champ.
So it is slow – though I am not sure EF is any faster. But not having to deal with that .edmx files, the .tt files, the whatever-else-we-throw-in files, I think further research is definitely warranted. Also, there are some other things I think F# Type Providers need to have:
1) Ability to handle proxies
2) Making Plural tables singular (The table name is Crashes, the type should be Crash)
3) An MS Access TP would be great
4) An Azure Sql Database TP would be doubly great
Pingback: F# Weekly #5, 2014 | Sergey Tihon's Blog
Nice post.
AFAIK the built-in SQL type provider uses SqlMetal underneath the bonnet – this is basically what LINQ to SQL uses, so you’re using an ORM anyway
As for the <- operator – you should be able to assign the values directly in the constructor e.g. new trafficCrashes(onRoad = "xyz", Miles = 10.) etc. etc.
By the way … Azure SQL databases supports TDS so theoretically you should be able to connect to a SQL database using the type provider (or any other ADO .NET wrapper) without any code changes.
AzureSql – I have tried a couple of times as it has not. I would love to be wrong.
I’ll give it a bash and get back to you 😊