Improving performance using in-memory caching
February 8, 2011 Leave a comment
I was thinking about how to make my random name generator more performant. I thought about a caching strategy where the EF dataset was loaded into memory at startup and then do queries on the in-memory set. I would use a Singleton pattern to make sure only 1 set was created. After searching around a bit using Bing, I realized that EF does not lend itself to this pattern -> every call could potentially go back to the database and the EF itself is rather heavy. I decided to go with a lightweight POCO using a Collection base.
My first stop was to create a Console application to see what kind of performance boost I could reasonably expect to obtain. I created 2the POCO and the collection like so:
public class LastName { public string Name { get; set; } public int Rank { get; set; } public double Frequency { get; set; } public double CumlFrequency { get; set; } } public class LastNameCollection: Collection<LastName> { }
I then made an EF class:
And then did a quick test.
class Program { public static LastNameCollection lastNameCollection = new LastNameCollection(); static void Main(string[] args) { Console.WriteLine("---Start---"); LoadAllData(); Stopwatch stopWatch = Stopwatch.StartNew(); Console.WriteLine("{0} was found remotely in {1} seconds", SearchForNameRemotely(), stopWatch.Elapsed.TotalSeconds); stopWatch.Restart(); Console.WriteLine("{0} was found locally in {1} seconds", SearchForNameLocally(), stopWatch.Elapsed.TotalSeconds); stopWatch.Stop(); Console.WriteLine("----End----"); Console.ReadKey(); } static void LoadAllData() { using (Tff.EntityFrameworkLoad.DB_9203_tffEntities entities = new DB_9203_tffEntities()) { var lastNameQuery = from lastName in entities.Census_LastName select lastName; foreach (Census_LastName censusLastName in lastNameQuery) { lastNameCollection.Add(new LastName { Name = censusLastName.LastName, Rank = censusLastName.Rank, Frequency = censusLastName.Frequency, CumlFrequency = censusLastName.Frequency }); } } } static string SearchForNameRemotely() { using (Tff.EntityFrameworkLoad.DB_9203_tffEntities entities = new DB_9203_tffEntities()) { Census_LastName lastNameFound = (from lastName in entities.Census_LastName where lastName.LastName == "Frankenstein" //Not case-sensitive select lastName).First(); return lastNameFound.LastName; } } static string SearchForNameLocally() { LastName lastNameFound = (from lastName in lastNameCollection where lastName.Name.ToLower() == "frankenstein" //NOTE Case-sensitivity select lastName).First(); return lastNameFound.Name; } }
Sure enough, the in-memory gain was approximately 300%:
I then layered on PLIQ to see if the additional processors increase the speed. Last time, I showed that PLINQ did NOT have a positive impact on performance using the LINQ to EF queries.
I changed the functions to add in the AsParallell() extension method like this:
using (Tff.EntityFrameworkLoad.DB_9203_tffEntities entities = new DB_9203_tffEntities()) { Census_LastName lastNameFound = (from lastName in entities.Census_LastName.AsParallel() where lastName.LastName == "Frankenstein" //Not case-sensitive select lastName).First(); return lastNameFound.LastName; }
Interestingly, the EF LINQ then threw an exception:
When I just ran AsParallel () on the in-memory copy, the performance degraded:
I started to implement this pattern in the Random project to see if I can get the performance boost without the Parallelism. I would assume with a web service you would create the collections in the APP_Start event handler and then query them during each method call. I put it into the Application_Start method, however performance did not increase in my unit tests – it looks like that app_start was getting called each time and I am not sure how to hold the application in memory without requests – my hosting provider recycles frequently.
I then put the collections load into the singleton method of my Global class and my tests flew:
private static LastNameCollection _lastNameCollection = null; public static LastNameCollection LastNameCollection { get { if (_lastNameCollection == null) { _lastNameCollection = new LastNameCollection(); PopulateLastNameCollectionFromDatabase(); } return _lastNameCollection; } }
I then pushed it out to production and sure enough, performance increased when I did multiple calls.