Improvements Parallelism Can Make On The Random Service
February 1, 2011 Leave a comment
Now that I have the Random Service set up and I have gone through the parallel extensions lab, I thought I could apply what I learned about parallelism to the random generator. I first needed a way of measuring the time functions take and then breaking down the components of the functions into parts that can benefit from parallelism and those parts that can’t. Also, I am curious to see how my local machine compares to my provider in terms of the benefits of parallelism. My big assumption is that the number of records created by the random factory is fairly small – samples of 100, 200 and the like.
To that end, I created a quick performance test harness. I started with the phone number generation because it was very straight forward:
static void GeneratePhoneNumbersPerformanceTestLocalMachine()
{
int numberOfRecords = 1000;
RandomFactory randomFactory = new RandomFactory();
Stopwatch stopwatch = Stopwatch.StartNew();
randomFactory.GetPhoneNumbers(numberOfRecords);
stopwatch.Stop();
Trace.WriteLine(String.Format("{0} phone numbers were generated in {1} seconds on the local machine",numberOfRecords, stopwatch.Elapsed.TotalSeconds));
}
The problem my 1st attempt is pretty clear:
Parallelism might help – but I am really not concerned about improving on thousandths of a second.
I then added the GetDate() and Get UniqueIds() functions – because these methods do not hit the database or walk a large dataset:
Starting performance test at 1/31/2011 8:04:32 AM
1000 phone numbers were generated in 0.0022865 seconds on the local machine
1000 dates were generated in 0.0008908 seconds on the local machine
1000 unique ids were generated in 0.000591 seconds on the local machine
Ending performance test at 1/31/2011 8:04:32 AM
I then decided to test GetLastName() using a 25%, 50%, 75%, and 100% prevalence thresholds (lower prevalence means fewer records to fetch and walk):
Starting performance test at 1/31/2011 8:01:36 AM
1000 phone numbers were generated in 0.0023858 seconds on the local machine
1000 dates were generated in 0.0006502 seconds on the local machine
1000 unique ids were generated in 0.0006484 seconds on the local machine
1000 last names (25% prevalence) were generated in 1.2553884 seconds on the local machine
1000 last names (50% prevalence) were generated in 0.3628737 seconds on the local machine
1000 last names (75% prevalence) were generated in 1.3719554 seconds on the local machine
1000 last names (100% prevalence) were generated in 8.9350157 seconds on the local machine
Ending performance test at 1/31/2011 8:01:48 AM
Interestingly, it looks like the connection is being re-used, so the 50% is faster than the 25%. Note the spike at 100% though – perhaps Parallelism might help there. I finished my testing suite for GetFirstNames(), GetAddresses(), GetPeople(), GetEmployees(). Here is the final results:
Starting performance test at 1/31/2011 9:28:36 AM
1000 phone numbers were generated in 0.0022125 seconds on the local machine
1000 dates were generated in 0.000647 seconds on the local machine
1000 unique ids were generated in 0.0006895 seconds on the local machine
1000 last names (25% prevalence) were generated in 1.4208552 seconds on the local machine
1000 last names (50% prevalence) were generated in 0.3804186 seconds on the local machine
1000 last names (75% prevalence) were generated in 1.4271377 seconds on the local machine
1000 last names (100% prevalence) were generated in 11.5619451 seconds on the local machine
1000 male first names (25% prevalence) were generated in 0.126765 seconds on the local machine
1000 male first names (50% prevalence) were generated in 0.0956216 seconds on the local machine
1000 male first names (75% prevalence) were generated in 0.1013383 seconds on the local machine
1000 male first names (100% prevalence) were generated in 0.2053033 seconds on the local machine
1000 female first names (25% prevalence) were generated in 0.1130885 seconds on the local machine
1000 female first names (50% prevalence) were generated in 0.0998854 seconds on the local machine
1000 female first names (75% prevalence) were generated in 0.1070964 seconds on the local machine
1000 female first names (100% prevalence) were generated in 0.9740046 seconds on the local machine
1000 both first names (25% prevalence) were generated in 0.1893091 seconds on the local machine
1000 both first names (50% prevalence) were generated in 0.2349195 seconds on the local machine
1000 both first names (75% prevalence) were generated in 0.2078913 seconds on the local machine
1000 both first names (100% prevalence) were generated in 1.1015048 seconds on the local machine
1000 street addresses were generated in 12.6074157 seconds on the local machine
1000 people (100% prevalence, both genders) were generated in 28.0779342 seconds on the local machine
1000 employees (100% prevalence, both genders) were generated in 29.1355036 seconds on the local machine
Ending performance test at 1/31/2011 9:30:05 AM
So, now it’s time to parallelize. Taking the path of biggest bang for the CPU cycle, I decided to looks at last names and street addresses. Diving into the code, I changed this:
for (int i = 0; i < numberOfNames; i++)
{
randomIndex = random.Next(1, lastNameQuery.Count);
selectedLastName = lastNameQuery[randomIndex-1];
lastNames.Add(selectedLastName.LastName);
To this:
Parallel.For(0, numberOfNames, (index) =>
{
randomIndex = random.Next(1, lastNameQuery.Count);
selectedLastName = lastNameQuery[randomIndex – 1];
lastNames.Add(selectedLastName.LastName);
And the output was:
1000 last names (100% prevalence) were generated in 11.8207654 seconds on the local machine
It went up! OK, so the performance hit is not on the for loop – it is the fetching of the records from the database.
I changed this code
List<string> lastNames = new List<string>();
var context = new Tff.Random.tffEntities();
List<Census_LastName> lastNameQuery =(from lastName in context.Census_LastName
where lastName.CumlFrequency < pervalence
select lastName).ToList<Census_LastName>();
To this:
List<string> lastNames = new List<string>();
var context = new Tff.Random.tffEntities();
List<Census_LastName> lastNameQuery =(from lastName in context.Census_LastName
.AsParallel()
where lastName.CumlFrequency < pervalence
select lastName).ToList<Census_LastName>();
And the output was:
1000 last names (100% prevalence) were generated in 17.2297868 seconds on the local machine
Wow, I wonder if each thread is creating its own database connection? I fired up SQL Performance monitor against a local machine instance of my database (my provider does not let me have sysadmin on the database – ahh some rain in those clouds). No surprise that when I moved the database to my local machine, performance improved dramatically:
1000 last names (100% prevalence) were generated in 2.1497559 seconds on the local machine
In any event, I slapped on SQL Performance Monitor. It looks like there is only 1 call being made (I expected 4 for my quad processor):
It looks like it I want to speed things up, I need to speed up the database call and PLINQ can’t help there. The best way would be to take the entire datasets and cache them then memory when the application starts up.
I will confirm these ideas using some of the lab tools with VS2010 in the coming weeks…