Predicting Physician Gender Using AzureML and F#

December 2, 2014 1 Comment

I am working with a couple of friends in a 2 week hackathon where the main subject is health care provider quality. One of the datasets that we are using is the national registry of physician information found here. One of the team members loaded it into Azure Sql Server and it is a dog. It is a about 1 gig of data and takes a couple of minutes to scan the entire dataset. I decided to take a small slice of the data (Connecticut physicians) and do some analysis on it .

My first step was to bring the data into AzureML via the Data Reader

Note that it took about 3 minutes to bring the data down. I then saved this data as a local dataset to do my experiments:

I then fired up another experiment using the dataset as the base. I first dragged in a Project Column module to only grab the columns I was interested in

I then pulled in a Missing Values Scrubber module where I would drop any row where there was a value missing

I then brought in a Metadata Editor module To change all of the fields to Categorical data types

With the data ready to go, I created a 70/30 (train/test) split of the data and added a Multiclass Decision Forest model with Gender as the Dependent variable

I then added a Score Model module and fed in the 30%. I finally added an Evaluate Model module

And the results were interesting, if not unsurprising:

Basically, if I know your age, your specialty, and your medical school, we can predict if you are a man 85% of the time. Encouragingly, we can only do it 62% of the time for a woman. I then published the experiment and created a quick script to consume the data:

 1 #r @"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.5\System.Net.Http.dll"
 2 #r @"..\packages\Microsoft.AspNet.WebApi.Client.5.2.2\lib\net45\System.Net.Http.Formatting.dll"
 3 
 4 open System
 5 open System.Net.Http
 6 open System.Net.Http.Headers
 7 open System.Net.Http.Formatting
 8 open System.Collections.Generic
 9 
10 type scoreData = {FeatureVector:Dictionary<string,string>;GlobalParameters:Dictionary<string,string>}
11 type scoreRequest = {Id:string; Instance:scoreData}
12 
13 let invokeService () = async {
14     let apiKey = ""
15     let uri = "https://ussouthcentral.services.azureml.net/workspaces/19a2e623b6a944a3a7f07c74b31c3b6d/services/6c4bbb43456e4d7e8a9196f2899f717d/score"
16     use client = new HttpClient()
17     client.DefaultRequestHeaders.Authorization <- new AuthenticationHeaderValue("Bearer",apiKey)
18     client.BaseAddress <- new Uri(uri)
19 
20     let input = new Dictionary<string,string>()
21     input.Add("Gender","U")
22     input.Add("MedicalSchoolName","OTHER")
23     input.Add("GraduationYear","1995")
24     input.Add("PrimarySpecialty","INTERNAL MEDICINE")
25 
26     let instance = {FeatureVector=input; GlobalParameters=new Dictionary<string,string>()}
27     let scoreRequest = {Id="score00001";Instance=instance}
28 
29     let! response = client.PostAsJsonAsync("",scoreRequest) |> Async.AwaitTask
30     let! result = response.Content.ReadAsStringAsync() |> Async.AwaitTask
31 
32     if response.IsSuccessStatusCode then
33         printfn "%s" result
34     else
35         printfn "FAILED: %s" result
36     response |> ignore
37 }
38 
39 invokeService() |> Async.RunSynchronously

And I have a way of predicting genders:

U,OTHER,1995,INTERNAL MEDICINE,0.651031798112075,0.348968201887925,0,F

Filed under Azure ML, F#

One Response to Predicting Physician Gender Using AzureML and F#

Pingback: F# Weekly #49, 2014 | Sergey Tihon's Blog

Jamie Dixon's Home

Predicting Physician Gender Using AzureML and F#

One Response to Predicting Physician Gender Using AzureML and F#

Leave a comment Cancel reply

Categories

Recent Posts

Archives

Blogroll

Meta

Jamie Dixon's Home

Predicting Physician Gender Using AzureML and F#

Share this:

Related

One Response to Predicting Physician Gender Using AzureML and F#

Leave a comment Cancel reply

Categories

Recent Posts

Archives

Blogroll

Meta