Consuming Sky Biometry’s Image Recognition API
June 24, 2014 2 Comments
I was looking at Sky Biometry a couple of days ago to do some facial recognition. I am not sure how many other companies out there do this, but working with their API from F# was a great experience.
Sky Biometry uses both Json and Xml for API calls. They also have an example covering library in C#. I decided to use F# and use the json type provider to explore their api and the quality of their recognition algorithms. They have a good documentation page and it didn’t take very long to get my account up and running and then making API calls.
I thought it would be fun to use an example from the first Terminator movie when Ah-nold went around looking for Sarah Connor. I picked two images from the movie series. The first is this image of a photograph of Sarah Connor has taken at the end of the first movie. I know that the terminator didn’t have this photograph in the movie, but working off-script, pretend that the Terminator has that photograph. This second image is from the first movie when she is in a bar. So if the Terminator was powered by Sky Biometry and found Sarah in the bar, how close would it match her to the first photo?
The first thing I did was to fire up a FSharp project in Visual Studio and start scripting the API calls that I would need to do facial recognition.
With all of the calls working in the REPL, I then moved the code into my module. I declared the types at the top and then crated a class that could be consumed by external projects.
You will notice that the type providers are using a local copy of the json to infer the type in the module. I did run into the problem where the type provider using the web call was not capturing the full graph in the type definition, so I took the json and made it local. This led to an interesting problem because a FSharp project out of the box in Visual Studio does not support adding folders. To get around that, I went to my file system and added a folder
I then created files in the folder that correspond to the type providers
Next I unloaded the FS project and edited it
and in the project, I added those files to the ItemGroup that brings in all of the files from disk
- <ItemGroup>
- <Compile Include="SkyBiometryImageComparer.fs" />
- <None Include="Script.fsx" />
- <None Include="packages.config" />
- <None Include="app.config" />
- <None Include="SkyBiometryImageJson/AddTags.json" />
- <None Include="SkyBiometryImageJson/FaceDetection.json" />
- <None Include="SkyBiometryImageJson/FaceRecognition.json" />
- <None Include="SkyBiometryImageJson/FaceTraining.json" />
- <None Include="SkyBiometryImageJson/RemoveTags.json" />
- </ItemGroup>
Once the project is re-loaded, the folder and the files show up in Solution Explorer.
In each of the .json files, I pasted in the results from the service call that I did in the REPL. For example,
So then I swapped them out in the module and now I get the full graph
With that out of the way, I implemented a one to one match of supporting methods to the API calls:
- member this.DetectFace(imageUri:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/detect.json?urls=") |> ignore
- stringBuilder.Append(imageUri) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let faceDetection = skybiometryFaceDetection.Load(stringBuilder.ToString())
- faceDetection.Photos.[0].Tags.[0].Tid
- member this.SaveTag(tid:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/tags/save.json?uid=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&tids=") |> ignore
- stringBuilder.Append(tid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let tags = skybiometryAddTags.Load(stringBuilder.ToString())
- tags.Status
- member this.TrainFace()=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/train.json?uids=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let training = skybiometryFaceTraining.Load(stringBuilder.ToString())
- training.Status
- member this.RecognizeFace(imageUri:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/recognize.json?uids=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&urls=") |> ignore
- stringBuilder.Append(imageUri) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let recognition = skybiometryFaceRecognition.Load(stringBuilder.ToString())
- if recognition.Photos.[0].Tags |> Seq.length > 0 then
- recognition.Photos.[0].Tags.[0].Attributes.Face.Confidence
- else
- 0
- member this.RemoveTag(tid:string) =
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/tags/remove.json?tids=") |> ignore
- stringBuilder.Append(tid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let tagRemoval = skybiometryRemoveTags.Load(stringBuilder.ToString())
- tagRemoval.Status
The StringBuilder makes the code pretty verbose, but I understand that it is the most efficient way to aggregate strings so I went with it. Also note that this is happy path programming with no error checking and I assume that the json coming back is well formed.
In any event, with the supporting methods done, it was just an exercise of calling each one in turn:
- member this.CalculateFacialRecognitionConfidence(baseUri:string, comparisionUri:string) =
- let tid = this.DetectFace(baseUri)
- if this.SaveTag(tid) = "success" then
- if this.TrainFace() = "success" then
- let confidence = this.RecognizeFace(comparisionUri)
- this.RemoveTag(tid) |> ignore
- confidence
- else
- 0
- else
- 0
And hopping over to a C# unit test project, I can call the FSharp and run some tests. I created a test for each of the supporting methods and then three happy path tests for the CalculateFacialRecognitionConfidence: comparing the same image should be 100, comparing 2 completely unrelated images should be 0, and then our exercise of identifying Sarah Connor should be better than 50/50. Here is an example of the 100% match use case:
And here is the actual test of finding Sarah Connor.:
- var imageComparer = new SkyBiometryImageComparer(skyBiometryUri, uid, apiKey, apiSecret);
- String basePhotoUri = "http://img2.wikia.nocookie.net/__cb20080226002749/terminator/images/thumb/7/75/T2_sarah_polaroid.jpg/300px-T2_sarah_polaroid.jpg";
- String targetPhotoUri = "http://emandrews2.files.wordpress.com/2013/04/sarah-connor-scared.jpg";
- Int32 confidence = imageComparer.CalculateFacialRecognitionConfidence(basePhotoUri, targetPhotoUri);
- bool expected = true;
- bool actual = confidence > 50;
- Assert.AreEqual(expected, actual);
- }
It runs green (the actual value is 58% for identifying Sarah Connor).
My guess is that that Terminator would start shooting… first at Sarah Connor and then at C#. The C# project that Sky Biometry provides has 2 classes of 1600 and 1420 lines of code to get functional equivalence of the 93 lines of F# code that I wrote (and a vast majority of that code is dealing with the string builder).