Consuming Sky Biometry’s Image Recognition API
June 24, 2014 2 Comments
I was looking at Sky Biometry a couple of days ago to do some facial recognition. I am not sure how many other companies out there do this, but working with their API from F# was a great experience.
Sky Biometry uses both Json and Xml for API calls. They also have an example covering library in C#. I decided to use F# and use the json type provider to explore their api and the quality of their recognition algorithms. They have a good documentation page and it didn’t take very long to get my account up and running and then making API calls.
I thought it would be fun to use an example from the first Terminator movie when Ah-nold went around looking for Sarah Connor. I picked two images from the movie series. The first is this image of a photograph of Sarah Connor has taken at the end of the first movie. I know that the terminator didn’t have this photograph in the movie, but working off-script, pretend that the Terminator has that photograph. This second image is from the first movie when she is in a bar. So if the Terminator was powered by Sky Biometry and found Sarah in the bar, how close would it match her to the first photo?
The first thing I did was to fire up a FSharp project in Visual Studio and start scripting the API calls that I would need to do facial recognition.
With all of the calls working in the REPL, I then moved the code into my module. I declared the types at the top and then crated a class that could be consumed by external projects.
You will notice that the type providers are using a local copy of the json to infer the type in the module. I did run into the problem where the type provider using the web call was not capturing the full graph in the type definition, so I took the json and made it local. This led to an interesting problem because a FSharp project out of the box in Visual Studio does not support adding folders. To get around that, I went to my file system and added a folder
I then created files in the folder that correspond to the type providers
Next I unloaded the FS project and edited it
and in the project, I added those files to the ItemGroup that brings in all of the files from disk
- <ItemGroup>
- <Compile Include="SkyBiometryImageComparer.fs" />
- <None Include="Script.fsx" />
- <None Include="packages.config" />
- <None Include="app.config" />
- <None Include="SkyBiometryImageJson/AddTags.json" />
- <None Include="SkyBiometryImageJson/FaceDetection.json" />
- <None Include="SkyBiometryImageJson/FaceRecognition.json" />
- <None Include="SkyBiometryImageJson/FaceTraining.json" />
- <None Include="SkyBiometryImageJson/RemoveTags.json" />
- </ItemGroup>
Once the project is re-loaded, the folder and the files show up in Solution Explorer.
In each of the .json files, I pasted in the results from the service call that I did in the REPL. For example,
So then I swapped them out in the module and now I get the full graph
With that out of the way, I implemented a one to one match of supporting methods to the API calls:
- member this.DetectFace(imageUri:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/detect.json?urls=") |> ignore
- stringBuilder.Append(imageUri) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let faceDetection = skybiometryFaceDetection.Load(stringBuilder.ToString())
- faceDetection.Photos.[0].Tags.[0].Tid
- member this.SaveTag(tid:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/tags/save.json?uid=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&tids=") |> ignore
- stringBuilder.Append(tid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let tags = skybiometryAddTags.Load(stringBuilder.ToString())
- tags.Status
- member this.TrainFace()=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/train.json?uids=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let training = skybiometryFaceTraining.Load(stringBuilder.ToString())
- training.Status
- member this.RecognizeFace(imageUri:string)=
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/faces/recognize.json?uids=") |> ignore
- stringBuilder.Append(uid) |> ignore
- stringBuilder.Append("&urls=") |> ignore
- stringBuilder.Append(imageUri) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let recognition = skybiometryFaceRecognition.Load(stringBuilder.ToString())
- if recognition.Photos.[0].Tags |> Seq.length > 0 then
- recognition.Photos.[0].Tags.[0].Attributes.Face.Confidence
- else
- 0
- member this.RemoveTag(tid:string) =
- let stringBuilder = new StringBuilder()
- stringBuilder.Append(skybiometryUri) |> ignore
- stringBuilder.Append("/fc/tags/remove.json?tids=") |> ignore
- stringBuilder.Append(tid) |> ignore
- stringBuilder.Append("&api_key=") |> ignore
- stringBuilder.Append(apiKey) |> ignore
- stringBuilder.Append("&api_secret=") |> ignore
- stringBuilder.Append(apiSecret) |> ignore
- let tagRemoval = skybiometryRemoveTags.Load(stringBuilder.ToString())
- tagRemoval.Status
The StringBuilder makes the code pretty verbose, but I understand that it is the most efficient way to aggregate strings so I went with it. Also note that this is happy path programming with no error checking and I assume that the json coming back is well formed.
In any event, with the supporting methods done, it was just an exercise of calling each one in turn:
- member this.CalculateFacialRecognitionConfidence(baseUri:string, comparisionUri:string) =
- let tid = this.DetectFace(baseUri)
- if this.SaveTag(tid) = "success" then
- if this.TrainFace() = "success" then
- let confidence = this.RecognizeFace(comparisionUri)
- this.RemoveTag(tid) |> ignore
- confidence
- else
- 0
- else
- 0
And hopping over to a C# unit test project, I can call the FSharp and run some tests. I created a test for each of the supporting methods and then three happy path tests for the CalculateFacialRecognitionConfidence: comparing the same image should be 100, comparing 2 completely unrelated images should be 0, and then our exercise of identifying Sarah Connor should be better than 50/50. Here is an example of the 100% match use case:
And here is the actual test of finding Sarah Connor.:
- var imageComparer = new SkyBiometryImageComparer(skyBiometryUri, uid, apiKey, apiSecret);
- String basePhotoUri = "http://img2.wikia.nocookie.net/__cb20080226002749/terminator/images/thumb/7/75/T2_sarah_polaroid.jpg/300px-T2_sarah_polaroid.jpg";
- String targetPhotoUri = "http://emandrews2.files.wordpress.com/2013/04/sarah-connor-scared.jpg";
- Int32 confidence = imageComparer.CalculateFacialRecognitionConfidence(basePhotoUri, targetPhotoUri);
- bool expected = true;
- bool actual = confidence > 50;
- Assert.AreEqual(expected, actual);
- }
It runs green (the actual value is 58% for identifying Sarah Connor).
My guess is that that Terminator would start shooting… first at Sarah Connor and then at C#. The C# project that Sky Biometry provides has 2 classes of 1600 and 1420 lines of code to get functional equivalence of the 93 lines of F# code that I wrote (and a vast majority of that code is dealing with the string builder).
Pingback: F# Weekly #26, 2014 | Sergey Tihon's Blog
Well, I am enthusiastic about F# too, but I surely dislike this comparison approach that you have chosen. I checked their C# client code. First, they have lots of comments, at least 30% of all line count. Second, they have ifdefs because they target multiple platforms. Third, for some reason they decided to create REST client themselves. If they have used something like Refit, the code would be significantly smaller and this would be more identical to the type provider usage.