Halloween Project (Garage Of Mystery)

For this years Halloween, the kids and I decided to do something out of the opening scene of Indiana Jones, without the big rock.  We wanted to give kids a choice when they came to the house –> either get a small “fun” size candy bar or enter the garage of mystery for the chance of a full sized candy bar.  (Incidentally, whoever thought it would be a good idea to name the smallest candy size on earth “fun” obviously was never a kid.  When I was growing up, we called it four size, being that if took four of them to make a normal candy bar)

So if the kid wants to go into the garage of mystery, they have to get to the alter of snickers without the motion detector or the laser beam trip wires catching them.  The full-size Snickers would disappear if the kid was picked up by the Kinect motion detector or if they tripped too many beams.  In the diagram below, the red dots are the lasers crossing in front of the alter

image

The first thing we did was construct the alter.  

imageimage

Once the frame was set, we added a servo with a trap door to the top.  We control the servo via a Phidget Servo Controller with some basic code from the Phidget SDK (if the SDK, you know, had F# in it)

1 member this.servoController_Attached(args:Events.AttachEventArgs) = 2 let _servoController = args.Device :?> AdvancedServo 3 _servoController.servos.[0].Engaged <- true 4 _servoController.servos.[0].Position <- 110. 5 _isServoControllerReady <- true 6 7 member this.initializeController() = 8 _servoController.Attach.Add(this.servoController_Attached) 9 _servoController.``open``() 10 11 member this.moveController(position:float) = 12 if _isServoControllerReady then 13 _servoController.servos.[0].Position <- position 14

And you can see it in action here:

 

With the alter ready, we turned our attention to the laser trip wires.  We purchased a whole bunch of dollar store pen lasers and got some Phidget light sensors.  We then created a frame for both sides of the garage –> one to mount the laser and 1 to mount the light sensor

imageimage

And then we added some basic code from the Phidget SDK (if the SDK, you know, had F# in it)

1 member this.interfaceKit_Attached(args: Events.AttachEventArgs) = 2 let _interfaceKit = args.Device :?> InterfaceKit 3 _interfaceKit.sensors 4 |> Seq.cast 5 |> Seq.map(fun s -> s :> InterfaceKitAnalogSensor) 6 |> Seq.map(fun s -> s.Sensitivity <- 20) 7 |>ignore 8 _isInterfaceKitReady <- true 9 10 member this.interfaceKit_SensorChange(e: SensorChangeEventArgs ) = 11 let eventArgs = new LightSensorChangeEventArgs(e.Index,e.Value) 12 lightSensorChange.Trigger(eventArgs) 13 14 member this.initializeInterfaceKit() = 15 _interfaceKit.Attach.Add(this.interfaceKit_Attached) 16 _interfaceKit.SensorChange.Add(this.interfaceKit_SensorChange) 17 _interfaceKit.``open``() 18 _interfaceKit.waitForAttachment() 19

Note that we are trapping the event from the light sensor and then raising it up in our own event. 

With the light sensor in place, we turned our attention to the Kinect motion sensor.  I first considered Rob Miles’s ides to compare the different color frames to see if there was movement but because I am using F# and F# does not support pointers like C#, the performance was too choppy.  You can see the Stack Overflow thread here.  So I could have either jumped to over to C# or figure out a different way using F#.  I went with option B by using the skeleton frame, which has a Z index.  By comparing the Z index over time, I can see how fast a person is moving towards to alter.  The Kinect code was pretty much from the SDK (if the SDK, you know, had F# in it)

1 member this.kinectSensor_ColorFrameReady(args: ColorImageFrameReadyEventArgs) = 2 use colorFrame = args.OpenColorImageFrame() 3 if not (colorFrame = null) then 4 let colorData = Array.zeroCreate<byte> colorFrame.PixelDataLength 5 colorFrame.CopyPixelDataTo(colorData) 6 let width = colorFrame.Width 7 let height = colorFrame.Height 8 let stride = colorFrame.Width * colorFrame.BytesPerPixel 9 let eventArgs = new ColorDataReadyEventArgs(colorData,width,height,stride) 10 colorDataReady.Trigger(eventArgs) 11 () 12 13 member this.KinectSensor_SkeletonFrameReady(args: SkeletonFrameReadyEventArgs) = 14 use skeletonFrame = args.OpenSkeletonFrame() 15 if not (skeletonFrame = null) then 16 let skeletons = Array.zeroCreate<Skeleton> skeletonFrame.SkeletonArrayLength 17 skeletonFrame.CopySkeletonDataTo(skeletons) 18 let skeletons1 = skeletons |> Array.filter (fun s -> s.TrackingState = SkeletonTrackingState.Tracked) 19 if skeletons1.Length > 0 then 20 skeletonChanged.Trigger(skeletons1.[0]) 21 () 22 () 23 24 member this.initializeKinect() = 25 _kinectSensor.ColorStream.Enable() 26 _kinectSensor.ColorFrameReady.Subscribe(this.kinectSensor_ColorFrameReady) |> ignore 27 _kinectSensor.SkeletonStream.Enable(); 28 _kinectSensor.SkeletonFrameReady.Subscribe(this.KinectSensor_SkeletonFrameReady) |> ignore 29 _kinectSensor.Start() 30

In the UI, I then checked for the skeleton movement and if the person moved too fast, they would trigger the snickers trap door to open

1 void garage_SkeletonChanged(object sender, Skeleton skeleton) 2 { 3 if(_skeletonPoint.Z > 0) 4 { 5 float zDelta = _skeletonPoint.Z - skeleton.Position.Z; 6 if (zDelta >= _zDeltaThreshold) 7 { 8 _numberOfSkeletonHits += 1; 9 skeletonChangedProgressBar.Dispatcher.Invoke(new Action(() => skeletonChangedProgressBar.Value = _numberOfSkeletonHits)); 10 11 } 12 if(_numberOfSkeletonHits >= _numberOfHitsForAlarm) 13 { 14 _garage.moveController(_openPosition); 15 } 16 17 skeletonCanvas.Children.Clear(); 18 drawSkelton(skeleton); 19 } 20 _skeletonPoint = skeleton.Position; 21 } 22

With the result like this:

With the hard parts done, it was time to create a UI.  I went with C# here because I am using WPF and the support for WPF and the Kinect is best in C#.  I created a WPF application and built a UI

1 <Window x:Class="ChickenSoftware.Halloween.UI.MainWindow" 2 xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" 3 xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" 4 Title="MainWindow" Height="600" Width="650" > 5 <Grid Height="600" Width="650" VerticalAlignment="Top" HorizontalAlignment="Left" > 6 <Image Name ="kinectVideo" Height="480" Width="640" Margin="10,0,0,120" /> 7 <Canvas Name="skeletonCanvas" Height="480" Width="640" Margin="10,0,0,120" /> 8 <Rectangle x:Name="sensor0Rectange" Fill="Lime" HorizontalAlignment="Left" Height="40" Margin="10,480,0,0" Stroke="Black" VerticalAlignment="Top" Width="82"/> 9 <Rectangle x:Name="sensor1Rectange" Fill="Lime" HorizontalAlignment="Left" Height="40" Margin="189,480,0,0" Stroke="Black" VerticalAlignment="Top" Width="83"/> 10 <Rectangle x:Name="sensor2Rectange" Fill="Lime" HorizontalAlignment="Left" Height="40" Margin="367,480,0,0" Stroke="Black" VerticalAlignment="Top" Width="82"/> 11 <Rectangle x:Name="sensor3Rectange" Fill="Lime" HorizontalAlignment="Left" Height="40" Margin="537,480,0,0" Stroke="Black" VerticalAlignment="Top" Width="83"/> 12 <ProgressBar x:Name="skeletonChangedProgressBar" HorizontalAlignment="Left" Height="40" Margin="10,528,0,0" VerticalAlignment="Top" Width="392" Foreground="#FFB00606"/> 13 <Button x:Name="resetButton" Content="Reset" HorizontalAlignment="Left" Height="37" Margin="537,528,0,0" 14 VerticalAlignment="Top" Width="83" Click="resetButton_Click"/> 15 <Button x:Name="EjectButton" Content="Eject!" HorizontalAlignment="Left" Height="37" Margin="429,528,0,0" 16 VerticalAlignment="Top" Width="83" Click="EjectButton_Click"/> 17 </Grid> 18 </Window>

I then added some code to handle all of the events that the Phidgets and Kinect are sending to the UI and do something useful with it.    For example, the light sensor change fills in the appropriate box on the screen (note that Phidgets use a different thread so you need to use Dispatcher.Invoke)

1 void garage_LightSensorChange(object sender, LightSensorChangeEventArgs args) 2 { 3 switch (args.SensorIndex) 4 { 5 case 0: 6 if (args.SensorIndex == 0 && args.LightAmount < _lightSensorThreshold) 7 { 8 _sensor0Tripped = true; 9 sensor0Rectange.Dispatcher.Invoke(new Action(()=>sensor0Rectange.Fill = new SolidColorBrush(Colors.Red))); 10 } 11 break; 12 case 1: 13 14 if (args.SensorIndex == 1 && args.LightAmount < _lightSensorThreshold) 15 { 16 _sensor1Tripped = true; 17 sensor1Rectange.Dispatcher.Invoke(new Action(() => sensor1Rectange.Fill = new SolidColorBrush(Colors.Red))); 18 } 19 break; 20 case 2: 21 if (args.SensorIndex == 2 && args.LightAmount < _lightSensorThreshold) 22 { 23 _sensor2Tripped = true; 24 sensor2Rectange.Dispatcher.Invoke(new Action(() => sensor2Rectange.Fill = new SolidColorBrush(Colors.Red))); 25 } 26 break; 27 case 3: 28 if (args.SensorIndex == 3 && args.LightAmount < _lightSensorThreshold) 29 { 30 _sensor3Tripped = true; 31 sensor3Rectange.Dispatcher.Invoke(new Action(() => sensor3Rectange.Fill = new SolidColorBrush(Colors.Red))); 32 } 33 break; 34 } 35 CheckForIntruder(); 36 } 37

With this associated method

1 private void CheckForIntruder() 2 { 3 Int32 numberOfSensorsTripped = 0; 4 5 if (_sensor0Tripped == true) 6 numberOfSensorsTripped += 1; 7 if (_sensor1Tripped == true) 8 numberOfSensorsTripped += 1; 9 if (_sensor2Tripped == true) 10 numberOfSensorsTripped += 1; 11 if (_sensor3Tripped == true) 12 numberOfSensorsTripped += 1; 13 if (numberOfSensorsTripped >= _numberOfSensorsForAlarm ) 14 _garage.moveController(0); 15 16 }

This code would be so much better in F# using pattern matching but b/c of the UI code, I kept it in C#.  I might refactor the non-visual components later.  The one thing that did surprise me is that how the Kinect V1 SDK makes it very hard to separate the UI components from the domain components.  Phidgets, on the other hand, had a very clear separation of concerns

So we then added some sides to the alter of snickers

image

And we were good to go.  The final result looks like this (the smoke machine was an added touch):

All of the code is on github here.  If you create your own garage of mystery, please drop me a line –> I would love to see what other makers come up with. 

Parsing Microsoft MVP Pages Part 2

As a final piece of the Terminator App (V1) is to associate MVP Names to the pictures I uploaded to Sky Biometry via the MVPId.  I already blogged about how to parse the MVP search page and get the photos for sky biometry and this was a similar task.  The key for each photo is the MVPId.  Once a person’s photo is sent to Sky Biometry, the response is the photo used to match and their Id.  Ideally, we would also see the person’s name

The first step was to parse the MVP list the same way I did before:

1 let getPageContents(pageNumber:int) = 2 let uri = new Uri("http://mvp.microsoft.com/en-us/search-mvp.aspx?lo=United+States&sl=0&browse=False&sc=s&ps=36&pn=" + pageNumber.ToString()) 3 let request = WebRequest.Create(uri) 4 request.Method <- "GET" 5 let response = request.GetResponse() 6 let stream = response.GetResponseStream() 7 let reader = new StreamReader(stream) 8 reader.ReadToEnd()

Next, once the page is laoded, I needed a way of parsing the name.  I used the tag like this <a href="/en-us/mvp/Jamie%20Dixon-5000814" to identify MVPs.  I then layered in a regex like this

1 let getMVPInfoFromPageContents(pageContents:string) = 2 let pattern = "(us\\/mvp\\/)([A-Z])(.+?)(-)(\\d+)" 3 let matchCollection = Regex.Matches(pageContents, pattern) 4 matchCollection 5 |> Seq.cast 6 |> Seq.map(fun (m:Match) -> m.Value) 7 |> Seq.map(fun s -> s.Split('-')) 8 |> Seq.map(fun a -> a.[0],a.[1]) 9 |> Seq.map(fun (n,i) -> n.Substring(7),n,i) 10 |> Seq.map(fun (n,ln,i) -> n.Replace("%20"," "),ln,i) 11 |> Seq.map(fun (n,ln,i) -> n,"mvp.microsoft.com/en-"+ln+"-"+i,i) 12 |> Seq.distinctBy(fun (n,uri,i) -> n) 13

And this is a great site in terms of building regexs.

With the list parsed, I then put each page together and saved it to disk

1 let getGetMVPInfos(pageNumber: int) = 2 let pageContents = getPageContents(pageNumber) 3 getMVPInfoFromPageContents pageContents 4 5 let pageList = [1..17] 6 let mvpInfos = pageList 7 |>Seq.collect(fun i -> getGetMVPInfos(i)) 8 9 let outFile = new StreamWriter(@"c:\data\mvpList.csv") 10 mvpInfos |> Seq.iter(fun (n,uri,i) -> outFile.WriteLine(sprintf "%s,%s,%s" n uri i)) 11 outFile.Flush 12 outFile.Close()

And with that in place, the terminator can use the FSharp csv provider to load the list (and also find Esther Lee, the one non-MVP the terminiator is scanning for)

1 namespace ChickenSoftware.Terminator.Core 2 3 open System 4 open FSharp.Data 5 6 type nameMappingContext = CsvProvider<"C:/data/mvpList.csv"> 7 8 type LocalFileSystemMvpProvider () = 9 member this.GetMVPInfo (mvpId:int) = 10 if mvpId = 1 then 11 new MVPInfo(1,"Esther Lee","NA","https://pbs.twimg.com/profile_images/2487129558/3DSC_0379.jpg") 12 else 13 let nameList = nameMappingContext.Load("C:/data/mvpList.csv") 14 let foundInfo = nameList.Rows 15 |> Seq.filter(fun r -> r.``21505`` = mvpId.ToString()) 16 |> Seq.map(fun r -> new MVPInfo(Int32.Parse(r.``21505``),r.``Bill Jelen``, 17 r.``mvp.microsoft.com/en-us/mvp/Bill%20Jelen-21505``, 18 "http://mvp.microsoft.com/private/en-us/PublicProfile/Photo/" + r.``21505``)) 19 |> Seq.toArray 20 if foundInfo.Length > 0 then 21 foundInfo.[0] 22 else 23 new MVPInfo(-1,"None","None","None") 24

And then compare the 2 photos and get the person’s name

1 LocalFileSystemMvpProvider mvpProvider = new LocalFileSystemMvpProvider(); 2 var mvpInfo = mvpProvider.GetMVPInfo(mvpId); 3 4 compareImage.Source = new BitmapImage(new Uri(mvpInfo.PhotoUri)); 5 facialRecognitionTextBox.Text = mvpInfo.FullName + " identified with a " + matchValue.Confidence + "% confidence."; 6

And it (kinda works)

image

and kinda not

image

Parsing Microsoft MVP Pages and Uploading Photos to Sky Biometry

As a piece of the Terminator project that I am bringing to the MVP Summit, I wanted to load in all of the MVP photographs to Sky Biometry and if a person matches the photo at a high level, terminate them.  I asked my Microsoft contact if I could get all of the MVP photos to load into the app and they politely told me no.

Not being one who takes no lightly, I decided to see if I could load the photos from the MVP website.  Each MVP has a profile photo like here and all of the MVPs are listed here with their MVP IDs specified.  So if I can get the Id from the search page and then create a Uri to the photo, I can then load it into Sky Biometry.

I first created a new FSharp project and fired up a script window.  I created a function that gets the entire contents of a page with the only variable being the index number of the pagination.

1 let getPageContents(pageNumber:int) = 2 let uri = new Uri("http://mvp.microsoft.com/en-us/search-mvp.aspx?lo=United+States&sl=0&browse=False&sc=s&ps=36&pn=" + pageNumber.ToString()) 3 let request = WebRequest.Create(uri) 4 request.Method <- "GET" 5 let response = request.GetResponse() 6 let stream = response.GetResponseStream() 7 let reader = new StreamReader(stream) 8 reader.ReadToEnd() 9

I then parsed the page for all instances of the MVPId.  Fortunately, I found this post that helped me understand how the pattern match works in .NET.  Note that the regex for the tag mvpid=123456 is “mvpid=\d+”

1 let getMVPIdsFromPageContents(pageContents:string) = 2 let pattern = "mvpid=\d+" 3 let matchCollection = Regex.Matches(pageContents, pattern) 4 matchCollection 5 |> Seq.cast 6 |> Seq.map(fun (m:Match) -> m.Value) 7 |> Seq.map(fun s -> s.Split('=')) 8 |> Seq.map(fun a -> a.[1]) 9

With that out of the way, I could get a Seq of all MVP IDs (at least from America and then collect each of the pages together:

1 let getGetMVPIds(pageNumber: int) = 2 let pageContents = getPageContents(pageNumber) 3 getMVPIdsFromPageContents pageContents 4 5 let pageList = [1..17] 6 let mvpIds = pageList 7 |>Seq.collect(fun i -> getGetMVPIds(i)) 8

so far so good:

image

I then could create a method that generates the MVP Photo Uri:

1 let getMvpImageUri(mvpId: int) = 2 new Uri("http://mvp.microsoft.com/private/en-us/PublicProfile/Photo/" + mvpId.ToString()) 3

With that out of the way, it was time to point the photos to Sky Biometry for facial detection and tagging.  I used the code found in this post with a couple of changes to account that a face might not be found in the photo (hence the choice type) and that bad things might happen (like too big of a photo)

1 type skybiometryFaceDetection = JsonProvider<".\SkyBiometryImageJson\FaceDetection.json"> 2 type skybiometryAddTags = JsonProvider<".\SkyBiometryImageJson\AddTags.json"> 3 type skybiometryFaceTraining = JsonProvider<".\SkyBiometryImageJson\FaceTraining.json"> 4 5 let detectFace (imageUri:string) = 6 let stringBuilder = new StringBuilder() 7 stringBuilder.Append(skyBiometryUri) |> ignore 8 stringBuilder.Append("/fc/faces/detect.json?urls=") |> ignore 9 stringBuilder.Append(imageUri) |> ignore 10 stringBuilder.Append("&api_key=") |> ignore 11 stringBuilder.Append(skyBiometryApiKey) |> ignore 12 stringBuilder.Append("&api_secret=") |> ignore 13 stringBuilder.Append(skyBiometryApiSecret) |> ignore 14 try 15 let faceDetection = skybiometryFaceDetection.Load(stringBuilder.ToString()) 16 if faceDetection.Photos.[0].Tags.Length > 0 then 17 Some faceDetection.Photos.[0].Tags.[0].Tid 18 else 19 None 20 with | :? System.Exception -> None 21

I then added the other two methods to tag and recognize

1 let saveTag(uid:string, tid:string)= 2 let stringBuilder = new StringBuilder() 3 stringBuilder.Append(skyBiometryUri) |> ignore 4 stringBuilder.Append("/fc/tags/save.json?uid=") |> ignore 5 stringBuilder.Append(uid) |> ignore 6 stringBuilder.Append("&tids=") |> ignore 7 stringBuilder.Append(tid) |> ignore 8 stringBuilder.Append("&api_key=") |> ignore 9 stringBuilder.Append(skyBiometryApiKey) |> ignore 10 stringBuilder.Append("&api_secret=") |> ignore 11 stringBuilder.Append(skyBiometryApiSecret) |> ignore 12 let tags = skybiometryAddTags.Load(stringBuilder.ToString()) 13 tags.Status 14 15 let trainFace(uid:string)= 16 let stringBuilder = new StringBuilder() 17 stringBuilder.Append(skyBiometryUri) |> ignore 18 stringBuilder.Append("/fc/faces/train.json?uids=") |> ignore 19 stringBuilder.Append(uid) |> ignore 20 stringBuilder.Append("&api_key=") |> ignore 21 stringBuilder.Append(skyBiometryApiKey) |> ignore 22 stringBuilder.Append("&api_secret=") |> ignore 23 stringBuilder.Append(skyBiometryApiSecret) |> ignore 24 let training = skybiometryFaceTraining.Load(stringBuilder.ToString()) 25 training.Status 26

Upon reflection, this would have been a perfect place for Scott W’s ROP, but I just created a covering function

1 let saveToSkyBiometry(mvpId:string, imageUri:string) = 2 let tid = detectFace(imageUri) 3 match tid with 4 | Some x -> saveTag(mvpId + "@terminatorChicken",x) |> ignore 5 trainFace(mvpId + "@terminatorChicken") 6 | None -> "Failure" 7 8 let results = mvpIds 9 |> Seq.map(fun mvpId -> mvpId, getMvpImageUri(Int32.Parse(mvpId))) 10

I then created a Seq.Map to call all of the photos in order but I quickly ran into this:

Capture

So I changed my Seq.Map to a Loop so I could throttle the requests:

1 for (mvpId,uri) in results do 2 let result= saveToSkyBiometry(mvpId, uri.ToString()) 3 printfn "%s" result 4 Thread.Sleep(TimeSpan.FromMinutes(1.)) 5

And sure enough

Capture1Capture2

And you can see the load every hour

Capture3

You can see the full code here.

Terminator Program: With The Kinect 2

I got my hands on a Kinect2 last week so I decided to re-write the Terminator program using the Kinect2 api.  Microsoft made some major changes to the domain api (no more skeleton frame, now using a body) but the underlying logic is still the same.  Therefore, it was reasonably easy to port the code.  There is plenty of places in the V2 api that are not documented yet but because I did some work in the V1 api, I could still get things done.  For example, the V2 api documentation and code samples use event handlers to work with any new frame that arrives from the Kinect.  This lead to some pretty laggy code.  However, by using polling on a second thread, I was able to get the performance to where it needs to be.  Also, a minor annoyance is that you have to use Win8 with the Kinect 2.

So here is the Terminator application, Gen 2.  The UI is still just a series of UI controls:

1 <Window x:Class="ChickenSoftware.Terminator.Gen2.UI.MainWindow" 2 xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" 3 xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" 4 Title="MainWindow" Height="700" Width="650" Loaded="Window_Loaded"> 5 <Canvas Width="650" Height="700"> 6 <Image x:Name="kinectColorImage" Width="640" Height="480" /> 7 <Canvas x:Name="bodyCanvas" Width="640" Height="480" /> 8 <Button x:Name="takePhotoButton" Canvas.Left="10" 9 Canvas.Top="485" Height="40" Width="125" Click="takePhotoButton_Click">Take Photo</Button> 10 <TextBox x:Name="facialRecognitionTextBox" Canvas.Left="10" Canvas.Top="540" Width="125" Height="40" FontSize="8" /> 11 <Image x:Name="currentImage" Canvas.Left="165" Canvas.Top="485" Height="120" Width="170" /> 12 <Image x:Name="compareImage" Canvas.Left="410" Canvas.Top="485" Height="120" Width="170" /> 13 </Canvas> 14 </Window> 15

In the code behind, I set up some class-level variables.  The only real difference is that the photo is moving from 640/480 to 1920/1080:

1 KinectSensor _kinectSensor = null; 2 Boolean _isKinectDisplayActive = false; 3 Boolean _isTakingPicture = false; 4 WriteableBitmap _videoBitmap = null; 5 Int32 _width = 1920; 6 Int32 _height = 1080;

When the page is loaded, a new thread is spun up that handles rendering the Kinect data:

1 private void Window_Loaded(object sender, RoutedEventArgs e) 2 { 3 SetUpKinect(); 4 _isKinectDisplayActive = true; 5 Thread videoThread = new Thread(new ThreadStart(DisplayKinectData)); 6 videoThread.Start(); 7 }

Setting up the Kinect is a bit different (KinectSensor.GetDefault()) but intuitive:

1 internal void SetUpKinect() 2 { 3 _videoBitmap = new WriteableBitmap(1920, 1080, 96, 96, PixelFormats.Bgr32, null); 4 _kinectSensor = KinectSensor.GetDefault(); 5 _kinectSensor.Open(); 6 }

With the big change in the DisplayKinectData method

1 internal void DisplayKinectData() 2 { 3 var colorFrameSource = _kinectSensor.ColorFrameSource; 4 var colorFrameReader = colorFrameSource.OpenReader(); 5 var bodyFrameSource = _kinectSensor.BodyFrameSource; 6 var bodyFrameReader = bodyFrameSource.OpenReader(); 7 8 while (_isKinectDisplayActive) 9 { 10 using (var colorFrame = colorFrameReader.AcquireLatestFrame()) 11 { 12 if (colorFrame == null) continue; 13 using (var bodyFrame = bodyFrameReader.AcquireLatestFrame()) 14 { 15 if (bodyFrame == null) continue; 16 //Color 17 var colorFrameDescription = colorFrame.ColorFrameSource.CreateFrameDescription(ColorImageFormat.Bgra); 18 var bytesPerPixel = colorFrameDescription.BytesPerPixel; 19 var frameSize = colorFrameDescription.Width * colorFrameDescription.Height * bytesPerPixel; 20 var colorData = new byte[frameSize]; 21 if (colorFrame.RawColorImageFormat == ColorImageFormat.Bgra) 22 { 23 colorFrame.CopyRawFrameDataToArray(colorData); 24 } 25 else 26 { 27 colorFrame.CopyConvertedFrameDataToArray(colorData, ColorImageFormat.Bgra); 28 } 29 //Body 30 var bodies = new Body[bodyFrame.BodyCount]; 31 bodyFrame.GetAndRefreshBodyData(bodies); 32 var trackedBody = bodies.FirstOrDefault(b => b.IsTracked); 33 34 //Update 35 if (_isTakingPicture) 36 { 37 Dispatcher.Invoke(new Action(() => AnalyzePhoto(colorData))); 38 } 39 else 40 { 41 if (trackedBody == null) 42 { 43 Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData))); 44 } 45 else 46 { 47 Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData, trackedBody))); 48 } 49 } 50 } 51 } 52 } 53 } 54

I am using a frameReader and frameSource for both the color (the video image) and the body (the old skeleton).  The method to get the frame has changed –> I am using AquireLatestFrame().  It is nice that we are still using byte[] to hold the data.

With the data in the byte[] arrays, the display is updated.  There are two UpdateDisplay methods:

1 internal void UpdateDisplay(byte[] colorData) 2 { 3 var rectangle = new Int32Rect(0, 0, _width, _height); 4 _videoBitmap.WritePixels(rectangle, colorData, _width * 4, 0); 5 kinectColorImage.Source = _videoBitmap; 6 } 7 8 internal void UpdateDisplay(byte[] colorData, Body body) 9 { 10 UpdateDisplay(colorData); 11 var drawingGroup = new DrawingGroup(); 12 using (var drawingContext = drawingGroup.Open()) 13 { 14 var headPosition = body.Joints[JointType.Head].Position; 15 if (headPosition.Z < 0) 16 { 17 headPosition.Z = 0.1f; 18 } 19 var adjustedHeadPosition = _kinectSensor.CoordinateMapper.MapCameraPointToDepthSpace(headPosition); 20 bodyCanvas.Children.Clear(); 21 Rectangle headTarget = new Rectangle(); 22 headTarget.Fill = new SolidColorBrush(Colors.Red); 23 headTarget.Width = 10; 24 headTarget.Height = 10; 25 Canvas.SetLeft(headTarget, adjustedHeadPosition.X + 75); 26 Canvas.SetTop(headTarget, adjustedHeadPosition.Y); 27 bodyCanvas.Children.Add(headTarget); 28 } 29 }

This is pretty much like V1 where the video byte[] is being written to a WritableBitmap and the body is being drawn on the canvas.  Note that like V1, the coordinates of the body need to be adjusted to the color frame.  The API has a series of overloads that makes it easy to do the translation.

With the display working, I added in taking the photo, sending it to Azure blob storage, and having Sky Biometry analyze the results.  This code is identical to V1 with the connection strings for Azure and Sky Biometry broken out into their own methods and the sensitive values placed into the app.config:

1 internal void AnalyzePhoto(byte[] colorData) 2 { 3 var bitmapSource = BitmapSource.Create(_width, _height, 96, 96, PixelFormats.Bgr32, null, colorData, _width * 4); 4 JpegBitmapEncoder encoder = new JpegBitmapEncoder(); 5 encoder.Frames.Add(BitmapFrame.Create(bitmapSource)); 6 var photoImage = UploadPhotoImage(encoder); 7 CompareImages(photoImage); 8 _isTakingPicture = false; 9 }

1 internal PhotoImage UploadPhotoImage(JpegBitmapEncoder encoder) 2 { 3 using(MemoryStream memoryStream = new MemoryStream()) 4 { 5 encoder.Save(memoryStream); 6 var photoImage = new PhotoImage(Guid.NewGuid(), memoryStream.ToArray()); 7 8 var customerUniqueId = new Guid(ConfigurationManager.AppSettings["customerUniqueId"]); 9 var connectionString = GetAzureConnectionString(); 10 11 IPhotoImageProvider provider = new AzureStoragePhotoImageProvider(customerUniqueId, connectionString); 12 provider.InsertPhotoImage(photoImage); 13 memoryStream.Close(); 14 return photoImage; 15 } 16 }

1 internal void CompareImages(PhotoImage photoImage) 2 { 3 String skyBiometryUri = ConfigurationManager.AppSettings["skyBiometryUri"]; 4 String uid = ConfigurationManager.AppSettings["skyBiometryUid"]; 5 String apiKey = ConfigurationManager.AppSettings["skyBiometryApiKey"]; 6 String apiSecret = ConfigurationManager.AppSettings["skyBiometryApiSecret"]; 7 var imageComparer = new SkyBiometryImageComparer(skyBiometryUri, uid, apiKey, apiSecret); 8 9 String basePhotoUri = GetBasePhotoUri(); 10 String targetPhotoUri = GetTargetPhotoUri(photoImage); 11 currentImage.Source = new BitmapImage(new Uri(targetPhotoUri)); 12 compareImage.Source = new BitmapImage(new Uri(basePhotoUri)); 13 14 var matchValue = imageComparer.CalculateFacialRecognitionConfidence(basePhotoUri, targetPhotoUri); 15 facialRecognitionTextBox.Text = "Match Value Confience is: " + matchValue.Confidence.ToString(); 16 }

With the code in place, I can the run the Terminator Gen 2:

image

I think I am doing the Sky Biometry recognition incorrectly so I will look at that later.  In any event, working with the Kinect V2 was fairly easy because it was close enough to the V1 that the concepts could translate.  I look forward to adding the targeting system this weekend!!!

Terminator Program: Part 2

Following up on my last post, I decided to send the entire photograph to Sky Biometry and have them parse the photograph and identify individual people.  This ability is built right into their API.  For example, if you pass them this picture, you get the following json back.

image

I added the red highlight to show that Sky Biometry can recognize multiple people (it is an array of uids) and that each face tag has a center.x and center:y.  Reading the API documentation, this point is center of the face tag point and their point is a percentage of the photo width.

image

So I need to translate the center point of the skeleton from the Kinect to eqiv center point of the sky biometry recognition output and I should be able to identify individual people within the Kinect’s field of vision.  Going back to the Kinect code, I ditched the DrawBoxAroundHead method and altered the UpdateDisplay method like so

  1. private void UpdateDisplay(byte[] colorData, Skeleton[] skeletons)
  2. {
  3.     if (_videoBitmap == null)
  4.     {
  5.         _videoBitmap = new WriteableBitmap(640, 480, 96, 96, PixelFormats.Bgr32, null);
  6.     }
  7.     _videoBitmap.WritePixels(new Int32Rect(0, 0, 640, 480), colorData, 640 * 4, 0);
  8.     kinectColorImage.Source = _videoBitmap;
  9.     var selectedSkeleton = skeletons.FirstOrDefault(s => s.TrackingState == SkeletonTrackingState.Tracked);
  10.     if (selectedSkeleton != null)
  11.     {
  12.         var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  13.         var adjustedHeadPosition =
  14.             _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  15.         var adjustedSkeletonPosition = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(selectedSkeleton.Position, ColorImageFormat.RgbResolution640x480Fps30);
  16.  
  17.         skeletonCanvas.Children.Clear();
  18.         Rectangle headRectangle = new Rectangle();
  19.         headRectangle.Fill = new SolidColorBrush(Colors.Blue);
  20.         headRectangle.Width = 10;
  21.         headRectangle.Height = 10;
  22.         Canvas.SetLeft(headRectangle, adjustedHeadPosition.X);
  23.         Canvas.SetTop(headRectangle, adjustedHeadPosition.Y);
  24.         skeletonCanvas.Children.Add(headRectangle);
  25.  
  26.         Rectangle skeletonRectangle = new Rectangle();
  27.         skeletonRectangle.Fill = new SolidColorBrush(Colors.Red);
  28.         skeletonRectangle.Width = 10;
  29.         skeletonRectangle.Height = 10;
  30.         Canvas.SetLeft(skeletonRectangle, adjustedHeadPosition.X);
  31.         Canvas.SetTop(skeletonRectangle, adjustedHeadPosition.Y);
  32.         skeletonCanvas.Children.Add(skeletonRectangle);
  33.  
  34.         String skeletonInfo = headPosition.X.ToString() + " : " + headPosition.Y.ToString() + " — ";
  35.         skeletonInfo = skeletonInfo + adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
  36.         skeletonInfo = skeletonInfo + adjustedSkeletonPosition.X.ToString() + " : " + adjustedSkeletonPosition.Y.ToString();
  37.  
  38.         skeletonInfoTextBox.Text = skeletonInfo;
  39.  
  40.     }
  41. }

Notice that there are two rectangles because I was not sure if the Head.Position or the Skeleton.Position would match SkyBiometry.  Turns out that I want the Head.Position for SkyBiometry (besides, the terminator would want head shots only)

image

So I ditched the Skeleton.Position.  I then needed a way to translate the Head.Posotion.X to SkyBiometry.X and Head.Posotion.Y to SkyBiometry.Y.  Fortunately, I know the size of each photograph (640 X 480) so calculating the percent is an exercise of altering UpdateDisplay:

  1. private void UpdateDisplay(byte[] colorData, Skeleton[] skeletons)
  2. {
  3.     Int32 photoWidth = 640;
  4.     Int32 photoHeight = 480;
  5.  
  6.     if (_videoBitmap == null)
  7.     {
  8.         _videoBitmap = new WriteableBitmap(photoWidth, photoHeight, 96, 96, PixelFormats.Bgr32, null);
  9.     }
  10.     _videoBitmap.WritePixels(new Int32Rect(0, 0, photoWidth, photoHeight), colorData, photoWidth * 4, 0);
  11.     kinectColorImage.Source = _videoBitmap;
  12.     var selectedSkeleton = skeletons.FirstOrDefault(s => s.TrackingState == SkeletonTrackingState.Tracked);
  13.     if (selectedSkeleton != null)
  14.     {
  15.         var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  16.         var adjustedHeadPosition =
  17.             _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  18.  
  19.         skeletonCanvas.Children.Clear();
  20.         Rectangle headRectangle = new Rectangle();
  21.         headRectangle.Fill = new SolidColorBrush(Colors.Blue);
  22.         headRectangle.Width = 10;
  23.         headRectangle.Height = 10;
  24.         Canvas.SetLeft(headRectangle, adjustedHeadPosition.X);
  25.         Canvas.SetTop(headRectangle, adjustedHeadPosition.Y);
  26.         skeletonCanvas.Children.Add(headRectangle);
  27.  
  28.         var skyBiometryX = ((float)adjustedHeadPosition.X / photoWidth)*100;
  29.         var skyBioMetryY = ((float)adjustedHeadPosition.Y / photoHeight)*100;
  30.  
  31.         String skeletonInfo = adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
  32.         skeletonInfo = skeletonInfo + Math.Round(skyBiometryX,2).ToString() + " : " + Math.Round(skyBioMetryY,2).ToString();
  33.  
  34.         skeletonInfoTextBox.Text = skeletonInfo;
  35.  
  36.     }

And so now I have

image

The next step is to get the Kinect photo to Sky Biometry.  I decided to use Azure Blob Storage as my intermediately location.  I updated the architectural diagram like so:

image

At this point, it made sense to move the project over to F# so I could better concentrate on the work that needs to be done and also getting the important code out of the UI code behind.  I fired up a F# project in my solution added a couple different implementations of Storing Photos.  To keep things consistent, I created a data structure and an interface:

  1. namespace ChickenSoftware.Terminator.Core
  2.  
  3. open System
  4.  
  5. type public PhotoImage (uniqueId:Guid, imageBytes:byte[]) =
  6.     member this.UniqueId = uniqueId
  7.     member this.ImageBytes = imageBytes
  8.  
  9. type IPhotoImageProvider =
  10.     abstract member InsertPhotoImage : PhotoImage -> unit
  11.     abstract member DeletePhotoImage : Guid -> unit
  12.     abstract member GetPhotoImage : Guid -> PhotoImage

My 1st stop was to replicate what Miles did with the Save File Dialog box with a File System Provider.  It was very much like a C# implementation:

  1. namespace ChickenSoftware.Terminator.Core
  2.  
  3. open System
  4. open System.IO
  5. open System.Drawing
  6. open System.Drawing.Imaging
  7.  
  8. type LocalFileSystemPhotoImageProvider(folderPath: string) =
  9.  
  10.     member this.GetPhotoImageUri(uniqueIdentifier: Guid) =
  11.         let fileName = uniqueIdentifier.ToString() + ".jpg"
  12.         Path.Combine(folderPath, fileName)
  13.  
  14.     interface IPhotoImageProvider with
  15.         member this.InsertPhotoImage(photoImage: PhotoImage) =
  16.             let fullPath = this.GetPhotoImageUri(photoImage.UniqueId)
  17.             use memoryStream = new MemoryStream(photoImage.ImageBytes)
  18.             let image = Image.FromStream(memoryStream)
  19.             image.Save(fullPath)
  20.  
  21.         member this.DeletePhotoImage(uniqueIdentifier: Guid) =
  22.             let fullPath = this.GetPhotoImageUri(uniqueIdentifier)
  23.             File.Delete(fullPath)        
  24.  
  25.         member this.GetPhotoImage(uniqueIdentifier: Guid) =
  26.             let fullPath = this.GetPhotoImageUri(uniqueIdentifier)
  27.             use fileStream = new FileStream(fullPath,FileMode.Open)
  28.             let image = Image.FromStream(fileStream)
  29.             use memoryStream = new MemoryStream()
  30.             image.Save(memoryStream,ImageFormat.Jpeg)
  31.             new PhotoImage(uniqueIdentifier, memoryStream.ToArray())

To call the save method, I altered the SavePhoto method in the C# project to use a MemoryStream and not a FileStream:

  1. private void SavePhoto(byte[] colorData)
  2. {
  3.     var bitmapSource = BitmapSource.Create(640, 480, 96, 96, PixelFormats.Bgr32, null, colorData, 640 * 4);
  4.     JpegBitmapEncoder encoder = new JpegBitmapEncoder();
  5.     encoder.Frames.Add(BitmapFrame.Create(bitmapSource));
  6.     using (MemoryStream memoryStream = new MemoryStream())
  7.     {
  8.         encoder.Save(memoryStream);
  9.         PhotoImage photoImage = new PhotoImage(Guid.NewGuid(), memoryStream.ToArray());
  10.  
  11.         String folderUri = @"C:\Data";
  12.         IPhotoImageProvider provider = new LocalFileSystemPhotoImageProvider(folderUri);
  13.  
  14.         provider.InsertPhotoImage(photoImage);
  15.         memoryStream.Close();
  16.     }
  17.     _isTakingPicture = false;
  18. }

And sure enough, it saves the photo to disk:

image

One problem that took me 20 minutes to uncover is that if you get your file system path wrong, you get the unhelpful exception:

image

This has been well-bitched about on stack overflow so I won’t comment further. 

With the file system up and running, I turned my attention to Azure.  Like the File System provider, it is very close to a C# implementation

  1. namespace ChickenSoftware.Terminator.Core
  2.  
  3. open System
  4. open System.IO
  5. open Microsoft.WindowsAzure.Storage
  6. open Microsoft.WindowsAzure.Storage.Blob
  7.  
  8. type AzureStoragePhotoImageProvider(customerUniqueId: Guid, connectionString: string) =
  9.  
  10.     member this.GetBlobContainer(blobClient:Blob.CloudBlobClient) =
  11.         let container = blobClient.GetContainerReference(customerUniqueId.ToString())
  12.         if not (container.Exists()) then
  13.             container.CreateIfNotExists() |> ignore
  14.             let permissions = new BlobContainerPermissions()
  15.             permissions.PublicAccess <- BlobContainerPublicAccessType.Blob
  16.             container.SetPermissions(permissions)
  17.         container
  18.  
  19.     member this.GetBlockBlob(uniqueIdentifier: Guid) =
  20.         let storageAccount = CloudStorageAccount.Parse(connectionString)
  21.         let blobClient = storageAccount.CreateCloudBlobClient()
  22.         let container = this.GetBlobContainer(blobClient)
  23.         let photoUri = this.GetPhotoImageUri(uniqueIdentifier)
  24.         container.GetBlockBlobReference(photoUri)
  25.  
  26.     member this.GetPhotoImageUri(uniqueIdentifier: Guid) =
  27.         uniqueIdentifier.ToString() + ".jpg"
  28.  
  29.     interface IPhotoImageProvider with
  30.         member this.InsertPhotoImage(photoImage: PhotoImage) =
  31.             let blockBlob = this.GetBlockBlob(photoImage.UniqueId)
  32.             use memoryStream = new MemoryStream(photoImage.ImageBytes)
  33.             blockBlob.UploadFromStream(memoryStream)
  34.  
  35.         member this.DeletePhotoImage(uniqueIdentifier: Guid) =
  36.             let blockBlob = this.GetBlockBlob(uniqueIdentifier)
  37.             blockBlob.Delete()       
  38.  
  39.         member this.GetPhotoImage(uniqueIdentifier: Guid) =
  40.             let blockBlob = this.GetBlockBlob(uniqueIdentifier)
  41.             if blockBlob.Exists() then
  42.                 blockBlob.FetchAttributes()
  43.                 use memoryStream = new MemoryStream()
  44.                 blockBlob.DownloadToStream(memoryStream)
  45.                 let photoArray = memoryStream.ToArray()
  46.                 new PhotoImage(uniqueIdentifier,photoArray)
  47.             else
  48.                 failwith "photo not found"

And when I pop it into the WPF application,

  1. private void SavePhoto(byte[] colorData)
  2. {
  3.     var bitmapSource = BitmapSource.Create(640, 480, 96, 96, PixelFormats.Bgr32, null, colorData, 640 * 4);
  4.     JpegBitmapEncoder encoder = new JpegBitmapEncoder();
  5.     encoder.Frames.Add(BitmapFrame.Create(bitmapSource));
  6.     using (MemoryStream memoryStream = new MemoryStream())
  7.     {
  8.         encoder.Save(memoryStream);
  9.         PhotoImage photoImage = new PhotoImage(Guid.NewGuid(), memoryStream.ToArray());
  10.  
  11.         Guid customerUniqueId = new Guid("7282AF48-FB3D-489B-A572-2EFAE80D0A9E");
  12.         String connectionString =
  13.             "DefaultEndpointsProtocol=http;AccountName=XXX;AccountKey=XXX";
  14.         IPhotoImageProvider provider = new AzureStoragePhotoImageProvider(customerUniqueId, connectionString);
  15.  
  16.  
  17.         provider.InsertPhotoImage(photoImage);
  18.         memoryStream.Close();
  19.     }
  20.     _isTakingPicture = false;
  21. }

I can now write my images to Azure.

image

With that out of the way, I can now have SkyBiometry pick up my photo, analyze it, and push the results back.  I went ahead and added in the .fs module that I had already created for this blog post.  I then added FSharp.Data via NuGet and was ready to roll. In he Save photo event handler,after saving the photo to blob storage, it then calls Sky Biometry to compare against a base image that has already been trained:

  1. private void SavePhoto(byte[] colorData)
  2. {
  3.     var bitmapSource = BitmapSource.Create(640, 480, 96, 96, PixelFormats.Bgr32, null, colorData, 640 * 4);
  4.     JpegBitmapEncoder encoder = new JpegBitmapEncoder();
  5.     encoder.Frames.Add(BitmapFrame.Create(bitmapSource));
  6.     PhotoImage photoImage = UploadPhotoImage(encoder);
  7.  
  8.     String skyBiometryUri = "http://api.skybiometry.com&quot;;
  9.     String uid = "Kinect@ChickenFace";
  10.     String apiKey = "XXXX";
  11.     String apiSecret = "XXXX";
  12.  
  13.     var imageComparer = new SkyBiometryImageComparer(skyBiometryUri, uid, apiKey, apiSecret);
  14.     String basePhotoUri = "XXXX.jpg";
  15.     String targetPhotoUri = "XXXX/" + photoImage.UniqueId + ".jpg";
  16.  
  17.     currentImage.Source = new BitmapImage(new Uri(basePhotoUri));
  18.     compareImage.Source = new BitmapImage(new Uri(targetPhotoUri)); ;
  19.     
  20.     var matchValue = imageComparer.CalculateFacialRecognitionConfidence(basePhotoUri, targetPhotoUri);
  21.     FacialRecognitionTextBox.Text = "Match Value is: " + matchValue.ToString();
  22.     _isTakingPicture = false;
  23. }

And I am getting a result back from Sky Biometry.

image

Finally, I added in the SkyBiometry X and Y coordinates for the photo and compared to the calculated ones based on the Kinect Skeleton Tracking:

  1. currentImage.Source = new BitmapImage(new Uri(basePhotoUri));
  2. compareImage.Source = new BitmapImage(new Uri(targetPhotoUri)); ;
  3.  
  4. var matchValue = imageComparer.CalculateFacialRecognitionConfidence(basePhotoUri, targetPhotoUri);
  5.  
  6. var selectedSkeleton = skeletons.FirstOrDefault(s => s.TrackingState == SkeletonTrackingState.Tracked);
  7. if (selectedSkeleton != null)
  8. {
  9.     var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  10.     var adjustedHeadPosition =
  11.         _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  12.  
  13.     var skyBiometryX = ((float)adjustedHeadPosition.X / 640) * 100;
  14.     var skyBioMetryY = ((float)adjustedHeadPosition.Y / 480) * 100;
  15.  
  16.     StringBuilder stringBuilder = new StringBuilder();
  17.     stringBuilder.Append("Match Value is: ");
  18.     stringBuilder.Append(matchValue.Confidence.ToString());
  19.     stringBuilder.Append("Sky Biometry X: ");
  20.     stringBuilder.Append(matchValue.X.ToString());
  21.     stringBuilder.Append("Sky Biometry Y: ");
  22.     stringBuilder.Append(matchValue.Y.ToString());
  23.     stringBuilder.Append("Kinect X: ");
  24.     stringBuilder.Append(Math.Round(skyBiometryX, 2).ToString());
  25.     stringBuilder.Append("Kinect Y: ");
  26.     stringBuilder.Append(Math.Round(skyBioMetryY, 2).ToString());
  27.     FacialRecognitionTextBox.Text = stringBuilder.ToString();
  28. }
  29.  
  30. _isTakingPicture = false;

And the results are encouraging –> it looks like I can use the X and Y to identify different people on the screen:

Match Value is: 53
Sky Biometry X: 10
Sky Biometry Y: 13.33

Kinect X: 47.5
Kinect Y: 39.79

Up next will be pointing the laser and the target…

 

 

 

Terminator Program: Part 1

I am starting to work on a new Kinect application for TRINUG’s code camp.  I wanted to extend the facial recognition application I did using Sky Biometry and have the Kinect identify people in its field of view.  Then, I want to give the verbal command “Terminate XXX” where XXX is the name of a recognized person.  That would activate a couple of servos via a netduino and point a laser pointer at that person and perhaps make a blaster sound.  The <ahem> architectural diagram </ahem? looks like this

image

Not really worrying about how far I will get (the fun is in the process, no?), I picked up Rob Miles’s excellent book Start Here: Learn The Kinect API and plugged in my Kinect.

The first thing I did was see if I can get a running video from the Kinect –> which was very easy.  I created a new C#/WPF application and replaced the default markup with this::

  1. <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
  2.         xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation&quot;
  3.         xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml&quot;
  4.         Title="MainWindow" Height="545" Width="643"
  5.         Loaded="Window_Loaded" Closing="Window_Closing">
  6.     <Grid>
  7.         <Image x:Name="kinectColorImage" Width="640" Height="480" />
  8.     </Grid>
  9. </Window>

And in the code-behind, I added the following code.  The only thing that is kinda tricky is that there are two threads: the Main UI thread and then the thread that processes the Kinect data.  Interestingly, it is easy to pass data from the Kinect Thread to the Main UI Thread –> just call the delegate and pass in the byte array.

  1. Boolean _isKinectDisplayActive = false;
  2. KinectSensor _sensor = null;
  3. WriteableBitmap _videoBitmap = null;
  4.  
  5. private void Window_Loaded(object sender, RoutedEventArgs e)
  6. {
  7.     SetUpKinect();
  8.     Thread videoThread = new Thread(new ThreadStart(DisplayKinectData));
  9.     _isKinectDisplayActive = true;
  10.     videoThread.Start();
  11. }
  12. private void Window_Closing(object sender, System.ComponentModel.CancelEventArgs e)
  13. {
  14.     _isKinectDisplayActive = false;
  15.  
  16. }
  17.  
  18. private void SetUpKinect()
  19. {
  20.     _sensor = KinectSensor.KinectSensors[0];
  21.     _sensor.ColorStream.Enable();
  22.     _sensor.Start();
  23. }
  24.  
  25. private void DisplayKinectData()
  26. {
  27.     while (_isKinectDisplayActive)
  28.     {
  29.         using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
  30.         {
  31.             if (colorFrame == null) continue;
  32.             var colorData = new byte[colorFrame.PixelDataLength];
  33.             colorFrame.CopyPixelDataTo(colorData);
  34.             Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData)));
  35.         }
  36.     }
  37.     _sensor.Stop();
  38. }
  39.  
  40. private void UpdateDisplay(byte[] colorData)
  41. {
  42.     if (_videoBitmap == null)
  43.     {
  44.         _videoBitmap = new WriteableBitmap(640, 480, 96, 96, PixelFormats.Bgr32, null);
  45.     }
  46.     _videoBitmap.WritePixels(new Int32Rect(0, 0, 640, 480), colorData, 640 * 4, 0);
  47.     kinectColorImage.Source = _videoBitmap;
  48. }

And I have a live-feed video

image

With that out of the way, I went to add picture taking capability.  I altered the XAML like so:

  1. <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
  2.         xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation&quot;
  3.         xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml&quot;
  4.         Title="MainWindow" Height="545" Width="643"
  5.         Loaded="Window_Loaded" Closing="Window_Closing">
  6.     <Grid>
  7.         <Image x:Name="kinectColorImage" Width="640" Height="480" />
  8.         <Button x:Name="takePhotoButton" Margin="0,466,435,10" Click="takePhotoButton_Click">Take Photo</Button>
  9.     </Grid>
  10. </Window>

And added this to the code behind:

  1. Boolean _isTakingPicture = false;
  2. BitmapSource _pictureBitmap = null;
  3.  
  4. private void takePhotoButton_Click(object sender, RoutedEventArgs e)
  5. {
  6.     _isTakingPicture = true;
  7.     SaveFileDialog dialog = new SaveFileDialog();
  8.     dialog.FileName = "Snapshot";
  9.     dialog.DefaultExt = ".jpg";
  10.     dialog.Filter = "Pictures (.jpg)|*.jpg";
  11.  
  12.     if (dialog.ShowDialog() == true)
  13.     {
  14.         String fileName = dialog.FileName;
  15.         using (FileStream fileStream = new FileStream(fileName, FileMode.Create))
  16.         {
  17.             JpegBitmapEncoder encoder = new JpegBitmapEncoder();
  18.             encoder.Frames.Add(BitmapFrame.Create(_pictureBitmap));
  19.             encoder.Save(fileStream);
  20.         }
  21.     }
  22. }

 

And altered the DisplayKinectDatra method to poll the _isTakingPicture flag

  1. private void DisplayKinectData()
  2. {
  3.     while (_isKinectDisplayActive)
  4.     {
  5.         using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
  6.         {
  7.             if (colorFrame == null) continue;
  8.             var colorData = new byte[colorFrame.PixelDataLength];
  9.             colorFrame.CopyPixelDataTo(colorData);
  10.             Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData)));
  11.  
  12.             if (_isTakingPicture)
  13.             {
  14.                 Dispatcher.Invoke(new Action(() => SavePhoto(colorData)));
  15.             }
  16.         }
  17.     }
  18.     _sensor.Stop();
  19. }

And now I have screen capture ability.

image

With that out of the way, I needed a way of identifying the people in the Kinect’s field of vision and taking their picture individually.  I altered the XAML like so

  1. <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
  2.         xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation&quot;
  3.         xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml&quot;
  4.         Title="MainWindow" Height="545" Width="643"
  5.         Loaded="Window_Loaded" Closing="Window_Closing">
  6.     <Grid>
  7.         <Image x:Name="kinectColorImage" Width="640" Height="480" />
  8.         <Button x:Name="takePhotoButton" Margin="0,466,435,10" Click="takePhotoButton_Click">Take Photo</Button>
  9.         <Canvas x:Name="skeletonCanvas" Width="640" Height="480" />
  10.                 <TextBox x:Name="skeletonInfoTextBox" Margin="205,466,10,10" />
  11.     </Grid>
  12. </Window>

And altered the Setup method like so:

  1. private void SetUpKinect()
  2. {
  3.     _sensor = KinectSensor.KinectSensors[0];
  4.     _sensor.ColorStream.Enable();
  5.     _sensor.SkeletonStream.Enable();
  6.     _sensor.Start();
  7. }

And then altered the UpdateDisplay method to take in both the color byte array and the skeleton byte array and display the head and skeleton location.  Note that there is a built in function called MapSkeletonPointToColorPoint() which takes the skeleton coordinate position and translates it to the color coordinate position.  I know that is needed, but I have no idea who it works –> magic I guess.

  1. private void UpdateDisplay(byte[] colorData, Skeleton[] skeletons)
  2. {
  3.     if (_videoBitmap == null)
  4.     {
  5.         _videoBitmap = new WriteableBitmap(640, 480, 96, 96, PixelFormats.Bgr32, null);
  6.     }
  7.     _videoBitmap.WritePixels(new Int32Rect(0, 0, 640, 480), colorData, 640 * 4, 0);
  8.     kinectColorImage.Source = _videoBitmap;
  9.     var selectedSkeleton = skeletons.FirstOrDefault(s => s.TrackingState == SkeletonTrackingState.Tracked);
  10.     if (selectedSkeleton != null)
  11.     {
  12.         var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  13.         var adjustedHeadPosition =
  14.             _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  15.         var adjustedSkeletonPosition = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(selectedSkeleton.Position, ColorImageFormat.RgbResolution640x480Fps30);
  16.  
  17.  
  18.         String skeletonInfo = headPosition.X.ToString() + " : " + headPosition.Y.ToString() + " — ";
  19.         skeletonInfo = skeletonInfo + adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
  20.         skeletonInfo = skeletonInfo + adjustedSkeletonPosition.X.ToString() + " : " + adjustedSkeletonPosition.Y.ToString();
  21.  
  22.         skeletonInfoTextBox.Text = skeletonInfo;
  23.  
  24.     }
  25. }

And the invocation of the UpdateDisplay now looks like this:

  1. private void DisplayKinectData()
  2. {
  3.     while (_isKinectDisplayActive)
  4.     {
  5.         using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
  6.         {
  7.             if (colorFrame == null) continue;
  8.             using (SkeletonFrame skeletonFrame = _sensor.SkeletonStream.OpenNextFrame(10))
  9.             {
  10.                 if (skeletonFrame == null) continue;
  11.  
  12.                 var colorData = new byte[colorFrame.PixelDataLength];
  13.                 var skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
  14.  
  15.                 colorFrame.CopyPixelDataTo(colorData);
  16.                 skeletonFrame.CopySkeletonDataTo(skeletons);
  17.  
  18.  
  19.                 if (_isTakingPicture)
  20.                 {
  21.                     Dispatcher.Invoke(new Action(() => SavePhoto(colorData)));
  22.                 }
  23.                 Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData, skeletons)));
  24.  
  25.             }
  26.         }
  27.     }
  28.     _sensor.Stop();
  29. }

And the results are what you expect:

image

With the ability to identify individuals, I then wants to take individual photos of each person and feed it to Sky Biometry.  To that end, I added a method to draw a rectangle around each person and then (somehow) take a snapshot of the contents within the triangle.  Drawing the rectangle was a straight-forward WPF exercise:

  1. private void DrawBoxAroundHead(Skeleton selectedSkeleton)
  2. {
  3.     skeletonCanvas.Children.Clear();
  4.     var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  5.     var shoulderCenterPosition = selectedSkeleton.Joints[JointType.ShoulderCenter].Position;
  6.  
  7.     var adjustedHeadPosition =
  8.         _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  9.     var adjustedShoulderCenterPosition =
  10.         _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(shoulderCenterPosition, ColorImageFormat.RgbResolution640x480Fps30);
  11.     var delta = adjustedHeadPosition.Y – adjustedShoulderCenterPosition.Y;
  12.     var centerX = adjustedHeadPosition.X;
  13.     var centerY = adjustedHeadPosition.Y;
  14.  
  15.     Line topLline = new Line();
  16.     topLline.Stroke = new SolidColorBrush(Colors.Red);
  17.     topLline.StrokeThickness = 5;
  18.     topLline.X1 = centerX + (delta * -1);
  19.     topLline.Y1 = centerY – (delta * -1);
  20.     topLline.X2 = centerX + delta;
  21.     topLline.Y2 = centerY – (delta * -1);
  22.     skeletonCanvas.Children.Add(topLline);
  23.     Line bottomLine = new Line();
  24.     bottomLine.Stroke = new SolidColorBrush(Colors.Red);
  25.     bottomLine.StrokeThickness = 5;
  26.     bottomLine.X1 = centerX + (delta * -1);
  27.     bottomLine.Y1 = centerY + (delta * -1);
  28.     bottomLine.X2 = centerX + delta;
  29.     bottomLine.Y2 = centerY + (delta * -1);
  30.     skeletonCanvas.Children.Add(bottomLine);
  31.     Line rightLine = new Line();
  32.     rightLine.Stroke = new SolidColorBrush(Colors.Red);
  33.     rightLine.StrokeThickness = 5;
  34.     rightLine.X1 = centerX + (delta * -1);
  35.     rightLine.Y1 = centerY – (delta * -1);
  36.     rightLine.X2 = centerX + (delta * -1);
  37.     rightLine.Y2 = centerY + (delta * -1);
  38.     skeletonCanvas.Children.Add(rightLine);
  39.     Line leftLine = new Line();
  40.     leftLine.Stroke = new SolidColorBrush(Colors.Red);
  41.     leftLine.StrokeThickness = 5;
  42.     leftLine.X1 = centerX + delta;
  43.     leftLine.Y1 = centerY – (delta * -1);
  44.     leftLine.X2 = centerX + delta;
  45.     leftLine.Y2 = centerY + (delta * -1);
  46.     skeletonCanvas.Children.Add(leftLine);
  47. }

And then adding that line in the Update Display

  1. if (selectedSkeleton != null)
  2. {
  3.     var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
  4.     var adjustedHeadPosition =
  5.         _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
  6.     var adjustedSkeletonPosition = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(selectedSkeleton.Position, ColorImageFormat.RgbResolution640x480Fps30);
  7.  
  8.     DrawBoxAroundHead(selectedSkeleton);
  9.  
  10.     String skeletonInfo = headPosition.X.ToString() + " : " + headPosition.Y.ToString() + " — ";
  11.     skeletonInfo = skeletonInfo + adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
  12.     skeletonInfo = skeletonInfo + adjustedSkeletonPosition.X.ToString() + " : " + adjustedSkeletonPosition.Y.ToString();
  13.  
  14.     skeletonInfoTextBox.Text = skeletonInfo;
  15.  
  16. }

Gives me this:

image

Which is great, but now I am stuck.  I need a way of isolating the contents of that rectangle in the byte array that I am feeding to bitmap encoder and I don’t know how to trim the array.  Instead of trying to learn any more WPF and graphic programming, I decided to take a different tact and send the photograph in its entirety to Sky Biometry and let it figure out the people in the photograph.  How I did that is the subject of my next blog post…

 

 

 

 

Programming the Kinect

I recently went back to programming to Kinect for a little off-base fun.  I wanted to see how hard it would be to move the Kinect up and down and then capture a picture.  Forgoing TDD, I created a WPF project that referenced the Kinect SDK.

I then added some buttons to control the up/down of the Kinect and a button to take a picture of the streaming image. I threw in 2 image controls – one to show the streaming image and one to show the still picture that was taken.

image

Here is the designer for those of you who are XAML-Impaired:

image

I then wired up the code behind to stream the images:

public MainWindow()
{
    InitializeComponent();          
    kinectSensor.DepthStream.Enable();
    kinectSensor.ColorStream.Enable();
    kinectSensor.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(kinectSensor_AllFramesReady);
    kinectSensor.Start();
}

void kinectSensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
{
    ColorImageFrame currentFrame = e.OpenColorImageFrame();
    if (currentFrame != null)
    {
        byte[] pixelData = new byte[currentFrame.PixelDataLength];
        currentFrame.CopyPixelDataTo(pixelData);
        BitmapSource bitMapSource = BitmapImage.Create(currentFrame.Width,
            currentFrame.Height,96,96, PixelFormats.Bgr32, null,
            pixelData, currentFrame.Width * currentFrame.BytesPerPixel);
        this.streamingVideoImage.Source = bitMapSource;
    }

}

And here is the output:

image

I then added in the ability to take a picture:

void kinectSensor_AllFramesReady(object sender, AllFramesReadyEventArgs e) {

//Same code as before

 

this.streamingVideoImage.Source = bitMapSource; if (takePicture) { this.takePictureImage.Source = bitMapSource; takePicture = false; } } }

 

And here is the output:

image

Feelin’ good, I then went to add the ability to move the Kinect up and down.  I read this article with the point of emphasis that you shouldn’t run the motor often or continuously.  I thought of moving the sensor of groups of 5 degrees at a time like so:

private void upButton_Click(object sender, RoutedEventArgs e)
{
    
    if (kinectSensor.ElevationAngle < kinectSensor.MaxElevationAngle - 5)
    {
        kinectSensor.ElevationAngle = kinectSensor.ElevationAngle + 5;
    }
}

With a complimentary function for down.  Sure enough, it worked.

imageimageimage

The only thing I don’t like is that the picture freezes between the adjustment.  In addition, if you follow the MSDN article where you should throw is a Thread.Sleep(1000) after each change, the effect is kinda hokey.  I then thought about putting in an Image like this image while the camera is adjusting and the thread sleeps – just so the user knows that the camera is adjusting.  I whipped up a function like this:

private void downButton_Click(object sender, RoutedEventArgs e)
{
    
    if (kinectSensor.ElevationAngle > kinectSensor.MinElevationAngle + 5)
    {
        kinectSensor.ElevationAngle = kinectSensor.ElevationAngle - 5;
    }
    XOutTheStreamingImage();
    Thread.Sleep(1000);
}
private void XOutTheStreamingImage()
{
    this.InvalidateVisual();
    BitmapImage xImage = new BitmapImage();
    xImage.BeginInit();
    xImage.UriSource = new Uri(@"C:\Users\Jamie\Documents\Visual Studio 2010\Projects\Tff.KinectExample_Solution\Tff.KinectExample\X.PNG");
    xImage.EndInit();
    this.streamingVideoImage.Source = xImage;
    this.InvalidateVisual();
}

One of the problem is – it doesn’t work.  A larger problem is – I don’t know why.  The image never shows – though the thread is sleeping.  I then put the XOutTheStreamingImage before the sensor change.  Nope.  I then detached the event handler and then reattached it:

private void upButton_Click(object sender, RoutedEventArgs e)
{
    kinectSensor.AllFramesReady -= new EventHandler<AllFramesReadyEventArgs>(kinectSensor_AllFramesReady);
    XOutTheStreamingImage();
    if (kinectSensor.ElevationAngle < kinectSensor.MaxElevationAngle - 5)
    {
        kinectSensor.ElevationAngle = kinectSensor.ElevationAngle + 5;
    }
    Thread.Sleep(1000);
    kinectSensor.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(kinectSensor_AllFramesReady);
}

Nope.  I then detached it and left it detached.  It then kinda worked.  If I hit up/down a couple of times, I got it to work:

image

This kind of random behavior smells like a multi-threading issue.  If the AllFramesReady event was on a different thread, then the X image would shows for 1/6 of a second.  If the elevation angle was also on a different thread – then the sleep wouldn’t matter.  I fired up IL Spy and sure enough, check the locks:

image

And after searching for “Thread”, I found this:

image

Sure enough, the frame is processed a a different thread. And check out the Initialize method:

image

so it looks like I can’t put my XImage  into the elevationChange event and the frameReady event.  Instead, I can unhook the event handler and show the image:

kinectSensor.AllFramesReady -= new EventHandler<AllFramesReadyEventArgs>(kinectSensor_AllFramesReady);
XOutTheStreamingImage();
if (kinectSensor.ElevationAngle < kinectSensor.MaxElevationAngle - 5)
{
    kinectSensor.ElevationAngle = kinectSensor.ElevationAngle + 5;
}

Sure enough – that works – but it only flashes for a second.  I then tried and displayed the X for a second after the screen is invalidated:

 

private void XOutTheStreamingImage()
{
    this.InvalidateVisual();
    BitmapImage xImage = new BitmapImage();
    xImage.BeginInit();
    xImage.UriSource = new Uri(@"C:\Users\Jamie\Documents\Visual Studio 2010\Projects\Tff.KinectExample_Solution\Tff.KinectExample\X.PNG");
    xImage.EndInit();
    this.streamingVideoImage.Source = xImage;
    this.InvalidateVisual();
    Thread.Sleep(1000);
}

However, that doesn’t work.  The thread sleeps BEFORE the screen refreshes, so I am stuck with the last image from the Kinect.  I am thinking I am stuck – there is no way to:

  • Stop the Kiect from capturing images
  • Update the screen
  • Move the Kinect Angle
  • Start the Kiect to capture images
    The way the API is set up…

I think I need my thinking chair…

Kinect SDK

I purchased a Kinect last week so that I could start messing around with its API.

1) There are two versions of the Kinect.  The XBOX 360 one and the Windows one.  The only difference between the two that I could gather is that the Windows one pre-loads the SDK and allows you to distribute your software commercially.  Since I am just a hobbyist, I went with the XBOX one for $100 cheaper.

2) The Kinect for the XBOX360 requires an additional power cord for it to connect to your computer,  You don’t need to buy it though as it comes included.  I made that mistake (and compounded it by buying from the Microsoft store at a premium)

3) There are a couple different SDKs floating around out there.  There is the 1.0 SDK and the 1.5 SDK.  You will want to use the 1.5 one (because newer is always better) and there is a HUGE difference in the APIs between the two versions to the point that anything you wrote in 1.0 is useless.

4) I started digging into programming the Kinect with this book.  After reading the SDK samples and documentation, it really isn’t necessary.  The SDK is really well documented and is probably the best place to start to learn about the technology.

5) Once I dove into programming the Kinect, I realize that that this is no small task.  For C#, the amount of code you need to write and the complexity is higher than any other Microsoft technology I have seen.  You will need to know about bit shifts, the low-level details of graphical classes, and advanced data structures.  For example, here is an example from the Kinect Explorer solution:

// Converts a 16-bit grayscale depth frame which includes player indexes into a 32-bit frame
// that displays different players in different colors
private void ConvertDepthFrame(short[] depthFrame, DepthImageStream depthStream)
{
    int tooNearDepth = depthStream.TooNearDepth;
    int tooFarDepth = depthStream.TooFarDepth;
    int unknownDepth = depthStream.UnknownDepth;

    // Test that the buffer lengths are appropriately correlated, which allows us to use only one
    // value as the loop condition.
    if ((depthFrame.Length * 4) != this.depthFrame32.Length)
    {
        throw new InvalidOperationException();
    }

    for (int i16 = 0, i32 = 0; i32 < this.depthFrame32.Length; i16++, i32 += 4)
    {
        int player = depthFrame[i16] & DepthImageFrame.PlayerIndexBitmask;
        int realDepth = depthFrame[i16] >> DepthImageFrame.PlayerIndexBitmaskWidth;
        
        if (player == 0 && realDepth == tooNearDepth)
        {
            // white 
            this.depthFrame32[i32 + RedIndex] = 255;
            this.depthFrame32[i32 + GreenIndex] = 255;
            this.depthFrame32[i32 + BlueIndex] = 255;
        }
        else if (player == 0 && realDepth == tooFarDepth)
        {
            // dark purple
            this.depthFrame32[i32 + RedIndex] = 66;
            this.depthFrame32[i32 + GreenIndex] = 0;
            this.depthFrame32[i32 + BlueIndex] = 66;
        }
        else if (player == 0 && realDepth == unknownDepth)
        {
            // dark brown
            this.depthFrame32[i32 + RedIndex] = 66;
            this.depthFrame32[i32 + GreenIndex] = 66;
            this.depthFrame32[i32 + BlueIndex] = 33;
        }
        else
        {
            // transform 13-bit depth information into an 8-bit intensity appropriate
            // for display (we disregard information in most significant bit)
            byte intensity = (byte)(~(realDepth >> 4));

            // tint the intensity by dividing by per-player values
            this.depthFrame32[i32 + RedIndex] = (byte)(intensity >> IntensityShiftByPlayerR[player]);
            this.depthFrame32[i32 + GreenIndex] = (byte)(intensity >> IntensityShiftByPlayerG[player]);
            this.depthFrame32[i32 + BlueIndex] = (byte)(intensity >> IntensityShiftByPlayerB[player]);
        }
    }
}

My goal is to have enough to work with to present at TriNug’s code camp in November.  That might be a stretch…