Terminator Program: Part 1
July 8, 2014 Leave a comment
I am starting to work on a new Kinect application for TRINUG’s code camp. I wanted to extend the facial recognition application I did using Sky Biometry and have the Kinect identify people in its field of view. Then, I want to give the verbal command “Terminate XXX” where XXX is the name of a recognized person. That would activate a couple of servos via a netduino and point a laser pointer at that person and perhaps make a blaster sound. The <ahem> architectural diagram </ahem? looks like this
Not really worrying about how far I will get (the fun is in the process, no?), I picked up Rob Miles’s excellent book Start Here: Learn The Kinect API and plugged in my Kinect.
The first thing I did was see if I can get a running video from the Kinect –> which was very easy. I created a new C#/WPF application and replaced the default markup with this::
- <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
- xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
- xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
- Title="MainWindow" Height="545" Width="643"
- Loaded="Window_Loaded" Closing="Window_Closing">
- <Grid>
- <Image x:Name="kinectColorImage" Width="640" Height="480" />
- </Grid>
- </Window>
And in the code-behind, I added the following code. The only thing that is kinda tricky is that there are two threads: the Main UI thread and then the thread that processes the Kinect data. Interestingly, it is easy to pass data from the Kinect Thread to the Main UI Thread –> just call the delegate and pass in the byte array.
- Boolean _isKinectDisplayActive = false;
- KinectSensor _sensor = null;
- WriteableBitmap _videoBitmap = null;
- private void Window_Loaded(object sender, RoutedEventArgs e)
- {
- SetUpKinect();
- Thread videoThread = new Thread(new ThreadStart(DisplayKinectData));
- _isKinectDisplayActive = true;
- videoThread.Start();
- }
- private void Window_Closing(object sender, System.ComponentModel.CancelEventArgs e)
- {
- _isKinectDisplayActive = false;
- }
- private void SetUpKinect()
- {
- _sensor = KinectSensor.KinectSensors[0];
- _sensor.ColorStream.Enable();
- _sensor.Start();
- }
- private void DisplayKinectData()
- {
- while (_isKinectDisplayActive)
- {
- using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
- {
- if (colorFrame == null) continue;
- var colorData = new byte[colorFrame.PixelDataLength];
- colorFrame.CopyPixelDataTo(colorData);
- Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData)));
- }
- }
- _sensor.Stop();
- }
- private void UpdateDisplay(byte[] colorData)
- {
- if (_videoBitmap == null)
- {
- _videoBitmap = new WriteableBitmap(640, 480, 96, 96, PixelFormats.Bgr32, null);
- }
- _videoBitmap.WritePixels(new Int32Rect(0, 0, 640, 480), colorData, 640 * 4, 0);
- kinectColorImage.Source = _videoBitmap;
- }
And I have a live-feed video
With that out of the way, I went to add picture taking capability. I altered the XAML like so:
- <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
- xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
- xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
- Title="MainWindow" Height="545" Width="643"
- Loaded="Window_Loaded" Closing="Window_Closing">
- <Grid>
- <Image x:Name="kinectColorImage" Width="640" Height="480" />
- <Button x:Name="takePhotoButton" Margin="0,466,435,10" Click="takePhotoButton_Click">Take Photo</Button>
- </Grid>
- </Window>
And added this to the code behind:
- Boolean _isTakingPicture = false;
- BitmapSource _pictureBitmap = null;
- private void takePhotoButton_Click(object sender, RoutedEventArgs e)
- {
- _isTakingPicture = true;
- SaveFileDialog dialog = new SaveFileDialog();
- dialog.FileName = "Snapshot";
- dialog.DefaultExt = ".jpg";
- dialog.Filter = "Pictures (.jpg)|*.jpg";
- if (dialog.ShowDialog() == true)
- {
- String fileName = dialog.FileName;
- using (FileStream fileStream = new FileStream(fileName, FileMode.Create))
- {
- JpegBitmapEncoder encoder = new JpegBitmapEncoder();
- encoder.Frames.Add(BitmapFrame.Create(_pictureBitmap));
- encoder.Save(fileStream);
- }
- }
- }
And altered the DisplayKinectDatra method to poll the _isTakingPicture flag
- private void DisplayKinectData()
- {
- while (_isKinectDisplayActive)
- {
- using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
- {
- if (colorFrame == null) continue;
- var colorData = new byte[colorFrame.PixelDataLength];
- colorFrame.CopyPixelDataTo(colorData);
- Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData)));
- if (_isTakingPicture)
- {
- Dispatcher.Invoke(new Action(() => SavePhoto(colorData)));
- }
- }
- }
- _sensor.Stop();
- }
And now I have screen capture ability.
With that out of the way, I needed a way of identifying the people in the Kinect’s field of vision and taking their picture individually. I altered the XAML like so
- <Window x:Class="ChickenSoftware.Terminiator.UI.MainWindow"
- xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
- xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
- Title="MainWindow" Height="545" Width="643"
- Loaded="Window_Loaded" Closing="Window_Closing">
- <Grid>
- <Image x:Name="kinectColorImage" Width="640" Height="480" />
- <Button x:Name="takePhotoButton" Margin="0,466,435,10" Click="takePhotoButton_Click">Take Photo</Button>
- <Canvas x:Name="skeletonCanvas" Width="640" Height="480" />
- <TextBox x:Name="skeletonInfoTextBox" Margin="205,466,10,10" />
- </Grid>
- </Window>
And altered the Setup method like so:
- private void SetUpKinect()
- {
- _sensor = KinectSensor.KinectSensors[0];
- _sensor.ColorStream.Enable();
- _sensor.SkeletonStream.Enable();
- _sensor.Start();
- }
And then altered the UpdateDisplay method to take in both the color byte array and the skeleton byte array and display the head and skeleton location. Note that there is a built in function called MapSkeletonPointToColorPoint() which takes the skeleton coordinate position and translates it to the color coordinate position. I know that is needed, but I have no idea who it works –> magic I guess.
- private void UpdateDisplay(byte[] colorData, Skeleton[] skeletons)
- {
- if (_videoBitmap == null)
- {
- _videoBitmap = new WriteableBitmap(640, 480, 96, 96, PixelFormats.Bgr32, null);
- }
- _videoBitmap.WritePixels(new Int32Rect(0, 0, 640, 480), colorData, 640 * 4, 0);
- kinectColorImage.Source = _videoBitmap;
- var selectedSkeleton = skeletons.FirstOrDefault(s => s.TrackingState == SkeletonTrackingState.Tracked);
- if (selectedSkeleton != null)
- {
- var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
- var adjustedHeadPosition =
- _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
- var adjustedSkeletonPosition = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(selectedSkeleton.Position, ColorImageFormat.RgbResolution640x480Fps30);
- String skeletonInfo = headPosition.X.ToString() + " : " + headPosition.Y.ToString() + " — ";
- skeletonInfo = skeletonInfo + adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
- skeletonInfo = skeletonInfo + adjustedSkeletonPosition.X.ToString() + " : " + adjustedSkeletonPosition.Y.ToString();
- skeletonInfoTextBox.Text = skeletonInfo;
- }
- }
And the invocation of the UpdateDisplay now looks like this:
- private void DisplayKinectData()
- {
- while (_isKinectDisplayActive)
- {
- using (ColorImageFrame colorFrame = _sensor.ColorStream.OpenNextFrame(10))
- {
- if (colorFrame == null) continue;
- using (SkeletonFrame skeletonFrame = _sensor.SkeletonStream.OpenNextFrame(10))
- {
- if (skeletonFrame == null) continue;
- var colorData = new byte[colorFrame.PixelDataLength];
- var skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
- colorFrame.CopyPixelDataTo(colorData);
- skeletonFrame.CopySkeletonDataTo(skeletons);
- if (_isTakingPicture)
- {
- Dispatcher.Invoke(new Action(() => SavePhoto(colorData)));
- }
- Dispatcher.Invoke(new Action(() => UpdateDisplay(colorData, skeletons)));
- }
- }
- }
- _sensor.Stop();
- }
And the results are what you expect:
With the ability to identify individuals, I then wants to take individual photos of each person and feed it to Sky Biometry. To that end, I added a method to draw a rectangle around each person and then (somehow) take a snapshot of the contents within the triangle. Drawing the rectangle was a straight-forward WPF exercise:
- private void DrawBoxAroundHead(Skeleton selectedSkeleton)
- {
- skeletonCanvas.Children.Clear();
- var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
- var shoulderCenterPosition = selectedSkeleton.Joints[JointType.ShoulderCenter].Position;
- var adjustedHeadPosition =
- _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
- var adjustedShoulderCenterPosition =
- _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(shoulderCenterPosition, ColorImageFormat.RgbResolution640x480Fps30);
- var delta = adjustedHeadPosition.Y – adjustedShoulderCenterPosition.Y;
- var centerX = adjustedHeadPosition.X;
- var centerY = adjustedHeadPosition.Y;
- Line topLline = new Line();
- topLline.Stroke = new SolidColorBrush(Colors.Red);
- topLline.StrokeThickness = 5;
- topLline.X1 = centerX + (delta * -1);
- topLline.Y1 = centerY – (delta * -1);
- topLline.X2 = centerX + delta;
- topLline.Y2 = centerY – (delta * -1);
- skeletonCanvas.Children.Add(topLline);
- Line bottomLine = new Line();
- bottomLine.Stroke = new SolidColorBrush(Colors.Red);
- bottomLine.StrokeThickness = 5;
- bottomLine.X1 = centerX + (delta * -1);
- bottomLine.Y1 = centerY + (delta * -1);
- bottomLine.X2 = centerX + delta;
- bottomLine.Y2 = centerY + (delta * -1);
- skeletonCanvas.Children.Add(bottomLine);
- Line rightLine = new Line();
- rightLine.Stroke = new SolidColorBrush(Colors.Red);
- rightLine.StrokeThickness = 5;
- rightLine.X1 = centerX + (delta * -1);
- rightLine.Y1 = centerY – (delta * -1);
- rightLine.X2 = centerX + (delta * -1);
- rightLine.Y2 = centerY + (delta * -1);
- skeletonCanvas.Children.Add(rightLine);
- Line leftLine = new Line();
- leftLine.Stroke = new SolidColorBrush(Colors.Red);
- leftLine.StrokeThickness = 5;
- leftLine.X1 = centerX + delta;
- leftLine.Y1 = centerY – (delta * -1);
- leftLine.X2 = centerX + delta;
- leftLine.Y2 = centerY + (delta * -1);
- skeletonCanvas.Children.Add(leftLine);
- }
And then adding that line in the Update Display
- if (selectedSkeleton != null)
- {
- var headPosition = selectedSkeleton.Joints[JointType.Head].Position;
- var adjustedHeadPosition =
- _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(headPosition, ColorImageFormat.RgbResolution640x480Fps30);
- var adjustedSkeletonPosition = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(selectedSkeleton.Position, ColorImageFormat.RgbResolution640x480Fps30);
- DrawBoxAroundHead(selectedSkeleton);
- String skeletonInfo = headPosition.X.ToString() + " : " + headPosition.Y.ToString() + " — ";
- skeletonInfo = skeletonInfo + adjustedHeadPosition.X.ToString() + " : " + adjustedHeadPosition.Y.ToString() + " — ";
- skeletonInfo = skeletonInfo + adjustedSkeletonPosition.X.ToString() + " : " + adjustedSkeletonPosition.Y.ToString();
- skeletonInfoTextBox.Text = skeletonInfo;
- }
Gives me this:
Which is great, but now I am stuck. I need a way of isolating the contents of that rectangle in the byte array that I am feeding to bitmap encoder and I don’t know how to trim the array. Instead of trying to learn any more WPF and graphic programming, I decided to take a different tact and send the photograph in its entirety to Sky Biometry and let it figure out the people in the photograph. How I did that is the subject of my next blog post…