Kinect – Getting Started – Become The Incredible Hulk
In my last two posts I’ve talked about Kinect SDK from Kinect .NET SDK–Getting Started and Kinect – Getting Started – Control Camera Angle, but now it’s time to do some cool things with the Kinect Sensor.
Now I’ll show you how to become The Incredible Hulk using Skeleton Tracking.
Download Demo Project
One of the big strengths of Kinect for Windows SDK is its ability to discover the skeleton of joints of an human standing in front of the sensor, very fast recognition system and requires no training to use.
The NUI Skeleton API provides information about the location of up to two players standing in front of the Kinect sensor array, with detailed position and orientation information.
The data is provided to application code as a set of points, called skeleton positions, that compose a skeleton, as shown in the picture below. This skeleton represents a user’s current position and pose.
Applications that use skeleton data must indicate this at NUI initialization and must enable skeleton tracking.
The Vitruvian Man has 20 points that called Joints in Kinect SDK.
Step 1: Register To SkeletonFrameReady
Make sure you Initialize with UseSkeletalTracking, otherwise the Skeleton Tracking will not work.
_kinectNui.Initialize(RuntimeOptions.UseColor | RuntimeOptions.UseSkeletalTracking | RuntimeOptions.UseColor);
_kinectNui.SkeletonFrameReady += new EventHandler<SkeletonFrameReadyEventArgs>(SkeletonFrameReady);
The Kinect NUI cannot track more than 2 Skeletons, if (SkeletonTrackingState.Tracked != data.TrackingState) continue;
means the Skeleton is tracked, untracked Skeletons only gives their position without the Joints, also Skeleton will be rendered if full body fits in frame.
Debugging isn’t a simple task when developing for Kinect – Get Up Each time you want to test it.
Skeleton Joints marked by TrackingID enum that defined its reference position:
public enum JointID
Step 2: Get Joint Position
The Joint position defined in Camera Space, and we need to translate to our Size and Position.
Depth Image Space
Image frames of the depth map are 640x480, 320×240, or 80x60 pixels in size, with each pixel representing the distance, in millimeters, to the nearest object at that particular x and y coordinate. A pixel value of 0 indicates that the sensor did not find any objects within its range at that location. The x and y coordinates of the image frame do not represent physical units in the room, but rather pixels on the depth imaging sensor. The interpretation of the x and y coordinates depends on specifics of the optics and imaging sensor. For discussion purposes, this projected space is referred to as the depth image space.
Player skeleton positions are expressed in x, y, and z coordinates. Unlike the coordinate of depth image space, these three coordinates are expressed in meters. The x, y, and z axes are the body axes of the depth sensor. This is a right-handed coordinate system that places the sensor array at the origin point with the positive z axis extending in the direction in which the sensor array points. The positive y axis extends upward, and the positive x axis extends to the left (with respect to the sensor array), as shown in Figure 5. For discussion purposes, this expression of coordinates is referred to as the skeleton space.
private Point getDisplayPosition(Joint joint)
float depthX, depthY;
_kinectNui.SkeletonEngine.SkeletonToDepthImage(joint.Position, out depthX, out depthY);
depthX = Math.Max(0, Math.Min(depthX * 320, 320)); //convert to 320, 240 space
depthY = Math.Max(0, Math.Min(depthY * 240, 240)); //convert to 320, 240 space
int colorX, colorY; ImageViewArea iv = new ImageViewArea();
// only ImageResolution.Resolution640x480 is supported at this point
(int)depthX, (int)depthY, (short)0, out colorX, out colorY);
// map back to skeleton.Width & skeleton.Height
return new Point((int)(imageContainer.Width * colorX / 640.0) - 30, (int)(imageContainer.Height * colorY / 480) - 30);
Step 3: Place Image Based On Joint Type
A position of type Vector4 (x, y, z, w - The first three attributes define the position in camera space. The last attribute (w) gives the quality level (between 0 and 1)) of the position that indicates the center of mass for that skeleton.
This value is the only available positional value for passive players.
void SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
foreach (SkeletonData data in e.SkeletonFrame.Skeletons)
if (SkeletonTrackingState.Tracked != data.TrackingState) continue;
foreach (Joint joint in data.Joints)
if (joint.Position.W < 0.6f) return;// Quality check
var heanp = getDisplayPosition(joint);
var rhp = getDisplayPosition(joint);
var lhp = getDisplayPosition(joint);
Download Demo Project