Developing interactive HTML5-based games with Kinect

Video gaming currently exists on many platforms and in many forms. Consoles, PCs, handhelds, and mobiles. But video games, other than casual, competitive or similar, can also be educational.

 

The project we worked on for the Sheikh Abdullah Al Salem Cultural Centre, one of the largest museum complexes in the world, included developing games that accompanied the physical exhibits with interesting content and material for learning about the exhibited subject. Because our goal was to make the viewer interested in the content, we needed to present it in a fun and interactive way. Most of the games were created for big touchscreens and supported multiplayer, with gameplay based on the number of players. But three specific games had the requirement to be developed for Microsoft Kinect, so we needed to adapt our mindset and perspective (literally) as all of the three Kinect games were set up with the sensor facing the floor.

Vertical placement of Kinect

The standard and the only officially supported way of placing the Kinect sensor is on a flat surface, additionally with a slight offset on each axis, but preferably as flat as possible. If the sensor is placed other than this, the official detection API will not work. As we had the Kinect placed vertically, we needed to implement our own algorithms for tracking hands, feet and people in general. To access the data from Kinect, we somehow needed to connect to the data the sensor was generating. Fortunately, there is an open source Node.js library called kinect2 that enables just that. Since the official algorithms inside the SDK don’t support tracking in a vertical position, we only used raw depth and color frames from the kinect2 library.

 

Development setup

The implementation we developed for tracking was mainly written in JavaScript, and some of it in TypeScript. There were subtle differences in implementation for each of the three apps, but during the development, all of them shared the same concept for transferring the tracking data to the front-end application. As the official Kinect SDK only works on Windows, we had to use a Windows PC with Kinect Studio to record samples.

Kinect Studio has a built-in functionality that enables the playback of the recorded data, which greatly improves the development experience as the recording could be played in a loop. The streamed data from Kinect is then processed with our tracking software and transferred via web sockets to the client app alongside the necessary information for rendering the particular object in the scene.

A journey to performance

The UI aspect of these applications was built with HTML5 technologies. For each of the three apps, we chose frameworks that best suited the game’s requirements.

One of the most challenging applications was the one that was visually the simplest – interactive forest floor projection. The whole gameplay was based on user interaction and we knew that we needed to make the physics as real as possible, otherwise the app would feel unconvincing. The forest contained three surfaces – ground with leaves, snow and a pond. Each section looked and behaved differently, so we applied various principles to their development.

To create the leaf particles and the water surface, we used Three.js – a JavaScript 3D library that offers great API with various examples that serve as a great starting point. As Three.js is a WebGL library, it could potentially use a lot of computer resources, so we strongly focused on rendering performance. The leaf scene did not contain a gigantic number of particles, but the ones that were rendered were visually complex and needed to be movable by the action of the players. Since the rendering itself, without the movement, already affected the performance, we knew we needed to be cautious when it came time to implement the physics. We did not use a physics engine, but developed our own algorithms for the particle movement that fit better into the art direction we were going for.

Presenting the movement

The exhibit needed to support multiple people simultaneously, therefore we needed to optimize the amount of data we send from the Kinect to the UI, as the streams that the sensor’s SDK creates are huge. Our Kinect detection algorithm triggered a movement event too frequently and we needed to throttle the number of bytes we transfer. The solution we introduced included processing the data and transferring it to the UI in an interval that did not create a delay for the user, but created a lot of headroom in the system for other calculations. As the number of movement events needed to be reduced, we also discovered that we needed to predict the user behavior and movement, so the change in the position on the screen wouldn’t be too linear and sharp – tracking the state of the positions of steps the user made and combining them with a transition resulted in a more realistic experience.

HTML as a 2D surface

Based on the requirements of the other Kinect applications that we needed to develop, we realized that we don’t need 3D rendering. The design and the general look and feel could be represented with animations, shadows, different backgrounds and similar tricks inside a two dimensional space, thus mimicking a three dimensional one while still maintaining the visual identity. The design was crafted in such a way that the optimal way to create game components was to use regular HTML elements and code them as if they represent a 2D model on a canvas.

Furthermore, those components needed to represent a model in space, and our goal was to enable the user to interact with them by hovering, dragging and grabbing under a certain area. To give those elements mass and a touch of gravity, we reached for a popular JavaScript 2D physics engine. The engine enabled us to give physical attributes to HTML elements, and treat them as if they were entities inside a game engine. By combining events from user gestures captured with Kinect, we established a very natural experience for the player that felt intuitive and fun.

Wrapping up

Programming applications with Kinect can be a very refreshing experience for a developer, as it really encourages innovation and makes you rethink the way we interact with user interfaces.

By combining creative programming with data such as positions of hands or feet, you can build applications that conceptually look futuristic, but at the same time give the user a very familiar feeling. Although the Kinect sensor is not officially supported by Microsoft anymore, it can still be used as a robust sensor for fetching the depth data from the 3D world. With adequate open source libraries, development of Kinect based applications can actually be a very entertaining process.