After several attempts to detect gesture, relying on skin color, I tried object detection through the famous Viola & jones method. I used it as implemented in OpenCV library.
The experiment was about detecting a gesture used to select some displayed item. Among every possible hand movement, the opposite figure seemed the most intuitive and recognizable.
Viola & Jones method is well known about efficiently detecting faces. But there is nothing about hands, even less for hands’ back. Software must be trained to identify this particular shape.
In order to compute the shape’s specific Haar-Like Deatures used by detection algorithm, many samples are required. A set of picture focused on the main object. And a set of random scenes, not containing the form to be detected.
- This software uses Webcam stream
- A ten seconds countdown allows the user to fit an object into capture area.
- By the end of the countdown, a snapshot is resized and saved.
- Positive pictures are rectangular, it is not a problem.
- Picture is then processed by an OpenCV utility (named createsamples), creating more images, slightly distorded. Modified contrast, scale or rotation, negative colours, …
- For every supplied picture, 20 are created.
My 1500 positive images goal was worth 10 minutes.
1500 positive images and 3500 negatives was enough for OpenCv to compute Haar-Like Features. OpenCV library provides the “haartraining” utility, doing the job for us.
The goal was to detect the shape with wide tolerances, allowing few false positives. It took the computer 8 hours to work out a 10 stages detector.
Below, some features involved in detection:
Results are quite interesting, regarding the short training duration. Initial shape is correctly and efficiently detected, while false positives seem to be rare.
This shape recognition may be part of a detection process. Simply used to locate features for the software to investigate and identify a user interaction.