I’ve been working on developing a method for automatic head-pose tracking, and along the way have come to model facial appearances. I start by initializing a facial bounding box using the Viola-Jones detector, a well known and robust detector used for training objects. This allows me to centralize the face. Once I know where the 2D plane of the face is in an image, I can register an Active Shape Model like so:
After multiple views of the possible appearance variations of my face, including slight rotations, I construct an appearance model.
The idea I am working with is using the first components of variations of this appearance model for determining pose. Here I show the first two basis vectors and the images they reconstruct:
As you may notice, these two basis vectors very neatly encode rotation. By looking at the eigenvalues of the model, you can also interpret pose.
Archived entries for opencv
Facial Appearance Modeling/Tracking
Keyframe based modeling
Playing with MSERs in trying to implement an algorithm for feature-based object tracking. The algorithm first finds MSERs, warps them to circles, describes them with a SIFT descriptor, and then indexes keyframes of sift vectors by using vocabulary trees. Of course that’s a ridiculously simplified explanation, but look at what it’s capable of!!!:
OpenCV 2.0 Introduces CPP Style Coding and More
No more void *‘s apparently. And they’ve got constructors and destructors.
Check out the documentation here: http://opencv.willowgarage.com/documentation/cpp/index.html
Notably, the memory management seems to be much nicer as destructors are called when there are no more references to the object.
As well, quickly accessing data row or plane major seems to be much easier now:
e.g. plane access:
// split the image into separate color planes
// access with iterators:
it_end = planes.end
As well, they have implemented STL-like class traits for easily declaring matrices with the normal c++ primitives without having to remember CV_64F etc…
Mat A(30, 40, DataType
Mat B = Mat_
There is a whole lot more introduced including a revamped interface system, many more machine learning and computer vision algorithms, OMP integration, and probably a lot more. I’ll be playing with it to see what else is going on. Hopefully the openframeworks community will pick up on it as well and integrate it into their next major release. I know I’ll be doing so for my projects.
OpenCV 1.2.0 (2.0 Beta)
Taken from the changelog:
>>> New functionality, features: <<<
* The brand-new C++ interface for most of OpenCV functionality
(cxcore, cv, highgui) has been introduced.
Generally it means that you will need to do less coding to achieve the same results;
it brings automatic memory management and many other advantages.
See the C++ Reference section in opencv/doc/opencv.pdf and opencv/include/opencv/*.hpp.
The previous interface is retained and still supported.
* The source directory structure has been reogranized; now all the external headers are placed
in the single directory on all platforms.
* The primary build system is CMake, http://www.cmake.org (2.6.x is the preferable version).
+ In Windows package the project files for Visual Studio, makefiles for MSVC,
Borland C++ or MinGW are note supplied anymore; please, generate them using CMake.
+ In MacOSX the users can generate project files for Xcode.
+ In Linux and any other platform the users can generate project files for
cross-platform IDEs, such as Eclipse or Code Blocks,
or makefiles for building OpenCV from a command line.
* OpenCV repository has been converted to Subversion, hosted at SourceForge:
where the very latest snapshot is at
and the more or less stable version can be found at
– CXCORE, CV, CVAUX:
* CXCORE now uses Lapack (CLapack 220.127.116.11 in OpenCV 2.0) in its various linear algebra functions
(such as solve, invert, SVD, determinant, eigen etc.) and the corresponding old-style functions
(cvSolve, cvInvert etc.)
* Lots of new feature and object detectors and descriptors have been added
(there is no documentation on them yet), see cv.hpp and cvaux.hpp:
+ FAST – the fast corner detector, submitted by Edward Rosten
+ MSER – maximally stable extremal regions, submitted by Liu Liu
+ LDetector – fast circle-based feature detector by V. Lepetit (a.k.a. YAPE)
+ Fern-based point classifier and the planar object detector -
based on the works by M. Ozuysal and V. Lepetit
+ One-way descriptor – a powerful PCA-based feature descriptor,
(S. Hinterstoisser, O. Kutter, N. Navab, P. Fua, and V. Lepetit,
“Real-Time Learning of Accurate Patch Rectification”).
Contributed by Victor Eruhimov
+ Spin Images 3D feature descriptor – based on the A. Johnson PhD thesis;
implemented by Anatoly Baksheev
+ Self-similarity features – contributed by Rainer Leinhart
+ HOG people and object detector – the reimplementation of Navneet Dalal framework
(http://pascal.inrialpes.fr/soft/olt/). Currently, only the detection part is ported,
but it is fully compatible with the original training code.
See cvaux.hpp and opencv/samples/c/peopledetect.cpp.
+ Extended variant of the Haar feature-based object detector – implemented by Maria Dimashova.
It now supports Haar features and LBPs (local binary patterns);
other features can be more or less easily added
+ Adaptive skin detector and the fuzzy meanshift tracker – contributed by Farhad Dadgostar,
see cvaux.hpp and opencv/samples/c/adaptiveskindetector.cpp
* The new traincascade application complementing the new-style HAAR+LBP object detector has been added.
* The powerful library for approximate nearest neighbor search FLANN by Marius Muja
is now shipped with OpenCV, and the OpenCV-style interface to the library
is included into cxcore. See cxcore.hpp and opencv/samples/c/find_obj.cpp
* The bundle adjustment engine has been contributed by PhaseSpace; see cvaux.hpp
* Added dense optical flow estimation function (based on the paper
“Two-Frame Motion Estimation Based on Polynomial Expansion” by G. Farnerback).
See cv::calcOpticalFlowFarneback and the C++ documentation
* Image warping operations (resize, remap, warpAffine, warpPerspective)
now all support bicubic and Lanczos interpolation.
* Most of the new linear and non-linear filtering operations (filter2D, sepFilter2D, erode, dilate …)
support arbitrary border modes and can use the valid image pixels outside of the ROI
(i.e. the ROIs are not “isolated” anymore), see the C++ documentation.
* The data can now be saved to and loaded from GZIP-compressed XML/YML files, e.g.:
* Added the Extremely Random Trees that train super-fast,
comparing to Boosting or Random Trees (by Maria Dimashova).
* The decision tree engine and based on it classes
(Decision Tree itself, Boost, Random Trees)
have been reworked and now:
+ they consume much less memory (up to 200% savings)
+ the training can be run in multiple threads (when OpenCV is built with OpenMP support)
+ the boosting classification on numerical variables is especially
fast because of the specialized low-overhead branch.
* mltest has been added. While far from being complete,
it contains correctness tests for some of the MLL classes.
* [Linux] The support for stereo cameras (currently Videre only) has been added.
There is now uniform interface for capturing video from two-, three- … n-head cameras.
* Images can now be compressed to or decompressed from buffers in the memory,
see the C++ HighGUI reference manual
* The reference manual has been converted from HTML to LaTeX (by James Bowman and Caroline Pantofaru),
so there is now:
+ opencv.pdf for reading offline
+ and the online up-to-date documentation
(as the result of LaTeX->Sphinx->HTML conversion) available at
– Samples, misc.:
* Better eye detector has been contributed by Shiqi Yu,
* sample LBP cascade for the frontal face detection
has been created by Maria Dimashova,
* Several high-quality body parts and facial feature detectors
have been contributed by Modesto Castrillon-Santana,
* Many of the basic functions and the image processing operations
(like arithmetic operations, geometric image transformations, filtering etc.)
have got SSE2 optimization, so they are several times faster.
– The model of IPP support has been changed. Now IPP is supposed to be
detected by CMake at the configuration stage and linked against OpenCV.
(In the beta it is not implemented yet though).
* PNG encoder performance improved by factor of 4 by tuning the parameters
I’ve recently finished up a project in collaboration with a Glass Artist, Agelos Papadakis. We built a structure of 25 glass neurons the size of a face and chainded them together in a 3x3x5 meter sculpture. We had 2 cameras hidden in the piece tracking peoples faces and a projector then creating visualizations of the recorded faces resembling something like a cloud of neurons firing in different patterns. We presented it first in Edinburgh at Lauriston Castle’s Glasshouse, and then at the Passing Through exhibit in the James Taylor Gallery in Hackney: http://jamestaylorgallery.co.uk/exhibitions/2009/03/passing-through.html
It’s a bit tricky trying to film the piece since it uses projection onto glass. Sadly I’m left with only a few images that try to portray what went on.
Here’s the code, http://ccrma.stanford.edu/~pkmital/share/Memory.zip It makes use of the openframeworks library so you will need to be familiar with how to setup an XCode project with the openframeworks library if you plan on using it.
The original idea was to use glass balls so that’s why all the code says glassBalls instead of say glassNeurons. If you manage to get it running, press ‘d’ to see the live video input. As it collects faces, it fills up the image buffers with each “glassBalls”. Once all the glassBalls are loaded with images of faces, then the visualization begins. Neurons will “fire” and brightness values of each neuron will go up and down based on Gaussian functions and these values are sent to the brightness shader.
It was a bit of a sculptural challenge placing all the glass pieces within view of the projector, avoiding any other glass pieces or chains in the line of sight of it, having it fit within our view of what we wanted, and also within the amount of time we had. We did the projection mapping by just creating the object with a bounding box. Once you moused over an image of a person’s face, you could drag it around, use ‘-‘, ‘=’, ‘_’, or ‘+’ to resize it. I used a text file to store (‘w’) and read (‘r’) the positions of the faces in case I had to reload the program. I think memo has some MSAInteractiveObject class now which would be 1000x nicer to use.
Just a note as well, I wrote this against of573. I also ended up extending ofxCvColorImage with a few functions so I’ve just included all of ofxCv* files in the project. There may very well be other functions that I’ve edited and forgotten about so please let me know if you have any problems with it. Have a look through and please let me know what you think!
OpenCV with Processing using Eclipse
My students for the Digital Media Studio Project here at the University of Edinburgh have asked me to present a small workshop on using some aspects of the Processing.org environment. I’ve worked up something and thought I could share it online as well. I’ve setup a google code repository with the necessary files. The code simply highlights what you could find throughout the Processing.org discourse and the OpenCV example files though is more thoroughly commented and organized.
A few notes, I really dislike the Processing IDE. Maybe it’s just because I’ve used IDE’s like VS, Netbeans, Eclipse, XCode etc… and I haven’t really played with Processing enough to have a well founded basis in the functions available. I believe going through a few extra steps to setup an IDE like Eclipse makes the task of programming much easier though at the cost of a bulky editor that may not be so easy to setup at first…
Eclipse is an (Integrated Development Environment) IDE for many coding languages, one of which is Java. Some advantages:
- code completion – automatically see possible choices for all members belonging to a class definition, such as functions and their arguments.
- javadocs – javadoc is a formatting for writing code comments. by following a simple format, javadocs can produce a nice html document outlining all the functions/members/arguments/what to expect etc… – while coding, having the ability to see javadocs is invaluable as memorizing all of the members of a class is often not ideal.
- browsing libraries – along the lines of javadocs, being able to see the definitions of a class are much easier than having to memorize all the functions belonging to something like processing.core.PImage – and with the eclipse environment, you can view the javadocs along with the libraries.
- debug – step through your program and view the stack trace, threads, and all the messy hex numbers.
The biggest disadvantages are that it takes time to setup a project, include files, and write the class definitions, none of which you will have to do in the Processing IDE. Luckily, there is a nice tutorial for setting up Eclipse to use the Processing libraries: http://processing.org/learning/tutorials/eclipse/ – I recommend going through this thoroughly.
OpenCV is an open source, cross-platform library developed by Intel and used widely by researchers in fields such as medical imaging, artificial intelligence, and interactive art. There is a nice port available that includes a minimal though nice set of functions for the Processing and Java environment: http://ubaa.net/shared/processing/opencv/ – This page provides detailed instructions and very nice documentation on setting up the OpenCV environment. If you are on OS X and are looking for the Java Extensions folder, try the folder: /Library/Java/Extensions.