Monday, September 9, 2013

Real -Time Video Processing in iPhone

I was really impressed with the snapdragon features integrated to new Snapdragon enabled Android devices. It has really brought the changes in the market, specifically pointing the facial processing features. I just wanted to have a glance on how facial processing is been implemented. Before to lean on some framework i just had a quick glance at some of the SDKs and libraries available for facial processing. Found too many falling into this category, most with desktop implementation samples as-well.
Like OpenCv, SimpleCv, Faint and the list went long. I also found some list of commercial vendors that provide packages for facial recognition like Cybula, NeuroTechnology, Pittsburgh Pattern Recognition, Sensible Vision.

Freezing the requirement made me filter the choices. I just wanted to explore and find some library or framework doing the facial processing in iOS, may be not exactly as the snapdragon's facial processing for Android but something alternate to that. Before going for an open source or third party libraries its always wise looking at the native iOS SDK options. For any one who just need to identify the face real time or from an image CIDetector of the Core Image framework is an easy choice. Just a couple of line of code will get the work done. Have a look at the implementation below.

Have a look at the native implementation doc by apple to get this done here
But for anyone who need to do much more processing than just detection of face eyes, mouth, using a good open source libraries designed for specific purpose is a good choice.
OpenCV gives a framework for iOS that can be downloaded here. Also the clean and clear documentation of the API is also available in the CV website. They have also got sample implementation of desk top and mobile versions useful for any one to take a quick start.

The real time video processing makes you write some codes to capture the frames at the rate you need before to get processed. You have a wonderful apple documentation to get know the base and start here.
With the motive to detect face and eyes real time,  i just started with extending OpenCV sample for iOS found here.

Face detection is easily possible by using cascade classifier for Haar features. A cascade classifier basically tells OpenCV what to search in the image you specify. I used lbpcascade_frontalface classifier in my project to detect the face. There are too many cascade classifiers available in the internet. You can also train a cascade classifier tell OpenCV to recognize an object of our choice.
Extending the project i just detected the eyes in the reduced face area by just loading the haarcascade_lefteye_2splits classifier. But even though OpenCV is one of the fastest among the machine vision libraries, the speed is bottle neck when it comes to mobile devices. I found it even depended on the device capability. Your processing speed differ from an iPod to iPhone 4 to iPhone5. You can have a really faster processing on latest iPhone5. But we can have a little control over the processing of frames. Loading of more cascade for identifying face, left eye, right eye will slow down the process. Reducing the number of cascade loading will improve your speed. Also reducing the area of the image for search is also a best practice to improve your speed.
Reducing the search area is all about fixing your region of interest (ROI). You don't have to search an eye in lower half of the face. But to reduce the usage of classifies you got to have something to be done. Template matching is one thing that can be tried out. Ofcourse you will have the limitations and constraints.
But it really speeds up the processing. This is how i got it done.
 
Load the specific classifiers.

NSString *faceCascadePath = [[NSBundle mainBundle] pathForResource:@"lbpcascade_frontalface" ofType:@"xml"];
NSString *lefteyeCascadePath = [[NSBundle mainBundle]pathForResource:@"haarcascade_lefteye_2splits" ofType:@"xml"];
NSString *righteyeCascadePath = [[NSBundle mainBundle]pathForResource:@"haarcascade_lefteye_2splits" ofType:@"xml"];
if (!_faceCascade.load([faceCascadePath UTF8String])) {
NSLog(@"Could not load face cascade: %@", faceCascadePath);
}
if (!_leftEyeCascade.load([lefteyeCascadePath UTF8String])) {
NSLog(@"Could not load leftEye cascade: %@", lefteyeCascadePath);
}
if (!_rightEyeCascade.load([righteyeCascadePath UTF8String])) {
NSLog(@"Could not load rightEye cascade: %@", righteyeCascadePath);
}

Implement the delegate that captures the frame as cv::Mat. This is where your additional codes for processing is done.

cvtColor(image, image, CV_RGBA2GRAY);

std::vector<cv::Rect> faces;

_faceCascade.detectMultiScale(image, faces, 1.1, 2, kHaarEyeOptions, cv::Size(40, 40));

Loop through the result face array detected.
for (int i = 0; i < faces.size(); i++) {
std::vector<cv::Rect> lefteyes;
std::vector<cv::Rect> righteyes;
cv::Rect faceRect = faces[i];
Draw the face rect on screen.
cv::rectangle(image, faceRect.tl(), faceRect.br(), cvScalar(255,0,0));
 
CGRect eyearea_right = CGRectMake(faces[i].x +faces[i].width/16,(faces[i].y + (faces[i].height/4.5)),(faces[i].width - 2*faces[i].width/16)/2,( faces[i].height/3.0));
 
cv::Rect  eROI(eyearea_right.origin.x,eyearea_right.origin.y,eyearea_right.size.width,eyearea_right.size.height);
 
cv::Mat croppedReye = image(eROI);
cv::Mat cropReye = image(eROI).clone();
CGRect eyearea_left = CGRectMake(faces[i].x +faces[i].width/16 +(faces[i].width - 2*faces[i].width/16)/2,(faces[i].y + (faces[i].height/4.5)),(faces[i].width - 2*faces[i].width/16)/2,( faces[i].height/3.0));

 
cv::Rect 
eLOI(eyearea_left.origin.x,eyearea_left.origin.y,eyearea_left.size.width,eyearea_left.size.height);

 
cv::Mat croppedLeye = image(eLOI);

cv::Mat cropLeye = image(eLOI).clone();


Learn few frames to get the template of both the eyes, which is to be used further for template matching implementation.

if(learnTemplate<50){
_lEyeTemplate = [self getTemplate:croppedReye];
_rEyeTemplate = [self getTemplate:croppedLeye];
}

This function takes the reduced eye area and creates the templates for the eyes. The cascade is used to find the eyes in the reduced eyes areas left and right respectively. And from the area the iris spot is been located with simple MinMax calculation of OpenCV which is finding the darkest spot in an image. Further this area is cropped to get the exact eye area excluding the eyebrow and that is been saved as the eye templates.

-(cv::Mat)getTemplate:(const cv::Mat&)croppedEyearea{
std::vector<cv::Rect> eyes;

cv::Mat EyeTemplate;

cv::Mat temlMat;
_leftEyeCascade.detectMultiScale(croppedEyearea, eyes, 1.1, 2, kHaarEyeareaOptions, cv::Size(30, 30));
for(int i=0; i<eyes.size(); i++){
CGRect eye_only_rectangle = CGRectMake(eyes[i].tl().x,(eyes[i].tl().y + eyes[i].height*0.4),eyes[i].width,(eyes[i].height*0.6));
cv::Rect mylROI(eye_only_rectangle.origin.x,eye_only_rectangle.origin.y ,eye_only_rectangle.size.width,eye_only_rectangle.size.height);
temlMat = croppedEyearea(mylROI);
 
double minval, maxval;
cv::Point minloc, maxloc;
CGPoint iris ;
cv::minMaxLoc(temlMat, &minval, &maxval, &minloc, &maxloc);
iris.x = minloc.x + eye_only_rectangle.origin.x;
iris.y = minloc.y + eye_only_rectangle.origin.y;
 
cv::Rect newlROI((int)iris.x-12,(int)iris.y-12 ,30,30);
EyeTemplate = croppedEyearea(newlROI).clone();
}
learnTemplate ++;
return EyeTemplate;
}
 
So what happens after the learning of templates. We really dont have to load the eye classifiers any more instead we can use the template created for just finding the eyes in face frame. This speeds the processing after learning phase.

cv::Mat res(cropReye.rows-_lEyeTemplate.rows+1, cropReye.cols-_lEyeTemplate.cols+1, CV_32FC1);
 
cv::matchTemplate(cropReye, _lEyeTemplate, res, CV_TM_CCOEFF_NORMED);
 
double minval, maxval;
cv::Point minloc, maxloc;
cv::minMaxLoc(res, &minval, &maxval, &minloc, &maxloc);
cv::Point px(maxloc.x + eyearea_right.origin.x , maxloc.y+ eyearea_right.origin.y);
 
cv::Point py(maxloc.x + _lEyeTemplate.cols + eyearea_right.origin.x , maxloc.y + _lEyeTemplate.rows + eyearea_right.origin.y);
 
cv::rectangle(image, px,py, *new Scalar(255, 255, 0, 255));
 Do the same to find the other eye also.


So whats next is to really make something out of this results. Keep tuned to look at the next post with detail description on blink detection using OpenCV in iPhone.

No comments:

Post a Comment