I
was really impressed with the snapdragon features integrated to new
Snapdragon enabled Android devices. It has really brought the changes
in the market, specifically pointing the facial processing features.
I just wanted to have a glance on how facial processing is been
implemented. Before to lean on some framework i just had a quick
glance at some of the SDKs and libraries available for facial
processing. Found too many falling into this category, most with
desktop implementation samples as-well.
Like
OpenCv, SimpleCv,
Faint and the list went
long. I also found some list of commercial vendors that provide
packages for facial recognition like Cybula,
NeuroTechnology,
Pittsburgh Pattern Recognition,
Sensible Vision.
Freezing
the requirement made me filter the choices. I just wanted to explore
and find some library or framework doing the facial processing in
iOS, may be not exactly as the snapdragon's facial processing for
Android but something alternate to that. Before going for an open
source or third party libraries its always wise looking at the native
iOS SDK options. For any one who just need to identify the face real
time or from an image CIDetector of the Core Image framework is an
easy choice. Just a couple of line of code will get the work done.
Have a look at the implementation below.
Have
a look at the native implementation doc by apple to get this done
here
But
for anyone who need to do much more processing than just detection of
face eyes, mouth, using a good open source libraries designed for
specific purpose is a good choice.
OpenCV
gives a framework for iOS that can be downloaded here.
Also the clean and clear documentation of the API is also available
in the CV website.
They have also got sample implementation of desk top and mobile
versions useful for any one to take a quick start.
The
real time video processing makes you write some codes to capture the
frames at the rate you need before to get processed. You have a
wonderful apple documentation to get know the base and start here.
With
the motive to detect face and eyes real time, i just started with
extending OpenCV sample for iOS found here.
Face detection is easily possible by using cascade classifier for Haar features. A cascade classifier basically tells OpenCV what to search in the image you specify. I used lbpcascade_frontalface classifier in my project to detect the face. There are too many cascade classifiers available in the internet. You can also train a cascade classifier tell OpenCV to recognize an object of our choice.
Extending
the project i just detected the eyes in the reduced face area by just
loading the haarcascade_lefteye_2splits
classifier. But even though OpenCV
is one of the fastest among the machine vision libraries, the speed
is bottle neck when it comes to mobile devices. I found it even
depended on the device capability. Your processing speed differ from
an iPod to iPhone 4 to iPhone5. You can have a really faster
processing on latest iPhone5. But we can have a little control over
the processing of frames. Loading of more cascade for identifying
face, left eye, right eye will slow down the process. Reducing the
number of cascade loading will improve your speed. Also reducing the
area of the image for search is also a best practice to improve your
speed.
Reducing
the search area is all about fixing your region of interest (ROI).
You don't have to search an eye in lower half of the face. But to
reduce the usage of classifies you got to have something to be done.
Template matching is one thing that can be tried out. Ofcourse you
will have the limitations and constraints.
But
it really speeds up the processing. This is how i got it done.
Load the specific classifiers.
NSString
*faceCascadePath = [[NSBundle
mainBundle]
pathForResource:@"lbpcascade_frontalface"
ofType:@"xml"];
NSString
*lefteyeCascadePath = [[NSBundle
mainBundle]pathForResource:@"haarcascade_lefteye_2splits"
ofType:@"xml"];
NSString
*righteyeCascadePath = [[NSBundle
mainBundle]pathForResource:@"haarcascade_lefteye_2splits"
ofType:@"xml"];
if
(!_faceCascade.load([faceCascadePath
UTF8String]))
{
NSLog(@"Could
not load face cascade: %@",
faceCascadePath);
}
if
(!_leftEyeCascade.load([lefteyeCascadePath
UTF8String]))
{
NSLog(@"Could
not load leftEye cascade: %@",
lefteyeCascadePath);
}
if
(!_rightEyeCascade.load([righteyeCascadePath
UTF8String]))
{
NSLog(@"Could
not load rightEye cascade: %@",
righteyeCascadePath);
}
Implement
the delegate that captures the frame as cv::Mat. This is where your
additional codes for processing is done.
cvtColor(image,
image, CV_RGBA2GRAY);
std::vector<cv::Rect>
faces;
_faceCascade.detectMultiScale(image,
faces, 1.1,
2,
kHaarEyeOptions,
cv::Size(40,
40));
Loop
through the result face array detected.
for
(int
i = 0;
i < faces.size();
i++) {
std::vector<cv::Rect>
lefteyes;
std::vector<cv::Rect>
righteyes;
cv::Rect
faceRect = faces[i];
Draw
the face rect on screen.
cv::rectangle(image,
faceRect.tl(),
faceRect.br(),
cvScalar(255,0,0));
CGRect
eyearea_right = CGRectMake(faces[i].x
+faces[i].width/16,(faces[i].y
+ (faces[i].height/4.5)),(faces[i].width
- 2*faces[i].width/16)/2,(
faces[i].height/3.0));
cv::Rect
eROI(eyearea_right.origin.x,eyearea_right.origin.y,eyearea_right.size.width,eyearea_right.size.height);
cv::Mat
croppedReye = image(eROI);
cv::Mat
cropReye = image(eROI).clone();
CGRect
eyearea_left = CGRectMake(faces[i].x
+faces[i].width/16
+(faces[i].width
- 2*faces[i].width/16)/2,(faces[i].y
+ (faces[i].height/4.5)),(faces[i].width
- 2*faces[i].width/16)/2,(
faces[i].height/3.0));
cv::Rect
eLOI(eyearea_left.origin.x,eyearea_left.origin.y,eyearea_left.size.width,eyearea_left.size.height);
eLOI(eyearea_left.origin.x,eyearea_left.origin.y,eyearea_left.size.width,eyearea_left.size.height);
cv::Mat
croppedLeye = image(eLOI);
cv::Mat
cropLeye = image(eLOI).clone();
Learn
few frames to get the template of both the eyes, which is to be used
further for template matching implementation.
if(learnTemplate<50){
_lEyeTemplate
= [self
getTemplate:croppedReye];
_rEyeTemplate
= [self
getTemplate:croppedLeye];
}
This function takes
the reduced eye area and creates the templates for the eyes. The
cascade is used to find the eyes in the reduced eyes areas left and
right respectively. And from the area the iris spot is been located
with simple MinMax calculation of OpenCV which is finding the darkest
spot in an image. Further this area is cropped to get the exact eye
area excluding the eyebrow and that is been saved as the eye
templates.
-(cv::Mat)getTemplate:(const cv::Mat&)croppedEyearea{
std::vector<cv::Rect>
eyes;
cv::Mat
EyeTemplate;
cv::Mat
temlMat;
_leftEyeCascade.detectMultiScale(croppedEyearea,
eyes, 1.1,
2,
kHaarEyeareaOptions,
cv::Size(30,
30));
for(int
i=0;
i<eyes.size();
i++){
CGRect
eye_only_rectangle = CGRectMake(eyes[i].tl().x,(eyes[i].tl().y
+ eyes[i].height*0.4),eyes[i].width,(eyes[i].height*0.6));
cv::Rect
mylROI(eye_only_rectangle.origin.x,eye_only_rectangle.origin.y
,eye_only_rectangle.size.width,eye_only_rectangle.size.height);
temlMat
= croppedEyearea(mylROI);
double
minval, maxval;
cv::Point
minloc, maxloc;
CGPoint
iris ;
cv::minMaxLoc(temlMat,
&minval, &maxval, &minloc, &maxloc);
iris.x
= minloc.x
+ eye_only_rectangle.origin.x;
iris.y
= minloc.y
+ eye_only_rectangle.origin.y;
cv::Rect
newlROI((int)iris.x-12,(int)iris.y-12
,30,30);
EyeTemplate
= croppedEyearea(newlROI).clone();
}
learnTemplate
++;
return
EyeTemplate;
}
So
what happens after the learning of templates. We really dont have to
load the eye classifiers any more instead we can use the template
created for just finding the eyes in face frame. This speeds the
processing after learning phase.
cv::Mat
res(cropReye.rows-_lEyeTemplate.rows+1,
cropReye.cols-_lEyeTemplate.cols+1,
CV_32FC1);
cv::matchTemplate(cropReye,
_lEyeTemplate,
res, CV_TM_CCOEFF_NORMED);
double
minval, maxval;
cv::Point
minloc, maxloc;
cv::minMaxLoc(res,
&minval, &maxval, &minloc, &maxloc);
cv::Point
px(maxloc.x
+ eyearea_right.origin.x
, maxloc.y+
eyearea_right.origin.y);
cv::Point
py(maxloc.x
+ _lEyeTemplate.cols
+ eyearea_right.origin.x
, maxloc.y
+ _lEyeTemplate.rows
+ eyearea_right.origin.y);
cv::rectangle(image,
px,py, *new
Scalar(255,
255,
0,
255));
Do
the same to find the other eye also.
So
whats next is to really make something out of this results. Keep
tuned to look at the next post with detail description on blink
detection using OpenCV in iPhone.