Practical approach for using face recognition in software applications

face recognition has become a very important tool . Many interesting use cases have come up which uses for face recognition . One interesting use case , is that Chinese government is using face recognition to catch criminals .

Another use case , i encountered on the twitter . Face recognition was used to display departure gate details to airport travelers . Airport is Chengdu Shuangliu International Airport in china . Seems pretty amazing feat of technology.

face recognition using passport photo

Question is how do you use face recognition in application software . Approach can depends on the number of customers ( or faces ) among’st which you need to perform face recognition . Some of the software application can have millions of records ( faces ) . Some of the questions which are important to look at

  • Is it practical to perform face recognition across millions of records ?

Here i restrict my discussion to newer set of techniques which are used for face recognition .

Algorithms based on Deep learning.

One of the most popular technique is to use deep learning for face recognition . Deep learning does not require any features defined upfront . All the features are learned by the network on its own .

Davis king’s dlib library has adopted the deep learning approach and developed ( modified and improved on existing architecture — more about that later ) a convolution neural network for face detection and recognition .

  • dlib library takes a photo as input

Face recognition is a similarity check

Face recognition does not perform equality check .

Let us take a example , For any image you fed to the face recognition algorithm , it will analyse and give a vector back . Lets say you take a image of viraat kohli ( an Indian cricketer ) and fed into the algorithm . You might get below vector

Viraat kohli — without cap

vector of first image v= [ 0.05 , 0.06 , 0.07 ,0.08]

In reality , dlib library emits a vector which has 128 numbers ( called 128D vector ) but for simplification , vector v above is 4D .

Now if you take second image of viraat kohli and fed into the face recognition . please note that second image is not same as the first image .

Viraat kohli — with Cap

In second photo graph viraat kohli is wearing a cap . Feeding this image to the neural network would also output a vector . Although this time , vector would be different as the image is different . It would be as follows .

Vector of second image would be = [ 0.06 , 0.06 , 0.07 ,0.07]

You can observe that both vectors are not same . If you feed another image to neural network ,lets say that of another popular cricketer ( Rohit sharma ) , network will emit another vector .

Rohit Sharma -another popular indian crickter.

Lets say that the vector emitted is : [ 0.78 , 0.80 ,0.90 ,0.96] . If we want to check which photos are similar to each other , we need to simply check which vectors are closer to each other . Another way to state is that , we need to find distance between the vectors . One of the simplest distance formula would be euclidean distance.

Euclidean distance.

Euclidean distance is the straight line distance between two points .Lets take a simple example . Distance between two points (2,-1) and (-1,2) would be as follows .

dist((2, -1), (-2, 2))

= √(2 — (-2))² + ((-1) — 2)²

= √(2 + 2)² + (-1–2)²

= √(4)² + (-3)² = √16 + 9 = √25 = 5.

Euclidean distance between 2 viraat kohli’s photo

=√(0.05–0.78)² + (0.06–0.80)² + (0.07–0.90)²+ (0.08–0.96)²

=0.01414213562373095

Euclidean distance between photo of viraat kohli and Rohit sharma

=√(0.05–0.06)² + (0.06–0.06)² + (0.07–0.07)²+ (0.08–0.07)²

=2.0892999999999997

Airport face detection or face detection anywhere

Simply these are the steps for face recognition

  • Step1 : For each of the photo in your system ,Take the photo , use a pre-trained model ( like dlib library has or build your network first!!) , to get the vector

But that is like comparing against all records in your database. Lets try to calculate this for the airport example ( roughly ) .

Airport in the example is Chengdu Shuangliu International Airport . Shuangliu Airport handled 42.2 million passengers in 2015. It was among world’s top 30 busiest airport in 2015 as per wikipedia information . Lets take the number of travelers in year to 50 million. In a month approximately 4.5 million .

  • In a day approximately 150000 unique visitors .

If you have limited computed power , then it makes sense to limit the number of vector comparison to a small number may be to few thousands. This would ensure that user experience for online user is not bad . How to limit this number depends on context of specific use case .

Information on the model

Dlib library internally uses a pre-trained model. This model is altered version of famous ResNet-34 architecture . Please refer to the blog by author for more details on the same .

If you want to use the model in node.js applications , please refer to the following node wrapper over dlib library . Although author ( Vincent Muhler) has moved onto a authoring a different library and different models ( face-api.js library and SSD Mobilenet model) , i still prefer dlib implementation for server side face recognition .

Summary

  • Face recognition is similarity match .

Interests : software design ,architecture , search, open banking , machine learning ,mobility