用10几行代码自己写个人脸识别程序

用10几行代码自己写个人脸识别程序

CV (Computer Vision)

最近在研究CV的一些开源库(OpenCV),有一个体会就是在此领域,除了一些非常学术的_机器学习_, 深度学习_等概念外,其实还有一些很有趣的_现实的_应用场景。比如之前很流行的微软的 https://how-old.net, 你使用自己指定或者上传的照片进行面部识别_猜年龄。 如下图所示:

细想一下这个很吸引眼球的程序,其实技术本身打散了就包括两大块,一是从图片中扫描并进行面部识别,二是对找到的人脸根据算法去猜个年龄。大家可以猜猜实现第一个功能需要多少核心代码量?其实不用上万行,在这里我就使用短短几行代码(去除空格换行什么的,有效代码只要10行)就实现一个_高大上_面部识别的功能。在此文容我细述一下具体实现代码以及我对机器识别图像领域技术的理解。

面部识别,刷脸

_人脸识别_技术大家应该都不陌生,之前大家使用的数码相机,或者现在很多手机自带的相机都有人脸识别的功能。其效果就像是下图这样。近的看,_剁手节_刚刚过了没有多久 , 背后的马老板一直在力推的刷脸支付也是一个此领域的所谓“黑科技”。比如在德国汉诺威电子展上,马云用支付宝“刷脸”买了一套纪念邮票。人脸识别应用市场也从爆发。随后,各大互联网巨头也纷纷推出了刷脸相关的应用。

如果要加个定义,人脸识别又叫做人像识别、面部识别,是一种通过用摄像机或摄像头采集含有人脸的图像或视频流,并自动在图像中检测和跟踪人脸,进而对检测到的人脸进行脸部的一系列相关技术。

我的十行代码程序

OK,长话短说,先上 干货 ,下面就是此程序的_带注释_ 版本,完整的程序以及相关配套文件可以在 这个github库 https://github.com/CloudsDocker/pyFacialRecognition 中找到,有兴趣可以_fork_ 下来玩玩。下面是整个程序的代码样子,后面我会逐行去解释分析。

就这短短的十行代码代码?seriously?“有图有真相”,我们先来看下运行的效果:

首先是原始的图片

运行程序后识别出面部并高亮显示的结果

请注意 K歌二人组 的脸上的红色框框,这就是上面十行代码的成果。

代码解析

准备工作

因为此程序使用是的Python,因此你需要去安装Python。这里就不赘述了。除此之外,还需要安装 OpenCV (http://opencv.org/downloads.html)。 多说一句,这个 OpenCV正如其名,是一个开源的机器识别的深度学习框架。这是Intel(英特尔)实验室里的一个俄罗斯团队创造的,目前在开源社区非常的活跃。

特别提一下,对于Mac的用户,推荐使用brew去安装 (下面第一条语句可能会执行报错,我当时也是搞了好久。如果遇到第一条命令不过可以通过文尾的方式联系作者)

brew tap homebrew/science
brew install opencv

安装完成之后,在python的命令行中输入如下代码验证,如果没有报错就说明安装好了。

>>> import cv2

程序代码“庖丁解牛”

# -*- coding: utf-8 -*-
import cv2,sys
  • 由于这里注释及窗口标题中使用了中文,因此加上utf-8字符集的支持
  • 引入Opencv库以及Python的sys内建库,用于解析输入的图片参数
inputImageFile=sys.argv[1]
  • 在运行程序时将需要测试的照片文件名作为一个参数传进来
faceClassifier=cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
  • 加载OpenCV中自带预先培训好的人脸识别层级分类器 HAAR Casscade Classifier,这个会用来对我们输入的图片进行人脸判断。

这里有几个在深度学习及机器图像识别领域中的几个概念,稍微分析一下,至于深入的知识,大家可以自行搜索或者联系作者。

Classifer

在机器深度学习领域,针对识别不同物体都有不同的classifier,比如有的classifier来识别洗车,还有识别飞机的classifier,有classifier来识别照片中的笑容,眼睛等等。而我们这个例子是需要去做人脸识别,因此需要一个面部识别的classifier。

物体识别的原理

一般来说,比如想要机器学习着去识别“人脸”,就会使用大量的样本图片来事先培训,这些图片分为两大类,positive和negative的,也就是分为包“含有人脸”的图片和“不包含人脸”的图片,这样当使用程序去一张一张的分析这些图片,然后分析判断并对这些图片“分类” (classify),即合格的图片与不合格的图片,这也就其为什么叫做 classifier , 这样学习过程中积累的”知识”,比如一些判断时的到底临界值多少才能判断是positive还是negative什么的,都会存储在一个个XML文件中,这样使用这些前人经验(这里我们使用了 哈尔 分类器)来对新的图片进行‘专家判断’分析,是否是人脸或者不是人脸。

Cascade

这里的 Cascade是 层级分类器 的意思。为什么要 分层 呢?刚才提到在进行机器分析照片时,其实是对整个图片从上到下,从左到右,一个像素一个像素的分析,这些分析又会涉及很多的 特征分析 ,比如对于人脸分析就包含识别眼睛,嘴巴等等,一般为了提高分析的准确度都需要有成千上万个特征,这样对于每个像素要进行成千上万的分析,对于整个图片都是百万甚至千万像素,这样总体的计算量会是个天文数字。但是,科学家很聪明,就想到分级的理念,即把这些特征分层,这样分层次去验证图片,如果前面层次的特征没有通过,对于这个图片就不用判断后面的特征了。这有点像是系统架构中的 FF (Fail Fast),这样就提高了处理的速度与效率。

objImage=cv2.imread(inputImageFile)
  • 使用OpenCV库来加载我们传入的测试图片
cvtImage=cv2.cvtColor(objImage,cv2.COLOR_BGR2GRAY)
  • 首先将图片进行灰度化处理,以便于进行图片分析。这种方法在图像识别领域非常常见,比如在进行验证码的机器识别时就会先灰度化,去除不相关的背景噪音图像,然后再分析每个像素,以便抽取出真实的数据。不对针对此,你就看到非常多的验证码后面特意添加了很多的噪音点,线,就是为了防止这种程序来灰度化图片进行分析破解。
foundFaces=faceClassifier.detectMultiScale(cvtImage,scaleFactor=1.3,minNeighbors=9,minSize=(50,50),flags = cv2.cv.CV_HAAR_SCALE_IMAGE)
  • 执行detectMultiScale方法来识别物体,因为我们这里使用的是人脸的cascade classifier分类器,因此调用这个方法会来进行面部识别。后面这几个参数来设置进行识别时的配置,比如
  • scaleFactor: 因为在拍照,尤其现在很多都是自拍,这样照片中有的人脸大一些因为离镜头近,而有些离镜头远就会小一些,因为这个参数用于设置这个因素,如果你在使用不同的照片时如果人脸远近不同,就可以修改此参数,请注意此参数必须要大于1.0
  • minNeighbors: 因为在识别物体时是使用一个移动的小窗口来逐步判断的,这个参数就是决定是不是确定找到物体之前需要判断多少个周边的物体
  • minSize:刚才提到识别物体时是合作小窗口来逐步判断的,这个参数就是设置这个小窗口的大小
print(" 在图片中找到了 {} 个人脸".format(len(foundFaces)))
  • 显示出查找到多少张人脸,需要提到的识别物体的方法返回的一个找到的物体的位置信息的列表,因此使用 len 来打印出找到了多少物体。
for (x,y,w,h) in foundFaces:
    cv2.rectangle(objImage,(x,y),(x+w,y+h),(0,0,255),2)
  • 遍历发现的“人脸”,需要说明的返回的是由4部分组成的位置数据,即这个“人脸”的横轴,纵轴坐标,宽度与高度。
  • 然后使用 OpenCV 提供的方法在原始图片上画出个矩形。其中 (0,0,255) 是使用的颜色,这里使用的是R/G/B的颜色表示方法,比如 (0,0,0)表示黑色,(255,255,255)表示白色,有些网页编程经验的程序员应该不陌生。
cv2.imshow(u'面部识别的结果已经高度框出来了。按任意键退出'.encode('gb2312'), objImage)
cv2.waitKey(0)
  • 接下来是使用 opencv 提供的imshow方法来显示这个图片,其中包括我们刚刚画的红色的识别的结果
  • 最后一个语句是让用户按下键盘任意一个键来退出此图片显示窗口

总结

好了,上面是这个程序的详细解释以及相关的知识的讲解。其实这个只是个_抛砖引玉_的作用,还用非常多的应用场景,比如程序解析网页上的图片验证码,雅虎前几个月开源的 NSFW, Not Suitable for Work (NSFW),即判断那些不适合工作场所的图片,内容你懂的。 :-)

最后,再提一下,所有这些源代码及相关文件都开源在 https://github.com/CloudsDocker/pyFacialRecognition ,在fork并下载到本地后执行下面代码来测试运行

git clone https://github.com/CloudsDocker/pyFacialRecognition.git
cd pyFacialRecognition
./run.sh

如果有任何建议或者想法,请联系我。

联系我:

  • phray.zhang@gmail.com (email/邮件,whatsapp, linkedin)
  • helloworld_2000 (wechat/微信)
  • 微博: cloudsdocker
  • github
  • [简书 jianshu](http://www.jianshu.com/users/a9e7b971aafc)
  • 微信公众号:vibex

Reference

2022

Linux Tips

Remember, some things have to end for better things to begin.

Back to Top ↑

2021

How to user fire extinguisher

Summary As you know, staff and your safety is paramount. So what if emergency take place, such as fire in office, how to help yourself and your colleagues by...

Deep dive into Kubernetes Client API

Summary To talk to K8s for getting data, there are few approaches. While K8s’ official Java library is the most widely used one. This blog will look into thi...

Whitelabel Error Page

Summary Whitelabel Error Page is the default error page in Spring Boot web app. It provide a more user-friently error page whenever there are any issues when...

Google maps no photos reviews

Summary I found a weird problem of the app Google Maps of my Oppo Android phone. That’s when you search a place in Google map, say “Central Park”, ideally th...

Debts in a nutshell

A debt security represents a debt owed by the issuer to an investor. Here, the investor acts as a lender to the issuer which may be a government, organisatio...

Back to Top ↑

2020

Debug Stuck IntelliJ

What happened to a debug job hanging in IntelliJ (IDEAS) IDE? You may find when you try to debug a class in Intellij but it stuck there and never proceed, e....

Awesome Kotlin

Difference with Scala Kotlin takes the best of Java and Scala, the response times are similar as working with Java natively, which is a considerable advantag...

JVM热身

此文是作者英文原文的翻译文章,英文原文在:http://todzhang.com/posts/2018-06-10-jvm-warm-up/

Mock in kotlin

Argument Matching & Answers For example, you have mocked DOC with call(arg: Int): Intfunction. You want to return 1 if argument is greater than 5 and -1 ...

Mock in kotlin

Argument Matching & Answers For example, you have mocked DOC with call(arg: Int): Intfunction. You want to return 1 if argument is greater than 5 and -1 ...

Curl

Linux Curl command

AOP

The concept of join points as matched by pointcut expressions is central to AOP, and Spring uses the AspectJ pointcut expression language by default.

Micrometer notes

As a general rule it should be possible to use the name as a pivot. Dimensions allow a particular named metric to be sliced to drill down and reason about th...

Awesome SSL certificates and HTTPS

What’s TLS TLS (Transport Layer Security) and its predecessor, SSL (Secure Sockets Layer), are security protocols designed to secure the communication betwee...

JVM warm up by Escape Analysis

Why JVM need warm up I don’t know how and why you get to this blog. But I know the key words in your mind are “warm” for JVM. As the name “warm up” suggested...

Java Concurrent

This blog is about noteworthy pivot points about Java Concurrent Framework Back to Java old days there were wait()/notify() which is error prone, while fr...

Back to Top ↑

2019

Conversations with God

Feelings is the language of the soul. If you want to know what’s true for you about something, look to how your’re feeling about.

Kafka In Spring

Enable Kafka listener annotated endpoints that are created under the covers by a AbstractListenerContainerFactory. To be used on Configuration classes as fol...

Mifid

FX Spot is not covered by the regulation, as it is not considered to be a financial instrument by ESMA, the European Union (EU) regulator. As FX is considere...

Foreign Exchange

currency pairs Direct ccy: means USD is part of currency pair Cross ccy: means ccy wihtout USD, so except NDF, the deal will be split to legs, both with...

Back to Top ↑

2018

Guice

A new type of Juice Put simply, Guice alleviates the need for factories and the use of new in your Java code. Think of Guice’s @Inject as the new new. You wi...

YAML

Key points All YAML files (regardless of their association with Ansible or not) can optionally begin with — and end with …. This is part of the YAML format a...

Sudo in a Nutshell

Sudo in a Nutshell Sudo (su “do”) allows a system administrator to give certain users (or groups of users) the ability to run some (or all) commands as root...

Zoo-keeper

ZK Motto the motto “ZooKeeper: Because Coordinating Distributed Systems is a Zoo.”

Cucumber

Acceptance testing vs unit test It’s sometimes said that unit tests ensure you build the thing right, whereas acceptance tests ensure you build the right thi...

akka framework of scala

philosophy The actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-o...

Apache Camel

Camel’s message model In Camel, there are two abstractions for modeling messages, both of which we’ll cover in this section. org.apache.camel.Message—The ...

JXM

Exporting your beans to JMX The core class in Spring’s JMX framework is the MBeanExporter. This class is responsible for taking your Spring beans and registe...

Solace MQ

Solace PubSub+ It is a message broker that lets you establish event-driven interactions between applications and microservices across hybrid cloud environmen...

Apigee

App deployment, configuration management and orchestration - all from one system. Ansible is powerful IT automation that you can learn quickly.

Ansible

Ansible: What Is It Good For? Ansible is often described as a configuration management tool, and is typically mentioned in the same breath as Chef, Puppet, a...

flexbox

How Flexbox works — explained with big, colorful, animated gifs

KDB

KDB However kdb+ evaluates expressions right-to-left. There are no precedence rules. The reason commonly given for this behaviour is that it is a much simple...

Agile and SCRUM

Key concept In Scrum, a team is cross functional, meaning everyone is needed to take a feature from idea to implementation.

Strategy-Of-Openshift-Releases

Release & Testing Strategy There are various methods for safely releasing changes to Production. Each team must select what is appropriate for their own ...

NodeJs Notes

commands to read files var lineReader = require(‘readline’).createInterface({ input: require(‘fs’).createReadStream(‘C:\dev\node\input\git_reset_files.tx...

CORS :Cross-Origin Resource Sharing

Cross-Origin Request Sharing - CORS (A.K.A. Cross-Domain AJAX request) is an issue that most web developers might encounter, according to Same-Origin-Policy,...

ngrx

Why @Effects? In a simple ngrx/store project without ngrx/effects there is really no good place to put your async calls. Suppose a user clicks on a button or...

iOS programming

View A view is also a responder (UIView is a subclass of UIResponder). This means that a view is subject to user interactions, such as taps and swipes. Thus,...

Back to Top ↑

2017

cloud computering

openshift vs openstack The shoft and direct answer is `OpenShift Origin can run on top of OpenStack. They are complementary projects that work well together....

cloud computering

Concepts Cloud computing is the on-demand demand delivery of compute database storage applications and other IT resources through a cloud services platform v...

Redux

whats @Effects You can almost think of your Effects as special kinds of reducer functions that are meant to be a place for you to put your async calls in suc...

reactive programing

The second advantage to a lazy subscription is that the observable doesn’t hold onto data by default. In the previous example, each event generated by the in...

Container

The Docker project was responsible for popularizing container development in Linux systems. The original project defined a command and service (both named do...

promise vs observiable

The drawback of using Promises is that they’re unable to handle data sources that produce more than one value, like mouse movements or sequences of bytes in ...

JDK source

interface RandomAccess Marker interface used by List implementations to indicate that they support fast (generally constant time) random access. The primary ...

SSH SFTP

Secure FTP SFTP over FTP is the equivalant of HTTPS over HTTP, the security version

AWS Tips

After establishing a SSH session, you can install a default web server by executing sudo yum install httpd -y. To start the web server, type sudo service htt...

Oracle

ORA-12899: Value Too Large for Column

Kindle notes

#《亿级流量网站架构核心技术》目录一览 TCP四层负载均衡 使用Hystrix实现隔离 基于Servlet3实现请求隔离 限流算法 令牌桶算法 漏桶算法 分布式限流 redis+lua实现 Nginx+Lua实现 使用sharding-jdbc分库分表 Disruptor+Redis...

Java Security Notes

Java Security well-behaved: programs should be prevent from consuming too much system resources

R Language

s<-read.csv("C:/Users/xxx/dev/R/IRS/SHH_SCHISHG.csv") # aggregate s2<-table(s$Original.CP) s3<-as.data.frame(s2) # extract by Frequency ordered s3...

SSH and Cryptography

SFTP versus FTPS SS: Secure Shell An increasing number of our customers are looking to move away from standard FTP for transferring data, so we are ofte...

Eclipse notes

How do I remove a plug-in? Run Help > About Eclipse > Installation Details, select the software you no longer want and click Uninstall. (On Macintosh i...

Maven-Notes

Maven philosophy “It is important to note that in the pom.xml file you specify the what and not the how. The pom.xml file can also serve as a documentatio...

Java New IO

Notes JDK 1.0 introduced rudimentary I/O facilities for accessing the file system (to create a directory, remove a file, or perform another task), accessi...

IT-Architect

SOA SOA is a set of design principles for building a suite of interoperable, flexible and reusable services based architecture. top-down and bottom-up a...

Algorithm

This page is about key points about Algorithm

Java-Tricky-Tech-Questions.md

What is the difference between Serializable and Externalizable in Java? In earlier version of Java, reflection was very slow, and so serializaing large ob...

Compare-In-Java

Concepts If you implement Comparable interface and override compareTo() method it must be consistent with equals() method i.e. for equal object by equals(...

Java Collections Misc

Difference between equals and deepEquals of Arrays in Java Arrays.equals() method does not compare recursively if an array contains another array on oth...

HashMap in JDK

Hashmap in JDK Some note worth points about hashmap Lookup process Step# 1: Quickly determine the bucket number in which this element may resid...

Java 8 Tips

This blog is listing key new features introduced in Java 8

Back to Top ↑

2016

Java GC notes

verbose:gc verbose:gc prints right after each gc collection and prints details about each generation memory details. Here is blog on how to read verbose gc

Hash Code Misc

contract of hashCode : Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consis...

Angulary Misc

Dependency Injection Angular doesn’t automatically know how you want to create instances of your services or the injector to create your service. You must co...

Java new features

JDK Versions JDK 1.5 in 2005 JDK 1.6 in 2006 JDK 1.7 in 2011 JDK 1.8 in 2014 Sun之前风光无限,但是在2010年1月27号被Oracle收购。 在被Oracle收购后对外承诺要回到每2年一个realse的节奏。但是20...

Simpler chronicle of CI(Continuous Integration) “乱弹系列”之持续集成工具

引言 有句话说有人的地方就有江湖,同样,有江湖的地方就有恩怨。在软件行业历史长河(虽然相对于其他行业来说,软件行业的历史实在太短了,但是确是充满了智慧的碰撞也是十分的精彩)中有一些恩怨情愁,分分合合的小故事,比如类似的有,从一套代码发展出来后面由于合同到期就分道扬镳,然后各自发展成独门产品的Sybase DB和微...

浅谈软件单元测试中的“断言” (assert),从石器时代进步到黄金时代。

大家都知道,在软件测试特别是在单元测试时,必用的一个功能就是“断言”(Assert),可能有些人觉得不就一个Assert语句,没啥花头,也有很多人用起来也是懵懵懂懂,认为只要是Assert开头的方法,拿过来就用。一个偶然的机会跟人聊到此功能,觉得还是有必要在此整理一下如何使用以及对“断言”的理解。希望可以帮助大家...

Kubernetes 与 Docker Swarm的对比

Kubernetes 和Docker Swarm 可能是使用最广泛的工具,用于在集群环境中部署容器。但是这两个工具还是有很大的差别。

http methods

RFC origion http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.2)

Spark-vs-Storm

The stark difference among Spark and Storm. Although both are claimed to process the streaming data in real time. But Spark processes it as micro-batches; wh...

微服务

可以想像一下,之前的传统应用系统,像是一个大办公室里面,有各个部门,销售部,采购部,财务部。办一件事情效率比较高。但是也有一些弊端,首先,各部门都在一个房间里。

kibana, view layer of elasticsearch

What’s Kibana kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on...

kibana, view layer of elasticsearch

What’s Kibana kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on...

iConnect

UI HTML5, AngularJS, BootStrap, REST API, JSON Backend Hadoop core (HDFS), Hive, HBase, MapReduce, Oozie, Pig, Solr

Data Structure

Binary Tree A binary tree is a tree in which no node can have more than two children. A property of a binary tree that is sometimes important is that th...

Something about authentication

It’s annoying to keep on repeating typing same login and password when you access multiple systems within office or for systems in external Internet. There a...

SQL

Differences between not in, not exists , and left join with null

Github page commands notes

404 error for customized domain (such as godday) 404 There is not a GitHub Pages site here. Go to github master branch for gitpages site, manually add CN...

RenMinBi International

RQFII RQFII stands for Renminbi Qualified Foreign Institutional Investor. RQFII was introduced in 2011 to allow qualified foreign institutional investors to ...

Load Balancing

Concepts LVS means Linux Virtual Server, which is one Linux built-in component.

Python

(‘—–Unexpected error:’, <type ‘exceptions.TypeError’>) datetime.datetime.now()

Microservices vs. SOA

Microservice Services are organized around capabilities, e.g., user interface front-end, recommendation, logistics, billing, etc. Services are small in ...

Java Class Loader

Codecache The maximum size of the code cache is set via the -XX:ReservedCodeCacheSize=N flag (where N is the default just mentioned for the particular com...

Back to Top ↑