DeepFaceClosingtheGaptoHuman-LevelPerformanceinFaceVerification资源-CSDN下载

4星 · 超过85%的资源需积分: 33 6 浏览量 2014-11-03 17:57:04 上传评论 1 收藏 1.99MB PDF 举报

### 深度学习在人脸识别中的应用：DeepFace #### 论文背景及意义《DeepFace：在人脸验证中接近人类级别的性能》是汤晓鸥团队发布的一篇经典论文，该研究通过深度学习技术实现了在人脸识别领域的人类级别表现。这篇论文的重要性在于它标志着机器在面部识别任务上取得了显著的进步，并且极大地缩小了与人类视觉系统之间的差距。 #### 技术框架概述在现代人脸识别中，通常遵循以下四个阶段的操作流程：检测、对齐、表示和分类。本研究中，作者重新审视了其中的“对齐”和“表示”两个步骤，并采用了显式的3D面部建模来执行分段仿射变换，以及利用一个九层深的神经网络来推导出面部表示。这种方法不仅提高了面部识别的准确性，还简化了计算过程。 #### 关键技术点解析 **1. 显式3D面部建模** - **目的**：为了更精确地对齐面部特征，采用显式的3D面部模型进行辅助。这种建模方法能够更好地捕捉面部的关键特征点，从而提高后续处理步骤的准确性。 - **作用**：通过对齐操作，可以消除由于头部姿态变化、表情差异等因素带来的影响，确保后续表示和分类阶段的准确性。 **2. 复杂的神经网络结构** - **层数**：九层深的神经网络。 - **参数量**：超过1.2亿个参数。 - **连接方式**：使用多个局部连接层而没有权重共享，而非传统的卷积层。这使得模型能够在不牺牲准确性的前提下，捕获更多的局部细节。 - **训练数据集**：该模型是在迄今为止最大的面部数据集上进行训练的，包含超过400万张面部图像，涵盖了超过4000个不同的身份，每个身份平均有上千张样本。如此大规模的数据集为模型提供了丰富的训练资源，有助于其学习到更加全面和鲁棒的面部特征表示。 #### 实验结果分析 - **测试基准**：LFW（Labeled Faces in the Wild）数据集是一个广泛用于评估面部识别算法性能的标准数据集。 - **性能表现**：DeepFace在LFW数据集上的准确率达到了97.25%，相比于当时的最先进算法，错误率降低了超过25%，非常接近人类级别的识别能力。 - **重要意义**：这一成果表明，DeepFace不仅在特定条件下能够实现接近甚至超越人类的表现，而且对于非受限环境下的面部识别也具有极强的适应性和泛化能力。 #### 结论与展望《DeepFace：在人脸验证中接近人类级别的性能》不仅在理论上提出了创新的技术方案，还在实践中取得了突破性的进展。该研究不仅对于促进计算机视觉领域的技术发展具有重要意义，也为未来人脸识别技术的应用和发展提供了新的思路和方向。随着技术的不断进步，我们可以期待未来在更多场景下实现更加精准、高效的人脸识别解决方案。

资源推荐

资源详情

资源评论

DeepFace: Closing the Gap to Human-Level Performance in Face Veriﬁcation

Yaniv Taigman Ming Yang Marc’Aurelio Ranzato

Facebook AI Group

Menlo Park, CA, USA

{yaniv, mingyang, ranzato}@fb.com

Lior Wolf

Tel Aviv University

Tel Aviv, Israel

[email protected]

Abstract

In modern face recognition, the conventional pipeline

consists of four stages: detect ⇒ align ⇒ represent ⇒ clas-

sify. We revisit both the alignment step and the representa-

tion step by employing explicit 3D face modeling in order to

apply a piecewise afﬁne transformation, and derive a face

representation from a nine-layer deep neural network. This

deep network involves more than 120 million parameters

using several locally connected layers without weight shar-

ing, rather than the standard convolutional layers. Thus we

trained it on the largest facial dataset to-date, an identity

labeled dataset of four million facial images belonging to

more than 4,000 identities, where each identity has an av-

erage of over a thousand samples. The learned representa-

tions coupling the accurate model-based alignment with the

large facial database generalize remarkably well to faces in

unconstrained environments, even with a simple classiﬁer.

Our method reaches an accuracy of 97.25% on the Labeled

Faces in the Wild (LFW) dataset, reducing the error of the

current state of the art by more than 25%, closely approach-

ing human-level performance.

1. Introduction

Face recognition in unconstrained images is at the fore-

front of the algorithmic perception revolution. The social

and cultural implications of face recognition technologies

are far reaching, yet the current performance gap in this do-

main between machines and the human visual system serves

as a buffer from having to deal with these implications.

We present a system (DeepFace) that has closed the ma-

jority of the remaining gap in the most popular benchmark

in unconstrained face recognition, and is now at the brink

of human level accuracy. It is trained on a large dataset of

faces acquired from a population vastly different than the

one used to construct the evaluation benchmarks, and it is

able to outperform existing systems with only very minimal

adaptation. Moreover, the system produces an extremely

compact face representation, in sheer contrast to the shift

toward tens of thousands of appearance features in other re-

cent systems [5, 7, 2].

The proposed system differs from the majority of con-

tributions in the ﬁeld in that it uses the deep learning (DL)

framework [3, 20] in lieu of well engineered features. DL is

especially suitable for dealing with large training sets, with

many recent successes in such diverse domains in vision,

speech and language modeling. Speciﬁcally with faces, the

success of the learned net in capturing facial appearance in

a robust manner is highly dependent on a very rapid 3D

alignment step. The network architecture is based on the

assumption that once the alignment is completed, the loca-

tion of each facial region is ﬁxed at the pixel level. It is

therefore possible to learn from the raw pixel RGB values,

without any need to apply several layers of convolutions as

is done in many other networks [18, 20].

In summary, we make the following contributions : (i)

The development of an effective deep neural net (DNN) ar-

chitecture and learning method that leverage a very large

labeled dataset of faces in order to obtain a face representa-

tion that generalizes well to other datasets; (ii) An effective

facial alignment system based on explicit modeling of 3D

faces; and (iii) Advance the state of the art signiﬁcantly in

(1) the Labeled Faces in the Wild benchmark (LFW) [17],

reaching near human-performance; and (2) the YouTube

Faces dataset (YTF) [29], decreasing the error rate there by

more than 50%.

1.1. Related Work

Big data and deep learning In recent years, a large num-

ber of photos have been crawled by search engines, and up-

loaded to social networks, which include a variety of uncon-

strained material, such as objects, faces and scenes. Being

able to leverage this immense volume of data is of great in-

terest to the computer vision community in dealing with its

unsolved problems. However, the generalization capability

of many of the conventional machine-learning tools used in

computer vision, such as Support Vector Machines, Princi-

pal Component Analysis and Linear Discriminant Analysis,

tend to saturate rather quickly as the volume of the training

set grows signiﬁcantly.

Recently, there has been a surge of interest in neu-

ral networks [18, 20]. In particular, deep and large net-

works have exhibited impressive results once: (1) they

have been applied to large amounts of training data and (2)

scalable computation resources such as thousands of CPU

cores [11] and/or GPU’s [18] have become available. Most

notably, Krizhevsky et al. [18] showed that very large and

deep convolutional networks [20] trained by standard back-

propagation [24] can achieve excellent recognition accuracy

when trained on a large dataset.

Face recognition state of the art Face recognition er-

ror rates have decreased over the last twenty years by three

orders of magnitude [12] when recognizing frontal faces in

still images taken in consistently controlled (constrained)

environments. Many vendors deploy sophisticated systems

for the application of border-control and smart biometric

identiﬁcation. However, these systems have shown to be

sensitive to various factors, such as lighting, expression, oc-

clusion and aging, that substantially deteriorate their perfor-

mance in recognizing people in such unconstrained settings.

Most current face veriﬁcation methods use hand-crafted

features. Moreover, these features are often combined

to improve performance, even in the earliest LFW con-

tributions. The systems that currently lead the perfor-

mance charts employ tens of thousands of image descrip-

tors [5, 7, 2]. In contrast, our method is applied directly

to RGB pixel values, producing a very compact and even

sparse descriptor.

Deep neural nets have also been applied in the past to

face detection [23], face alignment [26] and face veriﬁca-

tion [8, 15]. In the unconstrained domain, Huang et al. [15]

used as input LBP features and they showed improvement

when combining with traditional methods. In our method

we use raw images as our underlying representation, and

to emphasize the contribution of our work, we avoid com-

bining our features with engineered descriptors. We also

provide a new architecture, that pushes further the limit of

what is achievable with these networks by incorporating 3D

alignment, customizing the architecture for aligned inputs,

scaling the network by almost two order of magnitudes and

demonstrating a simple knowledge transfer method once the

network has been trained on a very large labeled dataset.

Metric learning methods are used heavily in face veriﬁ-

cation. In several cases existing methods are successfully

employed, but this is often coupled with task-speciﬁc in-

novation [25, 28, 6]. Currently, the most successful sys-

tem that uses a large data set of labeled faces [5] employs

a clever transfer learning technique which adapts a Joint

Bayesian model [6] learned on a dataset containing 99,773

images from 2,995 different subjects to the LFW image do-

main. Here, in order to demonstrate the effectiveness of the

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 1. Alignment pipeline. (a) The detected face, with 6 initial ﬁdu-

cial points. (b) The induced 2D-aligned crop. (c) 67 ﬁducial points on

the 2D-aligned crop with their corresponding Delaunay triangulation, we

added triangles on the contour to avoid discontinuities. (d) The reference

3D shape transformed to the 2D-aligned crop image-plane. (e) Triangle

visibility w.r.t. to the ﬁtted 3D-2D camera; black triangles are less visible.

(f) The 67 ﬁducial points induced by the 3D model that are using to direct

the piece-wise afﬁne warpping. (g) The ﬁnal frontalized crop. (h) A new

view generated by the 3D model (not used in this paper).

features, we keep the distance learning step trivial.

2. Face Alignment

Existing aligned versions of several face databases (e.g.

LFW-a [28]) help to improve recognition algorithms by pro-

viding a normalized input [25]. However, aligning faces

in the unconstrained scenario is still considered a difﬁcult

problem that has to account for many factors such as pose

(due to the non-planarity of the face) and non-rigid expres-

sions, which are hard to decouple from a person’s identity-

bearing facial morphology. Recent methods have shown

successful ways that compensate for these difﬁculties by

using sophisticated alignment techniques. Those methods

can be one or more from the following: (1) employing an

analytical 3D model of the face [27], (2) searching for sim-

ilar ﬁducial-points conﬁgurations from an external dataset

to infer from [4], and (3) unsupervised methods that ﬁnd a

similarity transformation for the pixels [16, 14].

While alignment is widely employed, no complete phys-

ically correct solution is currently present in the context of

unconstrained face veriﬁcation. 3D models have fallen out

of favor in recent years, especially in unconstrained envi-

ronments. However, since faces are 3D objects, done cor-

rectly, we believe that it is the right way. In this paper, we

describe a system that combines analytical 3D modeling of

the face based on ﬁducial points, used to warp a detected

facial crop to a 3D frontal mode (frontalization).

Similar to much of the recent alignment literature, our

alignment solution is based on using ﬁducial point detectors

to direct the alignment process. Localizing ﬁducial points

in unconstrained images is known to be a difﬁcult problem,

剩余7页未读，继续阅读

评论收藏

内容反馈

geoff1314

2015-05-27

不错的论文，但楼主分享需要积分不好。
zhu_2022

2015-11-18

非常不错的资源

zlzorro

粉丝: 0

DeepFace Closing the Gap to Human-Level Performance in Face Veri...

最新资源

DeepFace Closing the Gap to Human-Level Performance in Face Veri...

Deep Face Recognition

facebook DeepFace人脸识别

OPENCASCADE通过鼠标在界面中点击在模型中创建点.txt

deepface：适用于Python的轻量级深脸识别和面部属性分析（年龄，性别，情感和种族）框架

用友GAP开发平台介绍

gap操作手册

DeepID3_Face_Recognition_with_Very_Deep_Neural_Networks

【走读】DeepFace_ Closing the Gap to Human-Level Performance in Face

Supassing Human-Level Face Verification Performance on LFW with GuassianFace

Face-detection-with-Deep-Learning

深度学习经典文献打包文献

人工智能精选论文（图像识别与图像处理）

图像__视频__其他.zip

Bridging the Gap From Research to Practical Advice.pdf

closing the gap ASIC&Customer;

deeplearning.ai合集1-5包括预训练模型下载（附赠最新python cookbook）

复杂环境下的人脸识别Deep Learning Face Attributes in the Wild

DeepID3: Face Recognition with Very Deep Neural Networks英文论文翻译

人脸识别技术在道路交通管理中的应用探究.pdf

缩小多机器人协作深度强化学习中的模拟与实际差距

人脸识别技术发展研究.pdf

人脸识别在Android平台下的研究与实现 (2).pdf

CLOSING THE GAP

Python-TransparencybyDesignnetworksTbDnets

BURNINTEST--硬件检测工具

KEYNOTE 4 (CLOSING) - Barry Greene - Can Vendors Ever Provide Se

FlexGraphics_V_1.79_D4-XE10.2_Downloadly.ir

PBGUIControls2.5 for PB10.5

The Future of Retail

jdk安装异常link it with ‘-z noexecstack‘与inux 64位系统ZendGuardLoader.so: wrong ELF class: ELFCLASS32报错处理

iNode MC 7.1.45 for Android

最新资源

图像视频其他.zip