DeepFace Recognition

Face Recognition System

대부분의 사진이 얼굴에만 있지 않아서 face detection 을 통해 얼굴만 crop 함
face alignment 로 정면을 보게 위치를 조절
Augmentation 효과를 주는 Face Preprocessing 과정을 거쳐 모델 Input 으로 들어감
Train 과정에서는 CNN Network를 통해 얼굴의 Feature 를 뽑고 task 에 맞는 loss를 학습한다.

Closed-set Face Detection

Face detection은 Closed-set 과 Open-set 으로 나뉘어짐. Closed-set 은 train-set 에 test-set label 이 포함되어 있는 경우로 특정 회사 얼굴 인식을 생각하면 됨. Deep Learning 에서 Classification 문제와 동일하게 해결 가능함.

Open-set Face Detection

train-set에 없는 label 을 가진 얼굴 사진을 맞춰야 하는 경우 사용. 단순 분류 문제로 해결이 되지 않으므로 Image 를 discriminative 한 Feature space 로 Embedding 시키는 방향으로 학습한다. 즉, 얼굴 사진을 Feature space로 mapping 하기 위한 metric을 학습하며, 서로 다른 얼굴들에 대해 구분을 잘 하는 (= discriminative margin 최대화) 방향으로 mapping metric을 학습시켜야 한다.

Face Recognition System 발전 동향

Term

The Gallery: Target IDs

The Probe: Test ID

Identification: One-to-many (직원들 중 누구인지)

Verification: One-to-one (여기 직원인지 아닌지를 파악)

DeepFace (2014)

AlexNet을 Fine-tuning 해서 Image Classification Problem 으로 Learning (softmax loss). 이 때 마지막 Layer의 Embedding Vector를 기준으로 새로운 이미지와 chi-square distance를 계산해 Verification 을 수행

출처 : https://stats.stackexchange.com/questions/496693/using-chi2-distance-for-histogram-comparison-with-0-valued-elements-leads-to

FaceNet (2015)

FaceNet 이전까지는 Train 후 joint-bayecian, metric learning 후 fine-tuning 해야 했다. 그러나, FaceNet은 fine-tuning 없이 metric learning 으로만 학습한다. 위와 같은 Triplet Loss를 도입하였다.

Joint bayesian, 출처 : https://www.nature.com/articles/s41598-024-60002-z

metric learning, 출처 : https://www.researchgate.net/figure/Working-mechanism-of-metric-learning-The-metric-learning-method-aims-to-find-a_fig3_355201453

Triplet losss는 anchor, negative(anchor와 다른 label data), positive(anchor와 같은 label data) 3개의 sample을 사용, 기준이 되는 anchor 이미지와 positive의 L2 거리를 최소화, negative와의 거리를 최대화하는 방향으로 loss를 최적화 한다.

Triplet loss function

단점은 anchor-positive-negative의 balance 있는 샘플링을 위해 large batch size가 요구되고 hard positve, hard negative 등 문제가 생길 수 있다.

hard positive : ID는 같은데 구분하기 어려운 sample

hard negative : ID는 다른데 비슷한 sample

데이터셋이 공개되어있지 않고 너무 커서 재현 불가능하다.

VGGFace (2015)

Dataset이 공개되어 있고, FaceNet 보다 적은 데이터로 비슷한 성능을 낸다. Triplet Loss + softmax Loss 사용한다.

이 때, 이미 99% 정확도를 달성했다. Algorithm 보다 Dataset 과 Data-PreProcessing 이 더 중요하다는 걸 이 때 깨달았다.

FR 특징

small inter class variations: 사람 얼굴은 구조적으로 비슷해서 variation이 다른 object에 비해 상대적으로 적다.

large intra-class variations: 한 사람이라도 기분, 포즈, 나이에 따라 variantion이 크다.

ArcFace (2018)

open-set face recognition 에서의 매우 discriminative한 feature embedding 을 위한 additive angular Margin Loss (ArcFace) 를 제안한다.

Sphere 논문

Neural Net의 마지막 fully connected layer의 Weight Matrix는, 각 class의 중심 지표로 표현될 수 있다.

Normalized Weight의 row들이 각 class의 중심을 나타내는 vector로 간주하면 된다.

Margin-Loss가 좋고, 그걸 사용한게 ArcFace 라는 것을 설명하고 있다.

어쨌든 내적을 계산함에 있어 추가로 각도를 더 벌린 채로 학습을 진행함으로써
각도가 가까워질 놈은 더 가까워지게, 각도가 멀어질 놈은 더 멀어지게 하려는 컨셉으로 이해하면 된다.

근데, 최근에 AI Project를 진행하는 도중 이들을 한 데로 묶은 Python 경량화 Framework 가 Github에 있어서 이렇게 공부해보게 되었다.

https://github.com/serengil/deepface?tab=readme-ov-file

GitHub - serengil/deepface: A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python - serengil/deepface

github.com

이를 통해 성별 / 연령대 / 감정 분석을 한 꺼번에 돌릴 수 있도록 하는 DeepFace를 사용하여 프로젝트를 진행해보고자 한다.

'Data Engineer > AI' 카테고리의 다른 글

AI - LG_Aimers (Supervised Learning) (0)	2025.01.18
FastAPI 시작하기 (0)	2024.11.23
AI - 국민청원 분류 (0)	2024.11.17
AI - 작물 잎 사진으로 질병 분류 (0)	2024.11.16
AI - Pytorch (5)	2024.11.14