Characters Are Like Faces

Characters Are Like Faces

Haoyu Deng, Zhaoteng Ye, Yule Duan

University of Electronic Science and Technology of China

[Code][PDF]

Abstract

There are over 100,000 characters in Chinese, though only four thousand of them are used in our daily life. However for cultural researchers, they interact with those Rarely Used Characters (RUCs) frequently. It would facilitate using these RUCs for them with Optical Character Recognition (OCR) technology. Nevertheless, the current OCR methods, no matter regression based or classification based, are difficult to recognize such a huge amount of characters. In this work, we simply treat characters like human faces and adopt the MobileFaceNetV3 to recognize over 74,000 Chinese characters included in Unicode. A demo can be seen at http://risingentropy.top/OCR.html. All source code:https: //github.com/RisingEntropy/Characters-Are-Like-Faces

Note that the web demo is currently unavailable since I am currently transferring my website to github page.

Main diagram

Main diagram