Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points

SHI Xian-jin; CAO Shuang; ZHANG Chong-sheng; TAO Yue-feng; LÜ Ling-ling; SHEN Xia-jiong

doi:10.12263/DZXB.20201191

您当前的位置：

首页 >

文章列表页 >

Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points

PAPERS | 更新时间：2025-12-08

- Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points
- ACTA ELECTRONICA SINICA Vol. 49, Issue 10, Pages: 2020-2031(2021)
- 作者机构：
  
  1.河南大学计算机与信息工程学院，河南大学黄河文化遗产实验室，河南开封 475004
  2.河南省电化教育馆，河南郑州 450004
  3.华北水利水电大学电力学院，河南郑州 450045
- 作者简介：
- 基金信息：
- DOI：10.12263/DZXB.20201191
  CLC： TP311.5;TP391.1
- Received：26 October 2020，
  
  Revised：2021-09-29，
  
  Published：25 October 2021
- 稿件说明：
移动端阅览
史先进,曹爽,张重生等.基于锚点的字符级甲骨图像自动标注算法研究[J].电子学报,2021,49(10):2020-2031.

SHI Xian-jin,CAO Shuang,ZHANG Chong-sheng,et al.Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points[J].ACTA ELECTRONICA SINICA,2021,49(10):2020-2031.
史先进,曹爽,张重生等.基于锚点的字符级甲骨图像自动标注算法研究[J].电子学报,2021,49(10):2020-2031. DOI： 10.12263/DZXB.20201191.

SHI Xian-jin,CAO Shuang,ZHANG Chong-sheng,et al.Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points[J].ACTA ELECTRONICA SINICA,2021,49(10):2020-2031. DOI： 10.12263/DZXB.20201191.

摘要

甲骨文是中国最早的系统文字，是目前能见到的最早的成熟汉字.甲骨文的研究对历史探究和文化传承具有重要的意义.但是要实现字符级别的甲骨字符图像标注，在现有技术环境下，只能通过资深甲骨学专家进行人工标注，不仅耗费人力资源，而且效率低下.针对这一问题，在前期工作中的甲骨字符图像识别模型的基础上，本文提出了一种甲骨字符图像自动标注算法.该算法通过先分列后切割的思想，先将甲骨拓片上的每一个字符图像归结到某一个特定列，再以锚点甲骨字为参考点，根据空间近邻关系找到甲骨原文中的字所对应的甲骨字符图像，从而实现了甲骨字符图像的自动标注.同时，将标注好的甲骨字符图像添加到样本数据集，并利用增广后的数据集（增加6~10倍）重新训练甲骨字符图像识别模型，有利于提高基于深度学习的甲骨文识别算法的识别准确度；以较小的成本大幅增加样本数量，也可以节约专家大量的时间和人力.

Abstract

Oracle-Bone inscriptions are the earliest systematic and mature Chinese characters presently discovered. The study of Oracle-Bone inscriptions is of great significance to historical exploration and cultural inheritance. However

in order to realize character-level Oracle-Bone image annotation

in the existing technical environment

only experienced experts in Oracle-Bone inscriptions can carry out manual annotation

which not only consumes human resources

but also is inefficient. Aiming at this problem

based on the Oracle-Bone image recognition model in the previous work

this paper proposes an automatic annotation algorithm for Oracle-Bone character images. In this algorithm

each character image on the Oracle-Bone rubbings is first reduced to a specific column. Then

the Oracle-Bone character images corresponding to the characters in the original text are found by taking the anchor point as the reference point and according to the nearest neighbor relation of space

so as to realize the automatic labeling of the Oracle-Bone character images.At the same time

the labeled Oracle-Bone images are added to the sample data set

and the original Oracle-Bone character image recognition model is retrained by using the augmented data set（6-10 times increase）

which is conducive to improve the recognition accuracy of the Oracle-Bone character recognition algorithm based on deep learning. In this way

the number of samples can be greatly increased at a small cost

and a lot of time and manpower of experts can be saved.

关键词

Keywords

references

江铭虎 , 邓北星 , 廖盼盼 , 等 . 甲骨文字库与智能知识库的建立 [J]. 计算机工程与应用 , 2004 , 40 ( 4 ): 45 － 47, 60 .

Jiang M H , Deng B X , Liao P P , et al . Construction on word-base of oracle-bone inscriptions and its intelligent repository [J]. Computer Engineering and Applications , 2004 , 40 ( 4 ): 45 － 47, 60 . (in Chinese)

顾绍通 . 甲骨文数字化处理研究述评 [J]. 西华大学学报(自然科学版) , 2010 , 29 ( 5 ): 38 － 42, 48 .

Gu S T . Review on digitization processing of jiaguwen [J]. Journal of Xihua University (Natural Science Edition) , 2010 , 29 ( 5 ): 38 - 42, 48 . (in Chinese)

顾绍通 . 基于分形几何的甲骨文字形识别方法 [J]. 中文信息学报 , 2018 , 32 ( 10 ): 138 － 142 .

Gu S T . Identification of oracle-bone script fonts based on fractal geometry [J]. Journal of Chinese Information Processing , 2018 , 32 ( 10 ): 138 － 142 . (in Chinese)

李锋 , 周新伦 . 甲骨文自动识别的图论方法 [J]. 电子科学学刊 , 1996 , 18 ( S1 ): 41 － 47 .

Li F , Zhou X L . Recohnition of Jia Gu Wen based on graph theory [J]. Journal of Electronics , 1996 , 18 ( S1 ): 41 － 47 . (in Chinese)

焦清局 , 刘永革 , 仇利萍 , 等 . 网络驱动的未识甲骨字特性及场景语义预测 [J]. 浙江大学学报(理学版) , 2020 , 47 ( 2 ): 142 － 150 .

Jiao Q J , Liu Y G , Qiu L P , et al . Network-driven prediction of unknown oracle character's features and scene semantics [J]. Journal of Zhejiang University (Science Edition) , 2020 , 47 ( 2 ): 142 － 150 . (in Chinese)

Long S B , He X , Yao C . Scene text detection and recognition: The deep learning era [J]. International Journal of Computer Vision , 2021 , 129 ( 1 ): 161 － 184 .

Yin X C , Yin X W , Huang K Z , et al . Robust text detection in natural scene images [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2014 , 36 ( 5 ): 970 － 983 .

Xu Y C , Wang Y K , Zhou W , et al . TextField: learning a deep direction field for irregular scene text detection [J]. IEEE Transactions on Image Processing , 2019 , 28 ( 11 ): 5566 － 5579 .

Zhu Y X , Du J . TextMountain: Accurate scene text detection via instance segmentation [J]. Pattern Recognition , 2021 , 110 : 107336 .

林景栋 , 吴欣怡 , 柴毅 , 等 . 卷积神经网络结构优化综述 [J]. 自动化学报 , 2020 , 46 ( 1 ): 24 － 37 .

Lin J D , Wu X Y , Chai Y , et al . Structure optimization of convolutional neural networks: A survey [J]. Acta Automatica Sinica , 2020 , 46 ( 1 ): 24 － 37 . (in Chinese)

孟琭 , 孙霄宇 , 赵滨 , 等 . 基于卷积神经网络的铁轨路牌识别方法 [J]. 自动化学报 , 2020 , 46 ( 3 ): 518 － 530 .

Meng L , Sun X Y , Zhao B , et al . An identification method of high-speed railway sign based on convolutional neural network [J]. Acta Automatica Sinica , 2020 , 46 ( 3 ): 518 － 530 . (in Chinese)

张鲁宁 , 左信 , 刘建伟 . 零样本学习研究进展 [J]. 自动化学报 , 2020 , 46 ( 1 ): 1 － 23 .

Zhang L N , Zuo X , Liu J W . Research and development on zero-shot learning [J]. Acta Automatica Sinica , 2020 , 46 ( 1 ): 1 － 23 . (in Chinese)

鲁绪正 , 蔡恒进 , 林莉 . 基于Capsule网络的甲骨文构件识别方法 [J]. 智能系统学报 , 2020 , 15 ( 2 ): 243 － 254 .

Lu X Z , Cai H J , Lin L . Recognition of Oracle Radical based on the Capsule network [J]. CAAI Transactions on Intelligent Systems , 2020 , 15 ( 2 ): 243 － 254 . (in Chinese)

Huang S P , Zhong Z Y , Jin L W , et al . DropRegion training of inception font network for high-performance Chinese font recognition [J]. Pattern Recognition , 2018 , 77 : 395 － 411 .

Taigman Y , Yang M , Ranzato M , et al . DeepFace: Closing the gap to human-level performance in face verification [A]. 2014 IEEE Conference on Computer Vision and Pattern Recognition [C]. Columbus, OH, USA : IEEE , 2014 . 1701 － 1708 .

Schroff F , Kalenichenko D , Philbin J . FaceNet: A unified embedding for face recognition and clustering [A]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [C]. Boston, MA, USA : IEEE , 2015 . 815 － 823 .

Hestness J , Narang S R , Ardalani N , et al . Deep learning scaling is predictable , empirically[EB/OL]. https://www. researchgate. net/publication/321487863_Deep_ Learning_Scaling_is_Predictable_Empirically https://www.researchgate.net/publication/321487863_Deep_Learning_Scaling_is_Predictable_Empirically , 2017 .

Zhou X Y , Yao C , Wen H , et al . EAST: An efficient and accurate scene text detector [A]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [C]. Honolulu, HI, USA : IEEE , 2017 . 2642 － 2651 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Support Vector Discriminant Analysis and Its Application to Facial Expression Recognition

Multi-Valued Neuron（MVN） and Multi-Thresholded Neuron（MTN）,Their Combinations and Applications

Related Author

YING Zi-lu

TANG Jing-hai

LI Jing-wen

ZHANG You-wei

Wang Shoujue

Related Institution

School of Electronics and Information Engineering, Beihang University

School of Information,Wuyi University

School of Electronics and Information EngineeringBeihang UniversityBeijing

School of InformationWuyi UniversityJiangmenGuangdong

Institute of Semiconductors,Chinese Academy of Sciences

⁰