

浏览全部资源
扫码关注微信
1.河南大学河南省大数据分析与处理重点实验室,河南开封 475001
2.河南大学黄河文化遗产实验室,河南开封 475001
Received:01 April 2021,
Revised:2023-01-12,
Published:25 April 2023
移动端阅览
张重生,王斌.基于序列相似性计算的甲骨残片缀合算法[J].电子学报,2023,51(04):860-869.
ZHANG Chong-sheng,WANG Bin.Oracle Bone Fragments Conjugation Based on Sequence Matching[J].ACTA ELECTRONICA SINICA,2023,51(04):860-869.
张重生,王斌.基于序列相似性计算的甲骨残片缀合算法[J].电子学报,2023,51(04):860-869. DOI: 10.12263/DZXB.20210429.
ZHANG Chong-sheng,WANG Bin.Oracle Bone Fragments Conjugation Based on Sequence Matching[J].ACTA ELECTRONICA SINICA,2023,51(04):860-869. DOI: 10.12263/DZXB.20210429.
甲骨残片缀合一直是甲骨学研究中最急迫最具基础性的工作,它使得甲骨残片经过拼接,复原为更加完整的原始材料.尽管前人及同行曾提出若干计算机辅助的甲骨缀合方法,但这些方法缀合准确度不足,未能真正投入使用,并不能真正帮助专家解决甲骨缀合问题,导致当前的甲骨缀合工作仍旧依靠人工、依旧费时费力.为了更好地研究甲骨残片的机器缀合问题,本文使用一个较大规模甲骨缀合基准数据集OB-Rejoin,该数据集包含了约一千幅甲骨拓片图像,且融入了大量的甲骨学界已缀成果,用于算法评估.基于该数据集,本文设计了一种基于斜率变化量序列匹配的甲骨缀合算法(Slope United Sequence Matching for Oracle Bone Fragments Conjugation,SUM),该方法将甲骨残片的断边碴口图像匹配问题转化为数值型的序列数据和序列相似性比对问题,以将尚不够非常精密的计算机视觉领域的碴口图像匹配问题转换为数据科学领域较为成熟的序列数据相似性匹配问题.SUM将数值型的碴口序列数据进一步转换为斜率变化量序列和字符序列数据,最后利用字符序列的模糊匹配完成甲骨残片的碴口匹配.在实验环节,SUM算法与经典的序列相似性计算方法在精确率、召回率、漏检率方面进行了对比,并与两个较新的基于深度学习的序列匹配算法和形状匹配算法进行了性能对比.整体而言,SUM在OB-Rejoin数据集上的Top-15缀合召回率达到了95.181%,超越了对比算法.重要出土文献的精准复原本身是历史学和古文字研究中客观存在的重大现实需求,具有重要的史学价值和意义,因此,本文的研究成果,不但有助于解决甲骨残片的机器缀合问题,还对秦汉简牍和敦煌遗书等重要出土文献的精准复原具有重要的参考价值.
Rejoining the oracle bone fragments is an important prerequisite for the research of oracle bone inscriptions (OBI)
which can restore the original appearance and content of the oracle bones. Though computer-aided oracle bone fragments conjugation solutions have been investigated for decades
they could not be applied in real-world OBI research
due to their unsatisfactory performance. Consequently
until today
OBI researchers still have to rejoin the oracle bone fragments manually. To solve this problem
we first introduce OB-Rejoin
a large-scale dataset with about one thousand oracle bone rubbings. It includes a large number of fragments that have already been rejoined by OBI experts
which are used as the ground-truth in experiments. Moreover
we propose the SUM (Slope United Sequence Matching) algorithm for oracle bone fragments conjugation
which transforms the challenging curve matching problem of the oracle bone fragments into the numerical sequence matching problem. SUM next transforms the sequence data into slope variation-based sequence data and character sequences
and finally uses string matching algorithms for oracle bone fragments conjugation. We conduct comprehensive experiments to compare SUM with classic sequence matching methods
in terms of precision
recall
mis-rejoin rates. We also compare SUM with two very recent deep learning-based sequence matching and shape matching algorithms. All these experiments demonstrate the superiority of SUM over existing methods in oracle bone fragments conjugation
which achieves a Top-15 recall rate of 95.181% on OB-Rejoin. Overall
the recovery of unearthed documents is an important real-world problem that has historical significance
this research work is therefore not only useful for rejoining the oracle bone fragments
but also has important reference value for the recovery of other unearthed documents
in particular the conjugation of fragmented bamboo strips and Dunhuang manuscripts.
黄天树 . 甲骨缀合的学术意义与方法 [J ] . 故宫博物院院刊 , 2011 , 54 ( 1 ): 7 - 13, 156 .
HUANG T S . On academic value and research methods of restoration of oracle bone inscription fragments [J ] . Palace Museum Journal , 2011 , 54 ( 1 ): 7 - 13, 156 . (in Chinese)
王爱民 , 葛文英 , 赵哲 , 等 . 龟甲类甲骨文碎片计算机辅助缀合研究 [J ] . 计算机工程与设计 , 2011 , 32 ( 10 ): 3570 - 3573 .
WANG A M , GE W Y , ZHAO Z , et al . Research on computer matching of inscriptions on tortoise fragments [J ] . Computer Engineering and Design , 2011 , 32 ( 10 ): 3570 - 3573 . (in Chinese)
刘影 . 宾组牛胛骨新缀四组 [J ] . 故宫博物院院刊 , 2011 , 54 ( 1 ): 22 - 27, 156 .
LIU Y . Four sets of newly restored binzu bovine scapula inscription fragments [J ] . Palace Museum Journal , 2011 , 54 ( 1 ): 22 - 27, 156 . (in Chinese)
齐航福 . 甲骨新缀五组 [J ] . 故宫博物院院刊 , 2011 , 54 ( 1 ): 14 - 21, 156 .
QI H F . A study of five sets of newly restored oraclebone inscription fragments [J ] . Palace Museum Journal , 2011 , 54 ( 1 ): 14 - 21, 156 . (in Chinese)
王爱民 , 刘国英 , 葛文英 , 等 . 甲骨文计算机辅助缀合系统设计 [J ] . 计算机工程与应用 , 2010 , 46 ( 21 ): 59 - 62 .
WANG A M , LIU G Y , GE W Y , et al . System designation for computer aided rejoining of tortoise shells with inscriptions based on contour matching [J ] . Computer Engineering and Applications , 2010 , 46 ( 21 ): 59 - 62 . (in Chinese)
孙亚冰 . 甲骨缀合五则 [J ] . 南方文物 , 2015 , 54 ( 3 ): 107 - 108, 82 .
SUN Y B . Five cases of oracle bones splicing [J ] . Relics from South , 2015 , 54 ( 3 ): 107 - 108, 82 . (in Chinese)
王爱民 , 葛彦强 , 刘国英 , 等 . 计算机辅助甲骨文缀合关键技术研究 [J ] . 计算机测量与控制 , 2010 , 18 ( 7 ): 1612 - 1614 .
WANG A M , GE Y Q , LIU G Y , et al . Research on key technologies of computer aided rejoining of bones/tortoise shells with inscriptions [J ] . Computer Measurement & Control , 2010 , 18 ( 7 ): 1612 - 1614 . (in Chinese)
王爱民 , 钟珞 , 葛彦强 , 等 . 甲骨碎片智能缀合关键技术研究 [J ] . 武汉理工大学学报 , 2010 , 32 ( 20 ): 194 - 199 .
WANG A M , ZHONG L , GE Y Q , et al . Research on key technologies of the computer aided rejoining of the bones/tortoise shells with inscriptions [J ] . Journal of Wuhan University of Technology , 2010 , 32 ( 20 ): 194 - 199 . (in Chinese)
王爱民 , 葛彦强 , 刘国英 , 等 . 甲骨文计算机辅助缀合技术研究 [J ] . 中国科技信息 , 2010 , 22 ( 4 ): 43 - 46 .
WANG A M , GE Y Q , LIU G Y , et al . The system designation for the computer aided rejoining of the tortoise shells with inscriptions based on contour matching [J ] . China Science and Technology Information , 2010 , 22 ( 4 ): 43 - 46 . (in Chinese)
顾绍通 . 甲骨文数字化处理研究述评 [J ] . 西华大学学报(自然科学版) , 2010 , 29 ( 5 ): 38 - 42, 48 .
GU S T . Review on digitization processing of jiaguwen [J ] . Journal of Xihua University (Natural Science Edition) , 2010 , 29 ( 5 ): 38 - 42, 48 . (in Chinese)
张长青 , 王爱民 . 一种计算机辅助甲骨文拓片缀合方法 [J ] . 电子设计工程 , 2012 , 20 ( 17 ): 1 - 3 .
ZHANG C Q , WANG A M . Method for computer aided rejoining of bones/tortoise shells rubbing [J ] . Electronic Design Engineering , 2012 , 20 ( 17 ): 1 - 3 . (in Chinese)
RAKTHANMANON T , CAMPANA B , MUEEN A , et al . Searching and mining trillions of time series subsequences under dynamic time warping [C ] // Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York : ACM , 2012 : 262 - 270 .
MUEEN A , KEOGH E . Extracting optimal performance from dynamic time warping [C ] // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York : ACM , 2016 : 2129 - 2130 .
KEOGH E J , PAZZANI M J . Scaling up dynamic time warping for data mining applications [C ] // Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery an-d Data Mining . New York : ACM , 2000 : 285 - 289 .
JEONG Y S , JEONG M K , OMITAOMU O A . Weighted dynamic time warping for time series classification [J ] . Pattern Recognition , 2011 , 44 ( 9 ): 2231 - 2240 .
RATANAMAHATANA C A , KEOGH E . Three myths about dynamic time warping data mining [C ] // 2005 SIAM International Conference on Data mining . Philadelphia : Society for Industrial and Applied Mathematics , 2005 : 506 - 510 .
ZHAO J , ITTI L . shapeDTW: Shape dynamic time warping [J ] . Pattern Recognition , 2018 , 74 : 171 - 184 .
CAI X , XU T , YI J , et al . Dtwnet: A dynamic time warping network [J ] . Advances in Neural Information Processing Systems , 2019 , 11636 - 11646 .
SILVA D F , BATISTA G E . Speeding up all-pairwise dynamic time warping matrix calculation [C ] // 2016 SIAM International Conference on Data Mining . University City, Philadelphia : Society for Industrial and Applied Mathematics , 2016 : 837 - 845 .
ZHANG Z , TAVENARD R , BAILLY A , et al . Dynamic time warping under limited warping path length [J ] . Information Sciences , 2017 , 393 : 91 - 107 .
JAIN B J . Making the dynamic time warping distancewarping variant [J ] . Pattern Recognition , 2019 , 94 : 35 - 52 .
杨一鸣 , 潘嵘 , 潘嘉林 , 等 . 时间序列分类问题的算法比较 [J ] . 计算机学报 , 2007 , 30 ( 8 ): 1259 - 1266 .
YANG Y M , PAN R , PAN J L , et al . A comparative study on time series classification [J ] . Chinese Journal of Computers , 2007 , 30 ( 8 ): 1259 - 1266 . (in Chinese)
孙冬璞 , 曲丽 . 时间序列特征表示与相似性度量研究综述 [J ] . 计算机科学与探索 , 2021 , 15 ( 2 ): 195 - 205 .
SUN D P , QU L . Survey on feature representation and similarity measurement of time series [J ] . Journal of Frontiers of Computer Science and Technology , 2021 , 15 ( 2 ): 195 - 205 . (in Chinese)
周宁南 , 张孝 , 刘城山 , 等 . 基于动态时间规整的时序数据相似连接 [J ] . 计算机学报 , 2018 , 41 ( 8 ): 1798 - 1813 .
ZHOU N N , ZHANG X , LIU C S , et al . Similarity join on time series under dynamic time warping [J ] . Chinese Journal of Computers , 2018 , 41 ( 8 ): 1798 - 1813 . (in Chinese)
李永健 . 基于DTW和HMM的语音识别算法仿真及软件设计 [D ] . 哈尔滨 : 哈尔滨工程大学 , 2009 .
LI Y J . Speech Recognition Algorithm Simulation and Software Design Based on DTW and HMM [D ] . Harbin : Harbin Engineering University , 2009 . (in Chinese)
周瑜 , 刘俊涛 , 白翔 . 形状匹配方法研究与展望 [J ] . 自动化学报 , 2012 , 38 ( 6 ): 889 - 910 .
ZHOU Y , LIU J T , BAI X . Research and perspective on shape matching [J ] . Acta Automatica Sinica , 2012 , 38 ( 6 ): 889 - 910 . (in Chinese)
SALVADOR S , CHAN P . Toward accurate dynamic time warping in linear time and space [J ] . Intelligent Data Analysis , 2007 , 11 ( 5 ): 561 - 580 .
CORMEN T H , LEISERSON C E , RIVEST R L , et al . Introduction to Algorithms [M ] . Cambridge : MIT Press , 2022 .
ARONOV B , HAR-PELED S , KNAUER C , et al . Fréchet distance for curves, revisited [C ] // European Symposium on Algorithms . Berlin : Springer , 2006 : 52 - 63 .
BELONGIE S , MALIK J , PUZICHA J . Shape matching and object recognition using shape contexts [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2002 , 24 ( 4 ): 509 - 522 .
RADENOVIC F , TOLIAS G , CHUM O . Deep shape matching [C ] // European Conference on Computer Vision . Berlin : Springer , 2018 : 774 - 791 .
0
Views
12
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621