基于Mealy机的藏文字构件分解

才让卓玛, 李永明, 才智杰

电子学报 ›› 2015, Vol. 43 ›› Issue (5) : 935-939.

PDF(884 KB)
PDF(884 KB)
电子学报 ›› 2015, Vol. 43 ›› Issue (5) : 935-939. DOI: 10.3969/j.issn.0372-2112.2015.05.016
学术论文

基于Mealy机的藏文字构件分解

  • 才让卓玛1,2, 李永明1, 才智杰2
作者信息 +

Components Decomposition of Tibetan Words Based on Mealy Machines

  • CAI Rang-zhuoma1,2, LI Yong-ming1, CAI Zhi-jie2
Author information +
文章历史 +

摘要

藏文字构件分解是藏文信息处理的基础,具有重要的理论价值和广阔的应用前景.针对藏文字构件的复杂性与多样性,文章通过分析现代藏文字的构字规则和结构特点,研究了藏文字构件的分解过程,利用Mealy机的输出字符与移动一一对应的特性描述了藏文字构件的行为语义,给出了对于任意字符串能否被Mealy机分解的判定定理及基于Mealy机的藏文字构件分解算法,并设计实现了基于Mealy机的藏文字构件分解系统,验证了算法的可行性.

Abstract

Components decomposition of Tibetan words is the basic work of Tibetan information processing,it provides significant theoretical value and has wide application perspective.Since the complexity and variety of components of Tibetan words,this paper studies the process of components decomposition of Tibetan words by analyzing the grammars and the structure of Tibetan words,gives a component decomposition algorithm and judgment theorems about decomposition based on the one-to-one relationship between each output character and its state transition of Mealy machine,then verify the validity of the algorithm by a component decomposition system based on Mealy machine.

关键词

藏文信息处理 / Mealy机 / 构件 / 构字分解

Key words

Tibetan information processing / Mealy automata / components / component decomposition

引用本文

导出引用
才让卓玛, 李永明, 才智杰. 基于Mealy机的藏文字构件分解[J]. 电子学报, 2015, 43(5): 935-939. https://doi.org/10.3969/j.issn.0372-2112.2015.05.016
CAI Rang-zhuoma, LI Yong-ming, CAI Zhi-jie. Components Decomposition of Tibetan Words Based on Mealy Machines[J]. Acta Electronica Sinica, 2015, 43(5): 935-939. https://doi.org/10.3969/j.issn.0372-2112.2015.05.016
中图分类号: TP391   

参考文献

[1] 格桑居冕.实用藏文文法[M].成都:四川民族出版社,1987. Gesang Jumian.Practical Tibetan Grammar[M].Chengdu:Sichuan Nationalities Publishing House,1987.(in Chinese)
[2] 江荻,董颖红.藏文信息处理属性统计研究[J].中文信息学报,1994,9(2):37-44. Jiang di,Dong Ying-hong.Research on property of Tibetan characters as information processing[J].Journal of Chinese Information Processing,1994,9(2):37-44.(in Chinese)
[3] 江狄,董颖红.藏字叠加结构线性处理统计分析[J].中文信息,1994,(4):44-46. Jiang di,Dong Ying-hong.Linear statistics analysis of Tibetan character overlay structure[J].Chinese Information Processing,1994,(4):44-46.(in Chinese)
[4] 扎西次仁.《中华大藏经·丹珠尔》藏文对勘本字频统计分析[J].中国藏学,1997(2):122-133. Xiciren Zha.The frequency statistics for Gangyur of Chinese of Tripitaka[J].China Tibetology,1997(2):122-133.(in Chinese)
[5] 卢亚军,马少平,张敏,罗广.基于大型藏文语料库的藏文字符、部件、音节、词汇频度与通用度统计及其应用研究[J].西北民族大学学报,2003,24(48):32-42. Lu Ya-jun,Ma Shao-ping,Zhang Ming,Luo Guang.Researches of Calculations of Tibetan characters,pieces,syllables,vocabulary and universal frequency and its applications[J].Journal of Northwest Minorities University,2003,24(48):32-42.(in Chinese)
[6] 王维兰,陈万军.藏文字、音节频度及其信息熵[J].术语标准化与信息技术,2004,2:27-31. WangWei-lan,Chen Wan-jun,The frequency and information entropy of Tibetan character and syllable[J].Terminology Standardization & Information Technology,2004,2:27-31.(in Chinese)
[7] 高定国,龚育昌.现代藏字全集的属性统计研究[J].中文信息学报,2005,19(1):71-75. Gao Ding-guo,Gong Yu-chang.A statistically study on the qualities of all modern Tibetan character set[J].Journal of Chinese Information Processing,2005,19(1):71-75.(in Chinese)
[8] 艾金勇,李永宏,于洪志.藏文字形结构计量统计分析[J].计算机应用,2009,29(7):2029-2031. Ai Jin-yong,Li Yong-hong,Yu Hong-zhi.Statistical analysis on Tibetan shaped structure[J].Journal of Computer Application,2009,29(7):2029-2031.(in Chinese)
[9] 张大方,张洁坤,黄昆.一种基于智能有限自动机的正则表达式匹配算法[J].电子学报,2012,40(8):1617-1622. Zhang Da-fang,Zhang Jie-kun,Huang Kun.A regular expression matching algorithm with smart finite automaton[J].Acta Electronica Sinica,2009,29(7):2029-2003.(in Chinese)
[10] 赵力,邹采荣,吴镇扬.基于3维空间Viterbi算法的汉语连续语音识别[J].电子学报,2000,28(7):84-87. Zhao Li,Zou Cai-rong,Wu Zhen-yang.Recognition of Chinese continuous speech based on 3-Dimension Viterbi search[J].Acta Electronica Sinica,2000,28(7):84-87.(in Chinese)
[11] 蒋宗礼,姜守旭.形式语言与自动机理论[M].北京:清华大学出版社,2007. Jiang Zong-li,Jiang Shou-xu.Formal Languages and Automata Theory[M].Beijing:Tsinghai University Press,2007.(in Chinese)
[12] John E.Hopcroft,Rajeev Motwani,Jeffrey D.Ullman.Introduction to Automata Theory,Language,and Computation[M].Beijing:China Machine Press,2009.7.
[13] Wuu Yang.Mealy Machines are a better model of lexical analyzers[J].Computer Languages,Systems & Structures.2002,28 (3):273-288.
[14] IllyaI.Reznykov,Vitaliy I.Sushchansky.On the 3-state Mealy automata over an m-symbol alphabet of Growth order[nlogn/2logm][J].Journal of Algebra 2006,304:712-754.
[15] 江荻,康才俊.书面藏语排序的数学模型及算法[J].计算机学报,2004,27(4):527-529. JiangDi,Kang Cai-Jun.The sorting mathematical model and algorithm of written Tibetan language[J].Chinese Journal of Computers,2004,27(4):527-529.(in Chinese)

基金

国家自然科学基金 (No.61262051,No.11271237,No.61163018); 国家社科基金 (No.13BYY141,No.14BYY1322); 教育部“春晖计划”合作科研项目 (No.Z2012093); “长江学者和创新团队发展计划”创新团队资助项目 (No.IRT1068)

PDF(884 KB)

1259

Accesses

0

Citation

Detail

段落导航
相关文章

/