1. 中国科学院信息工程研究所,北京,100093
2. 中国科学院大学网络空间安全学院,北京,100049
3. 北京工业大学信息学部,北京,100124
4. 电子科技大学广东电子信息工程研究院,广东,东莞,523808
5. 中国科学院信息工程研究所,北京,100093
6. 中国科学院大学网络空间安全学院,北京,100049
7. 北京工业大学信息学部,北京,100124
8. 电子科技大学广东电子信息工程研究院,广东,东莞,523808
网络出版:2018-09-25,
纸质出版:2018
移动端阅览
吕品, 李全刚, 柳厅文, 等. 基于双向LSTM的误植域名滥用检测方法[J]. 电子学报, 2018,46(9):2081-2086.
L, Uuml, Pin, et al. Towards Typosquatting Abuse Detection using Bi-directional LSTM[J]. Acta Electronica Sinica, 2018, 46(9): 2081-2086.
吕品, 李全刚, 柳厅文, 等. 基于双向LSTM的误植域名滥用检测方法[J]. 电子学报, 2018,46(9):2081-2086. DOI: 10.3969/j.issn.0372-2112.2018.09.006.
L, Uuml, Pin, et al. Towards Typosquatting Abuse Detection using Bi-directional LSTM[J]. Acta Electronica Sinica, 2018, 46(9): 2081-2086. DOI: 10.3969/j.issn.0372-2112.2018.09.006.
当前,误植域名检测主要以计算域名对之间的编辑距离为基础,未能充分挖掘域名的上下文信息,且对短域名的检测易产生大量的假阳性结果。采集域名相关信息进行判定虽然有助于提高检测效果,却会引入较大的额外开销.本文采用了基于域名字符串的轻量级检测策略,并引入双向长短时记忆模型(LSTM,Long Short-Term Memory)来充分利用域名上下文,提升检测效果.本文还设计了面向域名的局部敏感哈希函数,以提高在大规模域名集合上进行误植域名检测的速度.在大量真实数据集上的实验结果表明,本文的工作改进了基于编辑距离检测方法的不足,能够有效地进行误植域名滥用检测.
Prior works on detection of typosquatting abuse are based on the calculation of edit distance between domains. They do not fully utilize the context information of domains
and usually give many false positive results for short domains. Actively crawling much related information of the given domains can help improving the results
but introduce a heavy overhead. Therefore
we design a lightweight detecting strategy based on domain names
and introduce the bi-directional long short-term memory (LSTM) model to make full use of the domain context information. Furthermore
we give a locality sensitive hashing function for domain names
in order to increase the speed of typosquatting abuse detection over large-scale domain sets. Experimental results on a real data set show that the proposed method can overcome the shortcomings of edit distance based methods
and can detect typosquatting abuse efficiently.
0
浏览量
267
下载量
4
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621