SHEN De-rong, LIU Li-nan, KOU Yue, NIE Tie-zheng, YU Ge
Vol. 38, Issue 2, Pages: 275-281(2010)
摘要:Duplicate records are multiple different records describing the same entity in the real world. Since some of the records extracted from different Deep Web sources in the same domain usually are duplicates, the paper focuses on duplicate records identification and a duplicate records identification model is proposed on the basis of known global schema and the relationship between the global schema and the interface attributes of each Deep Web data source. Based on the semi-structured data extracted from Deep Web data sources, the attributes that these data matching to are annotated by using a query probing method and the dominance of attributes of global schema is specified by analyzing these extracting instance data. Moreover, multiple estimators and multiple similarity algorithms are adopted to identify the duplicates. The experiment results show our duplicate record identification model is feasible and efficient.
关键词:Duplicate records identification;deep web;data extraction
摘要:Technologies of end-to-end multi-connection and multi-path transmission have gained a lot of re-searchers’ attention since they can improve throughput and promote security and reliability in data transmission. In the Universal Network, mapping mechanisms are presented from service to connection and from connection to path, so that data transmission with concurrent multi-connection and multi-path is also realized. This paper analyzes the mapping mechanism for the service layer of the Universal Network, and proposes the mathematical models for multi-to-multi mappings from service to connection and from connection to path based on network utility maximization. Mapping from service to path via connection is indeed how to allocate the path capacity to all services so that the aggregated utility of all services is globally optimized. General mapping models are also presented when there are inelastic services as well as elastic services in the network, so that the inelastic services can be guaranteed with certain QoS.
ZHANG Wen-ge, LIU Fang, JIAO Li-cheng, ZhANG Xiang-rong, BR
Vol. 38, Issue 2, Pages: 290-294(2010)
摘要:By modeling orthogonal bandelets coefficients in each quad-tree subsquare as Generalized Gaussian Distribution model, a calculation formula for adaptive local subsquarewise threshold is derived under the Bayesian frame, and the best range of the parameter needed to calculate subsquarewise threshold is found out. On these basis, a subsquarewise threshold denoising algorithm for natural images is proposed in bandelets domain. Owing to making full use of local statistic information of the image, the visual effect and evaluation criteria of proposed algorithm for natural image denoising outperform that of BayesShrink and other threshold-based methods.
摘要:According to analyzing the tendency of neutrals in a vote model, a new kind of similarity measure between interval-valued data based on Gauss distribution functions is proposed and the distance between two interval-valued data is given, and then, a novel fuzzy clustering algorithm for interval-valued data is presented. Examples show that this algorithm can get better performance than other existing methods.
摘要:For the adverse effect caused by the number decline of particles which are applied to implement the state estimation and model recognition, when model information is introduced into particle sampling process, a novel multiple model particle filter algorithm based on particle optimization is proposed. In the new algorithm, every particle is combined with extended Kalman filter, and the prediction and update mechanism of extended Kalman filter is used to realize the reasonable utilization of the latest observation information. The affectivity of single particle to approximate the real system state and model is improved. The theory analysis and simulation results show the new method outperform obviously the interacting multiple model particle filter and the standard multiple model particle filter in the filter precision of system state and the accuracy of model recognition.