摘要:In mobile augmented reality applications, users interact with smart objects in proximity to finish collaboration or interaction tasks, whose efficiency and user experience are determined by the underlying directional interaction technology. However, current directional interaction technologies at this stage are inefficient. For interaction means, they rely on wireless technologies such as Wi-Fi and BLE, which propagates omni-directionally and thus cannot use the user's spatial context (i.e., location and direction) to shorten the interaction time, bringing unnecessary effort. In terms of interaction interface, current vision-based interfaces suffer from low reliability and low scalability, which further limits the adaptability and efficiency of the system. To address this issue, we developed RetroAR: an optical-sensing solution that leverages visible light backscatter communication to serve for directional interaction with intelligent objects on commodity smartphones. RetroAR exploits the directional propagation property of light to preserve the user's spatial context, which enables fast connection-free directional interaction between the user and the target devices. RetroAR instruments objects with custom retro-reflective markers called ViTags. When users interact with these smart objects, these ViTags are used to communicate with the camera on the mobile reader by backscattering the flashlight beams. We first conducted a system evaluation which showed that RetroAR could work reliably at a distance up to 4 meters and a view angle up to 100 degrees, and is able to achieve 6-DoF 3D tracking with an error as low as 1 cm on translation and 4.7 degrees on rotation. To evaluate how our system performs in terms of interaction, we then conducted a user study with 12 participants, which demonstrated that RetroAR improves the interaction time of MAR contactless control by at least two times compared to Wi-Fi-based solutions. RetroAR utilizes the user's spatial environment with visible light backscatter communication to maintain the intuitiveness of the interaction process. Users can interact with multiple targets in a point-and-control manner, which reduces interaction costs and brings a natural and intuitive interaction experience.
摘要:Temporal information and subtle lip changes are crucial for lip reading. However, existing lip-reading methods have not accurately captured temporal information and focus on subtle movements. In response, we propose a lip-reading method named DMT-GhostNet that emphasizes minor lip variations and enhances temporal information. We introduce the decoupled spatio-temporal enhancement block (DSTE) to decouple the single 3D convolution into the time domain and the spatial domain. Based on motion excitation (ME) and the Ghost bottleneck block, we introduce the micro-motion bottleneck (M-Ghost) to detect subtle lip motions. The transformer multi-scale temporal convolution network (TransMS-TCN) is proposed to focus on important temporal sequences and restrict irrelevant information from flowing into MS-TCN. Experimental results show that DMT-GhostNet achieved an accuracy of 89.21% on the LRW dataset, which is an increase of 3.91% over mainstream methods based on ResNet and reduces the parameter count by nearly 6 M. This indicates that DMT-GhostNet effectively utilizes temporal information and focuses on lip details, significantly improving lip-reading performance.
摘要:Android virtualization applications is host applications and support dynamic loading of functional modules required by users in the form of plugins. Malicious developers use the above application features to hide their real attack intents in plugin applications for avoiding detection against the host applications. However, plugins are numerous and difficult to obtain and analyze, and existing pattern-based Android malicious virtualization application detection solutions have the problem of limited detectable application types. We propose a method based on contexts of conditional statements for detecting Android malicious virtualization applications and implement a prototype tools named MVFinder. The method takes the contextual environment in the Android virtualized application code that triggers loading or calling behaviors of plugin programs as the entry point to uncover the hidden maliciousness, for avoiding the need to consume a large amount of resources to try to obtain different kinds of plugin programs in real time or to parse the loading and running mode of the plugins one by one. At the same time, the method leverages the anomaly detection technique to discover data samples that differ significantly from the conditional contexts of most benignware, and thus identify the targeted malware, for avoiding the limitations of detecting with predefined rules. The experimental results show that this method outperforms the current representative schemes including VAHunt, Drebin, and Difuzer, in terms of accuracy and F1 score for detecting Android malicious virtualization application. Compared to VAHunt, MVFinder achieves identification of variants of HummingBad and PluginPhantom malicious application families.
摘要:Graph convolutional network has been widely applied in multi-behavior recommender systems due to its powerful ability to learn high-order collaborative signal. However, most existing graph convolution-based multi-behavior recommendation methods have failed to effectively model the relationships between different user-item nodes and various behaviors. The sparsity of target behaviors also poses challenges to further improve the performance of multi-behavior recommendation algorithms. Based on this, we propose the multi-behavior graph contrastive learning recommendation model with self-attention mechanism (SA-MBGCL). This method combines user-item node embeddings with behavior embeddings and employs a self-attention mechanism to enhance embedding representations, effectively modeling the dependency relationships between different nodes and behaviors. In the meanwhile, a graph contrastive learning approach is constructed, treating the target behavior and auxiliary behaviors of the same user as positive pairs, while considering those of different users as negative pairs, thereby reinforcing behavioral differences among different users to alleviate the sparsity of target behaviors. The proposed model combines unsampled recommendation tasks with multi-behavior graph contrastive learning to perform multi-task joint optimization. It was compared with 6 single-behavior models and 10 multi-behavior models on two public datasets, Beibei and Taobao. The results show that the proposed model SA-MBGCL achieves an average improvement of 5.21% in Hit Ratio (HR) and 8.30% in Normalized Discounted Cumulative Gain (NDCG). This demonstrates the effectiveness of the method presented in this work.
关键词:self-attention mechanism;graph contrastive learning;graph convolutional network;multi-task;multi-behavior;recommender system
摘要:UAV (Unmanned Aerial Vehicle)-assisted WSN (Wireless Sensor Networks) suffers from single-source data collection and uneven energy supplement. In this article, we first investigate and develop a mathematical model for the problem of fairness for data collection and energy supplement. Then, a novel deep reinforcement learning algorithm, named DPDQN (Double Parametrized Deep Q-Networks), is designed to resolve the proposed problem. The DPDQN algorithm incorporates a hybrid discrete-continuous action strategy, which consists of two components, namely, discrete action network and continuous action network. The former schedules the UAV's visiting order to sensors in WSN, and the latter optimizes the UAV’s hover position around each visited sensor. Numerical results demonstrate that the DPDQN algorithm outperforms three existing solutions in data collection fairness, energy replenishment fairness, flying distance, and four factors that influence fairness. Furthermore, the results validate our algorithm is robust and stable.
关键词:fairness data collection and energy supplement;unmanned aerial vehicle path planning;deep reinforcement learning;Wireless sensor networks
摘要:In recent years, U-shaped convolutional neural networks (CNNs) have achieved remarkable progress in image dehazing. However, most U-shaped dehazing networks directly pass encoder features to the decoder at the corresponding scale, ignoring effective utilization of multi-scale features. In addition, channel attention widely used in dehazing networks is restricted by receptive fields, failing to sufficiently leverage contextual information, which adversely affects learning of channel weights. To address the above issues, this paper proposes a novel dehazing algorithm with cross-layer attentive feature interaction and multi-scale channel attention. Specifically, the cross-layer attentive feature interaction module learns hierarchical weights for multi-scale encoder features, and aggregates these cross-layer features for transfer to the decoder, thereby reducing feature dilution during the dehazing network's reconstruction of clear images. Moreover, to uncover channel information that is critical for dehazing networks, we devise a multi-scale channel attention mechanism that extracts multi-scale features by dilated convolutions with different dilation rates, forming a parallel learning scheme of channel attention with multi-scale contexts for more effective weight allocation for dehazing network features. Experimental results demonstrate that the proposed dehazing algorithm achieves better objective metrics and visual performance compared to 12 existing methods on 4 public datasets. The code for this paper has been uploaded tohttp://github.com/bohuisir/AAFMAF.
摘要:Due to the variability of the network environment, video playback is prone to lag and bit rate fluctuations, which seriously affects the quality of end-user experience. In order to optimize network resource allocation and enhance user viewing experience, it is crucial to accurately evaluate video quality. Existing video quality evaluation methods mainly focus on the visual perception characteristics of short videos, with less consideration of the ability of human memory characteristics to store and express visual information, and the interaction between visual perception and memory characteristics. In contrast, when users watch long videos, video quality evaluation needs dynamic evaluation, which needs to consider both perceptual and memory elements. To better measure the quality evaluation of long videos, we introduce a deep network model to deeply explore the impact of video perception and memory characteristics on users' viewing experience, and proposes a dynamic quality evaluation model for long videos based on these two characteristics. Firstly, we design subjective experiments to investigate the influence of visual perceptual features and human memory features on user experience quality under different video playback modes, and constructs a video quality database with perception and memory (PAM-VQD) based on user perception and memory. Secondly, based on the PAM-VQD database, a deep learning methodology is utilized to extract deep perceptual features of videos, combined with visual attention mechanism, in order to accurately evaluate the impact of perception on user experience quality. Finally, the three features of perceptual quality score, playback status and self-lag interval output from the front-end network are fed into the long short-term memory network to establish the temporal dependency between visual perception and memory features. The experimental results show that the proposed quality assessment model can accurately predict the user experience quality under different video playback modes with good generalization performance.
关键词:visual perceptual properties;memory effect;quality of experience (QoE);deep learning;attention mechanism
摘要:As an important spatio-temporal data mining task, user trajectory identification is widely used in the fields of location-based personalized service recommendation, itinerary planning, crime behavior detection, and target tracking.However, it still has low prediction accuracy, mainly due to low sampling and sparse trajectory data, and a huge number of trajectory categories.To fill the research gaps, a user trajectory identification model based on an expandable self-attention spatio-temporal graph convolutional neural network (ESAST-GCNN) is proposed, which adopts the spatio-temporal graph convolutional neural network to deeply mine the relationship between time sequence features and spatial features to predict and expand the sequence.This model combines the self-attention mechanism to obtain the internal correlation of user trajectory feature vectors and identify user trajectories.After testing on two real datasets, the results show that the accuracy of ESAST-GCNN is improved by 13.95% and 10.63% in Geolife and Gowalla compared with TUL via Embedding and RNN (TULER-GRU), respectively.The experimental results illustrate that ESAST-GCNN is superior to other comparative models, with better identification effect and wider applicability.
摘要:Database parameter tuning is one of the crucial tasks in improving the performance of database systems. Database parameters can be classified based on their scopes and functionalities. It plays an essential role in investigating the mutual influence of parameters within a specific category or between different categories. But, the existing methods do not take into consideration this aspect. A collaborative multi-agent model called DBT-MADDPG (DataBase Tuning-Multi-Agent Deep Deterministic Policy Gradient) is proposed for database parameter tuning. A single-agent pre-training model called SA (Single Agent), a multi-agent joint training model called JAM (Joint Action Model), and a joint training model based on probabilistic selection called JAPM (Joint Action Probability Model) are designed for tuning the database parameters at different stages. The experimental results show that the DBT-MADDPG model is capable of tuning the database parameters at different functional and parameter levels, and can reach the performance of mainstream algorithms in the training stage of the SA model, and is 17.85% faster than the state-of-the-art algorithms to obtain the optimal configuration.
关键词:multi-agent;parameter tuning;database system;self-learning;joint training model
摘要:The topic of semantic community discovery and evolution analysis in dynamic attributed networks has important research value. It needs to simultaneously accomplish the tasks of dynamic community discovery, community semantic interpretation and community evolution analysis, but existing methods are difficult to achieve this goal. In view of this, this paper proposes a method DAN-NMF (NMF for Dynamic Attributed Networks) based on joint nonnegative matrix factorization. DAN-NMF can uniformly integrate network topology information, attribute information and smooth constraint information from community evolution, and derive iterative update rules of the related factor matrices using the majorization-minimization optimization framework, which helps it to directly obtain the results of dynamic community discovery, community semantic interpretation and community evolution analysis. Extensive experiments are conducted on multiple synthetic and real-world dynamic attributed networks. The results show that DAN-NMF has improved by at least 7.3% in term of accuracy metric, compared to the optimal baseline. Moreover, the data analysis results on real-world dynamic attributed networks also demonstrate that DAN-NMF can effectively discover the evolution patterns of dynamic communities and provide rich community semantic interpretations.
摘要:The quest for low-latency block ciphers is a burgeoning area of interest within the cryptographic community, with the development of low-latency S-boxes standing as a pivotal avenue of exploration.Leveraging gate circuits of minimal latency and a novel two-layer tree structure, our study delves into the construction of balanced Boolean functions and their extended bit permutation equivalence classes that manifest desirable cryptographic properties across varied latency thresholds.Utilizing these low-latency Boolean functions as coordinate functions, we craft vectorial Boolean functions to construct S-boxes with low-latency.Our research not only furnishes S-boxes optimized for latency performance and hardware implementation area but also pioneers the amalgamation of low-latency S-box sets with their corresponding inverse sets, searching for S-boxes with bidirectional low-latency property.The low-latency S-box in our investigation outperform existing benchmarks and offer more choices, showcasing a latency reduction of 20% and 33% over MANTIS and PRINCE, alongside achieving a hardware area reduction of 6.68% compared to MANTIS and a substantial improvement of 17.69% against PRINCE.
摘要:The 3D UAV (Unmanned Aerial Vehicle) path planning problem aims to plan an optimal flight path for the UAV while satisfying safety conditions. In this paper, a cost function for UAV path planning is constructed by means of mathematical modeling, so that the UAV path planning problem is transformed into a multi-constrained optimization problem, and metaheuristic algorithms are applied to solve this problem. Aiming at the shortcomings of artificial rabbit optimization algorithm which is slow to converge and easy to fall into local optimum, this paper develops an improved Artificial Rabbit Optimization algorithm based on Levy flight, adaptive Cauchy mutation, and elite population Genetic strategy (LCGARO). Multifaceted comparison experiments are conducted between LCGARO and six classical and advanced heuristic algorithms in 29 CEC2017 test functions and six 3D UAV path-planning terrain scenarios of varying complexity. The results of the comparison experiments prove that the LCGARO algorithm proposed in this paper has better optimization accuracy among 22 test functions in the comparison experiments of CEC2017 test functions. In the UAV path planning experiments, the LCGARO algorithm is able to plan a flight path with the smallest total cost function value in five terrain scenarios.
摘要:The transferability of adversarial samples is crucial for attacking unknown models, providing feasibility for adversarial attacks in practical scenarios. Existing transfer attacks tend to indiscriminately distort features to degrade prediction accuracy of the source model. However, they overlook the intrinsic features of objects in the images. Inspired by existing work on feature importance extraction, this paper proposes a method termed multi-layer accumulated gradient attack, which disrupts crucial object-aware features that dominate the model decision. Specifically, this paper introduces the iterative accumulated gradients to quantify feature importance, which are highly correlated with the target object and helpful to improve transfer attacks. Furthermore, combining attacks across various intermediate layers, this paper finally achieves multi-layer accumulated gradient attack. Compared with the best performing method, experimental results demonstrate a more efficient performance of the proposed one, the attacking success rates of which are comparable as to the normally trained models while increased by 2.6 percentage points as to the defense models.
摘要:The rapid development of artificial intelligence technology has endowed autonomous air combat strategies with the potential to surpass human experts. Existing intelligent air combat strategies can be classified into two categories based on their driving methods: knowledge-based strategies, which heavily rely on application scenarios and expert knowledge; and data-driven strategies, represented by reinforcement learning, which have poor interpretability and weak generalization. In this study, focusing on the scenario of multi-agent cooperative air combat from the air intelligence game (AIG)—a knowledge-based and data-driven integrating strategy design method is proposed. The knowledge-based part utilizes expert knowledge to design a parameterized and stylized knowledge-based artificial intelligence (AI) system, which generates high-quality offline data and initializes the strategy. The data-driven part employs graph attention networks to selectively represent information about teammates and opponents, aiming to improve training efficiency and convergence performance. Furthermore, a dynamic opponent matching mechanism is introduced for multi-agent reinforcement learning training to enhance strategy generalization. The proposed strategy achieved a statistical winning rate of over 70% when competing against 12 teams from the top 16 teams in AIG. It is worth mentioning that these teams all adopt the latest knowledge-based or data-driven methods, with diverse styles, and at the same time, they have strong combat capabilities.
关键词:reinforcement learning;knowledge and data integrating;air combat;multi-opponent;generalization
摘要:With the deepening of informatization of water treatment systems integrated industrial internet technology are facing increasingly severe challenges of abnormal behavior intrusion. Aiming at such problems as single threshold detection, low detection accuracy, high false alarm rate and so on in traditional anomaly detection methods, a progressive anomaly detection method for water treatment systems that integrates autoencoders and isolation forests is proposed. Firstly, by downsampling to filter duplicate data, the training and testing efficiency of the progressive anomaly detection model is accelerated; Secondly, the hidden layer neurons of the autoencoder are constructed to capture the key features of the data, optimize the weight and bias of the autoencoder, and set the reconstruction error threshold as the difference measurement between input and reconstruction for basic detection; Finally, construct an isolation tree with the average path length as the anomaly measurement threshold to form an isolation forest, and further traverse the isolation tree to complete advanced detection based on the anomaly data discovered by basic detection; Improving detection performance based on two-stage progressive anomaly detection. The experimental results show that the accuracy and F1 score of the proposed method in the secure water treatment dataset exceeds 95%, compared with the traditional method, the accuracy is improved by 31.86 percentage points on average, especially, the false positive rate of anomaly detection is significantly reduced to 0.30%. The precision rate, recall rate and other indicators obtained by the generalization analysis of the water distribution dataset are all over 94%. The training and testing time of the model has outstanding advantages in terms of comprehensive performance compared to comparative methods.
摘要:Ultrasound image segmentation plays a key role in disease diagnosis and treatment, but accurately segmenting the regions of interest is still a challenging task due to the low contrast, noise interference, and variability in shape, size, and location of the lesions in ultrasound images. To address this problem, we propose a dual-channel self-attention mechanism U-shaped network (SwinT-Unet), which utilizes Swin-Transformer and Unet encoder to simultaneously extract features. To effectively fuse the different-level features extracted by Swin-Transformer and Unet encoder, we also propose a gated dual-layer feature fusion module (GDFF), which achieves the effective fusion of global and local features through the gating mechanism, thereby improving the accuracy and robustness of the segmentation results. We conduct experiments on two different ultrasound image datasets, and the results show that our proposed model outperforms the existing convolutional neural network and Transformer-based models in terms of segmentation accuracy and robustness. Our paper provides a new method for ultrasound image segmentation, and offers more accurate and reliable support for clinical medical diagnosis and treatment.
摘要:Existing neural network-based high range resolution (HRR) radar target recognition methods are data-driven models that are of black-box structure, which makes it hard to interpret or assess the hidden representations of data. In the case of incomplete target-aspect, neural network-based methods are faced with the issues of poor feature generalization ability and rapid degradation of recognition performance. To access the issues, this paper develops a physical interpretable auto-encoder model (PIAEM). In detail, by incorporating the scattering center model of radar targets into networks, the PIAEM is a physical interpretable model that learns scattering center features with physical meanings. Specially, since the scattering center features reflect the target structure based on radar imaging theory, they are robust under the case of incomplete target-aspect. Moreover, this paper designs a recognition scheme to predict the category of test samples based on the minimum reconstruction error criterion. The experiments on the measured HRR radar dataset validate the effectiveness of our model on learning interpretable features and robust recognition performance, and our PIAEM improves 10.27% rates comparing with traditional radar target recognition methods.
关键词:radar target recognition;interpretable neural network;scattering center model;variational inference;auto-encoder;minimum reconstruction error
摘要:The exponential increase of IoT devices has accelerated the process of interconnecting heterogeneous wireless devices, and the cross-technology communication (CTC) technique enables wireless devices to operate in the same band and use different underlying protocols to connect directly without gateways. Nevertheless, systematic research on the two-way CTC of heterogeneous mobile devices is still lacking. This paper proposes MobiCTC, a CTC scheme based on energy sensing that supports bidirectional CTC between mobile WiFi and ZigBee devices. In the WiFi-to-ZigBee direction, the scheme uses RSSI as the decoding information and an energy-level mapping scheme to achieve information decoding. In the ZigBee to WiFi direction, the scheme adopts CSI as the decoding information, fully exploits CSI’s amplitude and phase information and uses a machine learning method for decoding. Finally, this paper designs and implements MobiCTC using the TelosB node and USRP X310 platform, as well as experimental verification. The experimental results show that in the mobile state, the WiFi to ZigBee communication throughput is 139.535 bps, which is 1.82 times higher than WiZig, and the symbol error rate is 0.016, which is basically the same as WiZig; the ZigBee to WiFi communication throughput is 250 bps, which is 15.7% higher than FreeBee, and the symbol error rate is 0.0516, which is a decrease of 23.21% compared to ZigFi.
关键词:heterogeneous devices;cross-technology communication;wireless communication;received signal strength indicator;channel state information
摘要:Collision attack is one of the main analysis techniques for scalar multiplication, and its success rate depends on the correction rate of collision detection in operations such as point addition and multiplication. Due to the influence of random operands and branching statements, collision detection almost approaches random guessing. How to detect collisions for point addition and point doubling effectively has become an urgent problem to be solved. To solve this problem, we focus on point addition and doubling on Jacobian coordinates in Weierstrass curves, and propose a collision detection method for scalar multiplication based on modular similarity detection. Firstly, according to the operation process of point addition and point doubling, the modular multiplication used in collision detection are identified, and a new collision relationship is constructed between the modular multiplications, which converts attack into modular multiplication collision detection. Secondly, we find that there are modular multiplications which are completely determined by the coordinate in the Jacobi coordinates. With the help of this finding, we propose modular similarity detection, and convert attack into detecting whether the two modular multiplication operations are the same, thereby avoiding the influence of random operands on the collision detection. Finally, we conduct collision detection experiments on a hardware-implemented scale multiplication. By compressing the curve based on principal component analysis, the accuracy of collision detection for point addition and doubling is improved to 99%. The proposed collision detection method remains effective for scalar multiplications with masking and branch balancing measures.
摘要:One high efficiency harmonic tuned RF (Radio-Frequency) oscillator with a composite right/left-handed (CRLH) cell in feedback path is presented. The optimum load impedances at fundamental and harmonic frequencies are obtained through harmonic load-pull simulation. A CRLH cell in the feedback path is employed to provide appropriate amplitude and phase signal, and to control the oscillation frequency. Besides, the load network is synthesized to control the second and third harmonic terminations to improve the efficiency. For demonstration, a RF oscillator using GaN high electron mobility transistor (HEMT) is designed and fabricated at 3.5 GHz. The measured results show that the power oscillator achieves a maximum efficiency of 65.5% with an output power of 39.7 dBm at 3.49 GHz, which proves the proposed circuit topology and the design method. Meanwhile, the proposed oscillator has the smallest circuit size compared with other reported high-efficiency oscillators.
摘要:Phase-change integrated photonic devices are widely considered as a strong competitor to conventional electronic devices due to their large bandwidth, short delay, multiplexing and great anti-interference. However, current phase-change integrated photonic devices require high energy consumption, thus severely exacerbating its commercial application prospect. To address this issue, this paper innovatively proposed a promising silicon dioxide (SiO2) / magnesium fluoride (MgF2) based photonic architecture to replace the mainstream silicon based devices. Such device made use of the Ge2Sb2Te5 (GST) and indium tin oxide (ITO) as the functional and microheater materials, respectively, which have received widespread applications today, and simulated its programming and readout process according to an independently developed model that coupled electro-thermal and phase-change field processes. Results indicated that the energy consumption for crystallization and amorphization were 78 aj/nm3 and 90 aj/nm3, much lower than majority of other silicon-based devices. It also exhibited good light propagation trait at near-infrared band (1 550 nm), as well as multilevel characteristic with more than 5 intermediate states and short pulse width with 50 ns. Additionally, further research suggested that the photonic neural networks constructed from the proposed device can be used to recognize the iris dataset, and its accuracy can reach 90%, close to that of conventional neural networks (~94.7%). Aforementioned work provided for the new strategy for developing emerging phase-change photonic devices with low power, in-memory computing and neuromorphic computing functionalities, and exhibited its extremely important significance to the general non von-Neumann regime that has both electronic and photonic performance superiorities.
摘要:As the heat flux of the chip continues to increase, the heat dissipation of package is a new challenge. Liquid cooling micro-channel heat dissipation technology is an important research direction to solve the problem of packaging thermal management. In this paper, microchannels embedded in ceramic substrate is proposed to solve the heat dissipation requirement of high power chip. This structure can shorten the distance between chip and heat sink, and reduce thermal resistance. It can also reduce the volume of the thermal control system, and realize the thermal control, electrical interconnection and the integration of the structure. The design of embedded microchannels structure needs to consider the dense wiring and though hole in the ceramic substrate to avoid the interference between them. An embedded microchannels structure considering wiring and though hole is designed, and an integrated processing method of micro-channel-wiring-via is proposed in this paper. The test results show that the location of embedded microchannels is more important than the structural parameters. The convergence layer and the heat dissipation layer of the main channel of the embedded microchannels should be designed in the same layer. This embedded microchannels for ceramic packaging substrate can meet the heat dissipation requirement of the 100 W chip.