论文数据共享支持计划网站发布

Select

PAPERS

Denoising Implicit Feedback with Self-Supervised Graph Convolution Network and Attention Mechanism for Social Recommendation

GUO Xiang-xing, ZHOU Wei, YANG Zheng-yi, WEN Jun-hao, YANG Jia-jia, LIU Man

ACTA ELECTRONICA SINICA. 2025, 53(1): 151-162. https://doi.org/10.12263/DZXB.20230387

Abstract (216) Download PDF (818) HTML (98)

Knowledge map

Save

Social recommender systems based on graph neural networks (GNNs) have achieved promising performance. However, challenges exist in GNN-based social recommendation models, such as the neighborhood aggregation operation of GNN-based models amplifying noise in users' implicit behaviors, resulting in suboptimal user and item representations. Additionally, the heterogeneity of edges in the user-item graph and the user social relationship graph leads to user representations learned on two different semantic spaces, where direct fusion also results in suboptimal representations. To address these issues, this paper proposes a social recommendation model based on self-supervised graph convolution and an attention mechanism to achieve implicit feedback noise reduction. The model captures users' true interests from the original user-item graph, generating a denoised user-item interaction graph; a novel method is introduced for fusing user vectors to integrate heterogeneous user vector representations. Experimental results on two public datasets demonstrate that the proposed model significantly improves the recommendation performance over the baseline models. Specifically, on the lastfm dataset, the performance improvement ranges from 1.18% to 3.87%, while on the ciao dataset, the improvement ranges from 3.56% to 7.31%.The effectiveness of each module is verified through ablation experiments.

Select

PAPERS

Study on Optimal Rendezvous Problem in 1D and 2D Passive Sensor Triangulation

CUI Jian-feng, LIANG Hong

ACTA ELECTRONICA SINICA. 2025, 53(1): 63-71. https://doi.org/10.12263/DZXB.20240292

Abstract (123) Download PDF (110) HTML (117)

Knowledge map

Save

We investigated the optimal intersection problem of a direction-finding cross-location system composed of 1D and 2D passive sensors. Utilizing closed-form solutions for localization accuracy, extremum analysis, and geometric intersection analysis, we identified the global optimal intersection point and explored the spatial distribution characteristics of optimal intersection positions, as well as their influencing factors and underlying principles. The study reveals that the global optimal intersection point lies in the horizontal plane of the baseline (or 2D sensor). The optimal intersection locations are jointly determined by the geometric intersection characteristics and the distance diffusion effect of measurement errors, distributed around an arc on the horizontal plane with the midpoint of the baseline as the center and the baseline length as the diameter, collapsing towards the baseline. Variations in sensor positions do not affect the relative position of the optimal intersection location to the baseline; once the variance ratio of the baseline and angular measurement errors is established, the optimal intersection location is determined. Furthermore, case analysis suggests that the optimal intersection area converges towards sensor with larger angular measurement errors. In practical engineering applications, the optimal intersection area holds greater utility than the optimal intersection point; matching the optimal intersection locations with target detection results or estimated positions can effectively enhance the system’s positioning performance.

Select

PAPERS

Local Representation Few-Shot Learning for Hyperspectral Image Cross-Scene Classification

ZHANG Yu-xiang, LI Wei, ZHANG Meng-meng, TAO Ran

ACTA ELECTRONICA SINICA. 2025, 53(1): 248-258. https://doi.org/10.12263/DZXB.20230937

Abstract (174) Download PDF (290) HTML (166)

Knowledge map

Save

In cross-scene classification tasks, most domain adaptation (DA) methods typically focus on transfer tasks where the source domain data and the target domain data are obtained using the same sensor and share the same land cover class. However, the adaptive performance is significantly reduced when new classes are present in the target data. Moreover, many hyperspectral image (HSI) classification methods rely on a global representation mechanism, where representation learning is performed on samples with fixed-size windows, limiting their ability to effectively represent ground object classes. A framework called local representation few-shot learning (LrFSL) is proposed, which aims to overcome the limitations of global representation ability by constructing a local representation mechanism in few-shot learning. In this proposed framework, meta-tasks are created for all labeled source domain data and a few labeled target domain data, and scenario training is performed simultaneously using a meta-learning strategy. Additionally, an Intra-domain local representation block (ILR-block) is designed to extract semantic information from multiple local representations within each sample. Furthermore, the inter-domain local alignment block (ILA-block) is designed to align cross-domain class-wise distribution, thereby mitigating the impact of domain shift on few-shot learning. Experimental results on three publicly available HSI datasets demonstrate that the proposed method outperforms state-of-the-art methods by a significant margin.

Select

PAPERS

Medical Image Segmentation Network Based on Shuffled Feature Encoding and Gated Decoding

LEI Tao, ZHANG Jun-ming, DU Xiao-gang, MIN Chong-dan, YANG Zi-yao

ACTA ELECTRONICA SINICA. 2024, 52(12): 4142-4152. https://doi.org/10.12263/DZXB.20231011

Abstract (144) Download PDF (157) HTML (127)

Knowledge map

Save

To solve the long-standing problems of the great scale variation in target sizes and blurred boundaries that make segmentation difficult in medical image segmentation, we propose a novel dual-branch hybrid network framework based on feature encoding and gated decoder based on multi-scale feature for accurate multi-organ segmentation. In order to fully exploit the strengths of convolutional neural network (CNN) in local information extraction and transformers in modeling long-range dependency, we employ U-Net and Swin-Unet to construct the dual-branch network. The innovation of this method lies in the shuffling operation of high-dimensional features extracted at multiple stages from different branches of the network. It efficiently integrates local and global information by means of a dual-branch channel cross-fusion, enhancing information interaction between the dual-branch network at different stages. This addresses the limitation in segmentation accuracy caused by the blurring of object contours in images. Additionally, to address the challenge of great scale variation among multiple organs, we introduce a new gated decoder based on multi-scale feature (GDMF) to extract multi-scale high-dimensional features at different stages of the network and perform adaptive feature enhancement, and adopts the attention mechanisms and feature mappings to assist in acquiring accurate target information. The experimental results on automated cardiac diagnosis challenge (ACDC) and fast and low GPU memory abdominal organ segmentation challenge 2021 (FLARE21) datasets demonstrate that our proposed method outperforms existing mainstream medical image segmentation methods and effectively solves the problems of the great scale variation in target sizes and blurred boundary in medical images.

Select

PAPERS

Design of Ultra-Wide Band Flat Negative Group Delay Circuit

GU Tao-chen, WAN Fa-yu, RAVELO Blaise

ACTA ELECTRONICA SINICA. 2024, 52(12): 3967-3975. https://doi.org/10.12263/DZXB.20240185

Abstract (116) Download PDF (238) HTML (100)

Knowledge map

Save

A fundamental theory of novel ultra-wide band (UWB) bandpass (BP) negative group delay (NGD) topology is established in this paper. The microwave circuit under study consists of lossy transmission lines and stepped impedance resonators. The flat NGD topology is constructed using fully distributed elements. The ABCD- and S-parameter models are formulated to derive the NGD optimal values and bandwidth. In order to verify the theoretical feasibility, NGD prototypes are designed, fabricated, and measured. The flat BP-NGD microstrip circuit has a compact size of 11 mm × 81 mm (0.13 λ_g × 1.01 λ_g) with a NGD center frequency of f_n =2.14 GHz. Excellent agreement has been observed between experimental and theoretical results, revealing Δf_NGD=1.28 GHz (BW_NGD=61%f_n ) NGD bandwidth and t_n =-0.52 ns NGD value. Furthermore, within the NGD frequency band, the flat BP-NGD prototype presents a good performance in terms of bandwidth about Δf_NGD =1.01 GHz, BW_flat-NGD =48%f_n with t_n ±0.05 ns group delay fluctuation. Compared with similar broadband flat NGD circuits, the flat NGD bandwidth of the SIR NGD circuit proposed in this article is increased by about 215%. The flat BP-NGD prototype return loss at the center frequency is better than 18.8 dB.

Select

PAPERS

Reinforcement Learning Based Tuning-free Plug-and-Play Image Reconstruction Method for Single Photon Imaging

CHEN Shuang, TIAN Ye, FU Ying

ACTA ELECTRONICA SINICA. 2024, 52(10): 3600-3612. https://doi.org/10.12263/DZXB.20230343

Abstract (928) Download PDF (513) HTML (872)

Knowledge map

Save

Quantum image sensor (QIS) has ultra-high single-photon sensitivity and spatial resolution, making it a promising alternative to CMOS image sensor (CIS) as the next-generation image sensor. However, image reconstruction of QIS differs from traditional image reconstruction methods, it aims to recover the original scene from binary measurements. The existing methods include model-based QIS image reconstruction and deep learning-based QIS image reconstruction. Model-based methods are largely based on optimization and are highly sensitive to the selection of hyperparameters. While deep learning-based methods require designing and training separate models for QIS image reconstruction tasks with slight variations in detail, which is inflexible and limits its usefulness to a large extent. In order to tackle the problems in QIS image reconstruction, a tuning-free plug-and-play alternating direction method of multiplier (TFPnP-ADMM) QIS image reconstruction method is proposed in this paper, which can adaptively select appropriate parameters dynamically for different input images with various oversampling factors, so as to achieve better image reconstruction performance. Specifically, in this paper, the parameters that need to be manually tuned in the QIS image reconstruction process under the plug-and-play (PnP) framework are modeled as a sequential decision problem, and a mixed model-free and model-based reinforcement learning algorithm is introduced to learn an optimal strategy, which could determine optimal hyperparameters at each iteration for different input images. The experimental results on synthetic dataset and real dataset demonstrate that, compared with existing state-of-the-art methods, the proposed method improves the peak signal-to-noise ratio by approximately 0.44~0.60 dB under oversampling rates of 4, 6, and 8. Furthermore, the visual results demonstrate the superiority of the proposed method in retaining more texture details. Real extremely low light QIS image data is available at https://github.com/ying-fu/Real-SPAD-Dataset.

Select

PAPERS

LiDAR Point Cloud Tracking Method Using Point-Voxel Relationship Modeling Under 3D Sparse Convolutional Framework

TIAN Sheng-jing, HAN Yi-nan, ZHAO Xian-tong, LIU Xiu-ping, ZHANG Ming

ACTA ELECTRONICA SINICA. 2024, 52(10): 3527-3540. https://doi.org/10.12263/DZXB.20231009

Abstract (879) Download PDF (522) HTML (830)

Knowledge map

Save

The potential of sparse convolution in the field of single target tracking from LiDAR (Lightlaser Detection And Ranging) point cloud has not been fully explored. The vast majority of point cloud tracking algorithms use point-based backbone networks which require higher computation costs and the target-aware relationship modeling is insufficient. To address this problem, this paper proposes a 3D target tracking algorithm based on a sparse convolutional framework, and incorporates it with a point-voxel dual channel relationship modeling module to facilitate the embedding of target discrimination information in the such sparse framework. Firstly, this work uses a 3D convolutional residual network to extract the features of the template and search area separately, then uses deconvolution to obtain pointwise features for the spatial position in tracking tasks. Secondly, the relationship modeling module further calculates a semantic similarity query table based on the above features of the template and the search area. In order to capture the fine-grained correlation, on the one hand, the module utilizes the nearest neighbor algorithm in the spatial point channel to find the template points for each search area point, and extracts corresponding features based on the query table; on the other hand, local multi-scale voxels are constructed with each search area point as the center in the voxel channel, and the accumulated similarity of templates falling into voxel units is used as clues to extract features. Finally, the dual channel feature fusion is sent into the candidate bounding box generation module based on bird’s-eye view to estimate the target bounding box. To verify the superiority of the proposed method, we evaluated it on the KITTI and NuScenes datasets, and compared with the baseline algorithm adopting sparse convolution, the mean success and precision rates achieved a considerable improvement of 11.0% and 12.0%. The proposed method not only inherits the efficient characteristics of sparse convolution but also improves tracking accuracy.

Select

PAPERS

Cross-CNN: An Animation Cross-Frame Sketch Colorization Algorithm Based on Hybrid Model with CNN and Transformer

YU Yi-feng, QIAN Jiang-bo, YAN Di-qun, WANG Chong, DONG Li

ACTA ELECTRONICA SINICA. 2024, 52(7): 2491-2502. https://doi.org/10.12263/DZXB.20230622

Abstract (1247) Download PDF (773) HTML (1199)

Knowledge map

Save

Coloring long sequences of animated sketch frames is a challenging task in computer vision. On one hand, the information contained in sketches is sparse, and coloring algorithms need to infer missing information. On the other hand, the colors between consecutive frames need to be consistent to ensure visual quality throughout the video. Most existing coloring algorithms are designed for single images and only provide one open-ended, reasonable color result, which is not suitable for coloring frame sequences. Other reference-based coloring algorithms do not have an organic connection between two frames, resulting in unsatisfactory coloring results. In the same shot sequence, the features of same object usually do not change too much. Therefore, a model that can automatically color sketches based on a given reference frame can be designed. This paper proposes a new model called Cross-CNN that combines convolutional neural networks (CNN) and Transformer. Our Cross-CNN can find and match colors from the reference frame, thus ensuring temporal feature consistency. In this model, the reference frame and the sketch frame are superimposed in the channel dimension, and the pre-trained Resnet50 network is used to extract locally fused features. The fused feature map is then passed to the Transformer structure for encoding to extract global features. In the Transformer structure, a cross attention mechanism is designed to better match long-distance features. Finally, a convolutional decoder with skip connections is used to output the colored image. In terms of the dataset, this paper extracted frames from eight movies and conducted strict screening to create a dataset containing 20 000 pairs of reference and sketch frames for experimental research. The SSIM (Structural SIMilarity) of Cross-CNN can reach 0.932, which is higher than the SOTA algorithm by 0.014. The algorithm codes link for this paper: https://github.com/silenye/Cross-CNN.

Select

PAPERS

Basic Math Library Implementation for RISC-V

LI Fei, GUO Shao-zhong, HAO Jiang-wei, HOU Ming, SONG Guang-hui, XU Jin-chen

ACTA ELECTRONICA SINICA. 2024, 52(5): 1633-1647. https://doi.org/10.12263/DZXB.20220375

Abstract (962) Download PDF (778) HTML (910)

Knowledge map

Save

RISC-V instruction set architecture (ISA), as a new streamlined ISA, has developed rapidly due to its characteristics of free, open source, and freedom. Since the research on RISC-V at home and abroad mainly focuses on hardware development, the software ecosystem is still weak compared to mature ISAs. Implementing a set of high-performance basic math libraries for the RISC-V instruction set can further enrich the RISC-V software ecosystem. This paper realizes the transplantation of Sunway math library to RISC-V based on automatic transplantation technology, and provides the first basic math library system using vector instruction optimization for RISC-V instruction architecture. This paper proposes an automatic branch look-up table method and a path marker insertion method for vector registers, focusing on solving the problem of register multiplexing in the process of register mapping between different architectures, realizing the correct and efficient mapping of registers, and automatically transplanting 69 mathematical functions according to different instruction equivalence conversion strategies. The test results show that the RISC-V basic math library function can achieve correct calculation, the maximum error is 1.90ULP, and the average performance of functions is 157.03 beats.

Select

PAPERS

GAN Synthetic Image Detection Using Fused Features in the Multi-Color Channels

QIAO Tong, CHEN Yu-xing, XIE Shi-chuang, YAO Heng, LUO Xiang-yang

ACTA ELECTRONICA SINICA. 2024, 52(3): 924-936. https://doi.org/10.12263/DZXB.20220711

Abstract (1092) Download PDF (1026) HTML (1001)

Knowledge map

Save

CSCD(1)

Currently, it is very difficult to identify the images synthesized by generative adversarial networks (GAN), which severely poses the threat on national cyber security and social stability. Meanwhile, most classifiers based on deep neural networks require large-scale samples for training, where the problems such as low model interpretability and poor generalization performance are less addressed. To overcome the limitations, we propose to design the ensemble classifier using fused features in the multi-color channels. First of all, by studying the discrimination of adjacent pixels in the multi-color channels between natural and GAN synthetic images, the difference metric is designed based on the correlation of adjacent pixels, in order to select the optimal color channels. Secondly, by utilizing the highly-correlated relationship among pixels, the difference array between adjacent pixels are modeled through a second-order Markov chain along eight directions, and meanwhile the subtractive pixel adjacency matrix features are successfully extracted. Finally, based on the extracted features, a simple but efficient detector for identifying GAN synthetic images is constructed. In the image dataset synthesized by the StyleGAN model, the results show that the accuracy of the proposed detector can reach 100.00%. It can also identify GAN synthetic images very well when the pair number of positive and negative training samples is 2 (99.65% accuracy) or only 50 positive training samples are provided (92.84% accuracy). The accuracy can also reach more than 99.96% in the image dataset synthesized by StyleGAN2 and PGGAN models. Numerous experiments show that the proposed method in this paper is better than the compared forensic methods. Our code is available at https://github.com/cyxcyx559/ccss.

Select

PAPERS

Imbalanced Ensemble Algorithm Based on Envelope Learning and Hierarchical Structure Consistency Mechanism

LI Fan, ZHANG Xiao-heng, LI Yong-ming, WANG Pin

ACTA ELECTRONICA SINICA. 2024, 52(3): 751-761. https://doi.org/10.12263/DZXB.20220712

Abstract (679) Download PDF (418) HTML (676)

Knowledge map

Save

Ensemble methods have become an important branch of imbalanced learning. However, the existing imbalanced ensemble methods all rely on the original instances without considering the structure information of the instances, so their effectiveness is still limited. The research shows that the structure information of instances includes local and global structure information. In order to solve the above problem, this paper proposes an imbalanced ensemble algorithm based on deep instance envelope network (DIEN) and hierarchical structure consistency mechanism (HSCM). Considering the local manifold and global structure information, the algorithm generates high-quality deep envelope instances to achieve class balance. Firstly, based on the instance neighborhood concatenation and fuzzy c-means clustering algorithm, the DIEN is designed to mine the structure information of instances, obtaining the deep envelope instances. Then, the local manifold structure measure and global structure distribution measure are designed to construct the HSCM to enhance the distribution consistency of interlayer instances. Next, DIEN and HSCM are combined to construct the optimized deep instance envelope network—DH (DIEN with HSCM). Then, the base classifier is applied to the deep envelope instances. Finally, the bagging ensemble learning mechanism is designed to fuse the prediction results of the base classifier to obtain the final results. At the end of this paper, several groups of experiments are organized. More than 10 public datasets and representative related algorithms are used for verification. Experimental results show that the proposed algorithm is significantly better in four performance metrics, such as AUC (Area Under Curve) and F-measure.

Select

PAPERS

Research on a Gesture Recognition Algorithm for FMCW Radar Based on Bidimensional Filtering and Adaptive Fixed Length

CHEN Jun-yi, JIANG De-chen, WANG Zhi-ming, CAO Jia-he, WANG Yong

ACTA ELECTRONICA SINICA. 2023, 51(8): 2179-2187. https://doi.org/10.12263/DZXB.20211410

Abstract (592) Download PDF (722) HTML (447)

Knowledge map

Save

This paper proposes a gesture recognition algorithm based on frequency modulated continuous wave (FMCW) radar echo signals. Firstly, a two-dimensional filtering algorithm is proposed to filter the gesture echo signals in the distance and speed dimensions, which effectively reduces the static noise of the system. Secondly, the data is filtered by the moving target indicator (MTI) algorithm to filter out the noise in the time dimension. Then a time-adaptive fixed-length method is proposed, which ensures the consistency of the frame number of each gesture sample on the premise of reducing the loss of gesture information. Finally, a range Doppler net (RD-Net) is established for training and classification. The algorithm achieved 98.28% accuracy in Google's open source deepsoli data set, which is 11.11% higher than the algorithm proposed by the data set. The algorithm achieves 90.8% accuracy in real-time reasoning experiments and has better generalization ability.

Select

Land-Sea Clutter Image Enhancement and Detector Design for Sky-wave Over-the-Horizon Radar

LUO Zhong-tao, GONG Yan-ru, LI Ji-xuan, LU Kun

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.240403

Online available: 2024-11-15

Abstract (171) Download PDF (142) HTML (107)

Knowledge map

Save

Sky-wave over-the-horizon radar (OTHR) effectiveness is limited by the operation environment. When the ionospheric state is bad or the operating parameters are unsuitable, the radar signal will not illuminate the scheduled area. Hence, the fact that the land-sea clutter (LSC) is normal or abnormal directly reflects the working status of OTHR. To address the scarcity and imbalance of OTHR clutter signals, a data enhancement method based on generative adversarial network is proposed for clutter range-Doppler image enhancement. A lightweight ResNet18 model is used for real-time identification of the radar images. Further, an LSC anomaly detector (LSCAD) is designed to achieve automatic identification of the radar LSC situation. The LSCAD extracts the high-amplitude region from the radar range-Doppler map, classifies it by the classification network based on the augmented dataset, and feds back to the radar operator. Simulation results show that the LSC data enhancement increases the LSC classifier accuracy by 25.26%. The LSCAD can make a correct judgement on the LSC status of the real data and literature images. Therefore, the LSCAD can be used as an extended module of the OTHR and provides automatic detection and warning about the LSC anomaly, which helps OTHR improving the degree of automation.

Select

Chinese Long Text Summarization with Guided Attention

GUO Zhe, ZHANG Zhi-bo, ZHOU Wei-jie, FAN Yang-yu, ZHANG Yan-ning

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.230429

Online available: 2024-11-28

Abstract (137) Download PDF (142) HTML (71)

Knowledge map

Save

Current research on Chinese long text summarization based on deep learning has the following problems: (1) summarization models lack information guidance, fail to focus on keywords and sentences, leading to the problem of losing critical information under long-distance span; (2) the word lists of existing Chinese long text summarization models are often word-based and do not contain common Chinese words and punctuation, which is not conducive to extracting multi-grained semantic information. To solve the above problems, a Chinese long text summarization method with guided attention (CLSGA) is proposed in this paper. Firstly, for the long text summarization task, an extraction model is presented to extract the core words and sentences in the long text to construct the guided text, which can guide the generation model to focus on more important information in the encoding process. Secondly, the Chinese long text vocabulary is designed to changing the text structure from words statistics to phrases statistics, which is conducive to extracting richer multi-granularity features. Hierarchical location decomposition encoding is then introduced to efficiently extend location encoding of long text and accelerate network convergence. Finally, the local attention mechanism is combined with the guided attention mechanism to effectively capture the important information under the long text span and improve the accuracy of summarization. Experimental results on four public Chinese abstract datasets with different lengths, LCSTS, CNewSum, NLPCC2017 and SFZY2020, show that our proposed method has significant advantages over long text summarization and can effectively improve the value of ROUGE-1, ROUGE-2 and ROUGE-L.

Select

Cloud-Edge Collaborative Retraining of Foundation Models at the Block Granularity

ZHANG Qing-long, HAN Rui, LIU Chi

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240518

Online available: 2024-12-17

Abstract (105) Download PDF (121) HTML (102)

Knowledge map

Save

Foundation models deployed in dynamic edge environment encounter continuously evolving input data distributions, requiring retraining them to maintain high accuracy. However, existing retraining techniques can only train fixed compressed models within the constraints of device resources and retraining windows, thus considerably lowering accuracies due to these small models’ limited generalization ability. For such an issue, this paper proposes BlockTrainer, an edge-cloud collaborative retraining approach of foundation models at the block granularity. BlockTrainer first introduces a model retraining scaling law to evaluate the accuracy contributions of different blocks in a foundation model according to its latest input data at edge. Based on this evaluation, it generates the optimal retraining solution under resource constraints, and dynamically converts the most accuracy-relevant parts of the model into retrainable small models at edge, thereby constructing a collaborative training system between large and small models. Comparative experiments on real edge-cloud platforms show that BlockTrainer improves the retraining accuracy of foundation models by 81.24% using the same resource consumptions, and supports retraining a model of up to 33 billion parameters.

Select

Efficient Restoration for Binary Neural Networks

ZENG Kai, WAN Zi-xin, Wang Ming-tao, SHEN Tao

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240640

Online available: 2024-12-23

Abstract (72) Download PDF (33) HTML (62)

Knowledge map

Save

Restoring the weight distribution, activation distribution, and gradient to the original full precision network data as much as possible can greatly improve the inference ability of the binary network. However, existing methods directly apply the restoration operation in forward propagation to binary data, and the gradient approximation functions for backpropagation are fixed or manually determined, resulting in the need for improvement in the restoration efficiency of binary networks. To address this problem, the efficient restoration method is investigated for binary neural networks. Firstly, a distribution recovery method for maximizing information entropy is proposed. By shifting the original full precision weight mean and scaling the modulus, the quantized binary weight directly has the characteristic of maximum distribution restoration. At the same time, a simple statistical translation and scaling factor is used to greatly improve the restoration efficiency of weight and activation; Furthermore, it is proposed a gradient function based on adaptive distribution approximation, which dynamically determines the update range of the current gradient in the P-percentile according to the actual distribution of the current full precision data. It adaptively changes the shape of the approximation function to efficiently update the gradient during the training process, thereby improving the convergence ability of the model. On the premise of ensuring the improvement of execution efficiency, theoretical analysis has confirmed that the method proposed in this paper can achieve maximum restoration of binary data. Compared with the existing advanced binary network models, the experimental results of our method show excellent performance, with a 60% and 67% reduction in computational time for the distribution restoration operation quantization of ResNet-18 and ResNet-20, respectively. An accuracy of 93.0% was achieved for VGG-Small binary quantization on the CIFAR-10 dataset, and 61.9% was achieved for ResNet-18 binary quantization on the ImageNet dataset, both of which are the best performance of the current binary neural network. The relevant code is available inhttps://github.com/sjmp525/IA/tree/ER-BNN.

Select

A Frequency Hopping Network Station Sorting Algorithm Based on Improved YOLOv8

ZHU Zheng-yu, ZHAO Hang-ran, WANG Zi-xuan, WANG Zhong-yong, KONG Ke-xian, LIANG Jing

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240487

Online available: 2024-12-24

Abstract (115) Download PDF (76) HTML (106)

Knowledge map

Save

Aiming at the problem that traditional frequency hopping network station sorting technology is ineffective under low signal-to-noise ratio conditions and has poor real-time detection performance, this paper proposes a shortwave frequency hop-ping signal sorting algorithm based on the improved YOLOv8. First, the short-time Fourier transform is performed on the received aliasing signal to generate a grayscale time-frequency image as the input of the YOLOv8 network model. Secondly, in view of the impact of frequency collisions between aliasing signals such as sweep frequency signals, fixed frequency signals and frequency hopping signals on detection accuracy, the Deformable Convolutional Net-works v2 is introduced in the C2f layer to improve the generalization ability of network feature extraction. Thirdly, the Simam attention mechanism is added to the backbone layer to solve the problem that background noise is easily confused with frequency hopping signals and affects detection accuracy under low signal-to-noise ratio. Finally, the convolutional kernel of Detect module is replaced by Partial Convolution kernel, which reduces the computational complexity of the network by 32.18% without the accuracy loss of mAP@0.5 exceeding 0.37%, and improve the inference speed of the network model. Experimental results show that the improved YOLOv8 algorithm proposed in this paper has a separation rate of 97.68% at -5 dB signal-to-noise ratio, and the model has fast convergence and strong robustness.

Select

Constrained Coding for Magnetization Transition Noise of Ultra-High Density Magnetic Storage

LUO Ke, LI Wei, JIAN Yu-gen, GAO Hong-yu, ZHANG Ke-zheng, LIAO Yan-zhe, WU Yu-fei, CHEN Jin-cai, LU Ping

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20230527

Online available: 2025-01-06

Abstract (66) Download PDF (42) HTML (11)

Knowledge map

Save

As the recording density of magnetic storage increases, the recording bit spacing decreases and the magnetization transition noise increases significantly, which greatly affects the quality of the readback signal. To mitigate the interference of magnetization transition noise problem among recording patterns in ultra-high density magnetic storage systems, the maximum transition run(MTR) constraint code MTR( $j = 1$ ), which limits the continuous transition, is proposed to effectively suppress the magnetization transition noise compared with the constraint codes MTR( $j = 2$ ) and MTR( $j = 3$ ), which allow continuous transitions. We investigate the detection effect of the readback signal experimentally. When the signal-to-noise ratio is 12 dB, the detection bit error rate (BER) of MTR( $j = 1$ ) is reduced by about 30% and 60% relatively compared with MTR( $j = 2$ ) and MTR( $j = 3$ ), respectively. We confirmed that the MTR( $j = 1$ ) constrained coding that forbids continuous transitions can achieve higher data detection reliability.

Select

YANG Hong-yu, WANG Yun-long, HU Ze, CHENG Xiang

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240769

Online available: 2025-02-26

Abstract (102) Download PDF (53) HTML (34)

Knowledge map

Save

Existing binary code similarity detection(BCSD) methods often overlook the actual execution information and local semantic details of programs, leading to suboptimal performance in assembly code semantic representation learning, high training resource consumption, and poor similarity detection performance. To address these issues, this paper proposes a cross-modal Coordinated Representation Learning method(CMRL) for binary code similarity detection. First, we extract the semantic correspondence between assembly instruction sequences and programming language fragments to construct a contrastive learning dataset. We then propose an Assembly Code-Programming Language Coordinated Representation Learning method(APECL), which uses the high-level semantics of source code as supervisory information. Through contrastive learning tasks, we align the feature representations of the APECL-Asm encoder and the programming language encoder in the semantic space, thereby enhancing the semantic representation learning capability of APECL-Asm for assembly instructions. Next, we design a graph neural network-based method for generating binary function embedding vectors. This method uses a semantic structure-aware network to fuse the semantic information extracted by APECL-Asm with the actual execution information of the program, generating function embedding vectors for similarity detection. Experimental results show that compared to existing methods, CMRL improves the Recall@1 metric for binary code similarity detection by 8%-33%. Additionally, in the context of code obfuscation, CMRL exhibits stronger resilience, with less degradation in the Recall@1 metric.

Select

Cost Function Optimization-Based Beam Scheduling and Resource Allocation Algorithm for Multibeam Satellite Communication Systems

ZHANG Si-ya, CHAI Rong, LIANG Cheng-chao, CHEN Qian-bin

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240116

Online available: 2025-02-28

Abstract (91) Download PDF (92) HTML (89)

Knowledge map

Save

Multibeam satellite communication systems have received widespread attentions due to their high throughput and efficient resource utilization. This paper investigates the beam scheduling and resource allocation problem in multibeam satellite communication system. By jointly considering user position and service characteristics, an optics-based initial user grouping algorithm is proposed. To enhance beam coverage performance, a minimum circle algorithm is proposed to optimally design satellite beam positions and coverage radius. Given the determined user grouping strategy, system cost function is defined and the joint beam scheduling, sub-channel allocation and power allocation problem is formulated as a system cost function minimization problem. To solve the formulated optimization problem, aggregate nodes are introduced to describe the characteristics of user groups, and a parameterized deep $Q$ -network-based joint beam scheduling and power allocation algorithm is proposed. Based on the obtained user group beam scheduling and power allocation strategy, a double deep $Q$ -network algorithm and a proximal policy optimization-based joint subchannel and power allocation strategies are proposed. Simulation results validate the effectiveness of the proposed algorithms.

Select

Lightweight Atrial Fibrillation Model Based on Feature Fusion of Morphology and Rhythmic and Interpretability Analysis

GAO Ning, LI Yu-rong, CHEN Hong, CHEN Wen-sheng, JIA Zi-hao

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240998

Online available: 2025-03-04

Abstract (61) Download PDF (110) HTML (61)

Knowledge map

Save

Atrial fibrillation(AF) is a common arrhythmia often associated with cardiovascular diseases such as stroke and heart failure. Although numerous researchers have made substantial progress in AF detection using deep learning methods in recent years, most of these methods require extensive computational resources. Moreover, the clinical application of these models is challenging due to the black-box nature of deep learning models. Therefore, this paper proposes a lightweight AF detection model based on feature fusion and conducts an interpretability study. The model comprises an ECG(Electrocardiogram) backbone network and an RRI(R-R Interval) branch. The ECG backbone network uses depthwise separable convolutions along with a few standard convolutions to extract deep morphological features of the ECG signals, while the RRI branch employs multi-scale convolutions to extract deep rhythm features of the RRI. The network learns robust feature representations by fusing morphological features and rhythm features to detect AF accurately. As to interpretability analysis, Grad-CAM++ is utilized to visualize the contribution of different features to the classification results. In this paper, the training and dataset internal tests are conducted in the LTAFDB and achieved an accuracy of 97.99%. In order to validate the generalization performance of the model, external testing experiments are conducted using the AFDB and the CPSC2021, achieving an accuracy of 95.17% and 93.81%, respectively. Experimental results demonstrate that the proposed method is lightweight, stable, and accurate, and the incorporation of interpretable deep-learning techniques suggests that the proposed method holds significant potential for the clinical diagnosis of AF.

Select

Dual-Stream Attention Image Inpainting Method Based on Interacting and Fusing Internal-External Features

HUANG Guang-yuan, HUANG Rong, ZHOU Shu-bo, JIANG Xue-qin

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240780

Online available: 2025-03-04

Abstract (68) Download PDF (58) HTML (67)

Knowledge map

Save

The attention mechanism and its variants have been widely applied in the field of image inpainting. They divide corrupted images into complete and missing regions, and capture long-range contextual information only within the complete regions to fill in the missing regions. As the area of missing regions increases, the features of complete regions decrease, which limits the performance of the attention mechanisms and leads to suboptimal inpainting results. In order to extend the context range of the attention mechanism, we employ a vector-quantized codebook to learn visual atoms. These visual atoms, which describe the structural and textural of image patches, constitute external features for image inpainting and thus compensate for the internal features of the image. On this basis, we propose a dual-stream attention image inpainting method based on interacting and fusing internal-external features. Based on internal and external information sources, we design an internal mask attention module and an internal-external cross attention module. These two attention modules form a dual-stream attention to facilitate interaction between internal features and between internal and external features, thereby generating internal- and external- source inpainting features. The internal mask attention shields the interference of missing region features with a mask. It captures contextual information exclusively within the complete regions, thereby generating internal-source inpainting features. The internal-external cross attention interacts with internal and external features by calculating the similarity relationship between internal features and external features composed of visual atoms, thereby generating external-source inpainting features. In addition, we design a controllable feature fusion module that generates spatial weight maps based on the correlation between internal- and external- source inpainting features. These spatial weight maps fuse internal and external features by element-wise weighting of internal- and external- source inpainting features. Extensive experimental results on Places2, FFHQ and Paris StreetView datasets demonstrate that the proposed method achieves average improvements of 3.45%, 1.34%, 13.91%, 13.64%, and 16.92% for PSNR, SSIM, L1, LPIPS, and FID metrics respectively, compared with the state-of-the-art methods. Visualization experimental results demonstrate that both internal features and external features composed of visual atoms are beneficial for repairing corrupted images.

Select

Robust Unsupervised Feature Selection with Double Fuzzy Learning

GAO Yun-long, SHI Shu-guang, ZHAO Zhi-xiang, CAO Chao, PAN Jin-yan

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240682

Online available: 2025-03-06

Abstract (109) Download PDF (144) HTML (111)

Knowledge map

Save

Due to the curse of dimensionality, effectively discarding redundant features while retaining critical information in high-dimensional data has become a key issue. Unsupervised feature selection, which performs dimensionality reduction without any prior class information, has attracted increasing attention. However, two common issues are ignored by existing unsupervised feature selection methods: Fuzziness is a common characteristic of data, but most existing unsupervised feature selection methods based on regularized regression ignore this aspect, resulting in suboptimal feature subsets; Most methods fail to effectively distinguish between normal and noisy samples and are susceptible to the noise. To tackle the mentioned issues, robust unsupervised feature selection with double fuzzy(DFRFS) learning is proposed. Specifically, DFRFS learning introduces fuzzy membership into unsupervised feature selection based on regularized regression, allowing data to be shared among multiple clusters, thereby better reflecting the complex structure and uncertainty of the data. Additionally, DFRFS learning assigns different weights to samples through the robust weight learning framework, thus suppressing the impact of noise while retaining the effect of normal samples. Experiments on toy and real-world datasets have demonstrated the effectiveness of the proposed method DFRFS learning.

Select

Counterfactual User Behavior Generation for Session-Based Recommendation

LU Xiangkui, WU Jun

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240783

Online available: 2025-03-17

Abstract (61) Download PDF (57) HTML (61)

Knowledge map

Save

To protect user privacy, many platforms offer anonymous login options, limiting recommendation systems to accessing only user behavior records within the current session, thereby leading to the development of session-based recommendation(SBR). Existing SBR approaches mainly follow the traditional paradigms of non-anonymous user behavior modeling, focusing on learning session representations through sequential modeling. However, when sessions are short, the performance of these techniques drops significantly, making it challenging to address real-world SBR scenarios dominated by short sessions. To this end, we propose a method called counterfactual inference by frequent pattern guided long sequence generation (CLSG), which aims to answer the counterfactual question: “what would be the model’s prediction if the session contained richer interactions?” CLSG follows the classical three-stage counterfactual inference process of “induction-action-prediction”. The induction stage constructs a frequent pattern knowledge base from the observed session set. The action stage generates counterfactual long sessions with the guide of the knowledge base. The prediction stage measures the discrepancy between the predictions of the observed and counterfactual sessions, and incorporates such discrepancy as a regularization term into the objective function to achieve representation consistency. Notably, CLSG is model-agnostic and can be easily applied to enhancing current SBR models. Experimental results on three benchmark datasets demonstrate that CLSG significantly improves the recommendation performance of five existing SBR models, with an average improvement of 6% in terms of both hit rate (HR) and mean reciprocal rank (MRR) metrics.

Select

Design of Multimode Reconfigurable Low Noise Amplifier for UWB Applications

LIU Qi-hang, LEI Qian-qian, XIONG Jian-hui, ZHANG Xu-dong

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240241

Online available: 2025-03-25

Abstract (66) Download PDF (61) HTML (71)

Knowledge map

Save

To solve the compatibility problem of multi-band on a single chip in the RF front-end of the receiver, this paper proposes a new bandwidth-reconfigurable low noise amplifier(LNA) structure for UWB applications. This LNA is based on switchable reconfigurable design methods, embedding the switchable design in the load of the cascaded LNA circuit. The design achieves switching of in-band input impedance matching and gain curves for different UWB operating modes by controlling the position of low-frequency impedance resonance point and corresponding gain pole through the reconfigurable design of the load inductance of the resistive parallel negative feedback structure. Compared with the design methods of introducing switches in the input/output matching path, placing switches at the load optimizes gain and noise performance without affecting impedance matching. The resistors and inductors in the traditional inductive peaking technique are adjustable to consider gain flatness within different operating bandwidths. Based on SMIC 28 nm CMOS technology, the simulation results of electromagnetic modeling demonstrate that the LNA operates in three modes: 3.1~10.6 GHz, 6~10.6 GHz, and 3.1~5 GHz, with in-band voltage gain(S ₂₁) above 16.59 dB and minimum noise figure below 3 dB. Under 0.8 V power supply voltage, all three modes exhibit input and output matching(S ₁₁, S ₂₂) below -10 dB, with a static power consumption of only 9.03 mW; after introducing MOS switches, the noise figure degradation of the LNA in all three bandwidths is less than 0.2 dB.

Select

Antenna Aperture Synthesis Technology for Air Vehicle Measurement, Control and Communication Applications

WU Qi, WANG Zi-tong, ZHANG Dong-liang, XIA Si-yu, FAN Wen-qi, CHEN Yi-long

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240933

Online available: 2025-04-02

Abstract (105) Download PDF (71) HTML (118)

Knowledge map

Save

The measurement and control of advanced air vehicle requires the realization of multiple functions such as telemetry, remote control, communication, and tracking. Traditionally, it is generally composed of multiple wireless transceiver systems and discrete antennas. The contradiction between its volume, weight, cost, installation, etc. and the limited resources of the air vehicle is becoming increasingly prominent. The antenna aperture synthesis enables a single multi-functional antenna aperture to perform the functions of multiple dedicated antenna apertures. This greatly reduces the number of antenna apertures. It also significantly eases the pressure on the antenna aperture layout on the air vehicle platform, offering a new way to enhance the system-level electromagnetic compatibility. This paper systematically elaborates on the technical route of antenna aperture synthesis for air vehicle measurement and control communication. It focuses on introducing the multi-band and multi-polarization antenna technology for the synthesis of multiple discrete antennas, the diplexer antenna technology for the synthesis of transmitted and received antennas, the shared-aperture antenna technology for the synthesis of multiple antennas in the same aperture, and the coupling suppression technology for the integration of the same-frequency antenna array. At the same time, combined with the working characteristics of the software-defined radio system, it analyzes the advantages and feasibility of the application of the software-defined radio system in the air vehicle measurement and control communication system. Finally, this paper looks forward to the development of the antenna aperture synthesis technology for air vehicle measurement and control communication and puts forward the possible development directions of the antenna aperture synthesis technology in the development of the air vehicle measurement and control communication system.

Select

Gate Voltage Tunable Ethanol Sensor Based on Pristine SnO₂

WU Hai-yang, YU Ning-mei

ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240624

Online available: 2025-04-15

Abstract (19) Download PDF (16) HTML (7)

Knowledge map

Save

It is crucial to monitor ethanol in real time due to the safety risks posed by its high volatility and flammability. However, current methods for improving the performance of SnO₂ ethanol sensors often hinder the miniaturization of devices. To address this, the paper designs an intrinsic SnO₂ ethanol sensor with a field-effect transistor structure and employs magnetron sputtering to fabricate the sensitive film. The study systematically investigates the influence of gate voltage on the gas-sensing performance of the sensor. Experimental results indicate that the SnO₂ sensor prepared by sputtering is an n-channel depletion-mode device. Gas-sensing tests reveal significant differences in the sensor’s response under different operating gate voltages: at a gate voltage of 10 V, the current change of the sensor in 100 ppm ethanol is 2.40 times; while at a gate voltage of -30 V, the channel current change is significantly enhanced to 3.42 times, representing a 42% improvement compared to 10 V. Further investigation shows that the gas-sensing properties of SnO₂ arise from the modulation of carrier concentration in the channel by the surface adsorption of ethanol molecules. This effect is significantly enhanced under negative gate voltage but suppressed under positive gate voltage. However, a positive gate voltage of 10 V induces more electrons in the channel, effectively accelerating the adsorption and desorption processes of ethanol. As a result, the sensor's response and recovery times to 100 ppm ethanol are reduced to 8 s and 17 s, respectively, demonstrating faster dynamic characteristics. The study’s findings indicate that the degree and rate of ethanol vapor reaction on the SnO₂ surface are significantly regulated by the sensor’s gate voltage. This research provides a new approach for optimizing the gas-sensing performance of SnO₂ sensors and contributes to advancing their application in miniaturized, fast-response, and high-precision gas-sensing detection.

Collections

Please choose a citation manager

Content to export

模态框（Modal）标题

Collections

Please choose a citation manager

Content to export