CIE Homepage  |  Join CIE  |  Login CIE  |  中文 
Home Browse Online first

Online first

The manuscripts published below will continue to be available from this page until they are assigned to an issue.
Please wait a minute...
  • Select all
    |
  • CHEN Ping-ping, LIN Hu, CHEN Hong-hui, XIE Zhao-peng
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240919
    Online available: 2025-05-07

    In the end-to-end text recognition of complex natural scenes, because text and background are difficult to distinguish, the location information detected by text and the semantic information recognized do not match, and the correlation between detection and recognition cannot be effectively utilized. In response to this problem, This paper proposes a multi-party synergetic information with dual-domain awareness text spotting(MSIDA). By enhancing text region features and edge textures, the synergies between text detection and recognition features are utilized to improve end-to-end text recognition performance. Firstly, a dual-domain awareness(DDA) module integrating text space and direction information is designed to enhance the visual feature information of text instances. Secondly, a multi-party explicit information synergy(MEIS) is proposed to extract explicit information from coding features and generate candidate text instances by matching and allocating the position, classification and character multi-party information used for detection and recognition. Finally, cooperative features guide learnable query sequences through decoders to obtain text detection and recognition results. Compared to the latest decoder with explicit points solo(DeepSolo) method, on the Total-Text, ICDAR 2015 and CTW1500 datasets, the accuracy of MSIDA improved respectively by 0.8%, 0.8% and 0.4%. The code and datasets are available at https://github.com/msida2024/MSIDA.git .

  • LIU Ying, XUE Jia-hao, ZHANG Wei-dong, XU Zhi-jie
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240754
    Online available: 2025-04-30

    An image classification algorithm based on coordinate importance pooling and decoupled class alignment distillation is proposed to improve the image classification accuracy of convolutional neural networks while achieving network lightweighting. Firstly, a coordinate importance pooling module is designed and embedded it into ResNet34, in order to fully utilize the positional information of image pixels to enhance the ability to discriminate important features. Secondly, BlurPool is used to mitigate the impact on network performance due to shift equivariance during down-sampling, and to construct the teacher network. Finally, the decoupled class alignment distillation algorithm was constructed to efficiently migrate image classification knowledge from the teacher network to the lightweight MobileNetV3 network, which considers the knowledge of target and non-target class separately and introduces correlation information between the class. The experimental results on different datasets showed that the proposed teacher network effectively improves the classification performance, and the distillation-trained student network achieves superior overall performance than other networks of the same magnitude, making it better applicable to practical scenarios with limited computational and storage power.

  • JIANG Wen-bo, ZHAO Gui-hua
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20230911
    Online available: 2025-04-30

    The television production and presentation system has undergone an evolution from black and white to color, and from analog to digital. Currently, it is in a stage of rapid development from high definition (HD) to ultra-high definition (UHD). The signal transmission rate of the traditional HD baseband system is only 1.5 Gbps, which is unable to carry 4K/8K UHD signals (48 Gbps@8K, 12 Gbps@4K). Moreover, the luminance dynamic range of HD television is only 103, while the human eye's visible range without pupil adjustment is 105. Therefore, UHD television should enhance the luminance dynamic range to 105 in accordance with the human eye's recognition ability. Focusing on the technical challenges such as uncompressed cross-domain multi-address Internet Protocol (IP) switching and high dynamic range (HDR) production and presentation for UHD, this paper comprehensively introduces the UHD television production and presentation system and key technologies, with a particular emphasis on the innovative points of UHD IP signal switching, 8K UHD video imaging and image processing, video intelligent enhancement, extended reality (XR) virtual-real fusion production, Audio Vivid, heterogeneous network audio-video synchronization transmission, and 4K/8K UHD terminal display, as well as the full-process program production and presentation capabilities of UHD high dynamic characteristics.

  • ZHANG Sen, PAN Cheng-wu, LI Hao-yu, HE Nai-long, MA Jie, ZHANG Long, LIU Si-yang, SUN Wei-feng
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240571
    Online available: 2025-04-30

    A potential control technique that can be used in 1 200 V isolation structure is proposed. Such potential control technique is realized through the potential delivering field plates (PDFPs). The PDFPs have the same spacing in the high voltage junction termination region, and the spacing of PDFPs begins to adjust in the P-type isolation ring region. The spacing of PDFPs near the source and drain side on the N/P channel lateral double-diffused metal-oxide-semiconductor are widened, and the spacing of PDFPs in the middle of the drift region is narrowed. The potential of the HVJT is delivered to the LDMOS region by PDFPs, which regulates the surface potential distribution of the LDMOS and prevents its premature breakdown. The experimental results indicate that the proposed isolation structure has 467% improvement in breakdown voltage compared with the isolation structure without PDFPs.

  • KANG Ran-lan, LI Yu-rong, SHI Wu-xiang, LI Ji-xiang
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240885
    Online available: 2025-04-30

    To address the issues of low spatial resolution and susceptibility to noise in traditional single-modality brain-computer interface(BCI) technologies based on electroencephalography(EEG), an increasing number of studies have focused on BCI research that combines EEG signals with functional near-infrared spectroscopy(fNIRS) signals. However, integrating these two heterogeneous signals poses challenges. This paper proposes an innovative end-to-end signal fusion method based on deep learning and evidence theory for motor imagery(MI) classification. The spatiotemporal feature information of EEG signals is extracted using dual-scale temporal convolution and depth wise separable convolution, with a hybrid attention module introduced to enhance the network’s ability to perceive important features. For fNIRS signals, spatial convolution across all channels explores activation differences between different brain regions, while parallel temporal convolution and gated recurrent unit(GRU) capture richer temporal feature information. During the decision fusion stage, the decision outputs obtained from decoding each signal are first utilized to estimate uncertainty using Dirichlet distribution parameter estimation. Subsequently, Dempster-Shafer theory(DST) is employed for dual-layer reasoning, effectively merging evidence from the two basic belief assignment(BBA) methods and different modalities to obtain the decoding results. The proposed model is evaluated on the publicly available TU-Berlin-A dataset, achieving an average accuracy of 83.26%, which represents a 3.78% improvement compared to the state-of-the-art research. This provides new ideas and approaches for fusion studies based on EEG and fNIRS signals.

  • LI Bo, LI Ze-chao, XING Peng, TANG Jin-hui
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240733
    Online available: 2025-04-27

    Anomaly detection has been widely studied and applied to various visual scenes. Recently, the mainstream unsupervised anomaly detection schemes are usually based on distillation methods and reconstruction methods. However, they still have some limitations. In distillation model, the student network can usually learn the strong representation ability of the teacher network, thus can not represent differently for the abnormal regions. In reconstruction model, the encoder-decoder model can easily learn a restoration shortcut and recover features indiscriminately. To address the above challenges, we propose 𝒩 -Net, which integrates the advantages of above two methods and alleviates limitations through the bidirectional distillation module and the multistage filtration mechanism. Specifically, in the teacher-student network, this paper first proposes distilling adaptive domain features instead of original domain features, which ensures efficient alignment of normal adaptive domain features through bidirectional distillation branches. Then, we propose a multilevel filtering module to filter abnormal features through query and compression to further enhance the ability to learn normal semantic feature distribution and improve the anomaly detection performance. Finally, a large number of experiments are carried out on two benchmark anomaly detection datasets, MVTec and VisA. The results show that the proposed method achieves advanced performance in anomaly detection and location tasks.

  • LI Yue-zhou, NIU Yu-zhen, LI Fu-sheng, KE Xiao, SHI Yi-qing
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240375
    Online available: 2025-04-26

    Images captured in low-light scenes are susceptible to multiple degradations such as darkness, noise, and blur, resulting in poor visibility and visual perception. Multi-degraded low-light image enhancement poses challenges to existing image enhancement methods as follows: on the one hand, low-light image enhancement or deblurring methods cannot handle all three types of degradation, and the effect of the combination strategy is limited by the increased computational cost and error accumulation. On the other hand, the existing multi-degraded low-light image enhancement method adopts the strategy of enhancing brightness first and then removing blur, and this sequential processing manner increases the risk of losing feature cues and is not conducive to detail recovery. To cope with the above challenges, this paper proposes the progressive edge-aware interactive enhancement network(PEIE-Net), which reduces the loss of feature details by designing a step-by-step enhancement process. Specifically, our network consists of an image enhancement branch and an edge information prediction branch. In each enhancement stage of the image enhancement branch, a self-attention modulation prediction module is designed to extract the global information, which is used for adaptive modulation in the channel modulation module and multi-scale restoration module. In the edge information prediction branch, the spatial-frequency domain feature transformation module is developed to extract the edge perceptual information. The edge perceptual information is used to predict the edges of high-quality images while also fused with the image enhancement features, simulating the interaction between different perceptions within the human visual system. In addition, we propose scene brightness estimation loss to coordinate the multiple progressive enhancement stages. Experiments on synthetic and real datasets demonstrate the effectiveness and sophistication of our method for enhancing low-light, noisy, and blur-degraded images, and can be used for low-light image enhancement and super-resolution tasks.

  • XIE Li-xia, WEI Chen-yang, YANG Hong-yu, HU Ze, CHENG Xiang
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240746
    Online available: 2025-04-25

    Existing malware detection methods suffer from inadequate extraction of sample features, excessive reliance on domain expert knowledge, and operational behavior monitoring, significantly impacting detection and classification performance. To address these issues, we propose a malware detection method based on multidimensional dynamic weighted alpha image fusion and feature enhancement. Standardized sample sets are obtained through invalid sample cleaning and outlier processing. High-quality fused image samples are then generated using a three-channel image generation and multidimensional dynamic weighted alpha image fusion method. The puppet optimization algorithm is employed for data reconstruction to mitigate the impact of data class imbalance on detection results, and image enhancement is performed on the reconstructed data samples. A spatial attention enhancement network based on dual-branch feature extraction and fusion channel information representation is used to extract and enhance image and text features, thereby improving feature representation capabilities. The enhanced image and text features are fused using a weighted fusion method to achieve malware family detection and classification. Experimental results show that the proposed method achieves a malware detection classification accuracy of 99.72% on the BIG2015 dataset, representing an improvement of 0.22%~2.5% over existing detection methods.

  • GU Mei-ying, LI Hang, ZHANG Jia-wei, BAI Xiao, ZHENG Jin
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240699
    Online available: 2025-04-22

    As the cost of unmanned aerial vehicles(UAVs) decreases, they have attracted increasing research interest. UAVs are now widely applied in various fields, including agriculture, firefighting, surveying, aerial photography, and recreational applications. These applications require UAVs to perform autonomous flights with precise self-localization, typically relying heavily on global navigation satellite systems(GNSS). However, GNSS has multiple shortcomings related to long-distance radio communications, such as non-line-of-sight reception, multi-path effects, and spoofing. This has driven the development of new methods to supplement or replace satellite navigation. Vision-based UAV localization and navigation methods, utilizing onboard visual sensors for autonomous localization and navigation, have become crucial in addressing this issue. This review contributes to the field by systematically reviewing vision-based UAV localization and navigation technologies, providing a comprehensive summary of the current research landscape and developmental trends. First, it introduces vision-based UAV localization methods, which are categorized into image retrieval and feature matching approaches. The technical characteristics, applicable scenarios, relevant datasets, and evaluation metrics of these methods are analyzed in detail. Second, this review elaborates on vision-based UAV navigation methods, distinguishing between obstacle detection and avoidance techniques and path planning methods based on their functional objectives, while highlighting the strengths and limitations of existing technologies. Finally, this review further discusses the potential challenges faced by vision-based UAV localization and navigation methods, including the lack of publicly available datasets, the need for hardware acceleration, the complexity of operating environments, real-time processing requirements, energy constraints, and the gap between simulated and real-world environments.

  • WANG Jin-zhong, DAI Shun, ZHANG Xiu-wei, TIAN Xue-tao, XING Yin-hui, WANG Fang, YIN Han-lin, ZHANG Yan-ning
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240602
    Online available: 2025-04-21

    Unmanned aerial vehicle(UAV)-based multispectral object detection utilizing both visible(RGB) and thermal infrared(T) images, makes all-weather and all-day target monitoring possible, serving critical roles in military and civilian applications. However, due to the complexity of data acquisition and processing, there is currently a lack of publicly available UAV-based RGB-T multispectral object detection datasets, which to some extent limits its research and application. Meanwhile, UAV operational scenarios are characterized by complex and variable conditions, including rapid changes in flight altitude, speed, focal length, and background. So, the captured targets exhibit diverse scales, uneven(dense/sparse) distributions, and category imbalances in images, which presents significant challenges for accurate detection. Furthermore, real-time requirement should be guaranted in applications such as reconnaissance and traffic monitoring. Therefore, it is the key to keep a trade-off between accuracy and speed in the algorithmic design of UAV RGB-T object detector. To address these issues, this paper introduces a large-scale UAV-based RGB-T multispectral dataset named UAV-RGBT, which spans across seasons and day-night cycles, and includes multiple categories and scales. Specifically, UAV-RGBT comprises 20 categories with 5 117 pairs of RGB-T images and over 110 000 annotations, which is conducive to advancing research in UAV-based multispectral object detection algorithms. Moreover, based on the YOLOv8n model, the UAV-based dual-branch multispectral object detection(UAV-DMDet) model is proposed to promote deep fusion of multispectral features through a multi-modal cross-attention fusion module and a multi-modal feature decomposition combination module. This approach achieves a batter trade-off among model parameter size, detection speed, and accuracy. Experimental results demonstrate that the UAV-DMDet model improves the mAP@0.5 on the UAV-RGBT dataset by 3.61% and 11.03% in the visible and thermal modalities, respectively, and enhances the mAP@0.5:0.95 by 0.84% and 6.76%, respectively. On the DroneVehicle dataset, the UAV-DMDet model outperforms the mainstream algorithm I2MDet, with mAP@0.5 and mAP@0.5:0.95 improvements of 2.66% and 12.36%, respectively. Furthermore, with 640 ×640 resolution images as input, the UAV-DMDet model achieve FP32 precision inference speed of 31 frames per second on a GeForce RTX 3090 GPU, and FP16 precision inference speed of 58 frames per second on a Huawei Ascend 710 processor, making it effectively applicable for real-time UAV-based RGB-T multispectral object detection tasks.

  • HOU Jie, CHEN Xi, TAO Shi-fei, WANG Hao, HOU Zhi-yang, HU Nai-jie
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240687
    Online available: 2025-04-19

    In response to the issue of slow responsiveness and low real-time performance in traditional spectrum resource allocation management within battlefield environments, a blockchain-enabled distributed spectrum allocation architecture is first established. Constraints such as spectrum satisfaction, conflicts, and priority are taken into consideration to formulate an optimization objective model. A distributed gale-shapley (D-GS) spectrum allocation algorithm based on matching theory is proposed, transforming battlefield spectrum allocation from static centralized to dynamic distributed, significantly improving spectrum allocation performance. By incorporating a greedy mechanism and a satisfaction threshold for spectrum units, combat units can notably enhance spectrum allocation satisfaction even when spectrum demands are submitted late. Simulation results demonstrate that the proposed method, under limited spectrum resource conditions, ensures unit satisfaction and maximizes the real-time nature of distributed spectrum management. Compared to static centralized spectrum allocation methods, the time overhead is reduced by over one order of magnitude, indicating significant advantages. The algorithm exhibits superior time performance and spectrum allocation efficiency, leading to improved spectrum allocation in battlefield scenarios.

  • JIANG Wei-jin, DU Xi-chen, JIANG Yi-rong, YANG Xuan, NIE Cai-yan, LIU Qian
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240946
    Online available: 2025-04-16

    With the rapid development of industrialization and urbanization, the importance of environmental monitoring is becoming more and more prominent. However, traditional monitoring methods are limited by high costs, difficult layout and maintenance challenges, making it difficult to achieve comprehensive and real-time monitoring. Crowd Sensing, an emerging environmental monitoring method, utilizes widely used highly intelligent devices and integrated sensors for large-scale collection and real-time transmission of environmental data. However, existing studies seldom consider data privacy protection, work balance, and system cost at the same time, which makes it difficult to achieve the expected results in practical applications. To solve this practical problem, this paper proposes a low-cost and high-efficiency method that can be applied to crowd sensing for environmental monitoring (Adaptive Federated Learning based Crowd Sensing algorithm for Environmental Monitoring, AFL-CSEM). Specifically, we first consider the challenges of resource constraints, device heterogeneity, and non-independent and homogeneous distribution of data in the system, and model the system by combining crowd sensing and federated learning techniques, and train the model locally on user's devices, sharing only the model parameters to effectively protect data privacy. Then, the convergence analysis of the system is carried out, and the convergence bounds of the crowd sending algorithm based on federated learning are obtained for non-independently and identically distributed data distributions. Then, in order to reduce the impact of device heterogeneity, based on the results of the convergence analysis, an adaptive control method is designed to dynamically adjust the local update frequency and batch size to adapt to the heterogeneous and dynamic monitoring environment. By comparing on real datasets, all the experimental results consistently prove the effectiveness of the proposed algorithm in this paper, and the AFL-CSEM algorithm improves the efficiency and accuracy of model training while reducing the computation and communication overhead and lowering the economic cost. It provides a novel and informative solution for environmental monitoring in resource-constrained edge computing environments.

  • LI Yong-ming, LI Wen-zheng, ZHANG Xiao-heng, WANG Pin, HU Jie
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240562
    Online available: 2025-04-15

    In complex environments, short-term pedestrian trajectory prediction finds extensive applications in autonomous driving, social robotics, intelligent security, and smart city infrastructures. Interactions among pedestrians and between pedestrians and their environment exhibit multi-scale complexities and uncertainties, posing substantial challenges. Although current deep learning models are effective in uncovering complex pedestrian interactions, they typically assume uniform motion patterns across various scenes, thereby neglecting potential distributional discrepancies. While domain adaptation models partially address this issue, they often overlook the multi-level characteristics of pedestrian interactions and environmental influences. To address these challenges, this study proposes a pedestrian trajectory prediction model founded on hierarchical envelope domain adaptation. We design a local-level envelope sample construction module by establishing local-level pedestrian adjacency relationships. An individual-level envelope sample construction module is devised based on individual pedestrian relationships. These two modules are subsequently integrated to form a bi-level envelope sample construction module. Leveraging the bi-level envelope sample construction module, we compute the spatio-temporal feature distribution of all pedestrian trajectories to construct global-level envelope samples. Employing the attention mechanism and cross-domain distribution alignment, we respectively design the local-level envelope domain adaptation and global-level envelope domain adaptation modules. These modules are then integrated into a unified framework using a weighted prediction loss function, which is jointly optimized. The experimental section utilizes two representative public datasets and compares them with five representative algorithm models. Comprehensive validation is conducted through ablation studies, parameter analysis, method comparison, and trajectory visualization. The experimental results in the ETH and UCY datasets show that compared with T-GNN, the average displacement error is reduced by 22.7% and the final displacement error is reduced by 19.8%. For the full version of the article, please refer to the link:https://github.com/LWZ9910/MESC-HEDA.git.

  • FANG Xin, CHEN Zhe, LIU Zhan-wen, LI Xiao-peng, SU Yu-xin
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240713
    Online available: 2025-04-15

    RGB and Thermal infrared(RGBT) tracking is a multi-modal object tracking method that integrates different information from visible light and thermal infrared sensors. This method aims to overcome the limitations of single sensor in a specific condition and increase the robustness and accuracy of object tracking by fusing data from multiple sensors. However, the majority of RGBT tracking methods in use today directly fuse features extracted from thermal infrared and visible light images, ignoring the homogeneity and heterogeneity of the two modalities. In addition, RGBT tracking is often affected by multiple challenging factors such as objects fast motion, scale variation, illumination variation, thermal crossover, and occlusion. Existing work often focuses on a single model to solve all challenges simultaneously, which requires highly complex model and extensive training data. This paper proposes a novel network called CMHHNet(facing different Challenges and combining Multi-modal Homogeneous and Heterogeneous information separation and integration Network) for RGBT tracking. In this network, a challenge-aware module is deployed in each layer of the backbone to fuse the visible light and thermal infrared features from two different modalities under each challenge separately, and adaptively aggregate the fused features under all challenges. In addition, an attention enhancement module and a multi-scale auxiliary module are added to strengthen the features that the backbone network has extracted. Finally, according to the homogeneity and heterogeneity of thermal infrared and visible light, their unique and common features are extracted separately and adaptively fused. Extensive experiments on GTOT, RGBT234 and LasHeR datasets demonstrate that the tracker proposed in this paper shows quite strong competitiveness compared with existing RGBT tracking methods.

  • WU Hai-yang, YU Ning-mei
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240624
    Online available: 2025-04-15

    It is crucial to monitor ethanol in real time due to the safety risks posed by its high volatility and flammability. However, current methods for improving the performance of SnO2 ethanol sensors often hinder the miniaturization of devices. To address this, the paper designs an intrinsic SnO2 ethanol sensor with a field-effect transistor structure and employs magnetron sputtering to fabricate the sensitive film. The study systematically investigates the influence of gate voltage on the gas-sensing performance of the sensor. Experimental results indicate that the SnO2 sensor prepared by sputtering is an n-channel depletion-mode device. Gas-sensing tests reveal significant differences in the sensor’s response under different operating gate voltages: at a gate voltage of 10 V, the current change of the sensor in 100 ppm ethanol is 2.40 times; while at a gate voltage of -30 V, the channel current change is significantly enhanced to 3.42 times, representing a 42% improvement compared to 10 V. Further investigation shows that the gas-sensing properties of SnO2 arise from the modulation of carrier concentration in the channel by the surface adsorption of ethanol molecules. This effect is significantly enhanced under negative gate voltage but suppressed under positive gate voltage. However, a positive gate voltage of 10 V induces more electrons in the channel, effectively accelerating the adsorption and desorption processes of ethanol. As a result, the sensor's response and recovery times to 100 ppm ethanol are reduced to 8 s and 17 s, respectively, demonstrating faster dynamic characteristics. The study’s findings indicate that the degree and rate of ethanol vapor reaction on the SnO2 surface are significantly regulated by the sensor’s gate voltage. This research provides a new approach for optimizing the gas-sensing performance of SnO2 sensors and contributes to advancing their application in miniaturized, fast-response, and high-precision gas-sensing detection.

  • WANG Bing, YANG Yi-chuan, DING Huai, XIA Bi-jun, ZHU Lei, DENG Ming-long
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240474
    Online available: 2025-04-09

    This paper focuses on the problem of the amplitude-comparison direction-finding(DF) system of the monopulse radar under the scene of cooperative coherent dual sources. The authors describe the general model of the amplitude-comparison DF system and analyze three typical cases. And then, the theoretical errors and their accuracies of the estimated angle, which are caused by the system errors and the synchronous phase difference error, are derived. Finally, the numerical simulations are provided to analyze the influences of the signals’ parameters to the estimated angle, including the synchronous phase difference, the power ratio and the signals’ angles. The errors of estimated angle and their accuracies are also provided. The results illuminate that the estimated angle is between the angles of the dual sources, the smaller the errors the more accurate the error models of the estimated angle and the higher the working frequency the more rigorous the time synchronous demand. This paper can fill up some gaps of the DF theory under the condition of dual sources and also provide some guidance for the performance analyses and the optimizations of the scheme for the DF system.

  • WU Qi, WANG Zi-tong, ZHANG Dong-liang, XIA Si-yu, FAN Wen-qi, CHEN Yi-long
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240933
    Online available: 2025-04-02

    The measurement and control of advanced air vehicle requires the realization of multiple functions such as telemetry, remote control, communication, and tracking. Traditionally, it is generally composed of multiple wireless transceiver systems and discrete antennas. The contradiction between its volume, weight, cost, installation, etc. and the limited resources of the air vehicle is becoming increasingly prominent. The antenna aperture synthesis enables a single multi-functional antenna aperture to perform the functions of multiple dedicated antenna apertures. This greatly reduces the number of antenna apertures. It also significantly eases the pressure on the antenna aperture layout on the air vehicle platform, offering a new way to enhance the system-level electromagnetic compatibility. This paper systematically elaborates on the technical route of antenna aperture synthesis for air vehicle measurement and control communication. It focuses on introducing the multi-band and multi-polarization antenna technology for the synthesis of multiple discrete antennas, the diplexer antenna technology for the synthesis of transmitted and received antennas, the shared-aperture antenna technology for the synthesis of multiple antennas in the same aperture, and the coupling suppression technology for the integration of the same-frequency antenna array. At the same time, combined with the working characteristics of the software-defined radio system, it analyzes the advantages and feasibility of the application of the software-defined radio system in the air vehicle measurement and control communication system. Finally, this paper looks forward to the development of the antenna aperture synthesis technology for air vehicle measurement and control communication and puts forward the possible development directions of the antenna aperture synthesis technology in the development of the air vehicle measurement and control communication system.

  • CHAI Rong, WANG Bing-yan, SUN Rui-jin, JING Xiao-rong
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240656
    Online available: 2025-04-02

    In this paper, the scenario of multiple antenna-unmanned aerial vehicle(UAV)-assisted sensing and communication is addressed. Target perception and user communication performance is comprehensively considered, and system cost function is defined as the weighted sum of system energy consumption and user data rate. The optimization problem of communication and sensing scheduling strategy, perception precoding design, and UAV flight trajectory is formulated as a constrained system cost function optimization problem. Due to the highly coupled and non-convex nature of the optimization problem, it is challenging to solve directly. To tackle this problem, the formulated optimization problem is decomposed into three subproblems, i.e., UAV flight trajectory optimization subproblem, communication and sensing scheduling subproblem, and radar perception precoding subproblem, and an iterative nested method is proposed to solve these subproblems. For UAV flight trajectory optimization subproblem, a markov decision process(MDP) is modeled, and a UAV trajectory optimization algorithm is designed based on double deep Q-networks. Given the state of the MDP model, the communication and sensing scheduling subproblem is solved using Lagrange dual transformation and quadratic transformation methods, and the radar perception precoding subproblem is addressed through applying equivalent transformation approaches, i.e., introducing auxiliary variables and converting optimization constraints. Based on the obtained communication and sensing scheduling strategy and radar perception precoding, the reward function of the MDP model is updated and the UAV flight trajectory is determined, so as to achieve the joint optimization of communication and sensing scheduling, perception precoding design, and UAV flight trajectory. The effectiveness of the proposed algorithm is verified through simulations.

  • BIE Meng-ni, LI Wei, FU Qiu-xing, CHEN Tao, DU Yi-ran, NAN Long-mei
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20241036
    Online available: 2025-03-31

    During the rapid evolution of post-quantum cryptography, considering the needs for flexibility and efficiency, we proposed a parallel reconfigurable sampling accelerator for various lattice-based post-quantum cryptographic algorithms. We analyzed seven sampling processes involved in lattice-based post-quantum cryptography and proposed seven efficient parallel implementation models for these samplings, respectively, based on mathematical derivations. Then we extracted four common operational logics from these models. Using these four common operational logics as the core, we introduced data rearrangement to limit the effective bit width of operation data, which improved the acceptance rate of rejection sampling and eliminates the complex modular reduction operations in finite field operations. Then we proposed a high energy-efficient reconfigurable parallel sampling algorithm. To enhance the hardware implementation efficiency of the sampling algorithm, we adopted the butterfly transform network to complete the parallel splitting, merging, and lookup of data with any effective bit width within a single clock cycle, efficiently realizing the parallelization of the algorithm's pre- and post-processing, and constructed a parameterized parallel reconfigurable sampling accelerator architecture model. Aiming for high energy efficiency, combined with logic synthesis experimental results, we determined the optimal parallel degree parameters of the architecture model and proposed a parallel reconfigurable sampling accelerator with a data bandwidth of 1 024 bits. Experimental results showed that, using a 40 nm CMOS process library, and performing post-simulation under the ss, 125 ℃ process corner conditions, the circuit's highest operating frequency can reach 667 MHz, with an average power consumption of 0.54W. Completing a 256-point uniform sampling requires 6 ns, completing a 256-point rejection sampling with a rejection value less than 216 on average only takes 22.5 ns, completing a 256-point binary sampling within 8 bits requires 18 ns, completing a 509-point simple ternary sampling requires 36 ns, completing a 701-point non-negative correlated ternary sampling requires 124.5 ns, completing a 509-point fixed-weight ternary sampling requires 11.18 μ s, and completing a discrete Gaussian sampling in the Falcon algorithm once requires 3 ns. Compared with existing research, the sampler proposed in we reduce the energy consumption value for a uniform-rejection sampling by about 30.23%, and the energy consumption value for a binary sampling by about 31.6%.

  • ZHANG Peng-fei, ZHAI Rui-chen, CHENG Xiang, ZHANG Zhi-kun, LIU Xi-meng, WANG Jie
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240938
    Online available: 2025-03-29

    In spatial crowdsourcing, task allocation is a crucial prerequisite for subsequent location-aware data collection. To tackle potential location privacy breaches, researchers often adopt geo-indistinguishability (Geo-I). Existing approaches that satisfy Geo-I are often designed for one-to-one scenarios, while implicitly assume that workers can perform any task, and they often focus on minimizing the average travel distance, rather than maximizing the number of task allocation. Furthermore, these studies often incorporate the planar laplacian (PL) mechanism to achieve Geo-I. However, due to the randomness and unbounded nature of PL, it can result in excessive noise in the location data uploaded by workers, significantly deteriorating the utility of task allocation. This can lead to either long distances or unassigned tasks. To address these problems, we propose MONITOR (Many-to-many task allOcation under geo-iNdIsTinguishability for spatial crOwdsouRcing), a new privacy-preserving task allocation approach for many-to-many scenario. The general idea of MONITOR is to upload the distances from each worker’s true location to the obfuscated preferred tasks’ locations instead of uploading each obfuscated worker’s location. In MONITOR, to collect the distances for subsequent task allocation, we design an obfuscated distance collection method, called GroCol (Group-based obfuscated distance Collection). To improve the utility for task allocation, we develop a parameter independent obfuscated distance comparison method called ParCom (Parameter-free obfuscated distance Comparison). To illustrate the effectiveness of MONITOR, we first theoretically analyze its privacy guarantee, task utility, and computational complexity. We then empirically show on two real-world datasets and one synthetic dataset that MONITOR share similar results to that of non-private task allocation about the number of assigned tasks, and reduce the average travel distance by more than 20% compared to the baseline approaches.

  • ZHOU Tao, NIU Yu-xia, YE Xin-yu, LIU Long, LU Hui-ling
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240642
    Online available: 2025-03-25

    Recognition of 3D multimodal positron emission tomography/computed tomography(PET/CT) lung tumor using deep learning is an important research area. In medical images of lung tumors, the spatial shape of lesions is irregular and the boundary between the lesions and the surrounding tissues is blurred, which makes it difficult for the model to fully extract tumor features, and the computational complexity of the model is higher in three-dimensional tasks. To solve the above problems, a cross-modal Light-3Dformer 3D lung tumor recognition model is proposed in this paper. The main contributions of this paper are as follows: Firstly, The backbone network extracts PET/CT image features, and the auxiliary network extracts PET image features and CT image features. Multi-modal feature enhancement and interactive learning are realized by lightweight cross-modal collaborative attention. Secondly, Light-3Dformer module are designed. In this module, Updating the 2 times matrix multiplication operation of Transformer to the linear element multiplication operation of Lightformer; The cascade Lightformer structure is designed, The output feature map of the cascade Lightformer structure and the initial input feature map are fused, through parallel and deep and shallow feature fusion, lightweight and rich gradient information can be realized; Designing with parameter less attention, this structure can enhance the ability of lung tumor feature extraction from three aspects: channel, space, and tomography image. Thirdly, lightweight cross-modal collaborative attention module(LCCAM) is designed, which can fully learn the cross-modal advantage information of 3D multi-modal images and carry out interactive learning of deep and shallow features. Finally, Ablation experiments and comparative experiments. In the self-built 3D multi-modal data set of lung tumor, the accuracy and area under the curve(AUC) values of the model are 90.19% and 89.81%, respectively, under the premise of optimal computation and running time. Comparing with the 3D-SwinTransformer-S model, the computation quantity is reduced by 117 times, and the calculation quantity is reduced by 400 times. The experimental results show that the model can better extract multi-modal information of lung tumor lesions, which provides a new idea for lightweight and multi-modal interaction of deep learning 3D models.

  • LIU Qi-hang, LEI Qian-qian, XIONG Jian-hui, ZHANG Xu-dong
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240241
    Online available: 2025-03-25

    To solve the compatibility problem of multi-band on a single chip in the RF front-end of the receiver, this paper proposes a new bandwidth-reconfigurable low noise amplifier(LNA) structure for UWB applications. This LNA is based on switchable reconfigurable design methods, embedding the switchable design in the load of the cascaded LNA circuit. The design achieves switching of in-band input impedance matching and gain curves for different UWB operating modes by controlling the position of low-frequency impedance resonance point and corresponding gain pole through the reconfigurable design of the load inductance of the resistive parallel negative feedback structure. Compared with the design methods of introducing switches in the input/output matching path, placing switches at the load optimizes gain and noise performance without affecting impedance matching. The resistors and inductors in the traditional inductive peaking technique are adjustable to consider gain flatness within different operating bandwidths. Based on SMIC 28 nm CMOS technology, the simulation results of electromagnetic modeling demonstrate that the LNA operates in three modes: 3.1~10.6 GHz, 6~10.6 GHz, and 3.1~5 GHz, with in-band voltage gain(S 21) above 16.59 dB and minimum noise figure below 3 dB. Under 0.8 V power supply voltage, all three modes exhibit input and output matching(S 11, S 22) below -10 dB, with a static power consumption of only 9.03 mW; after introducing MOS switches, the noise figure degradation of the LNA in all three bandwidths is less than 0.2 dB.

  • ZHAO Wen-xu, WANG Jia-xiang, HU Zhi-yuan, SU Kai-ming, LI Tie, YANG Zhuo-qing
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240975
    Online available: 2025-03-24

    Breathing, as a crucial physiological process for sustaining life, is closely related to various respiratory diseases such as sleep apnea and asthma. To meet the increasing demand for health monitoring, this paper innovatively proposes a dual-channel wearable MEMS(Micro-Electro-Mechanical Systems) respiratory monitoring microsystem integrated with a flexible nasal expander. This microsystem incorporates a flexible nasal expander, a respiratory sensor, and a signal processing module, enabling continuous real-time monitoring of airflow within the nasal cavity. The sensor’s sensitive element adopts a folded metal resistor structure, deposited on a glass substrate through planar MEMS technology, utilizing the thermoresistive effect to achieve signal measurement. When embedded in the flexible nasal expander, the sensor can simultaneously monitor breathing signals from both sides of the nasal cavity, making it especially suitable for long-term continuous monitoring. Signal simulation and performance testing results demonstrate that the sensor exhibits excellent sensitivity, response speed, and anti-interference capability. In tests simulating respiratory conditions such as sleep apnea and asthma, the sensor accurately differentiates between normal and abnormal breathing patterns, supporting further analysis of various respiratory diseases. Based on this, the paper develops a dual-channel wearable MEMS respiratory monitoring device integrated with a flexible nasal expander, aimed at continuous, real-time, and long-term respiratory monitoring, particularly suitable for abnormal breathing screening and health monitoring during sleep. Additionally, this system captures changes in the nasal cycle, providing new data dimensions for in-depth analysis of breathing patterns and physiological rhythms, highlighting its potential application value in long-term health management.

  • Ning Ⅺ, ZHOU Xiao-Lin, SUN Cong, LI Qiao-Yang, MA Jian-Feng, GUO Xin-Yu
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240890
    Online available: 2025-03-19

    As one of the typical equipment of cyber-physical systems (CPS), UAVs are easy to use, have low requirements for the working environment and strong flexibility, and have been widely used in agriculture, industry, military and other fields. Among them, the flight control system is the core basic service of UAV, which ensures the effective implementation of UAV telemetry perception, communication coverage, surveying, mapping and disaster relief applications. However, the changeable physical environment and complex functional structure make it easy to introduce various software security problems in the development process of the UAV flight control system, resulting in serious problems such as hijacking, crashing, and loss of control of the UAV. How to detect the security of the UAV flight control software system has become very important. Most of the existing UAV anomaly detection technologies rely on the input of digital world construction, and it is difficult to find the problem of UAV logic security in time, so this paper proposes a security detection method for UAV flight control software that supports physical interaction, combines static and dynamic analysis methods, and combines fuzzing testing methods to test the security of UAV flight control software, the results show that the method can detect the safety of UAV flight control tasks with a high coverage rate of 97%, and extract UAV feature data according to the test resultsBased on the feature data, the machine learning method is used to train a double anomaly detection model, and by comparing with the existing detection methods on multiple datasets, the proposed method finds the abnormal condition of the UAV with an accuracy rate of 97.5%, and effectively detects the known safety problems in the UAV flight control software system.

  • LI Jun-yi, XING Li-juan, LI Zhuo
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240496
    Online available: 2025-03-19

    To reduce the latency of the parity-check successive cancellation list(PC-SCL) decoding algorithm for parity-check polar codes, a fast parity-check successive cancellation list(Fast-PC-SCL) decoding algorithm is proposed. Firstly, the algorithm analyzes and studies two types of special nodes in parity-check polar(PC-Polar) codes: PC-Repetition(PC-REP) nodes and PC-single parity-check(PC-SPC) nodes, and theoretically proves that PC-REP nodes exhibit cyclic repetition of codeword sequences, while PC-SPC nodes have the property of the sum of codewords equals a specific value. Secondly, based on these properties, codeword list estimation methods are given for these two types of nodes. This enables the decoding of these nodes to be executed in parallel, significantly reducing decoding latency. Finally, by combining the codeword list estimation methods for these two types of nodes, the Fast-PC-SCL decoding algorithm is presented. This algorithm can decode without completely traversing the successive cancellation(SC) decoding tree, while fully retaining the effect of PC bit checks. Compared to the PC-SCL algorithm, it significantly reduces decoding latency without sacrificing performance. Experimental data show that it can reduce latency by up to 55.13%.

  • LU Xiangkui, WU Jun
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240783
    Online available: 2025-03-17

    To protect user privacy, many platforms offer anonymous login options, limiting recommendation systems to accessing only user behavior records within the current session, thereby leading to the development of session-based recommendation(SBR). Existing SBR approaches mainly follow the traditional paradigms of non-anonymous user behavior modeling, focusing on learning session representations through sequential modeling. However, when sessions are short, the performance of these techniques drops significantly, making it challenging to address real-world SBR scenarios dominated by short sessions. To this end, we propose a method called counterfactual inference by frequent pattern guided long sequence generation (CLSG), which aims to answer the counterfactual question: “what would be the model’s prediction if the session contained richer interactions?” CLSG follows the classical three-stage counterfactual inference process of “induction-action-prediction”. The induction stage constructs a frequent pattern knowledge base from the observed session set. The action stage generates counterfactual long sessions with the guide of the knowledge base. The prediction stage measures the discrepancy between the predictions of the observed and counterfactual sessions, and incorporates such discrepancy as a regularization term into the objective function to achieve representation consistency. Notably, CLSG is model-agnostic and can be easily applied to enhancing current SBR models. Experimental results on three benchmark datasets demonstrate that CLSG significantly improves the recommendation performance of five existing SBR models, with an average improvement of 6% in terms of both hit rate (HR) and mean reciprocal rank (MRR) metrics.

  • ZHANG An-ran, WANG Xing-fen, ZHAO Yu-han, LI Li-bo
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240767
    Online available: 2025-03-14

    To address the significant time overhead and free-rider effect in most GNN-based community search methods, this paper proposes an efficient community search based on graph combinatorial optimization(CS-ROMF). CS-ROMF designs a GNN-based community locator to quickly pinpoint potential communities of the query nodes, thereby reducing time overhead. On this basis, CS-ROMF further designs an RL-based community optimizer to adjust the structure of candidate communities, mitigating the free-rider effect. Experiments conducted on five real-world datasets with true communities demonstrate that CS-ROMF outperforms advanced community search methods across all evaluation metrics. Specifically, compared to the best baseline model, CS-ROMF achieves maximum improvements of 14.99%, 20.67%, and 21.37% in F 1 score, Jaccard score, and NMI, respectively. Additionally, CS-ROMF can significantly improve search efficiency, running up to 10 times faster than the baseline model based on GNN.

  • QIN Jia-qi, JIANG Ze-tao, LEI Xiao-chun
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240648
    Online available: 2025-03-14

    Images obtained in low light environments often have low brightness, low contrast, and uneven lighting, resulting in weakened and blurred image features that are difficult to extract. At the same time, there is also a large amount of noise information in the limited extracted features, making it difficult to detect and recognize objects. Therefore, there are very few existing low light object detection results. This paper proposes a low illumination object detection method based on the Illumination Correction and Feature Interaction Enhancement (ICFIE-YOLO) network to address the difficulties in extracting features from low illumination objects and the large noise in the feature space. This method first utilizes the proposed ICFIE-YOLO internal Multi Scale Illumination Correction Network (MSICN) to correct low illumination images, highlighting the blurry features of objects hidden in the image’s background, and enabling the feature extraction module to better extract object features. Secondly, to fully utilize effective feature information and filter out noise interference in feature maps, a Feature Interacted Enhancement (FIE) detection head is proposed. Through feature attention interaction, feature enhancement is achieved, establishing spatial and semantic correlations between features in different regions of low illumination images, thereby suppressing the interference of noise on effective features and achieving feature enhancement. Finally, on the basis of enhancing features and removing noise, an improved detection head is used to achieve high-precision object detection. Experiments on the ExDark and DarkFace datasets show that the proposed Method improves mAP by over 2.1% compared to mainstream object detection models, increases recall by over 4.2% compared to existing low light object detection Methods, and improves recall by 2.6% compared to baseline models. The proposed Method has good generalization performance.

  • ZHENG Hang, SHI Zhi-guo, WANG Yong, ZHOU Cheng-wei
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240504
    Online available: 2025-03-14

    With the continuous construction of new information infrastructures, multi-dimensional array signal processing plays a fundamental role in the filed of radar, wireless communication, remote sensing and so on. Multidimensional array signals contain rich spatial/temporal/frequentiol/polarization parametric information, offering great economic and social values. To deal with the problem of structural information loss inherent in traditional vector/matrix models, the tensor algebra has been adopted to effectively retrieve multi-dimensional signal features. However, as the dimension of signals increases, the tensor signal volume following the Nyquist sampling theorem exponentially expands. Unfortunately, computation resources of the system are approaching the physical limit, resulting in computational overload and high latency. Concerning these issues, the sparse sensing theory has been developed to exploit the spatial sparsity of signals for sub-Nyquist processing. The extension from one-dimensional sparse sensing to multi-dimensional sparse sensing becomes a promising solution to efficient tensor signal processing. Meanwhile, by imposing structured sparse sensing paradigm such as coprime and nested sensing, the performance of the system can be enhanced via augmented coarray signal processing. Thus, to pursue the high economy of multi-dimensional array signal processing, this paper endeavors to the research on Structured Sparse Tensor Signal Processing for Sensor Arrays. In particular, the paper introduces the statistical theory of sub-Nyquist tensor signals. By deriving the augmented coarray tensor model and devising the corresponding strategy of source identifiability enhancement, this theory facilitates Nyquist matching in the virtual domain and underdetermined parameter estimation. Based upon this theory, this paper introduces a coarray tensor completion algorithm for sparse array DOA estimation, exploiting the full information of the discontinuous virtual array to achieve high accuracy and resolution. Meanwhile, this paper introduces a coprime tensor weights optimization algorithm for sparse array beamforming, which yields a beampatten with a sharper mainlobe and lower sidelobes, and increases the output signal-to-interference-plus-noise ratio. Furthermore, this paper introduces a resource-efficient tensorized neural network for robust sparse tensor signal processing, which compensates the performance deterioration for the model-driven methods in non-ideal conditions by efficiently learning tensor signal features.

  • MA Ling, YANG Xiao-chun, WANG Bin, SONG Xiao-shi, LI Fa-ming
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240686
    Online available: 2025-03-14

    With the advancement of modern communication and information technology, intelligent transportation systems(ITS) have emerged as a prominent area of research. The vehicular ad hoc network(VANET), serving as its pivotal technology, plays a crucial role in facilitating real-time road information sharing and inter-vehicle communication. However, existing clustering algorithms for VANET are plagued by issues such as low stability and high overhead. To address these challenges, this paper proposes a VANET clustering algorithm that leverages end-cloud collaboration. In the end-cloud collaboration phase, vehicles upload their feature data to the cloud via road side units(RSU), where the cloud performs dynamic stability classification based on changes in vehicle features. Nodes exhibiting stable behavior demonstrate higher reliability and longer connection durations. In the end-to-end coordination phase, factors including relative node mobility and cluster coverage are taken into account during cluster-head election to streamline the process while enhancing cluster stability. Furthermore, this paper introduces a neighbor discovery and update mechanism aimed at restricting HELLO message forwarding operations to reduce overhead and optimize resource utilization. Experimental results demonstrate that the proposed algorithm surpasses baseline algorithms across key performance metrics such as cluster stability, quantity of clusters formed, and clustering costs—highlighting its potential applicability in real-world traffic scenarios.

  • XU Xin, GAN Zhi-gang, YE Tian-yu
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240193
    Online available: 2025-03-11

    This paper proposed a protocol of semiquantum private comparison (SQPC) based on entanglement swapping of GHZ-like state and Bell state, which allows the classical participants to compare the equality of their secret message under the help of a semi-honest third party (TP). TP is allowed to misbehave but cannot collude with anyone else. This paper provides a detailed proof of the protocol’s complete robustness against external eavesdroppers’ attacks, and analyzes its security against dishonest internal participants. This paper also conducted experimental simulations on the flow and output correctness of the protocol using IBM’s Qiskit. In addition, the security of the proposed protocol is confirmed and it can effectively prevent various kinds of attacks.

  • LU Qi-peng, LIU Ya-li, LIU Chang-geng, ZENG Cong-ai, CHEN Dong-dong, NING Jian-ting
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240111
    Online available: 2025-03-11

    Transferring product to the entity which is not trusted by the administrator may lead to some problems, such as product counterfeiting, smuggling, product loss, and privacy leaking, etc. Therefore, in this paper, a product transfer scheme named BPOTS in RFID-enabled supply chain based on blockchain is proposed. Firstly, this paper proposes a secret value sharing and verification algorithm based on Chinese remainder theorem and Pedersen commitment to achieve the transfer of products between the designated new owner sets. And in order to improve system efficiency, we propose a method for the transfer of products in batches based on the homomorphism of Pedersen commitment. Secondly, to balance the transparency and privacy of the supply chain, this paper proposes a pseudo ID generation algorithm based on symmetric encryption. Thirdly, security analysis and performance evaluation are conducted on the BPOTS scheme. The result shows that BPOTS strikes a balance between the transparency and privacy of the supply chain effectively and improves the efficiency of transferring product for about 12 times compared with the existing product ownership transfer schemes. Finally, the BPOTS scheme is implemented on ChainMaker platform and made available as open-source on Github. The testing result indicates that the efficiency of transferring product in BPOTS scheme is about 70.4% higher than that of transferring products in series. Moreover, BPOTS scheme reduces the costs of supply chain nodes effectively.

  • ZHU Song-hao, TAN Shao-han
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240816
    Online available: 2025-03-06

    Lane detection,as the cornerstone of intelligent driving systems,plays a crucial role in assisting driving processes such as lane keeping and adaptive cruise control.Given the crucial role of lane detection in improving road safety,promoting intelligent driving and intelligent transportation development,the research on lane detection technology has profound academic value and application prospects.However,due to the diversity of lane categories,road conditions and weather environments,as well as the different aspect ratios of lane lines,lane detection algorithms face many challenges.This paper proposes a multi-dimensional feature refinement method for complex scene lane detection based on start point guidance.Firstly,a global feature optimization module is utilized to enhance the global feature representation capability and a lane line perception aggregation module is utilized to enhance the correlation of local features,which helps to improve the semantic understanding ability and prediction accuracy of the model.Secondly,a starting point coordinate prediction module is utilized to predict the starting point coordinates of lane line to generate flexible anchors under various complex scenarios.Finally,a more general penalty lane intersection to union ratio is selected as the loss function to represent lane lines with variable virtual widths,which helps to improve the detection accuracy of the model.Compared with the CLRNet-DLA34 method,which has the highest accuracy in current lane detection algorithms,the method proposed in this paper improves the detection accuracy in terms of F1@50 on the CULane and CurveLanes datasets by 0.62%and 0.73%respectively,reaching 81.39%and 86.83%The experimental results demonstrate that the proposed method achieves good detection performance in complex scene lane detection tasks and has strong competitiveness among existing methods.Experimental results demonstrate that the proposed method performs well in lane detection tasks in complex scenes and has strong competitiveness among existing methods.

  • GAO Yun-long, SHI Shu-guang, ZHAO Zhi-xiang, CAO Chao, PAN Jin-yan
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240682
    Online available: 2025-03-06

    Due to the curse of dimensionality, effectively discarding redundant features while retaining critical information in high-dimensional data has become a key issue. Unsupervised feature selection, which performs dimensionality reduction without any prior class information, has attracted increasing attention. However, two common issues are ignored by existing unsupervised feature selection methods: Fuzziness is a common characteristic of data, but most existing unsupervised feature selection methods based on regularized regression ignore this aspect, resulting in suboptimal feature subsets; Most methods fail to effectively distinguish between normal and noisy samples and are susceptible to the noise. To tackle the mentioned issues, robust unsupervised feature selection with double fuzzy(DFRFS) learning is proposed. Specifically, DFRFS learning introduces fuzzy membership into unsupervised feature selection based on regularized regression, allowing data to be shared among multiple clusters, thereby better reflecting the complex structure and uncertainty of the data. Additionally, DFRFS learning assigns different weights to samples through the robust weight learning framework, thus suppressing the impact of noise while retaining the effect of normal samples. Experiments on toy and real-world datasets have demonstrated the effectiveness of the proposed method DFRFS learning.

  • LI Pu-fei, WANG Pin, LI Yong-ming, ZHANG Jin-hua, YAN Fang
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240793
    Online available: 2025-03-04

    Unsupervised domain adaptation(UDA) is a significant research area that aims to transfer knowledge from source data, which is well-labeled but distributed differently, to unlabeled target data. Existing methods only considered aligning the distributions of the original source and target domain samples, suffering from the big difference in the distributions. In recent years, semantic-based UDA has integrated category information on this basis. However, the category information is too coarse and cannot fully reflect the distributions of source and target domain. To solve this problem, a joint hierarchical granularity envelope(JHGE) discriminative feature learning approach is proposed, which integrates information from original sample pairs, categories, and granular envelopes at three levels. This method can reflect the distribution from coarse to fine. Specifically, the "knowledge pyramid" theory is firstly incorporated into the UDA framework to realize multi-sample granularity-based semantic representation. Besides, granular envelopes are created to connect the original samples with class centers, establishing three layers of sample granularity, which replace the single layer of original samples in existing UDA methods. Secondly, an iterative clustering approach is developed to uncover associative information between samples, generating granular envelopes between original samples and class centers. This three-layer sample granularity enriches the informative content of the existing UDA methods. Thirdly, a bagging ensemble mode is implemented to integrate the three-layer granularity spaces. The different layers of granularity are weighted to ensure satisfactory accuracy. Experimental results on benchmark datasets demonstrate that this method can effectively reduce differences across domain and outperforms the state-of-the-art domain adaptation methods.

  • GAO Ning, LI Yu-rong, CHEN Hong, CHEN Wen-sheng, JIA Zi-hao
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240998
    Online available: 2025-03-04

    Atrial fibrillation(AF) is a common arrhythmia often associated with cardiovascular diseases such as stroke and heart failure. Although numerous researchers have made substantial progress in AF detection using deep learning methods in recent years, most of these methods require extensive computational resources. Moreover, the clinical application of these models is challenging due to the black-box nature of deep learning models. Therefore, this paper proposes a lightweight AF detection model based on feature fusion and conducts an interpretability study. The model comprises an ECG(Electrocardiogram) backbone network and an RRI(R-R Interval) branch. The ECG backbone network uses depthwise separable convolutions along with a few standard convolutions to extract deep morphological features of the ECG signals, while the RRI branch employs multi-scale convolutions to extract deep rhythm features of the RRI. The network learns robust feature representations by fusing morphological features and rhythm features to detect AF accurately. As to interpretability analysis, Grad-CAM++ is utilized to visualize the contribution of different features to the classification results. In this paper, the training and dataset internal tests are conducted in the LTAFDB and achieved an accuracy of 97.99%. In order to validate the generalization performance of the model, external testing experiments are conducted using the AFDB and the CPSC2021, achieving an accuracy of 95.17% and 93.81%, respectively. Experimental results demonstrate that the proposed method is lightweight, stable, and accurate, and the incorporation of interpretable deep-learning techniques suggests that the proposed method holds significant potential for the clinical diagnosis of AF.

  • HUANG Guang-yuan, HUANG Rong, ZHOU Shu-bo, JIANG Xue-qin
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240780
    Online available: 2025-03-04

    The attention mechanism and its variants have been widely applied in the field of image inpainting. They divide corrupted images into complete and missing regions, and capture long-range contextual information only within the complete regions to fill in the missing regions. As the area of missing regions increases, the features of complete regions decrease, which limits the performance of the attention mechanisms and leads to suboptimal inpainting results. In order to extend the context range of the attention mechanism, we employ a vector-quantized codebook to learn visual atoms. These visual atoms, which describe the structural and textural of image patches, constitute external features for image inpainting and thus compensate for the internal features of the image. On this basis, we propose a dual-stream attention image inpainting method based on interacting and fusing internal-external features. Based on internal and external information sources, we design an internal mask attention module and an internal-external cross attention module. These two attention modules form a dual-stream attention to facilitate interaction between internal features and between internal and external features, thereby generating internal- and external- source inpainting features. The internal mask attention shields the interference of missing region features with a mask. It captures contextual information exclusively within the complete regions, thereby generating internal-source inpainting features. The internal-external cross attention interacts with internal and external features by calculating the similarity relationship between internal features and external features composed of visual atoms, thereby generating external-source inpainting features. In addition, we design a controllable feature fusion module that generates spatial weight maps based on the correlation between internal- and external- source inpainting features. These spatial weight maps fuse internal and external features by element-wise weighting of internal- and external- source inpainting features. Extensive experimental results on Places2, FFHQ and Paris StreetView datasets demonstrate that the proposed method achieves average improvements of 3.45%, 1.34%, 13.91%, 13.64%, and 16.92% for PSNR, SSIM, L1, LPIPS, and FID metrics respectively, compared with the state-of-the-art methods. Visualization experimental results demonstrate that both internal features and external features composed of visual atoms are beneficial for repairing corrupted images.

  • ZHANG Si-ya, CHAI Rong, LIANG Cheng-chao, CHEN Qian-bin
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240116
    Online available: 2025-02-28

    Multibeam satellite communication systems have received widespread attentions due to their high throughput and efficient resource utilization. This paper investigates the beam scheduling and resource allocation problem in multibeam satellite communication system. By jointly considering user position and service characteristics, an optics-based initial user grouping algorithm is proposed. To enhance beam coverage performance, a minimum circle algorithm is proposed to optimally design satellite beam positions and coverage radius. Given the determined user grouping strategy, system cost function is defined and the joint beam scheduling, sub-channel allocation and power allocation problem is formulated as a system cost function minimization problem. To solve the formulated optimization problem, aggregate nodes are introduced to describe the characteristics of user groups, and a parameterized deep Q-network-based joint beam scheduling and power allocation algorithm is proposed. Based on the obtained user group beam scheduling and power allocation strategy, a double deep Q-network algorithm and a proximal policy optimization-based joint subchannel and power allocation strategies are proposed. Simulation results validate the effectiveness of the proposed algorithms.

  • SHANG Bi-yun, WEI Xing, ZHOU Shi-jun, JI Wen-guan, Dong Shi-qi, Tu Yao-feng, Dong Zhen-jiang
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240868
    Online available: 2025-02-26

    With the widespread adoption of internet of things (IoT) and smart devices, the volume of data generated at the edge has far exceeded the computational and storage capabilities of edge nodes. This creates an urgent need for cloud-edge collaborative processing to meet the real-time analysis demands of large-scale data. With the decoupling of computation, memory, and storage, the shared-cahe architecture has become a critical solution for addressing the processing requirements of massive edge data. However, there are still several issuses remained in shared-cache architecture. First, in transactional processing scenarios, when hotspot cached data frequently migrates between nodes, the log persistence mechanisms of existing databases will generate a large number of log write operations, thereby impacting system performance. Secondly, the existing cache write-invalidation mechanism can lead to frequent eviction of some hotspot cached data, causing slower transactions to fail in retrieving target data from the shared cache in time. This could trigger a large number of cache reloads, resulting in system performance degradation. To address these issues, this paper proposes a dependency-table-based delayed log flushing mechanism. By consolidating multiple log write operations and deferring them until the log buffer is full or a transaction is committed, the mechanism reduces the frequency of log flushing and the overhead of disk writes. In addition, this paper also introduces a cache delayed invalidation mechanism that incorporates asynchronous replay of invalidation messages, page visibility determination, and an optimized cache replacement. This approach effectively extends the service time of cached data, improving cache hit rates and overall system performance. Based on these mechanisms, this paper implements a high-performance shared-cache database system called EBASE-T. Experimental results show that, compared to its pre-optimized version, EBASE-T achieves a 19.5% increase in throughput and a 13.1% reduction in latency. In TPC-C (online transaction processing system benchmarks) tests, EBASE-T demonstrates significant performance advantages over most shared-cache database systems.

  • YIN Jia-yuan, WU Jian, DENG Jing-ya, ZHOU Shi-gang, CAO Xin-yue
    ACTA ELECTRONICA SINICA. https://doi.org/10.12263/DZXB.20240447
    Online available: 2025-02-26

    A substrate integrated waveguide (SIW) periodic leaky-wave antenna (PLWA) with increased gain and continuous beam scanning through broadside is proposed by loading slow-wave (SW) structures. Slow wave structure in the form of periodic blind via-holes with loaded patches decelerates the phase velocity of the electromagnetic wave traveling in SIW, reducing the guided wavelength by 50%. Compared with normal SIW PLWA, the distance between the adjacent radiating slots in the proposed SW-SIW PLWA is decreased by half, allowing twice as many radiating slots in SW-SIW PLWA with the same length. Therefore, the radiation efficiency and the gain of SW-SIW PLWA can be significantly increased. Furthermore, the loaded patches of slow wave structure under the radiating slots are extended to improve the impedance match of the radiating slots and suppress the reflected wave, so the open-stopband (OSB), which is a common drawback of the PLWA, is suppressed. In consequence, the radiation beam can scan from backward to forward direction continuously. A prototype of the proposed SW-SIW PLWA is manufactured and measured, the scanning angle of the proposed PLWA reaches 72.7° during the operating frequency range of 13.4~15.4 GHz with maximum gain of 8.47 dBi. The measured results agree well with the simulations.