CIE Homepage  |  Join CIE  |  Login CIE  |  中文 

Most accessed

  • Published in last 1 year
  • In last 2 years
  • In last 3 years
  • All

Please wait a minute...
  • Select all
    |
  • SURVEY AND REVIEW
    WANG Qian-fan, YANG Jia-yi, WANG Yin-chu, CAI Sui-hua, MA Xiao
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2913-2932. https://doi.org/10.12263/DZXB.20240167
    Abstract (2655) Download PDF (575) HTML (2574)   Knowledge map   Save

    Streaming communication is an essential communication scenario in optical fiber communication networks and mobile communication networks. Unlike traditional intermittent or block-oriented communication, the data transmission in streaming communication exhibits typical continuous streaming characteristics. Coupled codes, compared to traditional block codes, have shown significant performance improvement in streaming communication scenarios. Additionally, they inherit low encoding and decoding latency. These advantages make coupled codes an important candidate for channel coding in streaming communication scenarios. This paper first reviews existing coupled LDPC codes, including product-like coupled LDPC codes, partially reencoded coupled LDPC codes, spatially coupled LDPC (SC-LDPC) codes, and globally coupled LDPC (GC-LDPC) codes. Following that, this paper introduces a series of improved designs for coupled LDPC codes based on free-ride codes, and introduces a new class of coupled LDPC codes based on block Markov superposition transmission (BMST) techniques. Finally, this paper concludes with a discussion on future prospects and research directions of the coupled LDPC codes.

  • PAPERS
    LI Xin, LU Wei, MA Zhao-yi, ZHU Pan, KANG Bin
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2799-2810. https://doi.org/10.12263/DZXB.20230515
    Abstract (2448) Download PDF (654) HTML (2396)   Knowledge map   Save

    Currently, graph Transformers mainly add auxiliary modules in the traditional Transformer framework to model graph data. However, these methods have not improved the original Transformer architecture. Their data modeling accuracy needs to be further enhanced. Thus, this paper suggests a node classification method based on graph attention and improved Transformer. In the proposed framework, a topology enhancement based node embedding is constructed for graph structure reinforcement learning. Then, a secondary mask based multi-head attention is developed for aggregation and update. Finally, pre-Norm and skip connection are introduced to improve the interlayer structure of Transformer, which can avoid the over-smoothing problem caused by feature convergence. Experimental results demonstrate that compared to 6 typical baseline models, our method is able to achieve optimal evaluation results on all different indicators. Moreover, it can simultaneously handle the node classification task for both small and medium datasets and comprehensively improve the classification performance.

  • LI Xue-long
    ACTA ELECTRONICA SINICA. 2024, 52(4): 1041-1082. https://doi.org/10.12263/DZXB.20230698
    Abstract (2436) Download PDF (1903) HTML (2219)   Knowledge map   Save

    Approximately 71% of the Earth’s surface is encompassed by aqueous elements, such as rivers, lakes, and seas. Concurrently, terrestrial imaging contends with the influence of water in the forms of clouds, snow, rain, and fog. Notwithstanding, contemporary machine vision research and application systems predominantly concentrate on visual tasks within aerial and vacuum environments, leaving a dearth of systematic investigation into visual tasks within various aquatic contexts. Water-related vision, emblematic of water-based optical technology in the realm of vision, is committed to dissecting the scientific intricacies of light-water interactions and their inter-medium propagation. It also entails intelligent processing and analysis of visual image signals within aquatic settings. This discipline concurrently addresses engineering and technical intricacies intrinsic to the progression of advanced, intelligent water-related vision apparatus. Embarking from the fundamentally significant scientific query, “What is the reason for the ocean’s blue color?” this paper proffers an exhaustive survey encapsulating the repercussions of seawater’s light absorption, scattering, and attenuation mechanisms upon underwater visual tasks. Furthermore, the current methodologies for the processing and refinement of subaquatic images are systematically examined. Exploiting the optical attributes of water and factors contributing to image degradation, this manuscript underscores our team’s milestones in pioneering indispensable technologies for underwater imaging and image analysis. Substantial headway has been achieved in devising underwater observation and analytical apparatus, encompassing the full-ocean-depth ultra-high-definition camera “Haitong,” the full-ocean-depth 3D camera, and the full-ocean-depth high-definition video camera. These innovations have distinctly established a comprehensive and methodical proficiency in optical detection within submerged contexts, encompassing variables of color, intensity, polarization, and spectral analysis. This collective endeavor effectively bridges the gap in China’s full-ocean-depth optical detection technology, propelling the progress of exploration and technological innovation within the domain of water-related vision, which offers remarkable application value and societal advantages.

  • PAPERS
    FANG Shuai, WAN Qi, CAO Yang
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2037-2052. https://doi.org/10.12263/DZXB.20221147
    Abstract (2221) Download PDF (382) HTML (2075)   Knowledge map   Save

    The trade-off between spatial and temporal resolution of satellite images leads to spatial and temporal contradictions in image sequences. Spatiotemporal image fusion provides a solution to generate high spatial resolution and high temporal resolution images to satisfy various earth observation applications. The spatiotemporal fusion algorithm based on sparse representation establishes the relationship between high and low spatial resolution images by jointly training the dictionary and sparse coding representation, which provides a unified fusion framework for phenological change and type change. However, the multi-source remote sensing images come from different sensors, and the relationship model between high and low spatial resolution images implies the sensor mapping. This inevitably leads to that the model is device dependent. To solve the problem, we decompose the multi-source remote sensing spatiotemporal fusion process into two sub-problems, device dependent sensor bias correction and device independent spatiotemporal fusion. The sensor bias correction can be used as a preprocessing module to improve the universality and accuracy of subsequent fusion models. When there are large space scale gaps between high and low spatial resolution image, the assumption that “the sparse coefficients of high and low spatial resolution images are the same” will bring about very significant fusion errors. To solve the problem, we optimize the objective function of sparse representation using cross-scale similarity prior. Intermediate-scale images are constructed to reduce ambiguity of cross-scale similar patches and improve the accuracy of cross-scale similar patches. Experimental results in three typical scenarios demonstrate the generalization ability of our algorithm. The contrastive experiments show that on the BOREAS dataset, compared to suboptimal indicators, SSIM (Structural SIMilarity) is improved by 4.2%, SAM (Spectral Angle Mapper) is increased by 4.6%; On the CIA dataset, compared to suboptimal indicators, SSIM is increased by 2.7%, and SAM is increased by 12.8%; On the LGC dataset, compared to suboptimal indicators, SSIM is increased by 7.1%, and SAM is increased by 16.3%. Our algorithm is superior to other compared methods in spatial and spectral performance.

  • PAPERS
    LIU Wen-xi, ZHANG Jia-bang, LI Yue-zhou, LAI Yu, NIU Yu-zhen
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2279-2290. https://doi.org/10.12263/DZXB.20230668
    Abstract (2146) Download PDF (945) HTML (2102)   Knowledge map   Save

    Camouflage object detection aims to detect highly concealed objects hidden in complex environments, and has important application value in many fields such as medicine and agriculture. The existing methods that combine boundary priors excessively emphasize boundary area and lack the ability to represent the internal information of camouflaged objects, resulting in inaccurate detection of the internal area of the camouflaged objects by the model. At the same time, existing methods lack effective mining of foreground features of camouflaged objects, resulting in the background area being mistakenly detected as camouflaged object. To address the above issues, this paper proposes a camouflage object detection method based on boundary feature fusion and foreground guidance, which consists of several stages such as feature extraction, boundary feature fusion, backbone feature enhancement and prediction. In the boundary feature fusion stage, the boundary features are first obtained through the boundary feature extraction module and the boundary mask is predicted. Then, the boundary feature fusion module effectively fuses the boundary features and boundary mask with the lowest level backbone features, thereby enhancing the camouflage object’s boundary position and internal region features. In addition, a foreground guidance module is designed to enhance the backbone features using the predicted camouflage object mask. The camouflage object mask predicted by the previous layer of features is used as the foreground attention of the current layer features, and performing spatial interaction on the features to enhance the network’s ability to recognize spatial relationships, thereby enabling the network to focus on fine and complete camouflage object areas. A large number of experimental results in this paper on four widely used benchmark datasets show that the proposed method outperforms the 19 mainstream methods compared, and has stronger robustness and generalization ability for camouflage object detection tasks.

  • SURVEYS AND REVIEWS
    TONG Kang, WU Yi-quan
    Acta Electronica Sinica. 2024, 52(3): 1016-1040. https://doi.org/10.12263/DZXB.20230624
    Abstract (2101) Download PDF (2499) HTML (2062)   Knowledge map   Save

    Small object detection is an extremely challenging task in computer vision. It is widely used in remote sensing, intelligent transportation, national defense and military, daily life and other fields. Compared to other visual tasks such as image segmentation, action recognition, object tracking, generic object detection, image classification, video caption and human pose estimation, the research progress of small object detection is relatively slow. We believe that the constraints mainly include two aspects: the intrinsic difficulty of learning small object features and the scarcity of small object detection benchmarks. In particular, the scarcity of small object detection benchmarks can be considered from two aspects: the scarcity of small object detection datasets and the difficulty of establishing evaluation metrics for small object detection. To gain a deeper understanding of small object detection, this article conducts a brand-new and thorough investigation on small object detection benchmarks based on deep learning for the first time. The existing 35 small object detection datasets are introduced from 7 different application scenarios, such as remote sensing images, traffic sign and traffic light detection, pedestrian detection, face detection, synthetic aperture radar images and infrared images, daily life and others. Meanwhile, comprehensively summarize the definition of small objects from both relative scale and absolute scale. For the absolute scale, it mainly includes 3 categories: the width or height of the object bounding box, the product of the width and height of the object bounding box, and the square root of the area of the object bounding box. The focus is on exploring the evaluation metrics of small object detection in detail from 3 aspects: based on IoU (Intersection over Union) and its variants, based on average precision and its variants, and other evaluation metrics. In addition, in-depth analysis and comparison of the performance of some representative small object detection algorithms under typical evaluation metrics are conducted on 6 datasets. These categories of typical evaluation metrics can be further subdivided, including the evaluation metric plus the definition of objects, the evaluation metric plus single object category. More concretely, the evaluation metrics plus the definition of objects can be divided into 4 categories: average precision plus the definition of objects, miss rate plus the definition of objects, DoR-AP-SM (Degree of Reduction in Average Precision between Small objects and Medium objects) and DoR-AP-SL (Degree of Reduction in Average Precision between Small objects and Large objects). For the evaluation metrics plus single object category, it mainly includes 2 types: average precision plus single object category, OLRP (Optimal Localization Recall Precision) plus single object category. These representative small object detection methods mainly include anchor mechanism, scale-aware and fusion, context information, super-resolution technique and other improvement ideas. Finally, we point out the possible trends in the future from 6 aspects: a new benchmark for small object detection, a unified definition of small objects, a new framework for small object detection, multi-modal small object detection algorithms, rotating small object detection, and high precision and real time small object detection. We hope that this paper could provide a timely and comprehensive review of the research progress of small object detection benchmarks based on deep learning, and inspire relevant researchers to further promote the development of this field.

  • PAPERS
    ZHANG Ze-wei, BAO Wei-min, FANG Hai-yan, SU Jian-yu, LI Xiao-ping, YAO Yun-feng
    ACTA ELECTRONICA SINICA. 2024, 52(9): 2939-2949. https://doi.org/10.12263/DZXB.20221003
    Abstract (1935) Download PDF (163) HTML (1903)   Knowledge map   Save

    A high-precision X-ray photon arrival time conversion model is crucial to the accuracy of X-ray Pulsar-based Navigation. Aiming at the current problem that the complete model is complex and the simplified model has limited accuracy, a fast simplified model with accuracy no less than the existing simplified model is proposed in this paper. Through the derivation of the existing complete model, the influence of each delay item on the accuracy of the model is theoretically analyzed, and it is pointed out that the Roemer delay is still the key to the accuracy of the simplified model. A fast simplified model was obtained by changing the expression of the Roemer delay and its second-order expansion, and considering the ease of access to physical quantities in practical application. The accuracy and computational efficiency of the proposed model are analyzed by using the complete model and the proposed simplified model to time-transform the measured photon data of NICER (Neutron star Interior Composition Explorer) and HXMT (Hard X-ray Modulation Telescope) satellites. Furthermore, the influence of orbital altitude and pulsar angular position measurement errors on the accuracy of the simplified model is analyzed by numerical simulation, and the accuracy and computational efficiency of the simplified model in the application of Earth orbit at different altitudes are discussed. The results show that the computational efficiency of the simplified model proposed in this paper is improved 50% than that of the Sheikh’s simplified model and 10% than the fei’s model, without causing a decrease in accuracy.

  • PAPERS
    HUA Qing-long, ZHANG Yun, REN Hang, JIANG Yi-cheng, XU Dan
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2900-2912. https://doi.org/10.12263/DZXB.20230465
    Abstract (1923) Download PDF (317) HTML (1866)   Knowledge map   Save

    In synthetic aperture radar (SAR) system, the three-dimensional rotation of ship targets in the presence of a medium and high sea state would lead to time-varying Doppler spectrum and image defocusing, which will adversely affect the subsequent information interpretation of ship targets in SAR images. Aiming at the refocusing problem of three-dimensional rotating ship targets, this paper proposes a SAR refocusing method for three-dimensional rotating ship target based on minimum entropy criterion and generative adversarial network, and designs the network structure of generator and discriminator. The generator transforms the defocused complex SAR ship image into range-Doppler domain, and estimates the phase error coefficient by range unit using phase error coefficient estimation network, and realizes the compensation of multi-order phase errors. The discriminator is composed of a complex-valued convolutional neural network, and all its elements, including convolution layer, activation function, feature mapping and parameters, are extended to the complex domain. The minimum entropy criterion and adversarial loss are introduced into the loss function to achieve unsupervised training and avoid the problem that it is difficult to obtain the target labeling samples of non-cooperative ships. Experiments on simulated data and Gaofen-3 data show that the proposed method achieves significant improvements in both refocusing accuracy and efficiency.

  • SURVEY AND REVIEW
    QIU Jing, CHEN Rong-rong, ZHU Hao-jin, XIAO Yan-jun, YIN Li-hua, TIAN Zhi-hong
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2529-2556. https://doi.org/10.12263/DZXB.20231057
    Abstract (1898) Download PDF (1014) HTML (1803)   Knowledge map   Save

    Investigating network attacks is crucial for the implementation of proactive defenses and the formulation of tracing countermeasures. With the rise of sophisticated and stealthy network threats, the need to develop efficient and automated methods for investigations has become a pivotal aspect of advance intelligent network attack and defense capabilities. Existing studies have focused on modeling system audit logs into provenance graphs that represent causal dependencies of attack events. Leveraging the powerful associative analysis and semantic representation capabilities of provenance graphs, complex and stealthy network attacks can be effectively investigated, yielding superior results compared to conventional methods. This paper offers a systematic review of the literature on provenance-graph-based attack investigation, categorizing the diverse methodologies into three principal groups: causality analysis, deep representation learning, and anomaly detection. For each category, the paper succinctly presents the workflows and the core frameworks that underpin these methodologies. Additionally, it delves into the optimization techniques for provenance graphs and chronicles the evolution of these technologies from theoretical constructs to their application in industrial settings. This study methodically aggregates and reviews datasets prevalently utilized in attack investigation research, offering a comprehensive comparative analysis of representative techniques alongside their associated performance metrics, specifically within the ambit of provenance graph-based methodologies. Subsequently, it delineates the prospective directions for future research and development within this specialized field, thereby providing a structured roadmap for advancing the domain's academic and practical applications.

  • PAPERS
    LIU Xin, HAI Yang, DAI Wei
    ACTA ELECTRONICA SINICA. 2024, 52(9): 3052-3064. https://doi.org/10.12263/DZXB.20230957
    Abstract (1867) Download PDF (925) HTML (1847)   Knowledge map   Save

    The state space model is a common and important model structure for automation and control. In this paper, the robust identification of nonlinear state-space model corrupted by outliers is investigated. The outliers imposed on both the state transition process and the output measurement process are considered and a more comprehensive and robust identification algorithm is proposed. To ensure the robustness of the proposed algorithm, two independent heavy-tailed Student's t-distributions are used to describe the state noise and the output noise, respectively. Then the particle smoothing method is applied to estimate the posterior distribution of the unknown states. Finally, the expectation maximization algorithm is used to realize the parameter estimation problem. The mathematical decomposition of the Student's t-distribution is employed in the identification process which brings two main advantages: (1) facilitating the derivation and implementation of the proposed algorithm; (2) providing a more clearer explanation of the robustness of the algorithm. The usefulness of the proposed algorithm is demonstrated via the numerical and mechanical examples.

  • PAPERS
    PENG Zi-ran, XU Huai-shun, XIAO Shen-ping
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2418-2428. https://doi.org/10.12263/DZXB.20240236
    Abstract (1822) Download PDF (624) HTML (1793)   Knowledge map   Save

    Most of the photovoltaic power stations are located in remote areas with complex terrain, which are affected by the external environment and prone to various faults. The traditional PV array fault diagnosis methods have the problems of low accuracy and low utilization of PV data. Aiming at the above problems, in this paper, we first improve the sparrow search algorithm (SSA) by introducing the Levy flight strategy and the dynamic adjustment strategy of the step factor to reduce the risk of the SSA algorithm falling into the local optimum and improve the optimization ability of the SSA algorithm. Then the improved levy adjustment sparrow search algorithm (LASSA) is used to optimize the key hyperparameters of the CatBoost model, and a photovoltaic array fault diagnosis model LASSA-based on CatBoost and using LASSA as the optimization strategy is proposed. CatBoost for accurate diagnosis of short-circuit, open-circuit, aging and shadow masking faults in PV arrays. The experimental results show that the fault diagnosis accuracy of the LASSA-CatBoost model is 99.7%, which is 3.6% higher compared to the CatBoost model before optimization. Compared with the existing PV array fault diagnosis models, the LASSA-CatBoost model has higher accuracy and stability.

  • PAPERS
    MA Zhe-yi-pei, JIANG Chao, LIU Yan-qiong, LI Jia-le
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2668-2678. https://doi.org/10.12263/DZXB.20230609
    Abstract (1808) Download PDF (343) HTML (1768)   Knowledge map   Save

    In this study, a design method of multilayer structure composite absorber based on double layers of metasurface is proposed. The designed composite absorber consists of two layers of metasurface, top absorption enhanced skin and several support dielectric slabs. The unit cells of metasurface Ⅰ and Ⅱ are separately irregular-shaped metal patches connected by chip resistors and hexagonal metal rings loaded with chip resistors; the top absorption enhanced skin is a fiberglass enforced epoxy laminate; support dielectric slabs adopt PMI foam. The simulation results indicate that the absorption frequency bands with reflection coefficients below -10 dB and -20 dB are 2.80~23.64 GHz and 3.56~22.56 GHz, respectively. The measurement results show that, the absorption frequency bands with reflection coefficients below -10 dB and -20 dB are 2.36~23.87 GHz and 3.17~23.16 GHz, respectively; the reflection coefficient curves obtained by simulation and test have good consistency, which verifies the effectiveness of the design method. The simulation and test results show that the frequency band of -10 dB reflection coefficient at 50° oblique incidence is basically consistent with that at normal incidence, while the offset of start and stop frequencies is less than 0.8 GHz; further, the simulation and test results show that when the oblique incidence angle is 60°, the fractional bandwidth of the reflection coefficient below -10 dB is up to 141.8%, which indicates that the composite absorber designed in this study has the incidence stability in a wide angle range. In addition, the mechanism of ultrawideband and high absorption of the structure and the influence of the main structure parameters are analyzed; the results demonstrate that the top absorption enhanced skin can improve absorptivity of the whole structure by up to 0.2(1.0 represents 100% absorption), and the complementary enhancement design of the two layers of metasurface absorption frequency bands can improve oblique incidence stability obviously.

  • PAPERS
    JIN Xiao-zhong, LIU Hai-kun, LAI Hao, MAO Fu-bing, ZHANG Yu, LIAO Xiao-fei, JIN Hai
    ACTA ELECTRONICA SINICA. 2024, 52(9): 3038-3051. https://doi.org/10.12263/DZXB.20221257
    Abstract (1802) Download PDF (1127) HTML (1783)   Knowledge map   Save

    Heterogeneous memory systems composed of traditional dynamic random access memory (DRAM) and new non-volatile memory (NVM) can be organized in a horizontal architecture or a hierarchical architecture. The horizontal DRAM/NVM architecture often requires page migration technologies to improve memory access performance. However, hot page monitoring and migration implemented in operating systems would cause significant software performance overhead. The hardware-supported hierarchical architecture even increases the memory access latency for big data applications with poor data locality due to the deeper memory hierarchy. To this end, this paper proposes a reconfigurable heterogeneous memory architecture that can be converted between horizontal and hierarchical architectures at runtime to dynamically adapt the memory access characteristics of different applications. We design a DRAM/NVM heterogeneous memory controller (HMC) based on the new instruction set architecture RISC-V (Reduced Instruction Set Computing-V). The HMC uses a few hardware counters for memory access monitoring and analyzing, and achieves dynamic address mapping and efficient page migration between DRAM and NVM pages. Experimental results show that the DRAM/NVM hybrid memory controller can improve application performance by 43%.

  • PAPERS
    ZHONG Yu-bin, YANG Peng, DOU Lei
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2112-2122. https://doi.org/10.12263/DZXB.20230162
    Abstract (1760) Download PDF (390) HTML (1662)   Knowledge map   Save

    Due to the irregular deformation of target in the tracking process, it is unable to accurately estimate the target scale, while using the scale model with fixed aspect ratio. In this paper, we propose an aspect-ratio-based correlation filtering tracking algorithm to address this problem. Based on the fDSST (fast Discriminative Scale Space Tracking) algorithm, first train and learn an aspect-ration model to update the aspect ratio of the target, which could help to obtain a more accurate target scale. On this basis, this paper designs a smoothing correction scheme and an adaptive learning rate mechanism to alleviate the model drift and achieve more accurate tracking. The results of comparative experiments on OTB100, VOT2016 and VOT2018 datasets show that the proposed algorithm improves the performance of the baseline algorithm. Especially, the overall precision and success rate of the proposed algorithm on OTB100 are 9.6% and 6.2% higher than those of fDSST.

  • PAPERS
    CHONG Yi-ning, LI Jue, QIAO Ming
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2271-2278. https://doi.org/10.12263/DZXB.20230845
    Abstract (1748) Download PDF (264) HTML (1702)   Knowledge map   Save

    In this paper, the design of high-voltage super junction power MOS (Metal Oxide Semiconductor) device is carried out by using the semi-super junction structure, the super junction cell structure is designed based on the Sentaurus TCAD (Technology Computer Aided Design) simulation platform, and the breakdown voltage and on-resistance of the high-voltage super junction power MOS devices are optimized, and then the characteristics of parasitic capacitance are explored. Finally, based on multiple epitaxial processes, a high-voltage super junction power MOS device with a simulated breakdown voltage of 1 658 V, a process simulation breakdown voltage of 1 598 V and a specific on-resistance value of 303 mΩ·cm2 has been independently designed, which reduced the specific on-resistance value by about 50% compared with the same withstand voltage device. At the same time, the influence of four main structural parameters, namely super junction doping concentration and thickness and voltage support layer doping concentration and thickness, on the parasitic capacitance characteristics of the device has been explored.

  • PAPERS
    CHEN Zhe, WANG Pin-qing, ZHOU Pei-gen, CHEN Ji-xin, HONG Wei
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2161-2169. https://doi.org/10.12263/DZXB.20230645
    Abstract (1738) Download PDF (506) HTML (1712)   Knowledge map   Save

    This paper presents the design of a millimeter-wave dual-band low phase noise voltage-controlled-oscillator in 45 nm CMOS SOI (Complementary Metal Oxide Semiconductor Silicon On Insulator) process, which covers bands of 24.25~27.5 GHz and 37~43.5 GHz for 5G millimeter-wave communications. Based on the transistor’s high performance as the RF switch in SOI process, the switched cap-bank and switched inductor topology are proposed in this paper, to enhance the quality factor Q for the wide-band tuning inductance and capacitance, increase the VCO (Voltage Controlled Oscillator) operating bandwidth, and lower the phase noise performance. Meanwhile, the switched capacitor is also adopted in the output matching network for good matching and stable output power in dual-bands. Measured results show that the designed VCO covers the bands of 24.25~27.5 GHz and 37~43.5 GHz for 5G millimeter-wave communication standards as in WRC-19, with output power of -4.8~0 dBm in low band and -6.4~-2.3 dBm in high band. The measured phase noise is -105.1 ‍dBc/Hz@1 MHz offset for the 24.482 GHz carrier, and -95.3 dBc/Hz@1 MHz offset for the 43.308 GHz carrier. The DC power consumption for the core circuit is 15.3~18.5 mW, and the core area is 0.198 mm2. The corresponding FoM (Figure of Merit) and FoMT for low (high) band is -181.3 dBc/Hz (-175.4 dBc/Hz), and -194.3 dBc/Hz (-188.3 dBc/Hz), respectively.

  • PAPERS
    YUAN Zhi-qiang, YANG Si-chun, RUAN Yue, XUE Xi-ling, TAO Tao
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2025-2036. https://doi.org/10.12263/DZXB.20220784
    Abstract (1674) Download PDF (466) HTML (1559)   Knowledge map   Save

    Quantum approximate optimization algorithm (QAOA) is an algorithm framework for solving combinatorial optimization problems. It is regarded as one of the promising candidates to demonstrate the advantages of quantum computing in the near future. Within the QAOA framework, the symmetries of quantum states induced by the binary encoding scheme restrain the performance of QAOA. Inspired by the Dicke state preparation algorithm, we proposed a new encoding scheme that eliminated the symmetry of quantum states representing solutions. Beyond that, we also proposed a novel evolution operator, star graph (SG) mixer, and its corresponding SG algorithm. The quantum circuit implementation of the SG algorithm on IBM Q showed the SG algorithm has an average performance improvement of about 25.3% over the standard QAOA algorithm in solving the graph partitioning problem.

  • PAPERS
    XING Chang-da, WANG Mei-ling, XU Yong-chang, WANG Zhi-sheng
    ACTA ELECTRONICA SINICA. 2024, 52(9): 3010-3022. https://doi.org/10.12263/DZXB.20230077
    Abstract (1667) Download PDF (1030) HTML (1654)   Knowledge map   Save

    Feature extraction is a key operation for hyperspectral image (HSI) classification. For current classification approaches, they usually ignore the information preservation and spatial distribution in feature extraction, which may export features with low information utilization and disordered distribution, generating unsatisfactory prediction results. To remedy such deficiencies, a novel method based on structure-wise feature reconstruction is proposed for the HSI classification. This method can reduce the information loss and improve the information preservation during the process of feature extraction. In addition, the distribution is also fully considered to enhance the discriminability and separability. In this proposed method, considering the reconstruction idea and the self-expression theory, a structure-wise feature reconstruction model is constructed to extract the features of the HSI, which can improve the information utilization of original information from the HSI and describe the structure reflecting the well-ordered distribution. Here, an optimization with alternative updating is presented to solve the above constructed model. The support vector machine is finally used to classify the extracted features and predict the labels of the HSI. The Salinas, Pavia Center, Botswana, and Houston datasets are used for experimental validation. Results show that the proposed method achieves the better classification performance compared with some state-of-the-art approaches, which is averagely higher 2.6%, 3.9%, 3.3% at OA (Overall Accuracy), AA (Average Accuracy), and Kappa indexes.

  • PAPERS
    YANG Jing, LIU Cheng-cheng, HUANG Jie, LI Xia
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2091-2102. https://doi.org/10.12263/DZXB.20221276
    Abstract (1664) Download PDF (498) HTML (1575)   Knowledge map   Save

    A convex-optimum localization algorithm based on semidefinite relaxation is proposed for moving target localization from time delay, Doppler shift and angle of arrival measurements in distributed multiple-input multiple-output radar. This algorithm alleviates the threshold effect that the positioning error deviates from the Cramer-Rao lower bound (CRLB) when the measurement error is large. First, the localization problem is formulated as a maximum likelihood estimation problem, which is reformulated as a weighted least squares problem with constraints by introducing auxiliary variables and then a convex semidefinite programming (SDP) problem by performing semidefinite relaxation. The SDP problem is solved efficiently by using the interior-point method to obtain the target position and velocity estimates. Since the local optimal solution of the convex optimization problem is the global optimal solution, the proposed algorithm has good global convergence. Simulation results demonstrate that the proposed algorithm approaches the CRLB, and achieves higher localization accuracy and robustness than existing algorithms at relatively large measurement noise levels.

  • PAPERS
    JIA Xi-bin, YU Gao-yuan, WANG Luo, DENG Yu-hui, YANG Da-wei, YANG Zheng-han
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2053-2066. https://doi.org/10.12263/DZXB.20220919
    Abstract (1663) Download PDF (487) HTML (1538)   Knowledge map   Save

    Microvascular invasion (MVI) is an important factor for early recurrence and poor long-term prognosis in patients with hepatocellular carcinoma (HCC) after resection or transplantation. Therefore, it is of great clinical value to evaluate whether MVI exists in patients with HCC before operation. In recent years, deep learning has provided a valuable solution for MVI image diagnosis and evaluation. Nevertheless, due to the difficulties of data annotation and collection, the current researches mostly use computed tomography (CT) or magnetic resonance imaging (MRI) methods to collect single modal sequences in images independently, which lacks the comprehensive application of multimodal sequences in various imaging methods. In order to make more effective use of multimodal data of CT and MRI images and improve diagnosis efficiency under few-shot scenarios, an efficient multimodal montribution aware network is proposed in this paper. The modality grouping convolution and efficient multimodal adaptive weighting module in this network are used to to learn the diagnostic contribution of each modal information of CT or MRI under complex and diverse MVI representation with little computational cost introduced. The experiment is carried out on the clinical dataset collected by the third-class hospital. Result show that with the support of a small amount of labeled data,our method can achieve better MVI diagnostic performance than many deep neural networks based on attention mechanism,which provides an effective reference for professional doctors’ diagnostic analysis.

  • PAPERS
    JIANG Shun-rong, SHI Kun, ZHOU Yong
    ACTA ELECTRONICA SINICA. 2024, 52(9): 3023-3037. https://doi.org/10.12263/DZXB.20221299
    Abstract (1655) Download PDF (1129) HTML (1642)   Knowledge map   Save

    Micro-grid is a distributed small-scale power generation and distribution system, which has realized the circular flow of electricity through adjacent energy trading according to the different needs of prosumers. In order to develop optimal price and transaction strategies in energy trading of micro-grid, we proposed a double sealed bid (DSB) auction scheme according to the characteristics of consortium blockchain. Except met key economic properties (individual rationality, budget balance, and so on), this scheme would determine the final winner based on the users' offers, bids, volumes, average price and other factors. In the meanwhile, in order to protect the personal privacy of users in the auction process, we proposed the blockchain-based differential privacy (BDP) algorithm based on the differential privacy theory and the characteristics of the DSB auction scheme, which was satisfied with differential privacy demands and mean validity through privacy analysis and data validity analysis. Finally, we applied the BDP algorithm to the DSB auction scheme and realized a safe and efficient double energy auction privacy-preserving scheme—differential privacy-based double auction on blockchain (DPDAB), which not only developed the optimal price and transaction strategy but also protected the users' privacy in the process of auction. In addition, we analyzed the influence of the BDP algorithm on auction data and the data computation time overhead on the auction scheme through experiments, and proved the validity of the DPDAB scheme in terms of average benefit, user satisfaction and social welfare through comparative experiments.

  • PAPERS
    LI Zi-qiang, YANG Wei, YANG Xian-feng, LUO Lin
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2891-2899. https://doi.org/10.12263/DZXB.20230648
    Abstract (1641) Download PDF (328) HTML (1599)   Knowledge map   Save

    At present, deep active learning (DAL) in the classification data labeling work has achieved outstanding success. How to select samples to improve the performance of models is still a difficult problem in deep active learning. We proposes a semi-automatic classification data labeling method based on weak label dispute (Dispute about Weak Label-based Deep Active Learning, DWLDAL). The method iteratively selects samples that is difficult for model to distinguish, and manually annotate these sample. This method contains pseudo label generator and weak label generator, pseudo label generator is trained on accurately annotated datasets to generate pseudo label for unlabeled data; weak label generator is trained on random data subset with pseudo labels. Weak label generator committee are used to determine which unlabeled data is the most controversial and should be manually annotated. We conducted experimental validation on the common datasets IMDB (Internet Movie Database), 20NEWS (20NEWSgroup), and chnsenticorp (chnsenticorp_htl_all) to address the issue of text classification. Three different voting decision-making methods are evaluated from the perspective of the accuracy of data annotation and classification tasks. The F 1 score of data annotation in DWLDAL method is 30.22%, 14.07% and 2.57% higher than that in the existing method Snuba, respectively. The F 1 score of classification task in DWLDAL method is 1.01%, 22.72% and 4.83% higher than that in Snuba method, respectively.

  • PAPERS
    SHI Qing, YANG Fei-ran, CHEN Xian-mei, YANG Jun
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2131-2140. https://doi.org/10.12263/DZXB.20230339
    Abstract (1627) Download PDF (373) HTML (1413)   Knowledge map   Save

    The performance of existing sampling rate offset (SRO) estimation algorithms can be degraded significantly in low signal-to-noise ratio (SNR) conditions. To address this problem, we propose the frequency-sliding double-cross correlation processing (FS-DXCP) algorithm based on the subband secondary generalized cross-correlation function to estimate SRO. The proposed algorithm adopts a frequency-domain sliding window to construct the subband SGCC function matrix of the sensor signals. Then, by utilizing the singular value decomposition (SVD), we adaptively mitigate the influence of low SNR frequency bins on estimating secondary generalized cross-correlation functions. Finally, a higher precision SRO estimation is achieved by tracking the maximum point of the estimated SGCC function. Computer simulations show that the root mean squared error of the proposed method for sampling rate offset is 4.21 ppm when the SNR is -5 dB, which is about 8.17 ppm lower than that of the double-cross correlation processing with phase transform (DXCP-PHAT) algorithm. The proposed algorithm effectively improves the estimation accuracy of the SRO in low SNR conditions.

  • LI Jia-ning, YAO Peng, JIE Lu, TANG Jian-shi, WU Dong, GAO Bin, QIAN He, WU Hua-qiang
    ACTA ELECTRONICA SINICA. 2024, 52(4): 1103-1117. https://doi.org/10.12263/DZXB.20230967
    Abstract (1621) Download PDF (1924) HTML (1597)   Knowledge map   Save

    Von Neumann computer architecture faces the bottleneck of “storage wall”, which hindering the performance improvement of AI (Artificial Intelligence) computing. Computing-In-Memory (CIM) breaks the limitation of “storage wall” and greatly improves the performance of AI computing. At present, CIM schemes have been implemented in a variety of storage media. According to the type of calculation signal, CIM scheme can be divided into digital CIM and analog CIM scheme. CIM has greatly improved the performance of AI computing, but the further development still faces major challenges. This article provides a detailed comparative analysis of CIM schemes in different signal domains, pointing out the main advantages and disadvantages of each scheme, and also pointing out the challenges faced by CIM. We believe that with the cross level collaborative research and development of process integration, devices, circuits, architecture, and software toolchains, CIM will provide more powerful and efficient computing power for AI computing at the edge and cloud ends.

  • PAPERS
    FU Dong-lai, GAO Ze-an
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2407-2417. https://doi.org/10.12263/DZXB.20230565
    Abstract (1618) Download PDF (344) HTML (1549)   Knowledge map   Save

    Multi-graph learning is a very important learning paradigm. Compared with multi-instance learning, in multi-graph learning, a bag represents an object, and each graph in the bag corresponds to a sub-object. This data representation method can express the structural information of sub-objects. However, existing multi-graph learning methods not only implicitly assume that the graphs in the bag satisfy independent and identical distribution, but also mostly adopt the technical idea of transforming multi-graph learning problems into multi-instance learning problems. This type of multi-graph learning method easily loses the structural information of the graph itself and the relationships between graphs. In response to the above problems, a multi-graph learning method based on structure awareness is proposed to effectively learn the structural information of the graph itself and the relationships between graphs. This method uses graph kernels to retain the structural information of the graph itself by calculating the similarity between graphs, expresses the structural information between graphs by generating bag-level graphs, and designs a bag encoder to effectively learn the structural information between graphs. Experimental results on the NCI(1), NCI(109), and AIDB datasets show that compared with existing methods, the proposed method improved by 5.97%, 3.44%, 4.48%, and 2.56% in accuracy, precision, F 1 value, and AUC respectively. In terms of recall rate decreased by 2.12%.

  • SURVEY AND REVIEW
    ZHANG Jin-feng, ZHANG Jin-cheng, REN Ze-yang, SU Kai, HAO Yue
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2151-2160. https://doi.org/10.12263/DZXB.20240103
    Abstract (1600) Download PDF (377) HTML (1499)   Knowledge map   Save

    Diamond surface-channel field-effect transistor utilizes two-dimensional hole gas (2DHG) on the hydrogen-terminated diamond surface as the channel to realize the control on output current by input voltage, and it is the mainstream structure of diamond electronic devices. The 2DHG conductivity has a large range of controllable sheet density and a high saturation drift velocity. This paper reviewed the research progress of diamond field-effect transistors in DC, frequency, and power characteristics, and revealed that low mobility is the main limiting factor for the development of diamond-based low-power high-speed digital circuits, high-frequency devices, and high-power microwave devices. It summarized the theoretical and experimental research of a new doping mechanism similar to modulation doping that emerged for the diamond surface conductivity recently. At room temperature the 2DHG Hall mobility has increased to 680 cm2/Vs, and the relevant square resistance has decreased from about 10 kΩ/sq to 1.4 kΩ/sq, which is expected to cause a great improvement in the performance of diamond field-effect transistors.

  • PAPERS
    KANG Hai-yan, WANG Xiao-shi
    ACTA ELECTRONICA SINICA. 2024, 52(6): 1963-1976. https://doi.org/10.12263/DZXB.20220892
    Abstract (1596) Download PDF (952) HTML (1473)   Knowledge map   Save

    In the deep learning privacy protection based on differential privacy, the length of the training period and the allocation of the privacy budget directly restrict the utility of the deep learning model. In the existing methods of deep learning combined with differential privacy, the model training cycle is limited and the budget allocation of a large number of feature privacy is unreasonable, which leads to poor security and availability of the model. We propose a method of deep learning methods based on data feature relevance and adaptive differential privacy (RADP). First, the method uses the layer-by-layer correlation propagation algorithm to calculate the average correlation of each feature parameter and the output result on the original data set on the pre-trained model and uses the information entropy-based method to calculate the average correlation of each feature parameter. According to the privacy metric, the Laplace noise is adaptively added to the average correlation; on this basis, according to the average correlation of each feature parameter, the privacy budget is allocated reasonably, Laplace noise is added to the feature parameters; finally, theoretical analysis shows that the method proposed in this paper satisfies ε-differential privacy and take into account security and availability. Based on the experimental results on 3 real datasets MNIST, Fashion-MNIST, and CIFAR-10, the accuracy and average loss of RADP are better than those of the AdLM (Adaptive Laplace Mechanism) method,the DPSGD (Differential Privacy with Stochastic Gradient Descent) method and the DPDLIGDO (Differentially Private Deep Learning with Iterative Gradient Descent Optimization) method. Moreover, the stability of RADP method can still be maintained well.

  • PAPERS
    WANG Xin-rui, JI Yuan, ZHANG Yin, CHEN Hong-gang, MU Ting-zhou
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2291-2299. https://doi.org/10.12263/DZXB.20230049
    Abstract (1585) Download PDF (323) HTML (1597)   Knowledge map   Save

    Based on super pixel technology, a digital driven strategy for color silicon OLED (Organic Light Emitting Diode) micro-display is proposed. By reusing adjacent pixel information, a single pixel can be used for imaging multiple adjacent pixels to greatly improve the display resolution. A digital driving circuit for color OLEDoS (Organic Light Emitting Diode on Silicon) micro-display is designed. Under the condition of 120 Hz frame rate, 256 grey levels and 4K display resolution can be achieved while the circuit area and data transmission per second are only 50% of the traditional driving mode. The test results show that the average current range of OLED pixel realized by the driving circuit is 13.1 pA~3.74 nA, which can meet the demand of near-eye display of micro display.

  • PAPER
    WANG Yu, WANG Zhen, WEN Li-qiang, LI Wei-ping, ZHAO Wen
    ACTA ELECTRONICA SINICA. 2024, 52(9): 2950-2960. https://doi.org/10.12263/DZXB.20221187
    Abstract (1564) Download PDF (107) HTML (1554)   Knowledge map   Save

    The task of document-level relation extraction aims to extract facts from multiple sentences of unstructured documents, which is a key step in the construction of domain knowledge graph and knowledge answering application. The task requires that the model not only capture the complex interactions between entities based on the structural features of documents, but also deal with the serious long-tail category distribution problem. Existing table-based relation extraction models try to solve this issue, but they mainly model documents in two-dimensional “entity/entity” space, and use multi-layer convolutional network or restricted self-attention mechanism to extract the interaction features between entities, which cannot avoid the influence of category overlap and capture the directional features of relationships, resulting in the lack of decoupled semantic information of interaction. For the above challenges, this paper proposes a new document-level relation extraction model, named DRE-3DC (Document-Level Relation Extraction with Three-Dimensional Representation Combination Modeling), in which the “entity/entity” modeling extend to the form of three-dimensional “entity/entities/relationship” modeling method. Based on the deformable convolution in triple attention mechanism, the model effectively distinguishes and integrates the interaction features under different semantic space and adaptively captures the document structural features. At the same time, we propose a multi-task learning method to enhance the perception of relation category combination of documents to alleviate the long-tail distribution problem. The experimental results reveal better score on DocRED and Revisit-DocRED dataset respectively. The effectiveness of the proposed method was verified by ablation experiment, comparative analysis and example analysis.

  • PAPERS
    ZHANG Xin-yi, FANG Yi-hong, HUANG Xi-heng, ZENG Yan, QIN Yu-wen, XU Ou, LI Jiang-ping
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2074-2082. https://doi.org/10.12263/DZXB.20230078
    Abstract (1553) Download PDF (504) HTML (1475)   Knowledge map   Save

    An all-fiber few-mode erbium-doped amplifier was built to compare the effects of different pump modes and different pump directions on the gain characteristics of the three signal modes, LP01, LP11a, and LP11b. The experimental results show that the amplifier has the best performance under forward LP11 pumping. The signal gain is more than 20 dB, the differential modal gain (DMG) is less than 0.9 dB and the noise figure is less than 9.6 dB in the whole C band. At a signal input power of -10 dBm/mode, the gain of all three signal modes at 1 550 nm exceeds 20.8 dB, the DMG is as low as 0.3 dB, the noise figure of LP01 signaling light is lower than 6.2 dB, and the noise figure of LP11 signaling light is lower than 9.6 dB. Comparing the different pumping directions under the four pumping schemes, it can be found that the noise figure of the forward-pumped amplifier is the smallest, but the gains of the three signal modes are also smaller, while the gain of the higher-order signal modes is increased by using the backward-pumped one, but the noise figure will also become larger. Comparing the pumping modes, it can be found that compared with the LP01 pumping, the LP11 pumping can significantly increase the gain of the LP11 signaling light, and has less effect on the gain of the LP01 signaling light, which can reduce the DMG value.

  • PAPERS
    GUO Yue-hao, WANG Xian-peng, LAN Xiang, SU Ting
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2103-2111. https://doi.org/10.12263/DZXB.20230172
    Abstract (1537) Download PDF (297) HTML (1464)   Knowledge map   Save

    Frequency diversity array (FDA) radar was proposed by Antonik and Wicks in 2006. Since there is a frequency offset between each adjacent antenna of FDA radar, there exists two-dimensional dependence on range and angle in the transmitting array. For bistatic FDA-multiple input multiple output (MIMO) radar, direction of departure (DOD)- direction of arrival (DOA)-range information is coupled in the transmitting steering vector. How to decouple the three information has become the focus of research. In this paper, aiming at the problem of target parameter estimation of bistatic FDA-MIMO radar, a reduced-dimension multiple signal classification (RD-MUSIC) parameter estimation algorithm based on tensor framework is proposed. Firstly, in order to decouple the DOD and range information in the transmitting array, it is necessary to divide the transmitting array into subarrays. Then the signal subspace is obtained by high-order-singular value decomposition, and the two-dimensional spatial spectral function is constructed. Secondly, the dimension of spatial spectrum is reduced by Lagrange algorithm, so that it is only related to DOA, and the DOA estimation is obtained. Then the frequency increment between subarrays is used to decouple the DOD and range information, and eliminate the phase ambiguity at the same time. Finally, the DOD and range estimation automatically matched with DOA estimation are obtained. The proposed algorithm uses the multidimensional structure of high-dimensional data to improve the estimation accuracy. Meanwhile, the proposed RD-MUSIC algorithm can effectively reduce the computational complexity. Numerical experiments show the superiority of the proposed algorithm.

  • PAPERS
    ZHANG Zi-xu, JIAN Zhi-hua
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2141-2150. https://doi.org/10.12263/DZXB.20230246
    Abstract (1532) Download PDF (244) HTML (1426)   Knowledge map   Save

    In any-to-any voice conversion, the encoder was usually utilized to disentangle the same speaker’s speech and then the decoder was used for self-reconstruction in the training phase, but the decoder in the conversion phase coupled the content information of source speech and the personality characteristics of target speech. Therefore, there existed performance mismatch between the decoder in the conversion phase and the training phase, which deteriorated the performance of voice conversion. This paper proposed a voice conversion method named DERS-VC (Double Exchange Representation Separation Voice Conversion) using double exchange representation separation. In self-reconstruction process of the training phase, the proposed method applied the speech of the same speaker to simulate the voice of different target speakers for self-supervised training. Meanwhile, the conversion invariance loss and the cycle consistency loss were introduced, and the cycle process of separation was conducted by double exchange representation separation to make the self-reconstructed speech closer to the original speech. The experimental results demonstrated that DERS-VC had an average reduction of 4.03% in MCD (Mel-Cepstral Distortion), and had an increment of 3.62% in MOS (Mean Opinion Score), compared with the AGAIN-VC (Activation Guidance and Adaptive Instance Normalization Voice Conversion) method, and the quality and similarity of converted speech both had been improved. This shows that the method of double exchange representation separation can decrease the mismatch of the decoder and improve the performance of any-to-any voice conversion.

  • PAPERS
    ZOU Guang-nan, YOU Qi-di, JIN Xing-hu, MA Yong-chun, LI Jie-yu
    ACTA ELECTRONICA SINICA. 2024, 52(6): 1903-1910. https://doi.org/10.12263/DZXB.20230661
    Abstract (1511) Download PDF (522) HTML (1433)   Knowledge map   Save

    Cloud-edge computing for the Internet of vehicle (CEIoV) can support real-time access and service requests of large-scale vehicles. In order to ensure the security of its internal resources, vehicle identity usually needs to be validated before it can access CEIoV. However, because the vehicle itself is in the running state and moreover its computing, storage and communication resources are limited, the existing identity authentication protocol cannot be directly applied to authenticate a running vehicle in CEIoV. Therefore, this paper proposes a lightweight continuous authentication (LCA) protocol to realize vehicle authentication and guarantee the security of CEIoV internal resources. LCA is designed based on chameleon Hash function, whose implementation requires simple cryptographic operations and is easy to be deployed on the resource-limited devices. By using random oracle model, the semantic security of LCA is proved strictly. At the same time, the experimental results show that LCA has lower computational and communication costs in the continuous authentication process compared with prior schemes.

  • PAPERS
    WU Yu-xuan, YU Hui-qun, FAN Gui-sheng
    ACTA ELECTRONICA SINICA. 2024, 52(8): 2878-2890. https://doi.org/10.12263/DZXB.20230523
    Abstract (1507) Download PDF (426) HTML (1468)   Knowledge map   Save

    Since the traffic flow is affected by multiple factors such as periodic characteristics and unexpected conditions, the prediction accuracy of existing models cannot satisfy the practical requirements. Under this background, this paper proposes a multimodal collaborative traffic flow prediction model based on error compensation (MCEC). To address the problem that traditional prediction models cannot take account of time series and covariates, this paper proposes a feature expansion method based on wavelet analysis, which introduces a clustering algorithm to obtain holiday labeling features, and uses congestion index, traffic accident map, and weather information as expanded features, and decomposes them on multiple scales. In the training phase, a multimodal collaborative model training was designed by adopting ARIMA (AutoregRessive Integrated Moving Average) model, LSTM (Long-Short-Term Memory network), a restricted dynamic time regularization technique, and a self-attentive mechanism to achieve the effect of fully learning each part of the data and optimally matching the model. In the error compensation stage, the obtained corresponding process values are input into the error compensation module based on SVR (Support Vector Regression) to learn and compensate the errors of each component, and reconstruct the prediction results. The MCEC is validated using a publicly available real highway data set. The results of a large number of comparison experiments at multiple time intervals show that the MAPE (Mean Absolute Percentage Error) of MCEC in traffic flow prediction reaches 17.02%,which has a higher prediction accuracy than other prediction models such as LSTM-SVR, ConvLSTM (Convolutional Long Short-Term Memory network), ST-GCN (Spatial Temporal Graph Convolutional Networks), MFFB (Multi-stream Feature Fusion Block), Transformer, indicating the validity and reasonableness of the MCEC model.

  • PAPERS
    CUI Yi-han, LIANG Yan, SONG Qian-qian, ZHANG Hui-xia, WANG Fan
    ACTA ELECTRONICA SINICA. 2024, 52(9): 2961-2970. https://doi.org/10.12263/DZXB.20230440
    Abstract (1477) Download PDF (1001) HTML (1471)   Knowledge map   Save

    With the increasing complexity of modern battlefield environment and the upgrading of aviation equipment technology, massive multi-source heterogeneous sensor data inevitably appear inconsistent and incomplete problems. Traditional multi-sensor fusion method ignores sensor features correlation, and forms a closed data-driven recognition system of sensors. Whereas expert cognition, domain experience, attribute rules and other knowledge can instruct model construction and inference recognition of comprehensive target recognition in the form of expert experience, rule constraints and so on, this paper presents a method of knowledge assisted integrated identification of aerial targets. First of all, a military combat knowledge map of typical aerial target features is constructed, and key feature parameters are extracted to establish a target identification framework model. Then data basic trust assignment and evidence conflict credibility are constructed at recognition and decision recognition level respectively. Besides, time-domain fusion rules for high-conflict evidence is formulated to adjust timing fusion weights by using historical data. Finally, type recognition of multi-sensor is hierarchically realized through static reasoning and dynamic fusion. This study recognition accuracy is better than the existing algorithms in typical aerial target recognition tasks, demonstrating the effectiveness of the proposed algorithm.

  • PAPERS
    GUO Zi-yue, QUAN Hui-min, PENG Zi-shun, DAI Yu-xing
    ACTA ELECTRONICA SINICA. 2024, 52(9): 3000-3009. https://doi.org/10.12263/DZXB.20230094
    Abstract (1476) Download PDF (1003) HTML (1458)   Knowledge map   Save

    Si/SiC cascaded H-bridge inverters enable a combination of different devices to ensure low output current total harmonic distortion (THD) and high device efficiency. However, this also presents the challenge of switching and assigning Si/SiC cells. In this paper, a model predictive control (MPC) with variable weight is designed to select the total switch state and assign the cell switch combination. In this method, a variable weight based on the switching loss of the device is introduced into the cost function of selecting the total switching state of the inverter and the switching combination of Si/SiC cells, to improve the efficiency and output current harmonic distortion rate of the inverter. The effectiveness of variable-weight MPC is verified on the five-level Si/SiC cascaded H-bridge inverter device, and the output current THD is reduced by up to 2.05% and the device loss is reduced by up to 4.53% compared with the fixed-weight MPC.

  • PAPERS
    JIANG Wen-tao, GAO Yuan, YUAN Heng, LIU Wan-jun
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2393-2406. https://doi.org/10.12263/DZXB.20240104
    Abstract (1470) Download PDF (611) HTML (1442)   Knowledge map   Save

    To extract more expressive and discriminative key features, reduce the loss of key features during network transmission, and improve the image classification ability of neural networks, a new image classification network of gating mechanism (GMNet) is proposed. Firstly, the shallow features are extracted using gated convolution, and the convolution operation is selectively performed through the gating mechanism to improve the network's ability to extract key features of the original image. Secondly, an interpolation gated convolution (IGC) module is designed, which combines Lanczos interpolation with gated convolution to enhance shallow features while extracting more discriminative features, improving the non-linear expression ability of features. Then, a large kernel gated attention mechanism (LGAM) module is designed, which combines large kernel attention with gated convolution to achieve selective enhancement and fusion of features, and improve the contribution of key region features. Finally, the large kernel gated attention mechanism module is embedded into the residual branch to enable the model to learn input data's features and contextual information more effectively, reduce the loss of key features during network information transmission, and improve the network's classification ability. The method achieved classification accuracy of 97.05%, 83.68%, 97.68%, 90.60%, and 83.05% on image datasets CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof, respectively, and improved on average by 3.26%, 7.08%, 3.44%, 2.65%, and 5.02% compared to current advanced methods. Compared with existing mainstream network models, the gated mechanism image classification network proposed in this paper can enhance the non-linear expression ability of features, extract more expressive and discriminative vital features, the loss of key features, improve the contribution of key region features, and effectively improve the image classification ability of neural networks.

  • PAPERS
    ZHANG Yu-tong, DENG Xin, XU Mai
    Acta Electronica Sinica. 2024, 52(1): 264-273. https://doi.org/10.12263/DZXB.20220893
    Abstract (1469) Download PDF (1293) HTML (1340)   Knowledge map   Save

    In recent years, significant progress has been made in multi-exposure image fusion in dynamic scenes. In particular, the deep learning based methods have shown great visual performance in dynamic multi-exposure image fusion, which have become the mainstream methods in high dynamic range (HDR) imaging. However, the current deep learning based methods are mostly implemented in a supervised manner, which heavily rely on the ground-truth images. That makes it difficult for them to work in real scenes. In this paper, we propose a self-supervised multi-exposure image fusion network for dynamic scenes. The main contributions of this paper are as follows: we design a self-supervised fusion network to explore the latent relationship between HDR and low dynamic range (LDR) images; we propose an attention mechanism based global deghosting module, to reduce the ghosting artifacts caused by moving objects; we propose a merging reconstruction module with residual and dense connections, to improve the reconstruction details; we design a motion mask guided self-supervised loss function to train the proposed network efficiently. Experimental results demonstrate the effectiveness of the proposed method. Compared with the state-of-the-art methods, our method achieves higher objective and subjective quality on reconstructed HDR images, with faster running speed.

  • PAPERS
    HUANG He, MA Rui-hua
    ACTA ELECTRONICA SINICA. 2024, 52(7): 2300-2306. https://doi.org/10.12263/DZXB.20240018
    Abstract (1464) Download PDF (565) HTML (1422)   Knowledge map   Save

    In this paper, a wideband, dual-polarized antenna with extremely low profile is developed for base station application. The antenna evolved from two fan-shaped dipoles that crossed each other. By adding annular branches and metallized through holes at the end of the dipole, its port input impedance increases when the antenna occupies a lower height. Besides, the flare angle of the fan-shaped arm is increased so that a second resonant point can be generated to achieve the purpose of expanding the bandwidth. The dual-polarized antenna can provide a bandwidth of 22% in the 2.17~2.7 GHz band. Because the two dipoles is highly symmetrical about the geometric center, the isolation degree and cross polarization discrimination are high in the working frequency band, among which the simulation value of the isolation degree can reach 51 dB, and the simulation value of the cross polarization discrimination in the 0° can reach 48 dB. In addition, the simulated peak gain of the antenna is as high as 9.6 dBi. The antenna has the advantages of high isolation, high cross-polarization discrimination and high gain, and has a good application prospect in the base station system.

  • PAPERS
    YOU Chun-xia, HU Qing-song, LI Shi-dang
    ACTA ELECTRONICA SINICA. 2024, 52(6): 2083-2090. https://doi.org/10.12263/DZXB.20220193
    Abstract (1463) Download PDF (486) HTML (1374)   Knowledge map   Save

    To solve the large fluctuation problem of signal-to-noise ratio of optical signals on the received plane in the indoor visible light communication system and improve the overall performance of the visible light communication system, a light source power optimization algorithm for the visible light communication system based on the swarm optimization algorithm — fireworks algorithm is proposed. The algorithm takes the signal-to-noise ratio factor of the wireless light-receiving plane as the optimization goal, optimizes the transmit power of each LED (Light Emitting Diode) light source, obtains the optimal signal-to-noise ratio factor on the receiving plane, and effectively reduces the optical signal fluctuation of the light-receiving plane. The results show that when the number of light sources is 16 point light sources, the signal-to-noise ratio factor on the receiving plane is reduced by 45% compared with the equal power light sources, which significantly improves the fluctuation amplitude of the received optical signal, thereby ensuring that the optical communication users at different positions can obtain the same communication quality. This method is suitable for any number and position of LED light sources, and is not limited by the number and positions of light sources. The comparative analysis shows that the signal-to-noise ratio factor of the receiving plane is smaller and the uniformity of the signal-to-noise ratio distribution is better with the increase of the number of light sources.