This paper proposes a contactless human vital-sign sensing method in non-line-of-sight (NLoS) scenarios, which integrates reconfigurable intelligent surfaces (RISs) into wireless human sensing, and achieves NLoS human vital-sign sensing by utilizing the flexible manipulation capability of RISs over electromagnetic waves. Firstly, through a visual-aided mechanism, this paper locates human targets using the deep learning model Yolo-v7. Then, the metasurface optimizes the coding matrix based on the estimated human location, thereby altering the propagation path of sensing signals to achieve human sensing in non-line-of-sight scenarios. Additionally, addressing the shortcomings of traditional variational mode decomposition (VMD) algorithms in human vital-sign sensing, we further propose an improved VMD algorithm to achieve precise estimation of breathing rate and heartbeat rate. Experimental results demonstrate that the visual-aided module can accurately locate the position of human chest, thus enabling precise beamforming controlled by the intelligent metasurface. The proposed filtering-based VMD algorithm can achieve precise estimation of human vital signs, with average estimation errors of 0.6 RPM for respiration and 5.3 BPM for heartbeat, respectively. Furthermore, further analysis also demonstrates the effectiveness and accuracy of the RIS-based sensing scheme for human vital-sign sensing in NLoS scenarios.
With the development of mobile communication technology, 6G (6th-Generation) networks, as the next generation of intelligent digital information infrastructure, will no longer only focus on the transmission and reproduction of signals. Instead, they will need to achieve efficient perception and understanding of the surrounding environment based on the electromagnetic propagation, acquire semantic knowledge to assist intelligent communication agents in prediction, decision, beamforming, and more. Therefore, compared to traditional channel models, improving wireless channel models with the ability to understand, reconstruct, and express the physical environment’s semantics has become an important characteristic of intelligent wireless channel models. This paper proposes a method for semantic analysis and modeling of wireless channels, which includes three levels of semantics: state semantics, behavior semantics, and event semantics, corresponding to the instant multipath of the channel, the time-varying trajectory of the channel, and the topological structure of the channel, respectively. In addition, based on the vehicular integrated sensing and communication (ISAC) channel measurement platform, this paper conducts semantic-oriented wireless channel measurements at 28 GHz. The channel semantics are decomposed, identified, and modeled based on the measured data, with a focus on analyzing the multipath distribution characteristics of the channel under three different semantics, and completing the semantic-guided channel generation. The results show that the channel semantics model can generate more accurate channels while expressing richer semantic information. The work in this paper explores new methods for intelligent channel modeling at the semantic level, promoting the ability of communication systems to understand and recognize the environment by deeply mining semantic features of wireless channels, thereby improving communication efficiency and quality.
Tactile sensors are developed to mimic human tactile perception, they are able to detect and quantify mechanical or thermal stimuli generated during physical contact and convert the information into electrical signals, empowering electronic systems the ability to sense touch. So that tactile sensors have attracted significant attention in the fields of human-machine interface and robotics. With the development of nanotechnology, material science and information technology, tactile sensors advance to flexibility and miniaturization in order to adapt to the application on complex curved surfaces and movable surfaces. Screen printing, as a mature planar graphic process, has been widely used in the processing of flexible electronics. Due to the characteristics of flexible material selection, low processing cost and fast production speed, screen printing has great potential in promoting the large-scale application of tactile sensor. In this paper, we review the current research status and recent progress of screen-printed tactile sensors, from the aspects of “the role of screen-printing”, “the mechanism of screen-printed tactile sensors”, and “the methods to improve the sensitivity of screen-printed tactile sensors”. In the first aspect, the roles of screen printing are classified according to the function of the screen-printed part, including printing “conductive layer”, “active layer”, and “structural layer”. In the second aspect, the two types of sensing mechanisms, active and passive sensing, are summarized separately, which further demonstrates the compatibility of the screen-printing process. In the third aspect, four kinds of structures fabricated by screen printing are summarized to show the potential for high sensitivity of screen-printed tactile sensors. As a result of the review, the methods of tactile-sensor fabrication based on screen printing are summarized, and the advantages of screen printing are revealed. Finally, based on the challenges faced by screen-printed tactile sensors, an outlook on the future direction of their development is given, to provide references for related research.
As one of the important parameters of the whispering gallery mode (WGM) optical microcavity, the high quality factor means that the microcavity can have better characteristics and broader application prospects in the fields of nonlinear optics, coherent optical communication and microwave photonics. In order to improve the quality factor of the microcavity, this paper controls the fabrication of the microcavity based on the automatic program, and optimized the polishing and annealing processes. The prepared silica microrod cavity has the characteristics of ultra-high quality factor, low cost and high efficiency, and the quality factor can reach up to 3×109. Based on the modular package scheme of integrated temperature control, the anti-interference ability of the microcavity coupling system is improved, and the frequency offset of one hour can be reduced by 10 times. With the ultra-low nonlinear threshold power (as low as 266 μW) brought by the ultra-high quality factor, this paper conducts sensing application experiments based on the Kerr optical frequency comb-generated by the microrod cavity, and realizes the multi-mode ambient temperature sensing with a sensing sensitivity of 8.40 pm/℃ and a measuring range of more than 30 ℃. The research results of this paper provide a powerful tool for high precision sensing, large capacity optical communication, low threshold laser and other applications.
To reduce the high analysis costs associated with the integrated structural electromagnetic optimization design of mesh antennas, a multi-fidelity method based on adaptive space mapping has been proposed. Based on the connection relationships between the cable and trusses, the analysis models of mesh antennas are classified into high-fidelity and low-fidelity. By using a space mapping matrix, high-fidelity samples are mapping to the space of low-fidelity, thereby enhancing the correlation between high and low fidelity analyses. Subsequently, a multi-fidelity model is established using the low-fidelity samples and mapped high-fidelity. Finally, apply it to mesh antennas. Compared to traditional multi-fidelity models, the multi-fidelity model based on space mapping achieved an average success rate increase of 47.3% on test functions with space biases. In the application case of form design for mesh antennas, compare to the traditional partical swarm optimization (PSO), the optimization results have been improved by an average of 0.515 dB while maintaining the same cost. Furthermore, compared to optimizations using the traditional multi-fidelity model, the optimization result improved by an average of 0.321 dB. The effectiveness of this method has been validated through numerical experiments and practical application of integrated structural electromagnetic optimization design of mesh antennas.
We investigated the optimal intersection problem of a direction-finding cross-location system composed of 1D and 2D passive sensors. Utilizing closed-form solutions for localization accuracy, extremum analysis, and geometric intersection analysis, we identified the global optimal intersection point and explored the spatial distribution characteristics of optimal intersection positions, as well as their influencing factors and underlying principles. The study reveals that the global optimal intersection point lies in the horizontal plane of the baseline (or 2D sensor). The optimal intersection locations are jointly determined by the geometric intersection characteristics and the distance diffusion effect of measurement errors, distributed around an arc on the horizontal plane with the midpoint of the baseline as the center and the baseline length as the diameter, collapsing towards the baseline. Variations in sensor positions do not affect the relative position of the optimal intersection location to the baseline; once the variance ratio of the baseline and angular measurement errors is established, the optimal intersection location is determined. Furthermore, case analysis suggests that the optimal intersection area converges towards sensor with larger angular measurement errors. In practical engineering applications, the optimal intersection area holds greater utility than the optimal intersection point; matching the optimal intersection locations with target detection results or estimated positions can effectively enhance the system’s positioning performance.
With the development of intelligent era, more and more devices have cameras and display screens, which are in various video formats with different interfaces. To fill the gaps, video bridging is widely required. The previous solutions adopted field programmable gate array (FPGA), graphics processing unit (GPU), and application specific integrated circuits (ASIC). However, it is difficult to meet the requirements of low cost and ultra-low power consumption and miniaturization, especially in the field of mobile display. This paper proposes a novel heterogeneous architecture which seamlessly integrates FPGA, microcontrol unit (MCU), ASIC, and memory into a single silicon chip. This chip not only achieves miniaturization, but also has the advantages of low cost and low power consumption; More importantly, this chip can support bridging requirements for different interfaces and video formats. At the same time, this paper provides evaluation methods and solutions for different algorithm applications, and provides a basis for architecture design. The chip has been successfully taped out in an industrial 22 nm process. It can support video input formats with a resolution of 3 840×2 160 and a refresh rate of 144 Hz, as well as video output formats with a resolution of 1 080×2 340 and a refresh rate of 90 Hz. The experimental results show that, in supporting the similar function, the overall chip size is about 4 mm×4 mm and the total power consumption is about only 200 mW, both of which are less than one tenth of AMD XC7K325T and Zynq Z7035. In other words, for applications in video bridging scenarios, our solution has significant optimization compared to traditional commercial FPGAs in terms of cost and power consumption.
The current evaluation of the post-quantum cryptography (PQC) standardization program by the National Institute of Standards and Technology (NIST) has entered the fourth round. Bit flipping key encapsulation (BIKE) is one of four candidates currently being evaluated. In the key generation of BIKE, the polynomial multiplication consumes a lot of time and area resources, which is also one of the slowest and most area consuming operations in most cryptography systems. In this work, we propose an overlap-free polynomial multiplier based on the Karatsuba algorithm (KA), which can efficiently implement polynomial multiplication of tens of thousands of bits with low latency, high performance and small area. This multiplier is applied to the BIKE key generation algorithm, which is implemented in hardware architecture based on the field programmable gate array (FPGA), improving the original compact polynomial multiplication and polynomial inversion algorithm. The multiplier proposed in this article can adapt to different requirements for area and delay by using different operand bit widths. Compared with BIKE’s original design, the improved design reduces the delay of the key generation module by 36.54% and the area delay production (ADP) by 10.4%.
The column-level readout circuit is the most obvious way to improve the readout efficiency of planar image sensors, but for the high-speed readout of large data and large loads on one hundred million of pixels, the design of column buffer in the parallel-serial conversion from column level to output level faces great challenges. In this paper, we propose a dual-feedback loop column buffer design method, which effectively suppresses the impact of the oversized parasitic parameters of the column bus on the setup time by implementing a dual-feedback loop between the proximal and distal outputs of the column buffer, and at the same time ensures the accuracy of the analogue signals under low-noise and high-dynamic conditions. Based on 55 nm complementary metal oxide semiconductor (CMOS) process, it has been successfully applied in a 12 288 × 12 288 pixel scale infrared image sensor. The results show that compared with the traditional column buffer, the dual-feedback loop column buffer proposed in this paper can shorten the rise build-up time by 23.4%, the fall build-up time by 21.9%, and the frame rate of one hundred million-level high-speed image sensor can be improved by more than 29.6%.
In this paper, the phenomenon of angle selection of InSb materials with near-zero refractive index is studied. The multilayer structure is composed of the main structure and anti-reflection structure. The main structure is used to generate the angle selection phenomenon, while the anti-reflection structure is used to suppress the leakage of electromagnetic wave. The results show that for the THz TE wave, a significant angle selection window is generated near 2.65 THz, while for the THz TM wave, the window is generated near 10.5 THz. For different polarization forms, temperature control can significantly adjust the angle range of the angle window. In addition, the critical angle characteristic is very sensitive to the refractive index change of the background medium, and the measurement range can be extended by temperature control. When the temperature is 300 K, 298 K, 296 K, respectively, the measurement range is 1.1~1.3 RIU, 1.3~1.5 RIU, 1.5~1.7 RIU. Compared with the constant temperature measurement, the refractive index measurement range is expanded by 200%. This structure uses a novel critical angle principle for refractive index sensing, and extends the detection range through temperature control, which provides a new idea for the development of optical sensors.
In this paper, a millimeter-wave ultra-wideband Wilkinson power divider with impedance compensation technique is presented. After adding parallel LC resonance networks at the output ports of traditional lumped-element power divider, the bandwidth can be increased effectively by compensating the input-output impedance and introducing additional matching and lossless frequency points. The proposed power divider is implemented using 65 nm complementary metal oxide semiconductor (CMOS) technology with a compact core area of 0.021 mm2. And the measurement results indicate that the insertion loss is 1.35~1.55 dB, reaching 0.2-dB amplitude bandwidth of 44~96 GHz, as well as the input-output return loss and isolation performance are all better than 11 dB across the entire operating frequency band. The proposed power divider achieves higher amplitude bandwidth with good port matching compared to previously published power dividers.
Traditional phased arrays, due to their high cost limitations, are no longer able to meet the growing demand for widespread applications. However, non-traditional phased array technologies based on sparse arrays, subarrays, and other technologies have received widespread attention and research. How to effectively draw molecular arrays and optimize the calculation process of sub arrays are key issues in improving computational efficiency and performance. This article proposes a nested iterative optimization method that integrates swarm intelligence optimization and clustering techniques to solve the problem of arbitrary shaped beam subarray partitioning. This method consists of two nested loop iterative optimization processes: (i) The outer loop uses swarm intelligence optimization method to achieve a reference array under any user-defined directional pattern, and analyzes multiple sets of different unit excitations (determined by the roots of the Shekunov polynomial distributed on a non unit circle) using Shekunov polynomial and basic algebraic theory; (ii) Based on the excitation matching strategy, the inner loop aims to achieve the optimal subarray layout and corresponding subarray excitation coefficients of the phased array through K-means clustering method, and ultimately generate a beam pattern that approximates the reference array. The effectiveness of the proposed method was verified by comparing it with traditional K-means clustering methods and particle swarm optimization methods in terms of pattern approximation, excitation matching error, pattern matching error, array performance parameters, and computational efficiency.
In unsupervised domain adaptation tasks, the source and target domains usually do not satisfy the independent and identical distribution assumption. In order to generate the usable labels for the target domain, classical domain adaptation methods select the category with the highest prediction probability of the classifier as the pseudo-label of the target sample. Thus, the pseudo-label inevitably contains certain noise information, which may cause negative transfer to the domain adaptation model. In addition, traditional adversarial domain adaptation methods usually consider the global distribution between domains and ignore the category information of samples. How to extract discriminative category-level features in domain adaptation tasks is also an important problem. Therefore, an unsupervised adversarial domain adaptation method is proposed using feature anomaly detection and pseudo-label regression. The target samples of the same class predicted by the classifier are formed into the category subdomain within the target domain. The Gaussian uniform mixture model is used to detect the subdomain samples with abnormal distance from the class mean. The posterior probability of the samples is calculated and the correctness of the sample pseudo-labels in the subdomain is measured, which is used as a loss factor to limit the influence of pseudo-labels on the model in training. Meanwhile, the pseudo-label regression function is used to reduce the difference between the predicted label and the high-confidence pseudo-label of the classifier. The category constraint of the unlabeled target domain is adopted to improve the distinguishability of feature categories. Experimental results show that the average recognition accuracy of the proposed method on datasets Office-31, Image-CLEF, and Office-Home are 90.2%, 89.6%, and 69.5%, respectively, which are all higher than the related popular algorithms.
The improvement of urban digitalization has generated a large amount of data. Through the integrated analysis of traffic flow data and weather data, urban traffic congestion caused by various weather conditions can be effectively alleviated. However, in the existing traffic flow prediction algorithms, the potential spatial relationship in the traffic flow has not been fully considered, and the prediction errors caused by external factors such as weather are ignored, which greatly affects the accuracy of the prediction. In response to the above problems, this paper proposes a digital twin traffic flow prediction method TCM-DTFP (Two-graph Convolution Mechanism-based Digital twin flow Prediction) based on the double-graph convolution mechanism. The algorithm builds an augmented matrix that integrates traffic flow features and weather features, adds weather features to traffic flow data, avoids the impact of complex weather conditions on traffic flow prediction, and improves the robustness of the algorithm; at the same time in order to improve the algorithm’s ability to capture the spatial correlation of traffic flow, a two-graph convolution mechanism based on TCN (Temporal Convolutional Networks) is proposed to comprehensively consider the dynamic interaction between temporal correlation, spatial correlation and regional flow in traffic influence of flow. Finally, extensive experiments on two real datasets, TaxiBJ and PeMSD4, demonstrate the effectiveness of our method.
Social recommender systems based on graph neural networks (GNNs) have achieved promising performance. However, challenges exist in GNN-based social recommendation models, such as the neighborhood aggregation operation of GNN-based models amplifying noise in users' implicit behaviors, resulting in suboptimal user and item representations. Additionally, the heterogeneity of edges in the user-item graph and the user social relationship graph leads to user representations learned on two different semantic spaces, where direct fusion also results in suboptimal representations. To address these issues, this paper proposes a social recommendation model based on self-supervised graph convolution and an attention mechanism to achieve implicit feedback noise reduction. The model captures users' true interests from the original user-item graph, generating a denoised user-item interaction graph; a novel method is introduced for fusing user vectors to integrate heterogeneous user vector representations. Experimental results on two public datasets demonstrate that the proposed model significantly improves the recommendation performance over the baseline models. Specifically, on the lastfm dataset, the performance improvement ranges from 1.18% to 3.87%, while on the ciao dataset, the improvement ranges from 3.56% to 7.31%.The effectiveness of each module is verified through ablation experiments.
In the process of program defects and malicious code discovery, it is necessary to analyze the behavioral similarity of binary programs. Currently, syntax-based similarity analysis methods often ignore the execution semantics of the program, resulting in low analysis accuracy; In the process of generating symbolic logic formulas, semantic based analysis methods frequently call constraint solvers for semantic similarity comparison, resulting in significant time overhead. This article proposes a code similarity fuzzy matching analysis method based on statistical inference for binary programs. Starting from the calculation of instruction level similarity, the semantic similarity between basic blocks and functions is inferred step by step. Firstly, the binary code is divided into a set of fragments with a standardized form according to certain rules, and dynamic programming is used at the basic block granularity to construct a storage table with the same execution semantics for the longest common subsequence, thereby obtaining the initial semantic mapping of instructions between basic blocks; Then, the mapping is extended to the target analysis code through neighborhood search, and the execution semantics of the fragments are learned during this process; Finally, statistical analysis is performed on the results of similar fragments to calculate the similarity of binary codes. During the experiment, an unsupervised pre training analysis method was used to improve the accuracy of code similarity analysis by tuning the pre training model parameters. Experiments were conducted on 13 mainstream open-source projects from the perspective of cross platform and optimization options. The experimental results showed that compared to the comparison tools, the analysis accuracy of our method improved by an average of 7.26%, Meanwhile, ablation experiments have shown that the pre trained model proposed in this paper can effectively improve the semantic matching performance of binary programs.
Vehicle trajectory anomaly detection provides important security support for various location-based services. Machine learning-based methods, as the mainstream detection methods, have been widely applied in various fields such as transportation and military. However, due to the problem of noise labels, existing anomaly detection methods have poor performance in practical applications. To solve this problem, this paper proposes a vehicle trajectory anomaly detection method based on noise label re-weighting (RW-TAD). This method uses a self-supervised approach to construct a sample weight estimator, which evaluates the credibility of given labels by calculating the probability of trajectory generation. Then, a detector based on weighted loss is used to detect anomalous trajectories. During the training process, the RW-TAD model uses a collaborative optimization strategy based on a dual-layer loss to jointly learn the sample weight estimator and the detector. Experimental results show that this method can effectively alleviate the interference of noisy samples on model training and achieve good performance. Compared with existing methods, it has greatly improved in detection accuracy and performance stability.
Six degrees of freedom (6DoF) video, allowing users to experience the scene from omnidirectional and arbitrary perspective, is the development direction of the next-generation immersive video system. The windowed 6DoF video with limited degrees of freedom is a hot research topic in recent years. This paper proposes a subjective database and an objective quality assessment method for the windowed 6DoF synthesized video. For subjective database, we build a subjective quality database called Windowed-6DoF. The database contains 128 windowed 6DoF synthesized videos which involve discomfort caused by two viewpoint switching paths, distortions caused by four rendering schemes, and four levels of compression. Then subjective quality tests are conducted on the database and the test results are analyzed. For objective quality assessment, we design a no reference quality assessment method for windowed 6DoF synthesized video which fuses multilayer features. Tchebichef moment is used to extract the low layer shape features of temporal video slices. Resnet-50 network is used to extract the high-level semantic features of video in temporal and spatial domains, and consequently reduce the dimensionality of features. Finally, the random forest is used to fuse the low layer shape features and high layer semantic features, and train the quality assessment model of windowed 6DoF synthesized video. We respectively test the method on the proposed Windowed-6DoF database and IRCCyN/IVC DIBR database. The experimental results show that the Pearson linear correlation coefficient of the proposed method are 0.932 7 and 0.858 1, respectively. The predicted scores of the objective method are consistent with the subjective assessment scores.
With the popularity of social networks and the rapid growth of multimedia data, efficient cross-modal retrieval has attracted more and more attention. Hashing is widely used in cross-modal retrieval tasks due to its high retrieval efficiency and low storage cost. However, most of these deep learning-based cross-modal hashing retrieval methods utilize image networks and text networks to respectively generate corresponding modal hash codes, making it difficult to obtain more efficient hash codes and unable to further reduce the modal gap between different modal data. To better improve the performance of cross-modal hashing retrieval, this paper proposes a cross-modal dual hashing based on transfer knowledge (CDHTK). CDHTK performs cross-modal hashing retrieval tasks by combining an image network, a transfer knowledge network, and a text network. For the image modality, CDHTK combines the hash codes generated separately by the image network and the knowledge transfer network to generate discriminative hash codes. For the text modality, CDHTK fuses the hash codes generated separately by the text network and the knowledge transfer network to generate efficient hash codes. CDHTK employs a combination of cross-entropy loss for label prediction, joint triplet quantization loss for hash code generation, and differential loss for transfer knowledge to jointly optimize the hash code generation process, thereby improving the retrieval performance of the model. Experiments on two commonly used data sets (IAPR TC-12, MIR-Flickr 25K) verified the effectiveness of CDHTK, which outperforms the current state-of-the-art cross-modal hashing method ALECH (Adaptive Label correlation based asymmEtric Cross-modal Hashing) by 6.82% and 5.13%, respectively.
With the continuous development and maturity of personalized and diversified new network applications and services, the amount of data and computing demands are experiencing an exponential growth trend. Cloud computing, edge computing and intelligent terminal devices have been developed rapidly, and computing resources have shown a trend of ubiquitous and decentralized deployment. How to use these ubiquitous computing resources efficiently and collaboratively to meet the increasing computational demands has become an important new topic in the current network field. Edge compute first network focus on the edge of the network, near the location of the data source, combining heterogeneous computing resources and network resources to improve resource utilization and task execution efficiency through resource awareness, service positioning, task scheduling, maintaining low latency and low cost and realizing the optimal configuration of distributed computing resources at the same time. Edge compute first network usually adopts the distributed task scheduling mode. In distributed task scheduling, each node makes local decision based on local information, which has the advantages of short decision time and effective relief of calculation and communication pressure of central controller. However, the local and asymmetric nature of information limits the global optimization performance of distributed scheduling, resulting in an inadequate task coverage. This paper focuses on distributed task scheduling in edge compute first network. With the support of game theory and multi-objective optimization methods, a distributed task scheduling algorithm based on optimal dynamic response is designed, which introduces communication and consensus elimination mechanisms within a two-hop range. Under the conditions of minimizing interaction costs and scheduling delays, it maximizes the task coverage of distributed scheduling and achieves convergence to the Nash equilibrium point. A dynamic game model based on the optimization and consistency of distributed decision-making, with consensus elimination within a two-hop range as one of the optimization objectives, is established. The theoretical derivation demonstrates the asymptotic equivalence between local decisions and global decisions, providing an effective theoretical basis for the existence of Nash equilibria and the convergence of distributed scheduling. Finally, the effectiveness and optimization benefits of the proposed algorithm are validated through simulations and comparisons with a classical distributed decision-making algorithm and global optimal solutions.
With the deployment of ultra-high-definition (UHD) imaging technology, generating high-quality UHD images typically involves fusing multiple UHD images with varying exposure levels. However, current multi-exposure image fusion (MEF) methods based on deep learning perform direct fusion of feature maps extracted from images with different exposure levels. These methods fail to fully exploit the feature information in images with varying exposure levels, which is essential for achieving successful MEF outcomes. To address this problem, we develop a UHD multi-exposure image fusion approach that incorporates both local and long-range characteristics of images, and it aims to mine the dependencies of images with different exposure levels. By enforcing translation invariance and self-attention on images with varying exposure levels, we can extract higher-level semantics and features. Furthermore, we aggregate the resulting features of different granularity by utilizing shortcut connections at various levels. Finally, we propose the Gate-MLP with a gating mechanism for filtering features with noise to generate a high-quality UHD image. To better demonstrate the work for UHD MEF task, we also establish a UHD image dataset for MEF task. Extensive experimental results demonstrate that ourapproach significantly outperforms existing approaches for UHD multi-exposure image fusion task on a single 24G RAM GPU.
In cross-scene classification tasks, most domain adaptation (DA) methods typically focus on transfer tasks where the source domain data and the target domain data are obtained using the same sensor and share the same land cover class. However, the adaptive performance is significantly reduced when new classes are present in the target data. Moreover, many hyperspectral image (HSI) classification methods rely on a global representation mechanism, where representation learning is performed on samples with fixed-size windows, limiting their ability to effectively represent ground object classes. A framework called local representation few-shot learning (LrFSL) is proposed, which aims to overcome the limitations of global representation ability by constructing a local representation mechanism in few-shot learning. In this proposed framework, meta-tasks are created for all labeled source domain data and a few labeled target domain data, and scenario training is performed simultaneously using a meta-learning strategy. Additionally, an Intra-domain local representation block (ILR-block) is designed to extract semantic information from multiple local representations within each sample. Furthermore, the inter-domain local alignment block (ILA-block) is designed to align cross-domain class-wise distribution, thereby mitigating the impact of domain shift on few-shot learning. Experimental results on three publicly available HSI datasets demonstrate that the proposed method outperforms state-of-the-art methods by a significant margin.
With the development of edge computing, the training of deep learning models increasingly relies on the privacy data generated by a large number of edge devices. In this context, federated learning has drawn extensive attention from both academia and industry due to its prominent privacy protection capabilities. However, in practice, federated learning faces challenges such as inefficient training and suboptimal model quality due to data heterogeneity and limited computational resources. Inspired by the concept of knowledge distillation, this paper proposes an efficient federated learning algorithm, named efficient federated learning with lightweight self knowledge distillation (FedSKD). This algorithm utilizes lightweight self-distillation techniques to extract intrinsic knowledge during the training process, alleviating local model overfitting and enhancing its generalization capability. Furthermore, it aggregates the generalization capability of local models to a global model through server parameter aggregation, thereby improving the quality and convergence speed of the global model. Additionally, by employing a dynamic synchronization mechanism, it further enhances the accuracy and training efficiency of the global model. Experimental results demonstrate that FedSKD algorithm, under non-identically distributed data partition strategies, enhances model accuracy and training efficiency while reducing computational costs. On the CIFAR10/100, compared to the latest baseline FedMLD, the FedSKD achieved an average 2% improvement in accuracy and reduced the training cost by an average of 56%.
Data of application, acceptance, and funding of National Natural Science Foundation of China (NSFC) projects in the “Semiconductor Science and Information Devices” (F04) discipline for the year 2024 is analyzed in this paper. Firstly, it outlines the reform measures implemented by the NSFC in 2024, and it analyzes the application and funding status of various project types within the “Semiconductor Science and Information Devices” discipline, including General Programs, Young Scientists Fund Projects, Regional Science Fund Projects, Key Programs, Excellent Young Scientists Fund Projects, and National Science Fund for Distinguished Young Scholars. Secondly, it summarizes the distribution of supporting institutions and the pilot work of the Responsibility, Credibility, Contribution (RCC) project review mechanism during the application and funding process. Lastly, the paper offers a conclusion and outlook for the project application in the field of “Semiconductor Science and Information Devices” .
This report comprehensively reviews application and funding statistics of the information acquisition and processing area under division I of information science department of the national natural science foundation of China in 2024, which mainly focuses on the general program, young scientists fund, fund for less-developed regions, key program, excellent young scientists fund, national science fund for distinguished young scholars, science fund for creative research groups, basic science center program and young student basic research program. The development trend of research directions in the field are analyzed and the prospects for the work in next year are provided.