RGB-Infrared person re-IDentification (re-ID) aims to match RGB and infrared (IR) images of the same person. However, the modality discrepancy between RGB and IR images poses a significant challenge for re-ID. To address this issue, this paper proposes a Proxy-based Embedding Alignment (PEA) method to align the RGB and IR modalities in the embedding space. PEA introduces modality-specific identity proxies and leverages the sample-to-proxy relations to learn the model. Specifically, PEA focuses on three types of alignments: intra-modality alignment, inter-modality alignment, and cycle alignment. Intra-modality alignment aims to align sample features and proxies of the same identity within a modality. Inter-modality alignment aims to align sample features and proxies of the same identity across different modalities. Cycle alignment requires that a proxy is aligned with itself after tracing it along a cross-modality cycle (e.g., IR→RGB→IR). By integrating these alignments into the training process, PEA effectively mitigates the impact of modality discrepancy and learns discriminative features across modalities. We conduct extensive experiments on several RGB-IR re-ID datasets, and the results show that PEA outperforms current state-of-the-art methods. Notably, on SYSU-MM01 dataset, PEA achieves 71.0% mAP under the multi-shot setting of the indoor-search protocol, surpassing the best-performing method by 7.2%.
- Article type
- Year
- Co-author
A Brain-Computer Interface (BCI) aims to produce a new way for people to communicate with computers. Brain signal classification is a challenging issue owing to the high-dimensional data and low Signal-to-Noise Ratio (SNR). In this paper, a novel method is proposed to cope with this problem through sparse representation for the P300 speller paradigm. This work is distinguished using two key contributions. First, we investigate sparse coding and its feasibility for brain signal classification. Training signals are used to learn the dictionaries and test signals are classified according to their sparse representation and reconstruction errors. Second, sample selection and a channel-aware dictionary are proposed to reduce the effect of noise, which can improve performance and enhance the computing efficiency simultaneously. A novel classification method from the sample set perspective is proposed to exploit channel correlations. Specifically, the brain signal of each channel is classified jointly using its spatially neighboring channels and a novel weighted regulation strategy is proposed to overcome outliers in the group. Experimental results have demonstrated that our methods are highly effective. We achieve a state-of-the-art recognition rate of 72.5%, 88.5%, and 98.5% at 5, 10, and 15 epochs, respectively, on BCI Competition III Dataset II.