Recent advances in neural network architectures reveal the importance of diverse representations. However, simply integrating more branches or increasing the width for the diversity would inevitably increase model complexity, leading to prohibitive inference costs. In this paper, we revisit the learnable parameters in neural networks and showcase that it is feasible to disentangle learnable parameters to latent sub-parameters, which focus on different patterns and representations. This important finding leads us to study further the aggregation of diverse representations in a network structure. To this end, we propose Parameter Disentanglement for Diverse Representations (PDDR), which considers diverse patterns in parallel during training, and aggregates them into one for efficient inference. To further enhance the diverse representations, we develop a lightweight refinement module in PDDR, which adaptively refines the combination of diverse representations according to the input. PDDR can be seamlessly integrated into modern networks, significantly improving the learning capacity of a network while maintaining the same complexity for inference. Experimental results show great improvements on various tasks, with an improvement of 1.47% over Residual Network 50 (ResNet50) on ImageNet, and we improve the detection results of Retina Residual Network 50 (Retina ResNet50) by 1.7% mean Average Precision (mAP). Integrating PDDR into recent lightweight vision transformer models, the resulting model outperforms related works by a clear margin. The code is available at: PDDR

A approach for natural forest site quality evaluation based on site productive potential has been proposed by Fu, et al. This approach could be used to evaluate the productive of different site types quantitatively by combining of environmental and stand variables in theory. This study mainly focusing on the forest growth models in this approach by using stand basal area in the whole Jilin province, detailed modelling approaches of growth models, such as model selection, model parameterization, parameter estimation and model evaluation, have been proposed.
The computational formula for the current annual increment of basal area in the two situations with the basal area models containing age directly or indirectly had been derived using mathematical theory by assuming identity trees growth in stand. A criterion for testing the relationship of basal area current annual increment and stand density site whether being the monotonic function was proposed. The effects of different categorical variables on basal area were descripted by dummy variable approach, namely model parameterization. The parameters in the basal area growth model with parameterization were estimated by modified least square method. Stand basal area models were developed based on the four continuous measured data from 3 634 permanent plots with size of 0.06 hm2 in Jilin province.
The relationship of basal area annual increment and stand density site whether being the monotonic function was tested effectively using the approach proposed in this study, which would provide an important criterion for model selection. Considering parameterization in the developed model not only explained the differences among different levels in variables effectively but also improved the precision of the developed models. The parameters in the dummy models could be estimated effectively by modified least square approach.
The approach for developing natural forest growth model proposed in this study could be used as a technical support for Fu et al. natural site quality evaluation approach.

Few-shot Action Recognition (FSAR) has been a heat topic in various areas, such as computer vision and forest ecosystem security. FSAR aims to recognize previously unseen classes using limited labeled video examples. A principal challenge in the FSAR task is to obtain more action semantics related to the category from a few samples for classification. Recent studies attempt to compensate for visual information through action labels. However, concise action category names lead to less distinct semantic space and potential performance limitations. In this work, we propose a novel Semantic-guided Video Multimodal Fusion Network for FSAR (SVMFN-FSAR). We utilize the Large Language Model (LLM) to expand detailed textual knowledge of various action categories, enhancing the distinction of semantic space and alleviating the problem of insufficient samples in FSAR tasks to some extent. We perform the matching metric between the extracted distinctive semantic information and the visual information of unknown class samples to understand the overall semantics of the video for preliminary classification. In addition, we design a novel semantic-guided temporal interaction module based on Transformers, which can make the LLM-expanded knowledge and visual information complement each other from both the temporal dimension and the channel, and improve the quality of feature representation in samples. Experimental results on three few-shot benchmarks, Kinetics, UCF101, and HMDB51, consistently demonstrate the effectiveness and interpretability of the proposed method.