[1]
Z. Yang, Y. Liu, and C. Ouyang, Causal intervention-based few-shot named entity recognition, in Proc. EMNLP 2023, doi: 10.18653/v1/2023.findings-emnlp.1046.
[3]
X. Ma and E. Hovy. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF, arXiv preprint arXiv: 1603.01354, 2016.
[4]
G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, Neural architectures for named entity recognition, arXiv preprint arXiv: 1603.01360, 2016.
[5]
M. E. Peters, W. Ammar, C. Bhagavatula, and R. Power, Semi-supervised sequence tagging with bidirectional language models, arXiv preprint arXiv: 1705.00108, 2017.
[6]
J. Snell, K. Swersky, and R. S. Zemel, Prototypical networks for few-shot learning, arXiv preprint arXiv: 1703.05175, 2017.
[7]
A. Fritzler, V. Logacheva, and M. Kretov, Few-shot classification in named entity recognition task, in Proc. 34th ACM/SIGAPP Symp. on Applied Computing. Limassol, Cyprus, 2019, pp. 993–1000.
[8]
Y. Yang and A. Katiyar, Simple and effective few-shot named entity recognition with structured nearest neighbor learning, arXiv preprint arXiv: 2010.02405, 2020.
[9]
Y. Hou, W. Che, Y. Lai, Z. Zhou, Y. Liu, H. Liu, and T. Liu, Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network, arXiv preprint arXiv: 2006.05702, 2020.
[11]
P. Wang, R. Xu, T. Liu, Q. Zhou, Y. Cao, B. Chang, and Z. Sui, An enhanced span-based decomposition method for few-shot sequence labeling, arXiv preprint arXiv: 2109.13023, 2021.
[12]
D. Yu, L. He, Y. Zhang, X. Du, P. Pasupat, and Q. Li, Few-shot intent classification and slot filling with retrieved examples, arXiv preprint arXiv: 2104.05763, 2021.
[13]
T. Ma, H. Jiang, Q. Wu, T. Zhao, and C.-Y. Lin, Decomposed meta-learning for few-shot named entity recognition, arXiv preprint arXiv: 2204.05751, 2022.
[14]
J. Pearl, Causality. Cambridge, UK: Cambridge University Press, 2009.
[15]
S. Thrun, Is learning the n-th thing anyeasier than learning the first? In Proc. of the 8th Int. Conf. on Neur. Information Processing Systems, Denver, CO, USA, 1995, pp. 640–646.
[16]
R. French, Catastrophic forgetting in connectionist networks, Trends Cogn. Sci., vol. 3, no. 4, pp. 128–135, 1999.
[17]
C. d. M. D’Autume, S. Ruder, L. Kong, and D. Yogatama, in Proc. of the 33rd Int. Conf. on Neural Information Processing Systems, arXiv preprint arXiv: 1906.01076, 2019.
[18]
M. Chen, W. Zhang, W. Zhang, Q. Chen, and H. Chen, Meta relational learning for few-shot link prediction in knowledge graphs, arXiv preprint arXiv: 1909.01515, 2019.
[19]
T. Gao, A. Fisch, and D. Chen, Making pre-trained language models better few-shot learners, arXiv preprint arXiv: 2012.15723, 2020.
[21]
T. Schick and H. Schütze, It’s not just size that matters: Small language models are also few-shot learners, arXiv preprint arXiv: 2009.07118, 2020.
[24]
X. Han, H. Zhu, P. Yu, Z. Wang, Y. Yao, Z. Liu, and M. Sun, FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation, arXiv preprint arXiv: 1810.10147, 2018.
[25]
R. Geng, B. Li, Y. Li, X. Zhu, P. Jian, and J. Sun, Induction networks for few-shot text classification, arXiv preprint arXiv: 1902.10482, 2019.
[26]
P. Wang, R. Xun, T. Liu, D. Dai, B. Chang, and Z. Sui, Behind the scenes: An exploration of trigger biases problem in few-shot event classification, in Proc. 30th ACM Int. Conf. Information & Knowledge Management, Virtual Event, 2021, pp. 1969–1978.
[27]
J. Y. Lim, K. M. Lim, C. P. Lee, and Y. X. Tan, SSL-ProtoNet: Self-supervised learning prototypical networks for few-shot learning, Expert Syst. Appl., vol. 238, p. 122173, 2024.
[29]
I. Sucholutsky and T. Griffiths. Alignment with human representations supports robust few shot learning, in Proc. of the 37th Int. Conf. on Neural Information Processing System, arXiv preprint arXiv:2301.11990, 2024.
[30]
Y. Tang, Z. Lin, Q. Wang, P. Zhu, and Q. Hu, AMU-tuning: Effective logit bias for clip-based few-shot learning, in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, arXiv preprint arXiv: 2404.08958, 2024.
[31]
L. Zhu, T. Chen, D. Ji, J. Ye, and J. Liu. LLaFS: When large language modelsmeet few-shot segmentation, in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, arXiv preprint arXiv: 2311.16926, 2024.
[32]
H. Ma, C. Zhang, Y. Bian, L., Z. Zhang, P. Zhao, S. Zhang, H. Fu, Q. Hu, and B. Wu, Fairness-guided few-shot prompting for large language models, in Proc. Advances in Neural Information Processing Systems 36 (NeurIPS 2023), arXiv preprint arXiv: 2303.13217, 2024.
[33]
M. Geng, S. Wang, D. Dong, H. Wang, G. Li, Z. Jin, X. Mao, and X. Liao, Large language models are few-shot summarizers: Multi-intent comment generation via In-context learning, in Proc. IEEE/ACM 46th Int. Conf. Software Engineering, Lisbon, Portugal, 2024, pp 1–13.
[34]
J. Ye, N. Xu, Y. Wang, J. Zhou, Q. Zhang, T. Gui, and X. Huang, LLM-DA: Data augmentation via large language models for few-shot named entity recognition, arXiv preprint arXiv: 2402.14568, 2024.
[35]
H. Liu, W. Zhang, J. Xie, B. Kim, Z. Zhang, and Y. Chai, Few-shot learning for chronic disease management: Leveraging large language models and multi-prompt engineering with medical knowledge injection, arXiv preprint arXiv: 2401.12988, 2024.
[36]
B. Kulis, Metric learning: A survey, Found. Trends® Mach. Learn., vol. 5, no. 4, pp. 287–364, 2013.
[37]
O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, Matching networks for one shot learning, in Proc. of the 30th Int. Conf. on Neural Information Processing Systems, arXiv preprint arXiv: 1606.04080, 2016.
[40]
S. Rao, J. Huang, and Z. Tang, RDProtoFusion: Refined discriminative prototype-based multi-task fusion for cross-domain few-shot learning, Neurocomputing, vol. 599, p. 128117, 2024.
[44]
N. Ding, G. Xu, Y. Chen, X. Wang, X. Han, P. Xie, H.-T. Zheng, and Z. Liu, Few-NERD: A few-shot named entity recognition dataset, arXiv preprint arXiv: 2105.07464, 2021.
[45]
L. Cui, Y. Wu, J. Liu, S. Yang, and Y. Zhang, Template-based named entity recognition using BART, arXiv preprint arXiv: 2106.01760, 2021.
[46]
S. S. S. Das, A. Katiyar, R. J. Passonneau, and R. Zhang, CONTaiNER: Few-shot named entity recognition via contrastive learning, arXiv preprint arXiv: 2109.07589, 2021.
[47]
B. Athiwaratkun, C. N. dos Santos, J. Krone, and B. Xiang, Augmented natural language for generative sequence labeling, arXiv preprint arXiv: 2009.13272, 2020.
[48]
Y. Wang, H. Chu, C. Zhang, and J. Gao, Learning from language description: Low-shot named entity recognition via decomposed framework, arXiv preprint arXiv: 2109.05357, 2021.
[49]
G. Dong, Z. Wang, J. Zhao, G. Zhao, D. Guo, D. Fu, T. Hui, C. Zeng, K. He, X. Li, et al., A multi-task semantic decomposition framework with task-specific pre-training for few-shot NER, in Proc. 32nd ACM Int. Conf. Information and Knowledge Management, Birmingham, UK, 2023, pp 430–440.
[50]
Y. Li, Y. Yu, and T. Qian, Type-aware decomposed framework for few-shot named entity recognition, arXiv preprint arXiv: 2302.06397, 2023.
[51]
W. Chen, L. Zhao, P. Luo, T. Xu, Y. Zheng, and E. Chen, HEProto: A kierarchical enhancing ProtoNet based on multi-task learning for few-shot named entity recognition, in Proc. 32nd ACM Int. Conf. Information and Knowledge Management, Birmingham, UK, 2023, pp. 296–305.
[52]
S. Bogdanov, A. Constantin, T. Bernard, B. Crabbé, and E. Bernard, NuNER: Entity recognition encoder pre-training via LLM-annotated data, arXiv preprint arXiv: 2402.15343, 2024.
[53]
X. Yang, H. Zhang, G. Qi, and J. Cai, Causal attention for vision-language tasks, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 9842–9852.
[54]
Y. Wu, K. Kuang, Y. Zhang, X. Liu, C. Sun, J. Xiao, Y. Zhuang, L. Si, and F. Wu, De-biased court’s view generation with causality, in Proc. 2020 Conf. Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 2020, pp. 763–780.
[55]
A. Coucke, A. Saade, A. Ball, T. Bluche, A. Caulier, D. Leroy, C. Doumouro, T. Gisselbrecht, F. Caltagirone, T. Lavril, et al., Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces, arXiv preprint arXiv: 1805.10190, 2018.
[56]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv: 1810.04805, 2018.
[57]
I. Loshchilov and F. Hutter, Decoupled weight decay regularization, arXiv preprint arXiv: 1711.05101, 2017.
[58]
J. Ma, Z. Yan, C. Li, and Y. Zhang, Frustratingly simple few-shot slot tagging, in Proc. Findings of the Association for Computational Linguistics : ACL-IJCNLP 2021, Virtual Event, 2021, pp. 1028–1033.
[59]
M. Henderson and I. Vulić, ConVEx: Data-efficient and few-shot slot labeling, arXiv preprint arXiv: 2010.11791, 2020.