Few-shot Named Entity Recognition (NER) systems are designed to identify new categories of entities with a limited number of labeled examples. A major challenge encountered by these systems is overfitting, particularly pronounced in comparison to tasks with ample samples. This overfitting predominantly stems from spurious correlations, a consequence of biases inherent in the selection of a small sample set. In response to this challenge, we introduce a novel approach in this paper: a causal intervention-based method for few-shot NER. Building upon the foundational structure of prototypical networks, our method strategically intervenes in the context to obstruct the indirect association between the context and the label. For scenarios restricted to 1-shot, where contextual intervention is not feasible, our method utilizes incremental learning to intervene at the prototype level. This not only counters overfitting but also aids in alleviating catastrophic forgetting. Additionally, to preliminarily classify entity types, we employ entity detection methods for coarse categorization. Considering the distinct characteristics of the source and target domains in few-shot tasks, we introduce sample reweighting to aid in model transfer and generalization. Through rigorous testing across multiple benchmark datasets, our approach consistently sets new state-of-the-art benchmarks, underscoring its efficacy in few-shot NER applications.
- Article type
- Year
- Co-author
Word embedding has drawn a lot of attention due to its usefulness in many NLP tasks. So far a handful of neural-network based word embedding algorithms have been proposed without considering the effects of pronouns in the training corpus. In this paper, we propose using co-reference resolution to improve the word embedding by extracting better context. We evaluate four word embeddings with considerations of co-reference resolution and compare the quality of word embedding on the task of word analogy and word similarity on multiple data sets. Experiments show that by using co-reference resolution, the word embedding performance in the word analogy task can be improved by around 1.88