Few-Shot Graph Classification with Structural-Enhanced Contrastive Learning for Graph Data Copyright Protection

Kainan Zhang; DongMyung Shin; Daehee Seo; Zhipeng Cai

doi:10.26599/TST.2023.9010071

| Sign up

PDF (7.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (4)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Tables (4)

Table 1

Table 2

Table 3

Table 4

Open Access

Few-Shot Graph Classification with Structural-Enhanced Contrastive Learning for Graph Data Copyright Protection

Kainan Zhang^¹, DongMyung Shin^², Daehee Seo^³, Zhipeng Cai^¹()

1Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA

2LSWare Inc., Seoul 08504, Republic of Korea

3College of Intelligence Information Engineering, Sangmyung University, Seoul 03016, Republic of Korea

Show Author Information

Abstract

Open-source licenses can promote the development of machine learning by allowing others to access, modify, and redistribute the training dataset. However, not all open-source licenses may be appropriate for data sharing, as some may not provide adequate protections for sensitive or personal information such as social network data. Additionally, some data may be subject to legal or regulatory restrictions that limit its sharing, regardless of the licensing model used. Hence, obtaining large amounts of labeled data can be difficult, time-consuming, or expensive in many real-world scenarios. Few-shot graph classification, as one application of meta-learning in supervised graph learning, aims to classify unseen graph types by only using a small amount of labeled data. However, the current graph neural network methods lack full usage of graph structures on molecular graphs and social network datasets. Since structural features are known to correlate with molecular properties in chemistry, structure information tends to be ignored with sufficient property information provided. Nevertheless, the common binary classification task of chemical compounds is unsuitable in the few-shot setting requiring novel labels. Hence, this paper focuses on the graph classification tasks of a social network, whose complex topology has an uncertain relationship with its nodes’ attributes. With two multi-class graph datasets with large node-attribute dimensions constructed to facilitate the research, we propose a novel learning framework that integrates both meta-learning and contrastive learning to enhance the utilization of graph topological information. Extensive experiments demonstrate the competitive performance of our framework respective to other state-of-the-art methods.

Keywords

few-shot learning contrastive learning data copyright protection

References

[1]

Zheng

and Z.

Cai

, Privacy-preserved data sharing towards multiple parties in industrial IoTs, IEEE J. Sel. Areas Commun., vol. 38, no. 5, pp. 968–979, 2020.

Crossref Google Scholar

[2]

Cai

and X.

Zheng

, A private and efficient mechanism for data uploading in smart cyber-physical systems, IEEE Trans. Netw. Sci. Eng., vol. 7, no. 2, pp. 766–775, 2020.

Crossref Google Scholar

[3]

Cai

, X.

Zheng

, J.

Wang

, and Z.

, Private data trading towards range counting queries in Internet of Things, IEEE Trans. Mob. Comput., vol. 22, no. 8, pp. 4881–4897, 2023.

Crossref Google Scholar

[4]

Cai

, Z.

, X.

Guan

, and Y.

, Collective data-sanitization for preventing sensitive information inference attacks in social networks, IEEE Trans. Dependable Secure Comput., vol. 15, no. 4, pp. 577–590, 2018.

Google Scholar

[5]

Liang

, Z.

Cai

, J.

, Q.

Han

, and Y.

, Deep learning based inference of private information using embedded sensors in smart devices, IEEE Netw., vol. 32, no. 4, pp. 8–14, 2018.

Crossref Google Scholar

[6]

Huang

, Y. J.

, and Z.

Cai

, Security and privacy in metaverse: A comprehensive survey, Big Data Mining and Analytics, vol. 6, no. 2, pp. 234–247, 2023.

Crossref Google Scholar

[7]

I. K.

Nti

, J. A.

Quarcoo

, J.

Aning

, and G. K.

Fosu

, A mini-review of machine learning in big data analytics: Applications, challenges, and prospects, Big Data Mining and Analytics, vol. 5, no. 2, pp. 81–97, 2022.

Crossref Google Scholar

[8]

, S.

Pan

, F.

Chen

, G.

Long

, C.

Zhang

, and P. S.

, A comprehensive survey on graph neural networks, IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 1, pp. 4–24, 2021.

Crossref Google Scholar

[9]

B. M.

Oloulade

, J.

Gao

, J.

Chen

, T.

Lyu

, and R.

Al-Sabri

, Graph neural architecture search: A survey, Tsinghua Science and Technology, vol. 27, no. 4, pp. 692–708, 2022.

Crossref Google Scholar

[10]

Zhang

, Z.

Cui

, M.

Neumann

, and Y.

Chen

, An end-to- end deep learning architecture for graph classification, Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, pp. 4438–4445, 2018.

Crossref Google Scholar

[11]

Duan

, J.

Wang

, H.

, and Y.

Sun

, Residual convolutional graph neural network with subgraph attention pooling, Tsinghua Science and Technology, vol. 27, no. 4, pp. 653–663, 2022.

Crossref Google Scholar

[12]

Errica

, M.

Podda

, D.

Bacciu

, and A.

Micheli

, A fair comparison of graph neural networks for graph classification, arXiv preprint arXiv: 1912.09893, 2019.

Google Scholar

[13]

Zhao

, H.

Chen

, L.

, and H.

Wan

, Understanding social relationships with person-pair relations, Big Data Mining and Analytics, vol. 5, no. 2, pp. 120–129, 2022.

Crossref Google Scholar

[14]

Eksombatchai

, P.

Jindal

, J. Z.

Liu

, Y.

Liu

, R.

Sharma

, C.

Sugnet

, M.

Ulrich

, and J.

Leskovec

, Pixie: A system for recommending 3+ billion items to 200+ million users in real-time, in Proc. 2018 World Wide Web Conference, 2018, pp. 1775–1784.

Crossref Google Scholar

[15]

Wang

, Q.

Yao

, J. T.

Kwok

, and L. M.

, Generalizing from a few examples: A survey on few-shot learning, ACM Comput. Surv., vol. 53, no. 3, p. 63, 2020.

Crossref Google Scholar

[16]

Chen

, Z.

Liu

, H.

, T.

Darrell

, and X.

Wang

, Meta-baseline: Exploring simple meta-learning for few-shot learning, in Proc. 2021 IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2021, pp. 9042–9051.

Crossref Google Scholar

[17]

Hospedales

, A.

Antoniou

, P.

Micaelli

, and A.

Storkey

, Meta-learning in neural networks: A survey, IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 9, pp. 5149–5169, 2021.

Crossref Google Scholar

[18]

Vinyals

, C.

Blundell

, T.

Lillicrap

, K.

Kavukcuoglu

, and D.

Wierstra

, Matching networks for one shot learning, in Proc. 30th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 3637–3645.

Google Scholar

[19]

Snell

, K.

Swersky

, and R.

Zemel

, Prototypical networks for few-shot learning, in Proc. 31st Int. Conf. Neural Information Processing Systems, 2017, pp. 4080–4090.

Google Scholar

[20]

Finn

, P.

Abbeel

, and S.

Levine

, Model-agnostic meta-learning for fast adaptation of deep networks, in Proc. 34th Int. Conf. Machine Learning - Volume 70, Sydney, Australia, 2017, pp. 1126–1135.

Google Scholar

[21]

, J.

Yang

, Z.

Zhang

, C.

Yao

, Z.

, S.

Zhou

, and X.

Yan

, Adaptivestep graph meta-learner for few-shot graph classification, in Proc. 29th ACM Int. Conf. Information & Knowledge Management, Virtual Event, 2020, pp. 1055–1064.

Crossref Google Scholar

[22]

Chauhan

, D.

Nathani

, and M.

Kaul

, Few-shot learning on graphs via super-classes based on graph spectral measures, arXiv preprint arXiv: 2002.12815, 2020.

Google Scholar

[23]

Jiang

, F.

Feng

, W.

Chen

, X.

, and X.

, Structure-enhanced meta-learning for few-shot graph classification, AI Open, vol. 2, pp. 160–167, 2021.

Crossref Google Scholar

[24]

, W.

, J.

Leskovec

, and S.

Jegelka

, How powerful are graph neural networks? arXiv preprint arXiv: 1810.00826v1, 2019.

Google Scholar

[25]

Hassani

, Cross-domain few-shot graph classification, arXiv preprint arXiv: 2201.08265, 2022.

Google Scholar

[26]

Chen

, S.

Kornblith

, M.

Norouzi

, and G.

Hinton

, A simple framework for contrastive learning of visual representations, in Proc. 37th Int. Conf. Machine Learning, Vienna, Austria, 2020, pp. 1597–1607.

Google Scholar

[27]

You

, T.

Chen

, Y.

Sui

, T.

Chen

, Z.

Wang

, and Y.

Shen

, Graph contrastive learning with augmentations, in Proc. 34th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 5812–5823.

Google Scholar

[28]

You

, T.

Chen

, Y.

Shen

, and Z.

Wang

, Graph contrastive learning automated, arXiv preprint arXiv: 2106.07594, 2021.

Google Scholar

[29]

Zhu

, Y.

, F.

, Q.

Liu

, S.

, and L.

Wang

, Graph contrastive learning with adaptive augmentation, in Proc. Web Conf. 2021, Ljubljana, Slovenia, 2021, pp. 2069–2080.

Crossref Google Scholar

[30]

Zhu

, Y.

, F.

, Q.

Liu

, S.

, and L.

Wang

, Deep graph contrastive representation learning, arXiv preprint arXiv: 2006.04131, 2020.

Google Scholar

[31]

, W.

Cheng

, D.

Luo

, H.

Chen

, and X.

Zhang

, InfoGCL: Information-aware graph contrastive learning, arXiv preprint arXiv: 2110.15438, 2021.

Google Scholar

[32]

Suresh

, P.

, C.

Hao

, and J.

Neville

, Adversarial graph augmentation to improve graph contrastive learning, arXiv preprint arXiv: 2106.05819v4, 2021.

Google Scholar

[33]

Lin

, C.

Liu

, P.

Zhou

, Z. Y.

, S.

Wang

, R.

Zhao

, Y.

Zheng

, L.

Lin

, E.

Xing

, and X.

Liang

, Prototypical graph contrastive learning, IEEE Trans. Neural Netw. Learn. Syst., .

Crossref Google Scholar

[34]

, H.

Yin

, X.

Xia

, T.

Chen

, L.

Cui

, and Q. V. H.

Nguyen

, Are graph augmentations necessary? Simple graph contrastive learning for recommendation, in Proc. 45th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Madrid, Spain, 2022, pp. 1294–1303.

Crossref Google Scholar

[35]

Zhu

, Y.

, Q.

Liu

, and S.

, An empirical study of graph contrastive learning, arXiv preprint arXiv: 2109.01116v2, 2021.

Google Scholar

[36]

Kolouri

, N.

Naderializadeh

, G. K.

Rohde

, and H.

Hoffmann

, Wasserstein embedding for graph learning, arXiv preprint arXiv: 2006.09430, 2020.

Google Scholar

[37]

, H.

Fan

, Y.

, S.

Xie

, and R.

Girshick

, Momentum contrast for unsupervised visual representation learning, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9726–9735.

Crossref Google Scholar

[38]

Kersting

, N. M.

Kriege

, C.

Morris

, P.

Mutzel

, and M.

Neumann

, Benchmark data sets for graph kernels, https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets, 2016.

[39]

McAuley

, R.

Pandey

, and J.

Leskovec

, Inferring networks of substitutable and complementary products, in Proc. 21th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Sydney, Australia, 2015, pp. 785–794.

Crossref Google Scholar

[40]

Tang

, J.

Zhang

, L.

Yao

, J.

, L.

Zhang

, and Z.

, ArnetMiner: Extraction and mining of academic social networks, in Proc. 14th ACM SIGKDD Int. Conf. Knowledge discovery and data mining, Las Vegas, NV, USA, 2008, pp. 990–998.

Crossref Google Scholar

[41]

Wang

, M.

Luo

, K.

Ding

, L.

Zhang

, J.

, and Q.

Zheng

, Graph few-shot learning with attribute matching, in Proc. 29th ACM Int. Conf. Information & Knowledge Management, Virtual Event, 2020, pp. 1545–1554.

Crossref Google Scholar

[42]

Riesen

and H.

Bunke

, IAM graph database repository for graph based pattern recognition and machine learning, https://doi.org/10.1007/978-3-540-89689-0_33, 2008.

Crossref

[43]

Shervashidze

, P.

Schweitzer

, E. J.

Van Leeuwen

, K.

Mehlhorn

, and K. M.

Borgwardt

, Weisfeiler-lehman graph kernels, J. Mach. Learn. Res., vol. 12, pp. 2539–2561, 2011.

Google Scholar

[44]

, C.

Ruan

, E.

Korpeoglu

, S.

Kumar

, and K.

Achan

, Inductive representation learning on temporal graphs, arXiv preprint arXiv: 2002.07962v1, 2020.

Google Scholar

[45]

T. N.

Kipf

and M.

Welling

, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2016.

Google Scholar

[46]

Velikovi

, G.

Cucurull

, A.

Casanova

, A.

Romero

, P.

Liò

, and Y.

Bengio

, Graph attention networks, arXiv preprint arXiv: 1710.10903v3, 2018.

Google Scholar

[47]

F. M.

Bianchi

, D.

Grattarola

, and C.

Alippi

, Mincut pooling in graph neural networks, arXiv preprint arXiv: 1907.00481v2, 2020.

Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 2,
April 2024

Pages 605-616

DOI: 10.26599/TST.2023.9010071

Cite this article:

Zhang K, Shin D, Seo D, et al. Few-Shot Graph Classification with Structural-Enhanced Contrastive Learning for Graph Data Copyright Protection. Tsinghua Science and Technology, 2024, 29(2): 605-616. https://doi.org/10.26599/TST.2023.9010071

3.2 Proposed framework

Figure 2 illustrates the framework of our proposed method. Two complementary classification tasks are performed simultaneously to learn the main encoder $ℱ_{θ} (\cdot)$ , which is a GNN for projecting a graph into an embedding. The first learning module is metric-based meta-learning, which utilizes explicit label information to generate the graph embedding and compute the similarity between the support set and query set. The second learning module is contrastive learning, which is a self-supervised instance-level classification task to improve the representation result. For self-supervised learning, we design a strategy to generate a pair of positive and negative augmentation views of the input graph automatically, which contributes to data copyright protection by mitigating the risk of unauthorized reproduction or misuse of the original data.

10.26599/TST.2023.9010071.F002 Fig. 2Overview of SE-GCL. The framework consists of two main processes: graph meta-learning and contrastive learning. Given a support set of input graphs, we use a graph encoder to extract robust feature representation and derive reliable prototypes for each class. The Wasserstein metric measures the similarity between the query graph and the prototype. Further, we impose the contrastive loss on the query set to improve the model’s generalizability. The complete workflow of all modules is an end-to-end solution. More details could be in the section of the proposed framework.

During meta-learning, the main encoder $ℱ_{θ} (\cdot)$ maps each graph into a latent representation as its graph embedding $h_{G_{i}} = ℱ_{θ} (G_{i})$ . Specifically, GNNs compute graph embedding via a message-passing framework:

(1) $h_{u}^{(l + 1)} = COM (h_{u}^{(l)}, [AGG ({h_{u^{'}}^{(l)} | \forall u^{'} \in U^{'}})])$

(2) $h_{G} = READOUT (h_{u}^{(l)} | \forall u \in U)$

where $h_{u}^{(l)}$ denotes the embedding of node u at the l-th GNN layer; $U^{'}$ is the neighbor set of node u; AGG $(\cdot)$ is the neighbor aggregation function; COM $(\cdot)$ is the combination function; and READOUT $(\cdot)$ is the graph-level pooling function. Then all support graph embeddings in the same class $y_{n}$ are aggregated into one prototype representation $z_{n}$ by computing the average, which is formulated as

(3) $z_{n} = \frac{1}{K} \sum_{i = 1}^{K} h_{G_{i}} (G_{i} \in D_{S} (G, y_{n}), n \in [1, N])$

To predict the label of the query graph, the similarity between query graph embedding and the prototype representation is measured by the p-th Wasserstein distance following the work in Ref. [ 36 ], which is the optional cost of moving mass between two graph embeddings. The classification loss $ℒ_{Meta}$ is defined as the average cross entropy between true labels and predictions based on the similarity, which can be formulated as

(4) $ℒ_{Meta} (D_{S}, D_{Q}, θ) =$ $- \frac{1}{M} \sum_{(G, y) \in D_{Q}} \log \frac{e^{sim (ℱ_{θ} (G), z_{y})}}{\sum_{i = 1}^{N} e^{sim (ℱ_{θ} (G), 𝒛_{i})}}$

where sim denotes the Wasserstein similarity metric.

Because contrastive learning can maximize the agreement between the input data and its positive view while minimizing the agreement with the negative view, two automatic augmentations are employed to generate a pair of differentiable views for the respective goals, which reduces the need to unauthorized operations of the original data. Expressly, the positive augmentation operation preserves the original topology of the sample graph $G_{i}$ and masks all the node features to form a positive view $G_{i}^{mask}$ , which aims to mediate the overwhelming of the node features over the graph structure information in the representation learning. On the other hand, the negative augmentation operation generates a negative view $G_{i}^{neg}$ by random node-dropping and edge-perturbation. Both operations follow an i.i.d. uniform distribution with node-dropping ratio $η$ and edge-perturbation ratio $1 - η$ . For edge-perturbation, it randomly drops $1 - η$ existing edges, then adds the same amount of random edges back into $G_{i}$ . To form $G_{i}^{neg}$ as a small subgraph from $G_{i}$ with a few noisy edges, $η$ is set to 0.8 by default. Moreover, it is stated in Ref. [ 23 ] that the structural information of graph data consists of both local and global dimensions, which means some attributes of a graph depend on the substructure of the graph while some consider the global structure more. As generalization is the main challenge for meta-learning to test novel domains, randomly treating a small subgraph as the negative example helps predictive models generalize beyond the limited training data. It should be noted that the negative view of one sample graph is also treated as the negative view of the rest samples (i.e., for a query set containing M samples, there are M negative views for each sample graph). Introduced in Ref. [ 37 ], we apply a momentum encoder $ℱ_{ω} (\cdot)$ for projecting the contrastive views, which behaves similarly as the main encoder as its parameter $ω$ is a moving average of $θ$ . Given $ℱ_{θ} (G_{i})$ , the contrastive loss aims to maximize its agreement with $ℱ_{ω} (G_{i}^{mask})$ while minimizing the agreement with all the negative views $ℱ_{ω} (G_{j}^{neg}), j \in M$ , which can be formulated as

(5) $ℒ_{con} (D_{S}, D_{Q}, θ, ω, η) =$ $- \frac{1}{M} \sum_{G \in D_{Q}} \log \frac{e^{sim (ℱ_{θ} (G), ℱ_{ω} (G^{mask}))}}{\sum_{j = 1}^{M} e^{sim (ℱ_{θ} (G), ℱ_{ω} (G_{j}^{neg}))}}$

where M denotes the size of the query set, and $η$ is the perturbation ratio. By minimizing $ℒ_{con}$ w.r.t. $θ$ , we force the main encoder $ℱ_{θ} (\cdot)$ to maintain the complete structural information in the embedding and produce more generalized prototypical networks. Thus, the overall loss is the combination of the classification loss and the contrastive loss:

(6) $ℒ_{total} = ℒ_{Meta} (D_{S}, D_{Q}, θ) + β ℒ_{con} (D_{S}, D_{Q}, θ, ω, η)$

where $β$ is a hyper-parameter that balances two terms. The detailed learning process is described in Algorithm 1. And all notations used in this paper are listed in Table 1 .

10.26599/TST.2023.9010071.T001 Table 1List of notations used in this paper.

Symbol	Description
$G$	Undirected unweighted graph
U	Set of nodes
$U^{'}$	Set of node’s neighbors
E	Set of edges
A	Set of node attributes
$y$	Graph label
$D_{S}$	Support dataset
$D_{Q}$	Query dataset
$ℱ_{θ} (\cdot)$	Main graph encoder
$ℱ_{ω} (\cdot)$	Momentum graph encoder
$h_{G}$	Graph embedding
$h_{u}^{(l + 1)}$	Node embedding at the l-th GNN layer
$z_{n}$	Graph prototype representation
$G^{mask}$	Graph positive augmentation view
$G^{neg}$	Graph negative augmentation view
$η$	Perturbation ratio of $G^{neg}$
$β$	Regularization hyper-parameter

Return

10.26599/TST.2023.9010071.T001Table 1List of notations used in this paper.

Symbol	Description
$G$	Undirected unweighted graph
U	Set of nodes
$U^{'}$	Set of node’s neighbors
E	Set of edges
A	Set of node attributes
$y$	Graph label
$D_{S}$	Support dataset
$D_{Q}$	Query dataset
$ℱ_{θ} (\cdot)$	Main graph encoder
$ℱ_{ω} (\cdot)$	Momentum graph encoder
$h_{G}$	Graph embedding
$h_{u}^{(l + 1)}$	Node embedding at the l-th GNN layer
$z_{n}$	Graph prototype representation
$G^{mask}$	Graph positive augmentation view
$G^{neg}$	Graph negative augmentation view
$η$	Perturbation ratio of $G^{neg}$
$β$	Regularization hyper-parameter

10.26599/TST.2023.9010071.T002Table 2Statistics of datasets. We show each dataset with the number of graphs $| G |$ , the average number of nodes $Avg$ . $| U |$ , the average number of edges $Avg$ . $| E |$ , the dimensions of node attributes $| A |$ , and the number of classes for training over testing $| y_{train} | / | y_{test} |$ .

Dataset	$\| G \|$	$A v g . \| U \|$	$A v g . \| E \|$	$\| A \|$	$\frac{\| y_{train} \|}{\| y_{test} \|}$
Amazon-Clothing	2000	32.15	192.50	9034	10/10
DBLP	2000	47.25	318.45	7202	10/10
Letter-High	2250	4.67	4.50	2	11/4
TRIANGLES	2000	20.85	35.50	1	7/3

10.26599/TST.2023.9010071.T003Table 3Accuracy with a standard deviation of baselines and our method. We tested 100 N-way-K-shot tasks on both Amazon-Clothing and DBLP datasets. The best results are highlighted in bold.

Method	Accuracy (%)
	Amazon-Clothing			DBLP
	5-way 5-shot	5-way 10-shot	8-way 5-shot	5-way 5-shot	5-way 10-shot	8-way 5-shot
WL kernel	56.40 $\pm$ 2.23	65.24 $\pm$ 1.37	49.47 $\pm$ 2.64	57.12 $\pm$ 2.44	65.52 $\pm$ 1.71	50.35 $\pm$ 2.39
GIN	63.25 $\pm$ 1.63	71.24 $\pm$ 1.57	55.47 $\pm$ 3.34	66.10 $\pm$ 2.41	72.38 $\pm$ 1.44	57.13 $\pm$ 2.88
MAML (GCN)	70.72 $\pm$ 3.88	76.62 $\pm$ 2.35	60.70 $\pm$ 4.53	73.12 $\pm$ 4.65	77.69 $\pm$ 2.89	63.19 $\pm$ 5.12
MAML (GAT)	70.66 $\pm$ 3.53	76.68 $\pm$ 2.51	60.27 $\pm$ 4.49	74.10 $\pm$ 4.19	78.03 $\pm$ 3.44	62.80 $\pm$ 3.99
PN (GCN)	70.18 $\pm$ 1.19	77.43 $\pm$ 1.87	63.17 $\pm$ 2.14	74.32 $\pm$ 2.49	79.79 $\pm$ 2.19	64.49 $\pm$ 3.19
PN (GAT)	71.22 $\pm$ 2.43	77.06 $\pm$ 2.15	63.89 $\pm$ 2.94	74.91 $\pm$ 3.29	80.29 $\pm$ 2.34	64.52 $\pm$ 3.52
SE-GCL (GCN)	74.98 $\pm$ 2.01	80.22 $\pm$ 1.55	66.37 $\pm$ 1.99	77.31 $\pm$ 2.17	83.40 $\pm$ 1.14	67.59 $\pm$ 2.86
SE-GCL (GAT)	75.02 $\pm$ 2.90	81.76 $\pm$ 2.36	66.92 $\pm$ 2.43	78.16 $\pm$ 3.09	84.75 $\pm$ 1.82	68.25 $\pm$ 3.20

10.26599/TST.2023.9010071.T004Table 4Accuracy of GSM and our method. We tested 100 N-way-K-shot tasks on both Letter-High (4-way) and TRIANGLES datasets (3-way).

Method	Accuracy (%)
Method	$K$ -shot	Letter-High	TRIANGLES
GSM	5	69.91 $\pm$ 5.90	71.40 $\pm$ 4.34
	10	73.28 $\pm$ 3.46	75.60 $\pm$ 3.67
	20	77.38 $\pm$ 1.58	80.04 $\pm$ 2.20
SE-GCL	5	74.34 $\pm$ 1.03	77.36 $\pm$ 1.25
	10	79.42 $\pm$ 0.84	83.14 $\pm$ 1.07
	20	84.15 $\pm$ 0.77	89.17 $\pm$ 0.85