[1]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, in Proc. 25th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NE, USA, 2012, pp. 1097–1105.
[2]
K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, in Proc. 3rd Int. Conf. Learning Representations, San Diego, CA, USA, 2015.
[3]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, in 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 1–9.
[5]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.
[9]
A. Oliver, A. Odena, C. Raffel, E. D. Cubuk, and I. J. Goodfellow, Realistic evaluation of deep semi-supervised learning algorithms, in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 3239–3250.
[10]
V. Verma, A. Lamb, J. Kannala, Y. Bengio, and D. Lopez-Paz, Interpolation consistency training for semi-supervised learning, in Proc. 28th Int. Joint Conf. Artificial Intelligence, Macao, China, 2019, pp. 3635–3641.
[11]
X. Zhu, Z. Ghahramani, and J. Lafferty, Semi-supervised learning using Gaussian fields and harmonic functions, in Proc. Twentieth Int. Conf. Int. Conf. Machine Learning, Washington, DC, USA, 2003, pp. 912–919.
[12]
O. Chapelle, B. Schölkopf, and A. Zien, Semi-Supervised Learning (Adaptive Computation and Machine Learning), Cambridge, MA, USA: MIT Press, 2006.
[13]
J. Turian, L. A. Ratinov, and Y. Bengio, Word representations: A simple and general method for semi-supervised learning, in Proc. 48th Ann. Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 2010, pp. 384–394.
[14]
S. Laine and T. Aila, Temporal ensembling for semi-supervised learning, arXiv preprint arXiv: 1610.02242, 2016.
[15]
A. Tarvainen and H. Valpola, Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, CA USA, 2017, pp. 1195–1204.
[17]
D. Berthelot, N. Carlini, I. Goodfellow, A. Oliver, N. Papernot, and C. Raffel, MixMatch: A holistic approach to semi-supervised learning, in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, p. 454.
[23]
T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha, Deep learning for classical Japanese literature, arXiv preprint arXiv: 1812.01718, 2018.
[25]
M. Sajjadi, M. Javanmardi, and T. Tasdizen, Regularization with stochastic transformations and perturbations for deep semi-supervised learning, in Proc. 30th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 1171–1179.
[26]
P. Bachman, O. Alsharif, and D. Precup, Learning with pseudo-ensembles, in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3365–3373.
[27]
S. Park, J. Park, S. J. Shin, and I. C. Moon, Adversarial dropout for supervised and semi-supervised learning, in Proc. 32nd AAAI Conf. Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conf. and Eighth AAAI Symp. Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 480.
[28]
D. Berthelot, N. Carlini, E. D. Cubuk, A. Kurakin, K. Sohn, H. Zhang, and C. Raffel, ReMixMatch: Semi-supervised learning with distribution alignment and augmentation anchoring, arXiv preprint arXiv: 1911.09785, 2020.
[29]
T. Joachims, Transductive learning via spectral graph partitioning, in Proc. Twentieth Int. Conf. Int. Conf. Machine Learning, Washington, DC, USA, 2003, pp. 290–297.
[30]
T. Joachims, Transductive inference for text classification using support vector machines, in Proc. 16th Int. Conf. Machine Learning, San Francisco, CA, USA, 1999, pp. 200–209.
[31]
B. Yoshua, D. Olivier, and R. N. Le, Label propagation and quadratic criterion, in Semi-Supervised Learning, O. Chapelle, B. Scholkopf, A. Zien, eds. Cambridge, MA, USA: MIT Press, 2006, pp. 192–216.
[32]
D. P. Kingma, D. J. Rezende, S. Mohamed, and M. Welling, Semi-supervised learning with deep generative models, in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3581–3589.
[33]
A. Odena, Semi-supervised learning with generative adversarial networks, arXiv preprint arXiv: 1606.01583, 2016.
[34]
D. H Lee, Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks, in Proc. 30th Int. Conf. Machine Learning, Atlanta, GA, USA, 2013, p. 2.
[35]
Y. Grandvalet and Y. Bengio, Semi-supervised learning by entropy minimization, in Proc. 17th Int. Conf. Neural Information Processing Systems, Vancouver, British, 2005, pp. 529–536.
[41]
M. Andrychowicz, M. Denil, S. G. Colmenarejo, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, Learning to learn by gradient descent by gradient descent, in Proc. 30th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 3981–3989.
[42]
S. Ravi and H. Larochelle, Optimization as a model for few-shot learning, in 5th Int. Conf. Learning Representations, Toulon, France, 2017, pp. 1–11.
[43]
M. Ren, E. Triantafillou, S. Ravi, J. Snell, K. Swersky, J. B. Tenenbaum, H. Larochelle, and R. S. Zemel, Meta-learning for semi-supervised few-shot classification, in Proc. 6th Int. Conf. Learning Representations, Vancouver, Canada, 2018.
[44]
C. Finn, P. Abbeel, and S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 1126–1135.
[45]
M. Ren, W. Zeng, B. Yang, and R. Urtasun, Learning to reweight examples for robust deep learning, in Proc. 35th Int. Conf. Machine Learning, Stockholmsmässan, Sweden, 2018, pp. 4334–4343.
[47]
Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. Hwang, and Y. Yang, Learning to propagate labels: Transductive propagation network for few-shot learning, in Proc. 7th Int. Conf. Learning Representations, New Orleans, LA, USA, 2019.
[48]
X. Li, Q. Sun, Y. Liu, S. Zheng, Q. Zhou, T. S. Chua, and B. Schiele, Learning to self-train for semi-supervised few-shot classification, in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, p. 922.
[49]
Z. Yu, L. Chen, Z. Cheng, and J. Luo, TransMatch: A transfer-learning scheme for semi-supervised few-shot learning, in 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 12853–12861.
[50]
P. Rodríguez, I. Laradji, A. Drouin, and A. Lacoste, Embedding propagation: Smoother manifold for few-shot classification, in 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 121–138.
[51]
A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pascanu, S. Osindero, and R. Hadsell, Meta-learning with latent embedding optimization, in Proc. 7th Int. Conf. Learning Representations, New Orleans, LA, USA, 2019.
[52]
Q. Sun, Y. Liu, T. S. Chua, and B. Schiele, Meta-transfer learning for few-shot learning, in 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019, pp. 403–412.
[53]
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. Mixup: Beyond empirical risk minimization. in Proc. 6th Int. Conf. Learning Representations, Vancouver, Canada, 2018.
[54]
S. J. Reddi, A. Hefny, S. Sra, B. Póczós, and A. Smola, Stochastic variance reduction for nonconvex optimization, in Proc. 33rd Int. Conf. Int. Conf. Machine Learning, New York, NY, USA, 2016, pp. 314–323.
[55]
A. Krizhevsky, Learning Multiple Layers of Features from Tiny Images,https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf, 2009.
[56]
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, Reading digits in natural images with unsupervised feature learning, presented on 25th Conf. Neural Information Processing Systems Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 2011.
[57]
A. Coates, A. Ng, and H. Lee, An analysis of single-layer networks in unsupervised feature learning, in Proc. Fourteenth Int. Conf. Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 2011, pp. 215–223.
[58]
B. Athiwaratkun, M. Finzi, P. Izmailov, and A. G. Wilson, There are many consistent explanations of unlabeled data: Why you should average, in Proc. 7th Int. Conf. Learning Representations, New Orleans, LA, USA, 2019.
[59]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, A simple framework for contrastive learning of visual representations, in Proc. 37th Int. Conf. Machine Learning, virtual, 2020, p. 149.
[60]
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, Momentum contrast for unsupervised visual representation learning, in 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9726–9735.
[61]
X. Chen and K. He, Exploring simple Siamese representation learning, in 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 15745–15753.
[62]
Y. Luo, J. Zhu, M. Li, Y. Ren, and B. Zhang, Smooth neighbors on teacher graphs for semi-supervised learning, in 2018 IEEE Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8896–8905.
[63]
I. Loshchilov and F. Hutter, SGDR: Stochastic gradient descent with warm restarts, in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2016.
[64]
G. Huang, Y. Li, G. Pleiss, Z. Liu, J. E. Hopcroft, and K. Q. Weinberger, Snapshot ensembles: Train 1, get m for free, in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2017.
[65]
J. Zhao, M. Mathieu, R. Goroshin, and Y. LeCun, Stacked what-where auto-encoders, arXiv preprint arXiv: 1506.02351, 2015.
[66]
E. Denton, S. Gross, and R. Fergus, Semi-supervised learning with context-conditional generative adversarial networks, arXiv preprint arXiv: 1611.06430, 2016.
[67]
K. Lee, S. Maji, A. Ravichandran, and S. Soatto, Meta-learning with differentiable convex optimization, in 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 10649–10657.
[68]
Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, Unsupervised feature learning via non-parametric instance discrimination, in 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 3733–3742.
[69]
W. Huang, M. Yi, and X. Zhao, Towards the generalization of contrastive self-supervised learning, arXiv preprint arXiv: 2111.00743, 2021.
[70]
J. Li, C. Xiong, and S. C. H. Hoi, CoMatch: Semi-supervised learning with contrastive graph regularization, in 2021 IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2021, pp. 9455–9464.
[71]
D. P. Bertsekas, Nonlinear Programming 2nd ed., Belmont, WY, USA: Athena Scientific, 1999.