Multi-task learning (MTL) can boost the performance of individual tasks by mutual learning among multiple related tasks. However, when these tasks assume diverse complexities, their corresponding losses involved in the MTL objective inevitably compete with each other and ultimately make the learning biased towards simple tasks rather than complex ones. To address this imbalanced learning problem, we propose a novel MTL method that can equip multiple existing deep MTL model architectures with a sequential cooperative distillation (SCD) module. Specifically, we first introduce an efficient mechanism to measure the similarity between tasks, and group similar tasks into the same block to allow their cooperative learning from each other. Based on this, the grouped task blocks are sorted in a queue to determine the learning sequence of the tasks according to their complexities estimated with the defined performance indicator. Finally, a distillation between the individual task-specific models and the MTL model is performed block by block from complex to simple manner, achieving a balance between competition and cooperation among learning multiple tasks. Extensive experiments demonstrate that our method is significantly more competitive compared with state-of-the-art methods, ranking No.1 with average performances across multiple datasets by improving 12.95% and 3.72% compared with OMTL and MTLKD, respectively.
Kumar V R, Yogamani S, Rashed H, Sitsu G, Witt C, Leang I, Milz S, Mäder P. OmniDet: Surround view cameras based multi-task visual perception network for autonomous driving. IEEE Robotics and Automation Letters, 2021, 6(2): 2830–2837. DOI: 10.1109/LRA.2021.3062324.
Tseng K K, Lin J, Chen C M, Hassan M M. A fast instance segmentation with one-stage multi-task deep neural network for autonomous driving. Computers & Electrical Engineering, 2021, 93: 107194. DOI: 10.1016/j.compeleceng.2021.107194.
Liu C, Li X, Li Q, Xue Y, Liu H, Gao Y. Robot recognizing humans intention and interacting with humans based on a multi-task model combining ST-GCN-LSTM model and YOLO model. Neurocomputing, 2021, 430: 174–184. DOI: 10.1016/j.neucom.2020.10.016.
Li S Y, Huang S J, Chen S. Crowdsourcing aggregation with deep Bayesian learning. Science China Information Sciences, 2021, 64(3): 130104. DOI: 10.1007/s11432-020-3118-7.
Ranjan R, Patel V M, Chellappa R. HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 2019, 41(1): 121–135. DOI: 10.1109/TPAMI.2017.2781233.
Baxter J. A model of inductive bias learning. Journal of Artificial Intelligence Research, 2000, 12: 149–198. DOI: 10.1613/jair.731.
Masana M, Liu X, Twardowski B, Menta M, Bagdanov A D, Van De Weijer J. Class-incremental learning: Survey and performance evaluation on image classification. IEEE Trans. Pattern Analysis and Machine Intelligence, 2023, 45(5): 5513–5533. DOI: 10.1109/TPAMI.2022.3213473.
De Lange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T. A continual learning survey: Defying forgetting in classification tasks. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(7): 3366–3385. DOI: 10.1109/TPAMI.2021.3057446.
Gou J, Yu B, Maybank S J, Tao D. Knowledge distillation: A survey. International Journal of Computer Vision, 2021, 129(6): 1789–1819. DOI: 10.1007/s11263-021-01453-z.
Vandenhende S, Georgoulis S, Van Gansbeke W, Proesmans M, Dai D, Van Gool L. Multi-task learning for dense prediction tasks: A survey. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(7): 3614–3633. DOI: 10.1109/TPAMI.2021.3054719.
He Y, Liu P, Zhu L, Yang Y. Filter pruning by switching to neighboring CNNs with good attributes. IEEE Trans. Neural Networks and Learning Systems, 2023, 34(10): 8044–8056. DOI: 10.1109/TNNLS.2022.3149332.
Feng Q, Yao J, Zhong Y, Li P, Pan Z. Learning twofold heterogeneous multi-task by sharing similar convolution kernel pairs. Knowledge-Based Systems, 2022, 252: 109396. DOI: 10.1016/j.knosys.2022.109396.
Levine S, Finn C, Darrell T, Abbeel P. End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research, 2016, 17(1): 1334–1373. DOI: 10.5555/2946645.2946684.
Lee S, Son Y. Multitask learning with single gradient step update for task balancing. Neurocomputing, 2022, 467: 442–453. DOI: 10.1016/j.neucom.2021.10.025.