AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

FedBone: Towards Large-Scale Federated Multi-Task Learning

Beijing Key Laboratory of Mobile Computing and Pervasive Devices, Institute of Computing Technology Chinese Academy of Sciences, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100190, China
Show Author Information

Abstract

Federated multi-task learning (FMTL) has emerged as a promising framework for learning multiple tasks simultaneously with client-aware personalized models. While the majority of studies have focused on dealing with the non-independent and identically distributed (Non-IID) characteristics of client datasets, the issue of task heterogeneity has largely been overlooked. Dealing with task heterogeneity often requires complex models, making it impractical for federated learning in resource-constrained environments. In addition, the varying nature of these heterogeneous tasks introduces inductive biases, leading to interference during aggregation and potentially resulting in biased global models. To address these issues, we propose a hierarchical FMTL framework, referred to as FedBone, to facilitate the construction of large-scale models with improved generalization. FedBone leverages server-client split learning and gradient projection to split the entire model into two components: 1) a large-scale general model (referred to as the general model) on the cloud server, and 2) multiple task-specific models (referred to as client models) on edge clients, accommodating devices with limited compute power. To enhance the robustness of the large-scale general model, we incorporate the conflicting gradient projection technique into FedBone to rectify the skewed gradient direction caused by aggregating gradients from heterogeneous tasks. The proposed FedBone framework is evaluated on three benchmark datasets and one real ophthalmic dataset. The comprehensive experiments demonstrate that FedBone efficiently adapts to the heterogeneous local tasks of each client and outperforms existing federated learning algorithms in various dense prediction and classification tasks while utilizing off-the-shelf computational resources on the client side.

Electronic Supplementary Material

Video
JCST-3639-Video.mp4
Download File(s)
JCST-2308-13639-Highlights.pdf (731.3 KB)

References

[1]
McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In Proc. the 20th International Conference on Artificial Intelligence and Statistics, Apr. 2017, pp.1273–1282.
[2]

Cao X J, Li Z H, Sun G, Yu H F, Guizani M. Cross-silo heterogeneous model federated multitask learning. Knowledge-Based Systems, 2023, 265: 110347. DOI: 10.1016/j.knosys.2023.110347.

[3]
Mo F, Shamsabadi A S, Katevas K, Demetriou S, Leontiadis I, Cavallaro A, Haddadi H. DarkneTZ: Towards model privacy at the edge using trusted execution environments. In Proc. the 18th International Conference on Mobile Systems, Applications, and Services, Jun. 2020, pp.161–174. DOI: 10.1145/3386901.3388946.
[4]
Liu Y, Huang A B, Luo Y, Huang H, Liu Y Z, Chen Y Y, Feng L C, Chen T J, Yu H, Yang Q. FedVision: An online visual object detection platform powered by federated learning. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.13172–13179. DOI: 10.1609/aaai.v34i08.7021.
[5]
Miao J X, Yang Z X, Fan L L, Yang Y. FedSeg: Class-heterogeneous federated learning for semantic segmentation. In Proc. the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2023, pp.8042–8052. DOI: 10.1109/CVPR52729.2023.00777.
[6]
Yao L Y, Gao D W, Wang Z, Xie Y X, Kuang W R, Chen D Y, Wang H H, Dong C H, Ding B L, Li Y L. A benchmark for federated hetero-task learning. arXiv: 2206.03436, 2022. http://arxiv.org/abs/2206.03436, Jul. 2024.
[7]
Zamir A R, Sax A, Shen W, Guibas L, Malik J, Savarese S. Taskonomy: Disentangling task transfer learning. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2018, pp.3712–3722. DOI: 10.1109/CVPR.2018.00391.
[8]
Smith V, Chiang C K, Sanjabi M, Talwalkar A. Federated multi-task learning. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.4427–4437.
[9]
Li T, Hu S Y, Beirami A, Smith V. Ditto: Fair and robust federated learning through personalization. In Proc. the 38th International Conference on Machine Learning, Jul. 2021, pp.6357–6368.
[10]
Chen Y J, Ning Y, Chai Z, Rangwala H. Federated multi-task learning with hierarchical attention for sensor data analytics. In Proc. the 2020 International Joint Conference on Neural Networks (IJCNN), Jul. 2020, pp.1–8. DOI: 10.1109/IJCNN48605.2020.9207508.
[11]

Wu Z Y, Sun S, Wang Y W, Liu M, Pan Q Y, Jiang X F, Gao B. FedICT: Federated multi-task distillation for multi-access edge computing. IEEE Trans. Parallel and Distributed Systems, 2024, 35(6): 1107–1121. DOI: 10.1109/TPDS.2023.3289444.

[12]
Chen J Y, Zhang A D. FedMSplit: Correlation-adaptive federated multi-task learning across multimodal split networks. In Proc. the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Aug. 2022, pp.87–96. DOI: 10.1145/3534678.3539384.
[13]
He C Y, Ceyani E, Balasubramanian K, Annavaram M, Avestimehr S. SpreadGNN: Decentralized multi-task federated learning for graph neural networks on molecular data. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22 -Mar. 1, 2022, pp.6865–6873. DOI: 10.1609/aaai.v36i6.20643.
[14]

Duan M M, Liu D, Chen X Z, Liu R P, Tan Y J, Liang L. Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Trans. Parallel and Distributed Systems, 2021, 32(1): 59–71. DOI: 10.1109/TPDS.2020.3009406.

[15]

Wu Q, Chen X, Zhou Z, Zhang J S. FedHome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Trans. Mobile Computing, 2022, 21(8): 2818–2832. DOI: 10.1109/TMC.2020.3045266.

[16]
Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In Proc. the 3rd Conference on Machine Learning and Systems (MLSys 2020), Mar. 2020, pp.429–450.
[17]
Yao X, Sun L F. Continual local training for better initialization of federated models. In Proc. the 2020 IEEE International Conference on Image Processing (ICIP), Oct. 2020, pp.1736–1740. DOI: 10.1109/ICIP40778.2020.9190968.
[18]
Li D L, Wang J P. FedMD: Heterogenous federated learning via model distillation. arXiv: 1910.03581, 2019. http://arxiv.org/abs/1910.03581, Jul. 2024.
[19]
Jiang Y H, Konečný J, Rush K, Kannan S. Improving federated learning personalization via model agnostic Meta learning. arXiv: 1909.12488, 2019. http://arxiv.org/abs/1909.12488, Jul. 2024.
[20]
Liang P P, Liu T, Liu Z Y, Allen N B, Auerbach R P, Brent D, Salakhutdinov R, Morency L P. Think locally, act globally: Federated learning with local and global representations. arXiv: 2001.01523, 2020. http://arxiv.org/abs/2001.01523, Jul. 2024.
[21]
Diao E N, Ding J, Tarokh V. HeteroFL: Computation and communication efficient federated learning for heterogeneous clients. In Proc. the 9th International Conference on Learning Representations, May 2021.
[22]
Zhang X, Li Y C, Li W P, Guo K Y, Shao Y F. Personalized federated learning via variational Bayesian inference. In Proc. the 39th International Conference on Machine Learning, Jul. 2022, pp.26293–26310.
[23]
Shoham N, Avidor T, Keren A, Israel N, Benditkis D, Mor-Yosef L, Zeitak I. Overcoming forgetting in federated learning on Non-IID data. In Proc. the 2019 Workshop on Federated Learning for Data Privacy and Confidentiality, Oct. 2019.
[24]
Collins L, Hassani H, Mokhtari A, Shakkottai S. Exploiting shared representations for personalized federated learning. In Proc. the 38th International Conference on Machine Learning, Jul. 2021, pp.2089–2099.
[25]
Bommasani R, Hudson D A, Adeli E et al. On the opportunities and risks of foundation models. arXiv: 2108.07258, 2021. http://arxiv.org/abs/2108.07258, Jul. 2024.
[26]
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog, 2019, 1(8): 9. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf, Sept. 2024.
[27]
Kirillov A, Mintun E, Ravi N, Mao H Z, Rolland C, Gustafson L, Xiao T T, Whitehead S, Berg A C, Lo W Y, Dollár P, Girshick R. Segment anything. arXiv: 2304.02643, 2023. http://arxiv.org/abs/2304.02643, Jul. 2024.
[28]

Tian Y Y S, Wan Y, Lyu L, Yao D Z, Jin H, Sun L C. FedBERT: When federated learning meets pre-training. ACM Trans. Intelligent Systems and Technology, 2022, 13(4): 66. DOI: 10.1145/3510033.

[29]
Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, de Laroussilhe Q, Gesmundo A, Attariyan M, Gelly S. Parameter-efficient transfer learning for NLP. In Proc. the 36th International Conference on Machine Learning, Jun. 2019, pp.2790–2799.
[30]
Chen C C, Feng X H, Zhou J, Yin J W, Zheng X L. Federated large language model: A position paper. arXiv: 2307.08925, 2023. http://arxiv.org/abs/2307.08925, Jul. 2024.
[31]
Xiao G X, Lin J, Han S. Offsite-tuning: Transfer learning without full model. arXiv: 2302.04870, 2023. http://arxiv.org/abs/2302.04870, Jul. 2024.
[32]

Wu C H, Wu F Z, Lyu L, Huang Y F, Xie X. Communication-efficient federated learning via knowledge distillation. Nature Communications, 2022, 13(1): Article No. 2032. DOI: 10.1038/s41467-022-29763-x.

[33]
Thapa C, Arachchige P C M, Camtepe S, Sun L C. SplitFed: When federated learning meets split learning. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22–Mar. 1, 2022, pp.8485–8493. DOI: 10.1609/aaai.v36i8.20825.
[34]
Liu Z, Hu H, Lin Y T, Yao Z L, Xie Z D, Wei Y X, Ning J, Cao Y, Zhang Z, Dong L, Wei F R, Guo B N. Swin transformer V2: Scaling up capacity and resolution. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2022, pp.11999–12009. DOI: 10.1109/CVPR52688.2022.01170.
[35]
Rusu A A, Colmenarejo S G, Gülçehre Ç, Desjardins G, Kirkpatrick J, Pascanu R, Mnih V, Kavukcuoglu K, Hadsell R. Policy distillation. In Proc. the 4th International Conference on Learning Representations, May 2016.
[36]
Yu T H, Kumar S, Gupta A, Levine S, Hausman K, Finn C. Gradient surgery for multi-task learning. In Proc. the 34th International Conference on Neural Information Processing Systems, Dec. 2020, pp. 5824–5836. DOI: 10.5555/3495724.3496213.
[37]
Dai JF, Qi HZ, Xiong YW, Li Y, Zhang GD, Hu H, Wei YC. Deformable convolutional networks. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.764–773. DOI: 10.1109/ICCV.2017.89.
[38]
Xu Y Y, Yang Y B, Zhang L F. DeMT: Deformable mixer transformer for multi-task learning of dense prediction. In Proc. the 37th AAAI Conference on Artificial Intelligence, Fed. 2023, pp.3072–3080. DOI: 10.1609/aaai.v37i3.25411.
[39]
Xie C, Koyejo S, Gupta I. Asynchronous federated optimization. In Proc. the 12th Annual Workshop on Optimization for Machine Learning, Dec. 2020.
[40]

Imteaj A, Thakker U, Wang S Q, Li J, Amini M H. A survey on federated learning for resource-constrained IoT devices. IEEE Internet of Things Journal, 2022, 9(1): 1–24. DOI: 10.1109/JIOT.2021.3095077.

[41]
Nguyen J, Malik K, Zhan H Y, Yousefpour A, Rabbat M, Malek M, Huba D. Federated learning with buffered asynchronous aggregation. In Proc. the 25th International Conference on Artificial Intelligence and Statistics, Mar. 2022, pp.3581–3607.
[42]
Chen Z, Badrinarayanan V, Lee C Y, Rabinovich A. GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In Proc. the 35th International Conference on Machine Learning, Jul. 2018, pp.793–802.
[43]
Sabt M, Achemlal M, Bouabdallah A. Trusted execution environment: What it is, and what it is not. In Proc. 2015 IEEE Trustcom/BigDataSE/Ispa, Aug. 2015, pp.57–64. DOI: 10.1109/Trustcom.2015.357.
[44]

Kato F, Cao Y, Yoshikawa M. Olive: Oblivious federated learning on trusted execution environment against the risk of sparsification. Proceedings of the VLDB Endowment, 2023, 16(10): 2404–2417. DOI: 10.14778/3603581.3603583.

[45]

Dwork C, Roth A. The algorithmic foundations of differential privacy. Foundations and Trends ® in Theoretical Computer Science, 2014, 9(3/4): 211–407. DOI: 10.1561/0400000042.

[46]
Dinh C T, Tran N H, Nguyen T D. Personalized federated learning with Moreau envelopes. In Proc. the 34th International Conference on Neural Information Processing Systems, Dec. 2020, Article No. 1796.
[47]
Marfoq O, Neglia G, Bellet A, Kameni L, Vidal R. Federated multi-task learning under a mixture of distributions. In Proc. the 35th International Conference on Neural Information Processing Systems, Dec. 2021, pp.15434–15447.
[48]
Silberman N, Hoiem D, Kohli P, Fergus R. Indoor segmentation and support inference from RGBD images. In Proc. the 12th European Conference on Computer Vision, Oct. 2012, pp.746–760. DOI: 10.1007/978-3-642-33715-4_54.
[49]
Mottaghi R, Chen X J, Liu X B, Cho N G, Lee S W, Fidler S, Urtasun R, Yuille A. The role of context for object detection and semantic segmentation in the wild. In Proc. the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2014, pp.891–898. DOI: 10.1109/CVPR.2014.119.
[50]
Chen X J, Mottaghi R, Liu X B, Fidler S, Urtasun R, Yuille A. Detect what you can: Detecting and representing objects using holistic models and body parts. In Proc. the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2014, pp.1979–1986. DOI: 10.1109/CVPR.2014.254.
[51]
Chen Z T, Shen Y K, Ding M Y, Chen Z F, Zhao H S, Learned-Miller E G, Gan C. Mod-Squad: Designing mixtures of experts as modular multi-task learners. In Proc. the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2023, pp.11828–11837. DOI: 10.1109/CVPR52729.2023.01138.
[52]
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015, pp.3431–3440. DOI: 10.1109/CVPR.2015.7298965.
[53]

Robbins H, Monro S. A Stochastic Approximation Method. The Annals of Mathematical Statistics, 1951, 22(3): 400. DOI: 10.1214/aoms/1177729586.

[54]
Wei Q J, Li X R, Yu W H, Zhang X, Zhang Y P, Hu B J, Mo B, Gong D, Chen N, Ding D Y, Chen Y X. Learn to segment retinal lesions and beyond. In Proc. the 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp.7403–7410. DOI: 10.1109/ICPR48806.2021.9412088.
Journal of Computer Science and Technology
Pages 1040-1057
Cite this article:
Chen Y-Q, Zhang T, Jiang X-L, et al. FedBone: Towards Large-Scale Federated Multi-Task Learning. Journal of Computer Science and Technology, 2024, 39(5): 1040-1057. https://doi.org/10.1007/s11390-024-3639-x

87

Views

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 03 August 2023
Accepted: 13 March 2024
Published: 05 December 2024
© Institute of Computing Technology, Chinese Academy of Sciences 2024
Return