| Sign up

PDF (2.4 MB)

Cite

Collect

Submit Manuscript

Show Outline

Figures (15)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Tables (4)

Table 1

Table 2

Table 3

Table 4

Open Access

Large-Scale Model Meets Federated Learning: A Hierarchical Hybrid Distributed Training Mechanism for Intelligent Intersection Large-Scale Model

Chang Liu^¹, Shaoyong Guo^¹(), Fangfang Dang^², Xuesong Qiu^¹, Sujie Shao^¹

1State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

2Network Security Center, State Grid Henan Electric Power Company Information Communication Branch, Zhengzhou 450052, China

Show Author Information

Abstract

The large-scale model (LSM) can handle large-scale data and complex problems, effectively improving the intelligence level of urban intersections. However, the traffic conditions at intersections are becoming increasingly complex, so the intelligent intersection LSMs (I²LSMs) also need to be continuously learned and updated. The traditional cloud-based training method incurs a significant amount of computational and storage overhead, and there is a risk of data leakage. The combination of edge artificial intelligence and federated learning provides an efficient and highly privacy protected computing mode. Therefore, we propose a hierarchical hybrid distributed training mechanism for I²LSM. Firstly, relying on the intelligent intersection system for cloud-network-terminal integration, we constructed an I²LSM hierarchical hybrid distributed training architecture. Then, we propose a hierarchical hybrid federated learning (H²Fed) algorithm that combines the advantages of centralized federated learning and decentralized federated learning. Further, we propose an adaptive compressed sensing algorithm to reduce the communication overhead. Finally, we analyze the convergence of the H²Fed algorithm. Experimental results show that the H²Fed algorithm reduces the communication overhead by 21.6% while ensuring the accuracy of the model.

Keywords

intelligent intersections large-scale models edge artificial intelligence (AI)federated learning compressed sensing

References

[1]

S. Asadianfam, M. Shamsi, and A. R. Kenari, Big data platform of traffic violation detection system: Identifying the risky behaviors of vehicle drivers, Multimed. Tools Appl., vol. 79, pp. 24645–24684, 2020.

Crossref Google Scholar

[2]

A. Navarro-Espinoza, O. R. López-Bonilla, E. E. García-Guerrero, E. Tlelo-Cuautle, D. López-Mancilla, C. Hernández-Mejía, and E. Inzunza-González, Traffic flow prediction for smart traffic lights using machine learning algorithms, Technologies, vol. 10, no. 1, p. 5, 2022.

Crossref Google Scholar

[3]

Y. Yang, K. He, Y. P. Wang, Z. Z. Yuan, Y. H. Yin, and M. Z. Guo, Identification of dynamic traffic crash risk for cross-area freeways based on statistical and machine learning methods, Physica A: Statistical Mechanics and Its Applications, vol. 595, p. 127083, 2022.

Crossref Google Scholar

[4]

D. Zhu, J. Chen, X. Shen, X. Li, and M. Elhoseiny, MiniGPT-4: Enhancing vision-language understanding with advanced large language models, arXiv preprint arXiv: 2304.10592, 2023.

[5]

K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, et al., Large language models encode clinical knowledge, arXiv preprint arXiv: 2212.13138, 2022.

[6]

S. J. Wu, O. Irsoy, S. Lu, V. Dabravolski, M. Dredze, S. Gehrmann, P. Kambadur, D. Rosenberg, and G. Mann, BloombergGPT: A large language model for finance, arXiv preprint arXiv: 2303.17564, 2023.

[7]

J. Yao, S. Zhang, Y. Yao, F. Wang, J. Ma, J. Zhang, Y. Chu, L. Ji, K. Jia, T. Shen, et al., Edge-cloud polarization and collaboration: A comprehensive survey for AI, IEEE Trans. Knowl. Data Eng., vol. 35, no. 7, pp. 6866–6886, 2023.

Google Scholar

[8]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, presented at the Advances in Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 2017.

[9]

C. Cui, Y. Ma, X. Cao, W. Ye, Y. Zhou, K. Liang, J. Chen, J. Lu, Z. Yang, K. D. Liao, et al., A survey on multimodal large language models for autonomous driving, in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 2024, pp. 958–979.

[10]

Z. Xu, Y. Zhang, E. Xie, Z. Zhao, Y. Guo, K. Y. K. Wong, Z. Li, and H. Zhao, DriveGPT4: Interpretable end-to-end autonomous driving via large language model, arXiv preprint arXiv: 2310.01412, 2023.

[11]

Y. Cui, S. Huang, J. Zhong, Z. Liu, Y. Wang, C. Sun, B. Li, X. Wang, and A. Khajepour, DriveLLM: Charting the path toward full autonomous driving with large language models, IEEE Trans. Intell. Veh., vol. 9, no. 1, pp. 1450–1464, 2024.

Crossref Google Scholar

[12]

O. Zheng, M. Abdel-Aty, D. Wang, C. Wang, and S. Ding, TrafficSafetyGPT: Tuning a pre-trained large language model to a domain-specific expert in transportation safety, arXiv preprint arXiv: 2307.15311, 2023.

[13]

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Arcas, Communication-efficient learning of deep networks from decentralized data, in Proc. 20th Int. Conf. Artificial Intelligence and Statistics, PMLR, Lauderdale, FL, USA, 2017, pp. 1273–1282.

[14]

L. Liu, J. Zhang, S. H. Song, and K. B. Letaief, Client-edge-cloud hierarchical federated learning, in Proc. ICC 2020—2020 IEEE Int. Conf. Communications (ICC), Dublin, Ireland, 2020, pp. 1–6.

[15]

I. Hegedüs, G. Danner, and M. Jelasity, Gossip learning as a decentralized alternative to federated learning, in Distributed Applications and Interoperable Systems, J. Pereira and L. Ricci, eds. Cham, Switzerland: Springer, 2019, pp. 74–90.

[16]

Z. Tang, S. Shi, B. Li, and X. Chu, GossipFL: A decentralized federated learning framework with sparsified and adaptive communication, IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 3, pp. 909–922, 2023.

Crossref Google Scholar

[17]

M. Asad, A. Moustafa, and T. Ito, FedOpt: Towards communication efficiency and privacy preservation in federated learning, Appl. Sci., vol. 10, no. 8, p. 2864, 2020.

Crossref Google Scholar

[18]

N. Strom, Scalable distributed DNN training using commodity GPU cloud computing, in Proc. Interspeech 2015, Dresden, Germany, 2015, pp. 1488–1492.

[19]

D. Jhunjhunwala, A. Gadhikar, G. Joshi, and Y. C. Eldar, Adaptive quantization of model updates for communication-efficient federated learning, in Proc. ICASSP 2021—2021 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada, 2021, pp. 3110–3114.

[20]

N. Shlezinger, M. Chen, Y. C. Eldar, H. V. Poor, and S. Cui, UVeQFed: Universal vector quantization for federated learning, IEEE Trans. Signal Process., vol. 69, pp. 500–514, 2021.

Crossref Google Scholar

[21]

C. Li, G. Li, and P. K. Varshney, Communication-efficient federated learning based on compressed sensing, IEEE Internet Things J., vol. 8, no. 20, pp. 15531–15541, 2021.

Crossref Google Scholar

[22]

Y. Guan, X. Liu, T. Ren, and J. Niu, Enabling communication-efficient federated learning via distributed compressed sensing, in Proc. IEEE INFOCOM 2023—IEEE Conf. Computer Communications, New York, NY, USA, 2023, pp. 1–10.

[23]

R. Zong, Y. Qin, F. Wu, Z. Tang, and K. Li, Fedcs: Efficient communication scheduling in decentralized federated learning, Inf. Fusion, vol. 102, p. 102028, 2024.

Crossref Google Scholar

[24]

S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, When edge meets learning: Adaptive control for resource-constrained distributed machine learning, in Proc. IEEE INFOCOM 2018—IEEE Conf. Computer Communications, Honolulu, HI, USA, 2018, pp. 63–71.

[25]

F. Yu, J. Tang, W. Yin, Y. Sun, H. Tian, H. Wu, and H. Wang, ERNIE-ViL: Knowledge enhanced vision-language representations through scene graphs, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 4, pp. 3208–3216, 2021.

[26]

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, LoRA: Low-rank adaptation of large language models, arXiv preprint arXiv: 2106.09685, 2021.

[27]

H. Yu, Y. Luo, M. Shu, Y. Huo, Z. Yang, Y. Shi, Z. Guo, H. Li, X. Hu, J. Yuan, et al., DAIR-V2X: A large-scale dataset for vehicle-infrastructure cooperative 3D object detection, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 21329–21338.

Big Data Mining and Analytics

Volume 7 Issue 4,
December 2024

Pages 1031-1049

DOI: 10.26599/BDMA.2024.9020029

Cite this article:

Liu C, Guo S, Dang F, et al. Large-Scale Model Meets Federated Learning: A Hierarchical Hybrid Distributed Training Mechanism for Intelligent Intersection Large-Scale Model. Big Data Mining and Analytics, 2024, 7(4): 1031-1049. https://doi.org/10.26599/BDMA.2024.9020029

Symbol	Description
$I$	Number of C-ECUs
$E_{i}, i = 1, 2, . . ., I$	$i - t h$ C-ECU
$J$	Number of regions to which $E_{i}$ is connected
$R_{i, j}, j = 1, 2, . . ., J$	$j - t h$ region of $E_{i}$ connection
$K$	Number of RS-ECUs in the $R_{i, j}$
$e_{i, j, k}, k = 1, 2, . . ., K$	$k - t h$ RS-ECU in the $R_{i, j}$
$f_{i, j, k} (\cdot)$	Loss function of $e_{i, j, k}$
$w_{i, j, k}$	Local model weights for $e_{i, j, k}$
$D_{i, j, k}$	Local data for $e_{i, j, k}$
$\| \cdot \|$	Size of $\cdot$
$g_{i, j, k}$	Gradient of $f_{i, j, k} (\cdot)$
$w_{i}$	Model weights for $E_{i}$
$g_{i}$	Gradient after aggregation of $E_{i}$
$w$	Global model weights for IICP
$g$	Gradient after aggregation of IICP
$L$	Maximum number of iterations for $e_{i, j, k}$
$T$	Maximum number of iterations for $E_{i}$
$H$	Maximum number of iterations for IICP
$\tilde{g}$	Sparsed gradient
$\hat{g}$	Compressed gradient
$\dot{g}$	Reconstructed gradient
$ξ$	Residual gradient
$η$	Learning rate

Algorithm	Number of nodes
Algorithm	IICP	C-EUC	RS-EUC
FedSGD	1	0	100
GossipFed	0	0	100
HierFed	1	5	$5 \times 20$
H²Fed	1	5	$5 \times 4 \times 5$

Experiment	Number of nodes
Experiment	Group 1	Group 2	Group 3
1	3 × 2 × 5	3 × 2 × 10	3 × 2 × 15
2	3 × 2 × 10	3 × 4 × 5	3 × 10 × 2
3	2 × 6 × 5	4 × 3 × 5	6 × 2 × 5
4	3 × 2 × (5, 5)	3 × 2 × (4, 6)	3 × 2 × (3, 7)

Algorithm	Sparsity (%)	Compression rate (%)
C-CS-H	10	20
C-CS-L	1	15
K-CS-H	1−10	20
K-CS-L	1−10	15
N-CS-H	10	15−20
N-CS-L	1	15−20
ACS	1−10	15−20