Quality-Aware User Recruitment Based on Federated Learning in Mobile Crowd Sensing

Wei Zhang; Zhuo Li; Xin Chen

doi:10.26599/TST.2020.9010046

| Sign up

PDF (1.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (3)

Fig. 1

Fig. 2

Fig. 3

Open Access

Quality-Aware User Recruitment Based on Federated Learning in Mobile Crowd Sensing

Wei Zhang, Zhuo Li(), Xin Chen

Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, and School of Computer Science, Beijing Information Science and Technology University, Beijing 100101, China

School of Computer Science, Beijing Information Science and Technology University, Beijing 100101, China

Show Author Information

Abstract

With the rapid development of mobile devices, the use of Mobile Crowd Sensing (MCS) mode has become popular to complete more intelligent and complex sensing tasks. However, large-scale data collection may reduce the quality of sensed data. Thus, quality control is a key problem in MCS. With the emergence of the federated learning framework, the number of complex intelligent calculations that can be completed on mobile devices has increased. In this study, we formulate a quality-aware user recruitment problem as an optimization problem. We predict the quality of sensed data from different users by analyzing the correlation between data and context information through federated learning. Furthermore, the lightweight neural network model located on mobile terminals is used. Based on the prediction of sensed quality, we develop a user recruitment algorithm that runs on the cloud platform through terminal-cloud collaboration. The performance of the proposed method is evaluated through simulations. Results show that compared with existing algorithms, i.e., Random Adaptive Greedy algorithm for User Recruitment (RAGUR) and Context-Aware Tasks Allocation (CATA), the proposed method improves the quality of sensed data by 23.5 $%$ and 38.8 $%$ , respectively.

Keywords

crowd sensing federated learning quality aware user recruitment

References

[1]

Estrin

, K. M.

Chandy

, R. M.

Young

, L.

Smarr

, A.

Odlyzko

, D.

Clark

, V.

Reding

, T.

Ishida

, S.

Sharma

, V. G.

Cerf

, et al., Participatory sensing: Applications and architecture, IEEE Internet Computing, vol. 14, no. 1, pp. 12-42, 2010.

Crossref Google Scholar

[2]

H. B.

McMahan

, E.

Moore

, D.

Ramage

, S.

Hampson

, and B. A.

Arcas

, Communication-efficient learning of deep networks from decentralized data, in Proc. 20th Int. Conf. Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 2017, pp. 1273-1282.

[3]

Tomasoni

, A.

Capponi

, C.

Fiandrino

, D.

Kliazovich

, F.

Granelli

, and P.

Bouvry

, Profiling energy efficiency of mobile crowdsensing data collection frameworks for smart city applications, in Proc. 2018 6th IEEE Int. Conf. Mobile Cloud Computing, Services, and Engineering, Bamberg, Germany, 2018, pp. 1-8.

Crossref

[4]

S. Z.

Liu

, Z. Z.

Zheng

, F.

, S. J.

Tang

, and G. H.

Chen

, Context-aware data quality estimation in mobile crowdsensing, presented at IEEE INFOCOM 2017-IEEE Conf. Computer Communications, Atlanta, GA, USA, 2017, pp. 1-9.

Crossref

[5]

Hassani

, P. D.

Haghighi

, and P. P.

Jayaraman

, Context-aware recruitment scheme for opportunistic mobile crowdsensing, in Proc. IEEE 21st Int. Conf. Parallel and Distributed Systems, Melbourne, Australia, 2015, pp. 266-273.

Crossref

[6]

Lease

, On quality control and machine learning in crowdsourcing, in Proc. 11th AAAI Conf. Human Computation, California, CA, USA, 2011, pp. 97-102.

[7]

Fiandrino

, F.

Anjomshoa

, B.

Kantarci

, D.

Kliazovich

, P.

Bouvry

, and J. N.

Matthews

, Sociability-driven framework for data acquisition in mobile crowdsensing over fog computing platforms for smart cities, IEEE Transactions on Sustainable Computing, vol. 2, no. 4, pp. 345-358, 2017.

Crossref Google Scholar

[8]

C. S.

Meng

, W. J.

Jiang

, Y. L.

, J.

Gao

, L.

, H.

Ding

, and Y.

Cheng

, Truth discovery on crowd sensing of correlated entities, in Proc. 13th ACM. Conf. Embedded Networked Sensor Systems, Seoul, Republic of Korea, 2015, pp. 169-182.

Crossref

[9]

G. R.

, S. C.

Peng

, C.

Wang

, J. W.

Niu

, and Y.

Yuan

, An energy-efficient data collection scheme using denoising autoencoder in wireless sensor networks, Tsinghua Science and Technology, vol. 24, no. 1, pp. 86-96, 2019.

Crossref Google Scholar

[10]

Yang

, F.

, S. J.

Tang

, X. F.

Gao

, B.

Yang

, and G. H.

Chen

, On designing data quality-aware truth estimation and surplus sharing method for mobile crowdsensing, IEEE Journal on Selected Areas in Communications, vol. 35, no. 4, pp. 832-847, 2017.

Crossref Google Scholar

[11]

X. X.

Yin

, J. W.

Han

, and P. S.

, Truth discovery with multiple conflicting information providers on the web, IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. 6, pp. 796-808, 2008.

Crossref Google Scholar

[12]

, Y. L.

, J.

Gao

, L.

, B.

Zhao

, M.

Demirbas

, W.

Fan

, and J. W.

Han

, A confidence-aware approach for truth discovery on long-tail data, Proceedings of the VLDB Endowment, vol. 8, no. 4, pp. 425-436, 2014.

Crossref Google Scholar

[13]

X. L.

Dong

, L.

Berti-Equille

, and D.

Srivastava

, Truth discovery and copying detection in a dynamic world, Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 562-573, 2009.

Crossref Google Scholar

[14]

Zhao

, B. I. P.

Rubinstein

, J.

Gemmell

, and J. W.

Han

, A Bayesian approach to discovering truth from conflicting sources for data integration, Proceedings of the VLDB Endowment, vol. 5, no. 6, pp. 550-561, 2012.

Crossref Google Scholar

[15]

X. X.

Yin

and W. Z.

Tan

, Semi-supervised truth discovery, in Proc. 20th Int. Conf. World Wide Web, Hyderabad, India, 2011, pp. 217-226.

Crossref

[16]

Ribeiro

, D.

Florêncio

, C.

Zhang

, and M.

Seltzer

, CROWDMOS: An approach for crowdsourcing mean opinion score studies, in Proc. 2011 IEEE Int. Conf. Acoustics, Speech and Signal Processing, Prague, Czech Republic, 2011, pp. 2416-2419.

Crossref

[17]

Y. Q.

Zheng

, G. B.

Shen

, L. Q.

, C. S.

Zhao

, M.

, and F.

Zhao

, Travi-Navi: Self-deployable indoor navigation system, IEEE/ACM Transactions on Networking, vol. 25, no. 5, pp. 2655-2669, 2017.

Crossref Google Scholar

[18]

Reddy

, A.

Parker

, J.

Hyman

, J.

Burke

, D.

Estrin

, and M.

Hansen

, Image browsing, processing, and clustering for participatory sensing: Lessons from a DietSense prototype, in Proc. 4th Workshop on Embedded Networked Sensors, Cork, Ireland, 2007, pp. 13-17.

Crossref

[19]

E. T. H.

Chu

, C. Y.

Lin

, P. H.

Tsai

, and J. W. S.

Liu

, Participant selection for crowdsourcing disaster information, in Proc. 3rd Int. Conf. Disaster Management and Human Health Disaster Management, Vienna, Austria, 2013, pp. 231-240.

Crossref

[20]

Tomasoni

, A.

Capponi

, C.

Fiandrino

, D.

Kliazovich

, F.

Granelli

, and P.

Bouvry

Crossref

[21]

D. Q.

Zhang

, H. Y.

Xiong

, L. Y.

Wang

, and G. L.

Chen

, CrowdRecruiter: Selecting participants for piggyback crowdsensing under probabilistic coverage constraint, in Proc. 2014 ACM Int. Joint Conf. Pervasive and Ubiquitous Computing, Seattle, WA, USA, 2014, pp. 703-714.

Crossref

[22]

, S.

Pack

, and V. C. M.

Leung

, Coverage-guaranteed and energy-efficient participant selection strategy in mobile crowdsensing, IEEE Internet of Things Journal, vol. 6, no. 2, pp. 3202-3211, 2019.

Crossref Google Scholar

[23]

Azzam

, R.

Mizouni

, H.

Otrok

, A.

Ouali

, and S.

Singh

, GRS: A group-based recruitment system for mobile crowd sensing, Journal of Network & Computer Applications, vol. 72, pp. 38-50, 2016.

Crossref Google Scholar

[24]

Y. L.

Zhang

, H. Y.

Qin

, B.

, J.

Wang

, S.

Lee

, and Z. Q.

Huang

, Truthful mechanism for crowdsourcing task assignment, Tsinghua Science and Technology, vol. 23, no. 6, pp. 645-659, 2018.

Crossref Google Scholar

[25]

, A. K.

Sahu

, A.

Talwalkar

, and V.

Smith

, Federated learning: Challenges, methods, and future directions, IEEE Signal Processing Magazine, vol. 37, no. 3, pp. 50-60, 2020.

Crossref Google Scholar

[26]

L. N.

Liu

, X.

Chen

, Z. M.

, L. H.

Wang

, and X. M.

Wen

, Mobile-edge computing framework with data compression for wireless network in energy internet, Tsinghua Science and Technology, vol. 24, no. 3, pp. 271-280, 2019.

Crossref Google Scholar

[27]

X. F.

Wang

, Y. W.

Han

, C. Y.

Wang

, Q. Y.

Zhao

, X.

Chen

, and M.

Chen

, In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated learning, IEEE Network, vol. 33, no. 5, pp. 156-165, 2019.

Crossref Google Scholar

[28]

Yang

, Y.

Liu

, T. J.

Chen

, and Y. X.

Tong

, Federated machine learning: Concept and applications, ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, p. 12, 2019.

Crossref Google Scholar

[29]

Liu

, S. J.

Tang

, X. G.

Sun

, Q. Y.

Chen

, J. X.

Cao

, J. Z.

Luo

, and S. S.

Zhao

, Context-aware social media user sentiment analysis, Tsinghua Science and Technology, vol. 25, no. 4, pp. 528-541, 2020.

Crossref Google Scholar

[30]

Haruhiko

, K.

Hidehiko

, and H.

Terumine

, A study on the simple penalty term to the error function from the viewpoint of fault tolerant training, in Proc. 2004 IEEE Int. Joint Conf. Neural Networks, Budapest, Hungary, 2004, pp. 1045-1050.

Tsinghua Science and Technology

Volume 26 Issue 6,
December 2021

Pages 869-877

DOI: 10.26599/TST.2020.9010046

Cite this article:

Zhang W, Li Z, Chen X. Quality-Aware User Recruitment Based on Federated Learning in Mobile Crowd Sensing. Tsinghua Science and Technology, 2021, 26(6): 869-877. https://doi.org/10.26599/TST.2020.9010046

10.26599/TST.2020.9010046.F001 Fig. 1Federated learning-based user recruitment in MCS.

4.2 Prediction model for sensing data quality

In internet of things networks, wearable devices, autonomous vehicles, or smart homes may contain numerous sensors that allow them to collect large amounts of data in real time for sensing some special scenarios and events^{[

25
]}. However, building analysis models in these scenarios and events may be difficult due to the private nature of data and the limited connectivity of devices. Liu et al.^{[

26
]} proposed a novel framework that utilizes the local area network to collect data from users and reduce transmission latency without extra energy consumption overhead. Some research works have attempted to use new intelligent methods to build the model. Wang et al.^{[

27
]} designed the “In-Edge AI” framework to utilize collaboration among devices and edge nodes intelligently and thus exchange the learning parameters for the improved training and inference of the models. Yang et al.^{[

28
]} proposed building data networks among organizations on the basis of federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.

In this section, we use the federated learning framework to build the prediction model of sensing quality. The system architecture is shown in Fig. 1 . We divide it into two stages. Firstly, a lightweight neural network is used to construct the relationship between context and sensing data quality. Afterward, a federated learning framework is adopted to solve the relationship model and predict the sensing data quality of mobile users. We predict the data quality of user $u_{i}$ in sensing task through context information $C_{i}$ . Context is any information that can be used to model the situation of a specific user^{[

29
]}. This scenario has many context features, such as user identity, time, location, and their activity. Federated learning serves as an on-device distributed training system, which learns a shared global model from distributed mobile devices while keeping the training data at each device. This novel distributed learning paradigm harnesses the benefits of low latency and low power consumption while preserving user privacy. Federated learning gathers mobile users to participate in server-side model training, and the optimization model $f (w)$ with constant parameter $λ$ and variable parameter $w$ is as follows:

$\min_{w} f (w) = \sum_{i = 1}^{m} {(y_{i} - w^{T} C_{i} - \frac{λ}{n} \sum_{i} {‖ w_{i} ‖}^{n})}^{2},$

where $y_{i}$ is the sensing data quality calculated by the above algorithm.

By training model parameters, the relationship between context and sensing data quality is established. The standard Back Propagation (BP) algorithm uses the squared error as the objective function, and the gradient descent method is used to minimize the error function. As the number of iterations increases, the speed of function approximations decreases. Hence, the accuracy of approximation for highly nonlinear samples cannot be guaranteed. Considering the different relevance of contextual fields to the sensing data quality, we introduce a regularized penalty function to filter the contextual data with low relevance quickly.

(5) $r (w) = \frac{λ}{n} \sum_{i} {‖ w_{i} ‖}^{n}$

Equation ( 5 ) was proposed in Ref. [ 30 ] named penalty function $r (w)$ to prevent the possible overfitting of error terms, which leads to a decrease in the generalization ability of the network, thereby improving the fault tolerance of the model. This kind of lightweight neural network is built locally at the mobile terminal using the federated gradient descent algorithm for model training. An unequal number of datasets $D_{i} = {(X_{i}, y_{i}) : i = 1, 2, \dots, n}$ is located on the mobile terminal for user $u_{i}$ . Meanwhile, the loss function of the local training model $f_{i} (w)$ is $δ (x_{i}, y_{i}, w_{i})$ . Federated learning uses an effective parameter aggregation algorithm to train the model. Mobile users can update all model training parameters without uploading local data, thus greatly saving the overhead of the entire perception system. The typical gradient descent algorithm calculates the global parameters as $\frac{\partial f (w_{t})}{\partial w_{t}}$ with each round $t$ . Based on a fixed learning rate $η$ , the model weights can be updated in Formula ( 6 ):

(6) $w_{t + 1} \leftarrow w_{t} - η \frac{\partial f (w_{t})}{\partial w_{t}}$

Meanwhile, the average federated gradient descent algorithm calculates $g_{i} = \frac{\partial f_{i} (w_{t})}{\partial w_{t}}$ on each user’s local dataset in Formular ( 7 ), and the central cloud server aggregates the calculation results of each user with a data size of $n$ to perform average gradient descent in Formular ( 7 ).

(7) $w_{t + 1} \leftarrow w_{t} - η \sum_{i = 1}^{m} \frac{n_{i}}{n} g_{i}$

The local parameter for each user $u_{i}$ is updated as follows: $w_{t + 1} \leftarrow \sum_{i = 1}^{m} \frac{n_{i}}{n} w_{t + 1}^{k}$ . The parameter updating algorithm for federated learning model is summarized as Algorithm 1.

4.3 User recruitment algorithm

On the basis of federated learning, we can analyze the relationship between context and sensing data quality and predict the sensing data quality of the user on the basis of the user’s context. Using the predicted sensing quality of different users, we can recruit appropriate users for the sensing tasks. To minimize the sensing cost, we consider the physical location of the users. We design a Quality-Aware User Recruitment (QAUR) algorithm to solve and decide which users are recruited. For every user, we decide whether to recruit it or not. If the answer is yes, we update the current recruited group and the group’s sensed data quality with the predicted results; moreover, the travel cost constraint is updated. If the answer is no, the current state remains as the previous optimal solution. The detailed recruitment process is shown in Algorithm 2.

Dynamic programming is used to solve the problem. In each step, $p (i, j, U)$ is the maximum sensed quality when recruiting sensing users with a number of $i$ with the total distance $j$ and the optimal user set $U$ for recruitment. Each candidate user has two choice: One is selecting the user $u_{i}$ to participate in the sensed task and the other is not. We describe the concepts used in the recursion process. On one hand, in the subset without selecting the user $u_{i}$ , the sensed quality of the optimal subset is $p (i - 1, j, U \ u_{i})$ . On the other hand, in the subset of selecting the user $u_{i}$ , the optimal subset is the composition of this user and the optimal subset of $p (i - 1, j - l_{i}, U \ u_{i})$ , and the sensed quality of this optimal subset is $q (u_{i}, C_{i}) + p (i - 1, j - l_{i}, U \ u_{i})$ . $q_{i}$ is the real sensed quality of user $u_{i}$ . Therefore, the optimal solution for $p (i, j, U)$ is equal to the greater of these two values. If user $u_{i}$ is not recruited, then the sensing quality of the best subset is equal to the sensing quality of the best subset selected from the previous $i - 1$ user set. Thus, the following recurrence is generated as

(8) $p (i, j, U) = {\begin{matrix} \max {p (i - 1, j, U \ u_{i}), q (u_{i}, C_{i}) + \\ p (i - 1, j - l_{i}, U \ u_{i})}, j - l_{i} ⩾ 0; \\ p (i - 1, j, U \ u_{i}), j - l_{i} < 0 \end{matrix}$

In the solution process, we use user context information $(c_{1}, c_{2}, \dots, c_{n})$ to predict the user’s current sensing quality $q (u_{i}, C_{i}) = f (w)$ to approximate the optimal solution of the problem. The classical dynamic programming approach works from the bottom up, filling the table with solutions to all the small problems. Each of them requires one solution, and some solutions to the smaller problems are not required to solve a given problem; thus, we naturally combine the advantages of the top-down and bottom-up approach. The necessary subproblems are solved only once, and this memory function is used to recruit users every time. The user recruitment algorithm combined with federated learning can not only maximize system resources but also effectively use user context information to estimate its future sensing quality.

10.26599/TST.2020.9010046.F002 Fig. 2Sensing quality of data with different distance constraints.

10.26599/TST.2020.9010046.F003 Fig. 3Sensing quality of data with different numbers of users.

手机微信扫描二维码，点击右上角···按钮
分享到微信朋友圈