Federated Learning (FL) enables clients to securely share gradients computed on their local data with the server, thereby eliminating the necessity to directly expose their sensitive local datasets. In traditional FL, the server might take advantage of its dominant position during the model aggregation process to infer sensitive information from the shared gradients of the clients. At the same time, malicious clients may submit forged and malicious gradients during model training. Such behavior not only compromises the integrity of the global model, but also diminishes the usability and reliability of trained models. To effectively address such privacy and security attack issues, this work proposes a Blockchain-based Privacy-preserving and Secure Federated Learning (BPS-FL) scheme, which employs the threshold homomorphic encryption to protect the local gradients of clients. To resist malicious gradient attacks, we design a Byzantine-robust aggregation protocol for BPS-FL to realize the cipher-text level secure model aggregation. Moreover, we use a blockchain as the underlying distributed architecture to record all learning processes, which ensures the immutability and traceability of the data. Our extensive security analysis and numerical evaluation demonstrate that BPS-FL satisfies the privacy requirements and can effectively defend against poisoning attacks.


The COVID-19 pandemic has hit the world hard. The reaction to the pandemic related issues has been pouring into social platforms, such as Twitter. Many public officials and governments use Twitter to make policy announcements. People keep close track of the related information and express their concerns about the policies on Twitter. It is beneficial yet challenging to derive important information or knowledge out of such Twitter data. In this paper, we propose a Tripartite Graph Clustering for Pandemic Data Analysis (TGC-PDA) framework that builds on the proposed models and analysis: (1) tripartite graph representation, (2) non-negative matrix factorization with regularization, and (3) sentiment analysis. We collect the tweets containing a set of keywords related to coronavirus pandemic as the ground truth data. Our framework can detect the communities of Twitter users and analyze the topics that are discussed in the communities. The extensive experiments show that our TGC-PDA framework can effectively and efficiently identify the topics and correlations within the Twitter data for monitoring and understanding public opinions, which would provide policy makers useful information and statistics for decision making.