Sort:
Open Access Issue
Estimating Multiple Socioeconomic Attributes via Home Location—A Case Study in China
Journal of Social Computing 2021, 2(1): 71-88
Published: 16 February 2021
Abstract PDF (535.7 KB) Collect
Downloads:52

Inferring people’s Socioeconomic Attributes (SEAs), including income, occupation, and education level, is an important problem for both social sciences and many networked applications like targeted advertising and personalized recommendation. Previous works mainly focus on estimating SEAs from peoples’ cyberspace behaviors and relationships, such as the content of tweets or the social networks between online users. Besides cyberspace data, alternative data sources about users’ physical behavior, like their home location, may offer new insights. More specifically, in this paper, we study how to predict a person’s income level, family income level, occupation type, and education level from his/her home location. As a case study, we collect people’s home locations and socioeconomic attributes through a survey involving 9 provinces and 85 cities in China. We further enrich home location with the knowledge from real estate websites, government statistics websites, online map services, etc. To learn a shared representation from input features as well as attribute-specific representations for different SEAs, we propose H2SEA, a factorization machine-based multi-task learning method with attention mechanism. Extensive experiment results show that: (1) Home location can clearly improve the estimation accuracy for all SEA prediction tasks (e.g., 80.2% improvement in terms of F1-score in estimating personal income level); (2) The proposed H2SEA model outperforms alternative models for SEA inference in terms of various evaluation metrics, such as Area Under Curve (AUC), F-measure, and specificity; (3) The performance of specific SEA prediction tasks (e.g., personal income) can be further improved if H2SEA only focuses on cities or villages due to urban-rural gap in China; (4) Compared with online crawled housing price data, the area-level average income and Points Of Interest (POI) are more important features for SEA inferences in China.

Open Access Issue
Understanding the Behavioral Differences Between American and German Users: A Data-Driven Study
Big Data Mining and Analytics 2018, 1(4): 284-296
Published: 02 July 2018
Abstract PDF (826 KB) Collect
Downloads:81

Given that the USA and Germany are the most populous countries in North America and Western Europe, understanding the behavioral differences between American and German users of online social networks is essential. In this work, we conduct a data-driven study based on the Yelp Open Dataset. We demonstrate the behavioral characteristics of both American and German users from different aspects, i.e., social connectivity, review styles, and spatiotemporal patterns. In addition, we construct a classification model to accurately recognize American and German users according to the behavioral data. Our model achieves high classification performance with an F1-score of 0.891 and AUC of 0.949.

Total 2