PDF (388.9 KB)
Collect
Submit Manuscript
Research paper | Open Access

Crowdsourcing for search engines: perspectives and challenges

Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran
Show Author Information

Abstract

Purpose

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.

Design/methodology/approach

To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.

Findings

As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.

Originality/value

The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.

References

 
Almosalami, A., Jones, A., Tipparach, S., Leier, K. and Peterson, R. (2018), “Beachbot: crowdsourcing garbage collection with amphibious robot network”, Proceedings of CIEEE International Conferenc/IEEE International Conference on Human-Robot Interaction, ACM, pp. 333-334.https://doi.org/10.1145/3173386.3177832
 

Alonso, O., Rose, D.E. and Stewart, B. (2008), “Crowdsourcing for relevance evaluation”, ACM SIGIR Forum, Vol. 42 No. 2, pp. 9-15.

 
Alonso, O., Schenkel, R. and Theobald, M. (2010), “Crowdsourcing assessments for XML ranked retrieval”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 602-606.https://doi.org/10.1007/978-3-642-12275-0_57
 
Bennett, P.N., Chickering, D.M. and Mityagin, A. (2009), “Picture this: preferences for image search”, Proceedings of the ACM SIGKDD Workshop on Human Computation, ACM, pp. 25-26.https://doi.org/10.1145/1600150.1600157
 
Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S. and Tran Duc, T. (2011), “Repeatable and reliable search system evaluation using crowdsourcing”, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 923-932.https://doi.org/10.1145/2009916.2010039
 
Bozzon, A., Brambilla, M. and Ceri, S. (2012a), “Answering search queries with crowdsearcher”, Proceedings of the 21st International Conference on World Wide Web, ACM, pp. 1009-1018.https://doi.org/10.1145/2187836.2187971
 
Bozzon, A., Brambilla, M. and Mauri, A. (2012b), “A Model-Driven approach for crowdsourcing search”, Proceedings of CrowdSearch 2012 Workshop at WWW, pp. 31-35.
 

Brabham, D.D. (2008), “Crowdsourcing as a model for problem solving: an introduction and cases”, Convergence, Vol. 14 No. 1, pp. 75-90.

 

Breazeal, C., DePalma, N., Orkin, J., Chernova, S. and Jung, M. (2013), “Crowdsourcing human-robot interaction: new methods and system evaluation in a public environment”, Journal of Human-Robot Interaction, Vol. 2 No. 1, pp. 82-111.

 

Callaghan, C.W. (2016), “A new paradigm of knowledge management: crowdsourcing as emergent research and development”, Southern African Business Review, Vol. 20 No. 1, pp. 1-28.

 
Chang, J.C., Amershi, S. and Kamar, E. (2017), “Revolt: collaborative crowdsourcing for labeling machine learning datasets”, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, pp. 2334-2346.https://doi.org/10.1145/3025453.3026044
 
Chen, W., Zhao, Z., Wang, X. and Ng, W. (2016), “Crowdsourced query processing on microblogs”, Proceedings of International Conference on Database Systems for Advanced Applications, Springer, Cham, pp. 18-32.https://doi.org/10.1007/978-3-319-32025-0_2
 
Ciceri, E., Fraternali, P., Martinenghi, D. and Tagliasacchi, M. (2016), “Crowdsourcing for top-k query processing over uncertain data”, Proceedings of IEEE 32nd International Conference on Data Engineering (ICDE), IEEE, pp. 1452-1453.https://doi.org/10.1109/ICDE.2016.7498370
 

Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B. and Allahbakhsh, M. (2018), “Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions”, ACM Computing Surveys, Vol. 51 No. 1, Article No. 7.

 
Del Prado, G.M. (2015), “Robots are terrible at these 3 uniquely human skills”, Business Insider, available at: www.businessinsider.com/things-humans-can-do-better-than-machines-2015-10/ (accessed 20 April 2018).
 
Dellermann, D., Lipusch, N., Ebel, P. and Leimeister, J.M. (2018), “Design principles for a hybrid intelligence decision support system for business model validation”, ElectronicMarkets, available at: https://doi.org/10.1007/s12525-018-0309-2
 

Deng, T. and Feng, L. (2011), “A survey on information re-finding techniques”, International Journal of Web Information Systems, Vol. 7 No. 4, pp. 313-332.

 
Difallah, D.E., Demartini, G. and Cudré-Mauroux, P. (2012), “Mechanical cheat: spamming schemes and adversarial techniques on crowdsourcing platforms”, Proceedings of The First International Workshop on Crowdsourcing Web search (CrowdSearch), pp. 26-30.https://doi.org/10.1145/2187836.2187900
 

Dimitrova, S. and Scarso, E. (2017), “The impact of crowdsourcing on the evolution of knowledge management: insights from a case study”, Knowledge and Process Management, Vol. 24 No. 4, pp. 287-295.

 
Dounias, G. (2015), “Hybrid computational intelligence”, Encyclopedia of Information Science and Technology, 3rd Edition, IGI Global, pp. 154-162.https://doi.org/10.4018/978-1-4666-5888-2.ch016
 
Ermagun, A., Fan, Y., Wolfson, J., Adomavicius, G. and Das, K. (2017), “Real-time trip purpose prediction using online location-based search and discovery services”, Transportation Research Part C: Emerging Technologies, Vol. 77, pp. 96-112.https://doi.org/10.1016/j.trc.2017.01.020
 
Fan, J., Lu, M., Ooi, B.C., Tan, W.C. and Zhang, M. (2014), “A hybrid machine-crowdsourcing system for matching web tables”, Proceedings of IEEE 30th International Conference on Data Engineering, IEEE, pp. 976-987.https://doi.org/10.1109/ICDE.2014.6816716
 
Folds, D.J. (2016), “Human executive control of autonomous systems: a conceptual framework”, Proceedings of IEEE International Symposium on Systems Engineering (ISSE), pp. 1-5.https://doi.org/10.1109/SysEng.2016.7753126
 
Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S. and Xin, R. (2011), “CrowdDB: answering queries with crowdsourcing”, Proceedings of the 2011 ACMSIGMOD International Conference on Management of Data, pp. 61-72.https://doi.org/10.1145/1989323.1989331
 
Gao, J., Li, Q., Zhao, B., Fan, W. and Han, J. (2016), “Mining reliable information from passively and actively crowdsourced data”, Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2121-2122.https://doi.org/10.1145/2939672.2945389
 
Halmes, M. (2013), “Measurements of collective machine intelligence”, arXiv preprint arXiv:1306.6649.
 

Hariri, N. (2013), “Do natural language search engines really understand what users want? a comparative study on three natural language search engines and google”, Online Information Review, Vol. 37 No. 2, pp. 287-303.

 
Harris, C.G. (2011), “Dirty deeds done dirt cheap: a darker side to crowdsourcing”, Proceedings of IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and IEEE Third International Conference on Social Computing (SocialCom), IEEE, pp. 1314-1317.https://doi.org/10.1109/PASSAT/SocialCom.2011.89
 
Harris, C.G. and Srinivasan, P. (2013), “Comparing crowd-based, game-based, and machine-based approaches in initial query and query refinement tasks”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 495-506.https://doi.org/10.1007/978-3-642-36973-5_42
 
Holzinger, A., Plass, M., Holzinger, K., Crişan, G.C., Pintea, C.M. and Palade, V. (2016), “Toward interactive machine learning (iML): applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach”, Proceedings of International Conference on Availability, Reliability, and Security, Springer, pp. 81-95.https://doi.org/10.1007/978-3-319-45507-5_6
 

Howe, J. (2006), “The rise of crowdsourcing”, Wired Magazine, Vol. 14 No. 6, pp. 1-4.

 

Ikediego, H.O., Ilkan, M., Abubakar, A.M. and Victor Bekun, F. (2018), “Crowd-sourcing (who, why and what)”, International Journal of Crowd Science, Vol. 2 No. 1, pp. 27-41.

 
Jain, A., Das, D., Gupta, J.K. and Saxena, A. (2015), “Planit: a crowdsourcing approach for learning to plan paths from large scale preference feedback”, Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 877-884.https://doi.org/10.1109/ICRA.2015.7139281
 
Jansen, B.J., Booth, D.L. and Spink, A. (2007), “Determining the user intent of web search engine queries”, Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 1149-1150.https://doi.org/10.1145/1242572.1242739
 
Jeong, J.W., Morris, M.R., Teevan, J. and Liebling, D. (2013), “A crowd-powered socially embedded search engine”, Proceedings of Seventh International AAAI Conference on Weblogs and Social Media, ICWSM, AAAI.
 
Kairam, S. and Heer, J. (2016), “Parting crowds: characterizing divergent interpretations in crowdsourced annotation tasks”, Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 1637-1648.https://doi.org/10.1145/2818048.2820016
 
Kamar, E. (2016), “Directions in hybrid intelligence: complementing Ai systems with human intelligence”, Proceedings of IJCAI, pp. 4070-4073.
 
Kamar, E., Hacker, S. and Horvitz, E. (2012), “Combining human and machine intelligence in large-scale crowdsourcing”, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, pp. 467-474.
 
Kazai, G. (2011), “In search of quality in crowdsourcing for search engine evaluation”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 165-176.https://doi.org/10.1007/978-3-642-20161-5_17
 
Kazai, G., Kamps, J., Koolen, M. and Milic-Frayling, N. (2011), “Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking”,Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205-214.https://doi.org/10.1145/2009916.2009947
 
Kim, Y., Collins-Thompson, K. and Teevan, J. (2013), “Crowdsourcing for robustness in web search”, Proceedings of TREC.
 

Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A. and Bernstein, M.S. (2017), “Visual genome: connecting language and vision using crowdsourced dense image annotations”, International Journal of Computer Vision, Vol. 123 No. 1, pp. 32-73.

 
Law, E., Mityagin, A. and Chickering, M. (2009a), “Intentions: a game for classifying search query intent”, Proceedings of CHI’09 Extended Abstracts on Human Factors in Computing Systems, ACM, pp. 3805-3810.https://doi.org/10.1145/1520340.1520575
 
Law, E., von Ahn, L. and Mitchell, T. (2009b), “Search war: a game for improving web search”,Proceedings of the ACM sigkdd workshop on human computation, ACM, p. 31.https://doi.org/10.1145/1600150.1600160
 

Lewandowski, D. (2015), “Evaluating the retrieval effectiveness of web search engines using a representative query sample”, Journal of the Association for Information Science and Technology, Vol. 66 No. 9, pp. 1763-1775.

 
Liptchinsky, V., Satzger, B., Schulte, S. and Dustdar, S. (2015), “Crowdstore: a crowdsourcing graph database”, Proceedings of International Conference on Collaborative Computing: Networking, Applications and Worksharing, Springer, pp. 72-81.https://doi.org/10.1007/978-3-319-28910-6_7
 
Ma, H., Chandrasekar, R., Quirk, C. and Gupta, A. (2009), “Improving search engines using human computation games”, Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 275-284.https://doi.org/10.1145/1645953.1645990
 
Marcus, A., Wu, E., Karger, D.R., Madden, S. and Miller, R.C. (2011), “Crowdsourced databases: query processing with people”, Proceedings of CIDR.
 
Milne, D., Nichols, D.M. and Witten, I.H. (2008), “A competitive environment for exploratory query expansion”, Proceedings of the 8thACM/IEEE-CS Joint Conference on Digital Libraries, ACM, pp. 197-200.https://doi.org/10.1145/1378889.1378922
 
Moradi, M., Ardestani, M.A. and Moradi, M. (2016), “Learning decision making for soccer robots: a crowdsourcing-based approach”, Proceedings of Artificial Intelligence and Robotics (IRANOPEN), IEEE, pp. 25-29.https://doi.org/10.1109/RIOS.2016.7529514
 
Ofli, F., Meier, P., Imran, M., Castillo, C., Tuia, D., Rey, N., Briant, J., Millet, P., Reinhard, F., Parkan, M. and Joost, S. (2016), “Combining human computing and machine learning to make sense of big (aerial) data for disaster response”, Big Data, Vol. 4 No. 1, pp. 47-59.https://doi.org/10.1089/big.2014.0064
 

OReilly, T. (2007), “What is web 2.0: design patterns and business models for the next generation of software”, International Journal of Digital Economics, Vol. 1, pp. 17-37.

 
Parameswaran, A., Teh, M.H., Garcia-Molina, H. and Widom, J. (2014), “Datasift: a crowd-powered search toolkit”,Proceedings of International Conference on Management of Data, pp. 885-888.https://doi.org/10.1145/2588555.2594510
 
Parameswaran, A.G., Garcia-Molina, H., Park, H., Polyzotis, N., Ramesh, A. and Widom, J. (2012), “Crowdscreen: algorithms for filtering data with humans”, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 361-372.https://doi.org/10.1145/2213836.2213878
 

Park, S., Cho, K. and Choi, K. (2015), “Information seeking behavior of shopping site users: a log analysis of popshoes, a korean shopping search engine”, Journal of the Korean Society for Information Management, Vol. 32 No. 4, pp. 289-305.

 
Poirier, P. (2017), “Four human strengths and AI weaknesses”, avaliable at: https://medium.com/eruditeai/four-human-strengths-and-ai-weaknesses-a0fc1d38d538/ (accessed 20 July 2018).
 
Rahman, S.S., Easton, J.M. and Roberts, C. (2015), “Mining open and crowdsourced data to improve situational awareness for railway”, Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1240-1243.https://doi.org/10.1145/2808797.2809369
 

Ruotsalo, T., Jacucci, G., Myllymäki, P. and Kaski, S. (2015), “Interactive intent modeling: information discovery beyond search”, Communications of the, ACM, Vol. 58 No. 1, pp. 86-92.

 
Sarma, A.D., Parameswaran, A., Garcia-Molina, H. and Halevy, A. (2014), “Crowd-powered find algorithms”, Proceedings of IEEE 30th International Conference on Data Engineering (ICDE), pp. 964-975.
 
Simpson, E.D., Venanzi, M., Reece, S., Kohli, P., Guiver, J., Roberts, S.J. and Jennings, N.R. (2015), “Language understanding in the wild: combining crowdsourcing and machine learning”, Proceedings of the 24th International Conference on World Wide Web, pp. 992-1002.https://doi.org/10.1145/2736277.2741689
 
Spirin, N. (2014), “Searching for design examples with crowdsourcing”, Proceedings of the 23rd International Conference on World Wide Web, ACM, pp. 381-382.https://doi.org/10.1145/2567948.2577371
 
Sushmita, S., Joho, H., Lalmas, M. and Jose, J.M. (2009), “Understanding domain relevance in web search”, Proceedings of WWW 2 Workshop on Web Search Result Summarization and Presentation, Madrid.
 
Tayyub, J., Hawasly, M., Hogg, D.C. and Cohn, A.G. (2017), “CLAD: a complex and long activities dataset with rich crowdsourced annotations”, arXiv preprint arXiv:1709.03456.
 

Teevan, J., Collins-Thompson, K., White, R.W. and Dumais, S. (2014), “Slow search”, Communications of the, ACM, Vol. 57 No. 8, pp. 36-38.

 

Thelwall, M. (2008), “Quantitative comparisons of search engine results”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.

 

Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P. and Ramachandran, V. (2015), “Crowdsourcing enumeration queries: estimators and interfaces”, IEEE Transactions on Knowledge and Data Engineering, Vol. 27 No. 27, pp. 1796-1809.

 

Uyar, A. (2009), “Investigation of the accuracy of search engine hit counts”, Journal of Information Science, Vol. 35 No. 4, pp. 469-480.

 

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D. and Blum, M. (2008), “Recaptcha: human-based character recognition via web security measures”, Science, Vol. 321 No. 5895, pp. 1465-1468.

 

Wallace, B.C., Noel-Storr, A., Marshall, I.J., Cohen, A.M., Smalheiser, N.R. and Thomas, J. (2017), “Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach”, Journal of the American Medical Informatics Association, Vol. 24 No. 6, pp. 1165-1168.

 

Weyer, J., Fink, R.D. and delt, F. (2015), “Human-machine cooperation in smart cars: an empirical investigation of the loss-of-control thesis”,Safety Science, Vol. 72, pp. 199-208.

 
Whitney, L. (2017), “Are computers already smarter than humans?”, Time Magazine, available at: http://time.com/4960778/computers-smarter-than-humans/ (accessed 23 March 2018).
 

Yampolskiy, R.V., Ashby, L. and Hassan, L. (2012), “Wisdom of artificial crowds – a metaheuristic algorithm for optimization”, Journal of Intelligent Learning Systems and Applications, Vol. 4 No. 2, pp. 98-107.

 
Yan, T., Kumar, V. and Ganesan, D. (2010), “Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones”, Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, ACM, pp. 77-90.https://doi.org/10.1145/1814433.1814443
 

Zadeh, L.A. (2008), “Toward human level machine intelligence-is it achievable? the need for a paradigm shift”, IEEE Computational Intelligence Magazine, Vol. 3 No. 3.

 
Zahedi, M.S. et al. (2017), “How questions are posed to a search engine? An empiricial analysis of question queries in a large scale persian search engine log”, Proceedings of 3th International Conference on Web Research (ICWR), IEEE, pp. 84-89.https://doi.org/10.1109/ICWR.2017.7959310
 

Zeng, Z., Tang, J. and Wang, T. (2017), “Motivation mechanism of gamification in crowdsourcing projects”, International Journal of Crowd Science, Vol. 1 No. 1, pp. 71-82.

 

Zhang, J., Wang, S. and Huang, Q. (2017), “Location-based parallel tag completion for geo-tagged social image retrieval”,ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8 No. 3, Article No. 38.

 
Zhao, Z., Wei, F., Zhou, M., Chen, W. and Ng, W. (2015), “Crowd-selection query processing in crowdsourcing databases: a task-driven approach”, Proceedings of EDBT, pp. 397-408.
International Journal of Crowd Science
Pages 49-62
Cite this article:
Moradi M. Crowdsourcing for search engines: perspectives and challenges. International Journal of Crowd Science, 2019, 3(1): 49-62. https://doi.org/10.1108/IJCS-12-2018-0026
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return