| Sign up

PDF (388.9 KB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Research paper | Open Access

Crowdsourcing for search engines: perspectives and challenges

Mohammad Moradi()

Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran

Show Author Information

Abstract

Purpose

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.

Design/methodology/approach

To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.

Findings

As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.

Originality/value

The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.

Keywords

Search engines Web 2.0 Information retrieval Crowdsourcing Human-computer interaction Human computation

References

Almosalami, A., Jones, A., Tipparach, S., Leier, K. and Peterson, R. (2018), “Beachbot: crowdsourcing garbage collection with amphibious robot network”, Proceedings of CIEEE International Conferenc/IEEE International Conference on Human-Robot Interaction, ACM, pp. 333-334.https://doi.org/10.1145/3173386.3177832

Alonso, O., Rose, D.E. and Stewart, B. (2008), “Crowdsourcing for relevance evaluation”, ACM SIGIR Forum, Vol. 42 No. 2, pp. 9-15.

Crossref Google Scholar

Alonso, O., Schenkel, R. and Theobald, M. (2010), “Crowdsourcing assessments for XML ranked retrieval”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 602-606.https://doi.org/10.1007/978-3-642-12275-0_57

Bennett, P.N., Chickering, D.M. and Mityagin, A. (2009), “Picture this: preferences for image search”, Proceedings of the ACM SIGKDD Workshop on Human Computation, ACM, pp. 25-26.https://doi.org/10.1145/1600150.1600157

Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S. and Tran Duc, T. (2011), “Repeatable and reliable search system evaluation using crowdsourcing”, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 923-932.https://doi.org/10.1145/2009916.2010039

Bozzon, A., Brambilla, M. and Ceri, S. (2012a), “Answering search queries with crowdsearcher”, Proceedings of the 21st International Conference on World Wide Web, ACM, pp. 1009-1018.https://doi.org/10.1145/2187836.2187971

Bozzon, A., Brambilla, M. and Mauri, A. (2012b), “A Model-Driven approach for crowdsourcing search”, Proceedings of CrowdSearch 2012 Workshop at WWW, pp. 31-35.

Brabham, D.D. (2008), “Crowdsourcing as a model for problem solving: an introduction and cases”, Convergence, Vol. 14 No. 1, pp. 75-90.

Crossref Google Scholar

Breazeal, C., DePalma, N., Orkin, J., Chernova, S. and Jung, M. (2013), “Crowdsourcing human-robot interaction: new methods and system evaluation in a public environment”, Journal of Human-Robot Interaction, Vol. 2 No. 1, pp. 82-111.

Crossref Google Scholar

Callaghan, C.W. (2016), “A new paradigm of knowledge management: crowdsourcing as emergent research and development”, Southern African Business Review, Vol. 20 No. 1, pp. 1-28.

Chang, J.C., Amershi, S. and Kamar, E. (2017), “Revolt: collaborative crowdsourcing for labeling machine learning datasets”, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, pp. 2334-2346.https://doi.org/10.1145/3025453.3026044

Chen, W., Zhao, Z., Wang, X. and Ng, W. (2016), “Crowdsourced query processing on microblogs”, Proceedings of International Conference on Database Systems for Advanced Applications, Springer, Cham, pp. 18-32.https://doi.org/10.1007/978-3-319-32025-0_2

Ciceri, E., Fraternali, P., Martinenghi, D. and Tagliasacchi, M. (2016), “Crowdsourcing for top-k query processing over uncertain data”, Proceedings of IEEE 32nd International Conference on Data Engineering (ICDE), IEEE, pp. 1452-1453.https://doi.org/10.1109/ICDE.2016.7498370

Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B. and Allahbakhsh, M. (2018), “Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions”, ACM Computing Surveys, Vol. 51 No. 1, Article No. 7.

Crossref Google Scholar

Del Prado, G.M. (2015), “Robots are terrible at these 3 uniquely human skills”, Business Insider, available at: www.businessinsider.com/things-humans-can-do-better-than-machines-2015-10/ (accessed 20 April 2018).

Dellermann, D., Lipusch, N., Ebel, P. and Leimeister, J.M. (2018), “Design principles for a hybrid intelligence decision support system for business model validation”, ElectronicMarkets, available at: https://doi.org/10.1007/s12525-018-0309-2

Deng, T. and Feng, L. (2011), “A survey on information re-finding techniques”, International Journal of Web Information Systems, Vol. 7 No. 4, pp. 313-332.

Crossref Google Scholar

Difallah, D.E., Demartini, G. and Cudré-Mauroux, P. (2012), “Mechanical cheat: spamming schemes and adversarial techniques on crowdsourcing platforms”, Proceedings of The First International Workshop on Crowdsourcing Web search (CrowdSearch), pp. 26-30.https://doi.org/10.1145/2187836.2187900

Dimitrova, S. and Scarso, E. (2017), “The impact of crowdsourcing on the evolution of knowledge management: insights from a case study”, Knowledge and Process Management, Vol. 24 No. 4, pp. 287-295.

Crossref Google Scholar

Dounias, G. (2015), “Hybrid computational intelligence”, Encyclopedia of Information Science and Technology, 3rd Edition, IGI Global, pp. 154-162.https://doi.org/10.4018/978-1-4666-5888-2.ch016

Ermagun, A., Fan, Y., Wolfson, J., Adomavicius, G. and Das, K. (2017), “Real-time trip purpose prediction using online location-based search and discovery services”, Transportation Research Part C: Emerging Technologies, Vol. 77, pp. 96-112.https://doi.org/10.1016/j.trc.2017.01.020

Fan, J., Lu, M., Ooi, B.C., Tan, W.C. and Zhang, M. (2014), “A hybrid machine-crowdsourcing system for matching web tables”, Proceedings of IEEE 30th International Conference on Data Engineering, IEEE, pp. 976-987.https://doi.org/10.1109/ICDE.2014.6816716

Folds, D.J. (2016), “Human executive control of autonomous systems: a conceptual framework”, Proceedings of IEEE International Symposium on Systems Engineering (ISSE), pp. 1-5.https://doi.org/10.1109/SysEng.2016.7753126

Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S. and Xin, R. (2011), “CrowdDB: answering queries with crowdsourcing”, Proceedings of the 2011 ACMSIGMOD International Conference on Management of Data, pp. 61-72.https://doi.org/10.1145/1989323.1989331

Gao, J., Li, Q., Zhao, B., Fan, W. and Han, J. (2016), “Mining reliable information from passively and actively crowdsourced data”, Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2121-2122.https://doi.org/10.1145/2939672.2945389

Halmes, M. (2013), “Measurements of collective machine intelligence”, arXiv preprint arXiv:1306.6649.

Hariri, N. (2013), “Do natural language search engines really understand what users want? a comparative study on three natural language search engines and google”, Online Information Review, Vol. 37 No. 2, pp. 287-303.

Crossref Google Scholar

Harris, C.G. (2011), “Dirty deeds done dirt cheap: a darker side to crowdsourcing”, Proceedings of IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and IEEE Third International Conference on Social Computing (SocialCom), IEEE, pp. 1314-1317.https://doi.org/10.1109/PASSAT/SocialCom.2011.89

Harris, C.G. and Srinivasan, P. (2013), “Comparing crowd-based, game-based, and machine-based approaches in initial query and query refinement tasks”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 495-506.https://doi.org/10.1007/978-3-642-36973-5_42

Holzinger, A., Plass, M., Holzinger, K., Crişan, G.C., Pintea, C.M. and Palade, V. (2016), “Toward interactive machine learning (iML): applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach”, Proceedings of International Conference on Availability, Reliability, and Security, Springer, pp. 81-95.https://doi.org/10.1007/978-3-319-45507-5_6

Howe, J. (2006), “The rise of crowdsourcing”, Wired Magazine, Vol. 14 No. 6, pp. 1-4.

Ikediego, H.O., Ilkan, M., Abubakar, A.M. and Victor Bekun, F. (2018), “Crowd-sourcing (who, why and what)”, International Journal of Crowd Science, Vol. 2 No. 1, pp. 27-41.

Crossref Google Scholar

Jain, A., Das, D., Gupta, J.K. and Saxena, A. (2015), “Planit: a crowdsourcing approach for learning to plan paths from large scale preference feedback”, Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 877-884.https://doi.org/10.1109/ICRA.2015.7139281

Jansen, B.J., Booth, D.L. and Spink, A. (2007), “Determining the user intent of web search engine queries”, Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 1149-1150.https://doi.org/10.1145/1242572.1242739

Jeong, J.W., Morris, M.R., Teevan, J. and Liebling, D. (2013), “A crowd-powered socially embedded search engine”, Proceedings of Seventh International AAAI Conference on Weblogs and Social Media, ICWSM, AAAI.

Kairam, S. and Heer, J. (2016), “Parting crowds: characterizing divergent interpretations in crowdsourced annotation tasks”, Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 1637-1648.https://doi.org/10.1145/2818048.2820016

Kamar, E. (2016), “Directions in hybrid intelligence: complementing Ai systems with human intelligence”, Proceedings of IJCAI, pp. 4070-4073.

Kamar, E., Hacker, S. and Horvitz, E. (2012), “Combining human and machine intelligence in large-scale crowdsourcing”, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, pp. 467-474.

Kazai, G. (2011), “In search of quality in crowdsourcing for search engine evaluation”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 165-176.https://doi.org/10.1007/978-3-642-20161-5_17

Kazai, G., Kamps, J., Koolen, M. and Milic-Frayling, N. (2011), “Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking”,Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205-214.https://doi.org/10.1145/2009916.2009947

Kim, Y., Collins-Thompson, K. and Teevan, J. (2013), “Crowdsourcing for robustness in web search”, Proceedings of TREC.

Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A. and Bernstein, M.S. (2017), “Visual genome: connecting language and vision using crowdsourced dense image annotations”, International Journal of Computer Vision, Vol. 123 No. 1, pp. 32-73.

Crossref Google Scholar

Law, E., Mityagin, A. and Chickering, M. (2009a), “Intentions: a game for classifying search query intent”, Proceedings of CHI’09 Extended Abstracts on Human Factors in Computing Systems, ACM, pp. 3805-3810.https://doi.org/10.1145/1520340.1520575

Law, E., von Ahn, L. and Mitchell, T. (2009b), “Search war: a game for improving web search”,Proceedings of the ACM sigkdd workshop on human computation, ACM, p. 31.https://doi.org/10.1145/1600150.1600160

Lewandowski, D. (2015), “Evaluating the retrieval effectiveness of web search engines using a representative query sample”, Journal of the Association for Information Science and Technology, Vol. 66 No. 9, pp. 1763-1775.

Crossref Google Scholar

Liptchinsky, V., Satzger, B., Schulte, S. and Dustdar, S. (2015), “Crowdstore: a crowdsourcing graph database”, Proceedings of International Conference on Collaborative Computing: Networking, Applications and Worksharing, Springer, pp. 72-81.https://doi.org/10.1007/978-3-319-28910-6_7

Ma, H., Chandrasekar, R., Quirk, C. and Gupta, A. (2009), “Improving search engines using human computation games”, Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 275-284.https://doi.org/10.1145/1645953.1645990

Marcus, A., Wu, E., Karger, D.R., Madden, S. and Miller, R.C. (2011), “Crowdsourced databases: query processing with people”, Proceedings of CIDR.

Milne, D., Nichols, D.M. and Witten, I.H. (2008), “A competitive environment for exploratory query expansion”, Proceedings of the 8thACM/IEEE-CS Joint Conference on Digital Libraries, ACM, pp. 197-200.https://doi.org/10.1145/1378889.1378922

Moradi, M., Ardestani, M.A. and Moradi, M. (2016), “Learning decision making for soccer robots: a crowdsourcing-based approach”, Proceedings of Artificial Intelligence and Robotics (IRANOPEN), IEEE, pp. 25-29.https://doi.org/10.1109/RIOS.2016.7529514

Ofli, F., Meier, P., Imran, M., Castillo, C., Tuia, D., Rey, N., Briant, J., Millet, P., Reinhard, F., Parkan, M. and Joost, S. (2016), “Combining human computing and machine learning to make sense of big (aerial) data for disaster response”, Big Data, Vol. 4 No. 1, pp. 47-59.https://doi.org/10.1089/big.2014.0064

OReilly, T. (2007), “What is web 2.0: design patterns and business models for the next generation of software”, International Journal of Digital Economics, Vol. 1, pp. 17-37.

Parameswaran, A., Teh, M.H., Garcia-Molina, H. and Widom, J. (2014), “Datasift: a crowd-powered search toolkit”,Proceedings of International Conference on Management of Data, pp. 885-888.https://doi.org/10.1145/2588555.2594510

Parameswaran, A.G., Garcia-Molina, H., Park, H., Polyzotis, N., Ramesh, A. and Widom, J. (2012), “Crowdscreen: algorithms for filtering data with humans”, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 361-372.https://doi.org/10.1145/2213836.2213878

Park, S., Cho, K. and Choi, K. (2015), “Information seeking behavior of shopping site users: a log analysis of popshoes, a korean shopping search engine”, Journal of the Korean Society for Information Management, Vol. 32 No. 4, pp. 289-305.

Crossref Google Scholar

Poirier, P. (2017), “Four human strengths and AI weaknesses”, avaliable at: https://medium.com/eruditeai/four-human-strengths-and-ai-weaknesses-a0fc1d38d538/ (accessed 20 July 2018).

Rahman, S.S., Easton, J.M. and Roberts, C. (2015), “Mining open and crowdsourced data to improve situational awareness for railway”, Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1240-1243.https://doi.org/10.1145/2808797.2809369

Ruotsalo, T., Jacucci, G., Myllymäki, P. and Kaski, S. (2015), “Interactive intent modeling: information discovery beyond search”, Communications of the, ACM, Vol. 58 No. 1, pp. 86-92.

Crossref Google Scholar

Sarma, A.D., Parameswaran, A., Garcia-Molina, H. and Halevy, A. (2014), “Crowd-powered find algorithms”, Proceedings of IEEE 30th International Conference on Data Engineering (ICDE), pp. 964-975.

Simpson, E.D., Venanzi, M., Reece, S., Kohli, P., Guiver, J., Roberts, S.J. and Jennings, N.R. (2015), “Language understanding in the wild: combining crowdsourcing and machine learning”, Proceedings of the 24th International Conference on World Wide Web, pp. 992-1002.https://doi.org/10.1145/2736277.2741689

Spirin, N. (2014), “Searching for design examples with crowdsourcing”, Proceedings of the 23rd International Conference on World Wide Web, ACM, pp. 381-382.https://doi.org/10.1145/2567948.2577371

Sushmita, S., Joho, H., Lalmas, M. and Jose, J.M. (2009), “Understanding domain relevance in web search”, Proceedings of WWW 2 Workshop on Web Search Result Summarization and Presentation, Madrid.

Tayyub, J., Hawasly, M., Hogg, D.C. and Cohn, A.G. (2017), “CLAD: a complex and long activities dataset with rich crowdsourced annotations”, arXiv preprint arXiv:1709.03456.

Teevan, J., Collins-Thompson, K., White, R.W. and Dumais, S. (2014), “Slow search”, Communications of the, ACM, Vol. 57 No. 8, pp. 36-38.

Crossref Google Scholar

Thelwall, M. (2008), “Quantitative comparisons of search engine results”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.

Crossref Google Scholar

Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P. and Ramachandran, V. (2015), “Crowdsourcing enumeration queries: estimators and interfaces”, IEEE Transactions on Knowledge and Data Engineering, Vol. 27 No. 27, pp. 1796-1809.

Crossref Google Scholar

Uyar, A. (2009), “Investigation of the accuracy of search engine hit counts”, Journal of Information Science, Vol. 35 No. 4, pp. 469-480.

Crossref Google Scholar

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D. and Blum, M. (2008), “Recaptcha: human-based character recognition via web security measures”, Science, Vol. 321 No. 5895, pp. 1465-1468.

Crossref Google Scholar

Wallace, B.C., Noel-Storr, A., Marshall, I.J., Cohen, A.M., Smalheiser, N.R. and Thomas, J. (2017), “Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach”, Journal of the American Medical Informatics Association, Vol. 24 No. 6, pp. 1165-1168.

Crossref Google Scholar

Weyer, J., Fink, R.D. and delt, F. (2015), “Human-machine cooperation in smart cars: an empirical investigation of the loss-of-control thesis”,Safety Science, Vol. 72, pp. 199-208.

Crossref Google Scholar

Whitney, L. (2017), “Are computers already smarter than humans?”, Time Magazine, available at: http://time.com/4960778/computers-smarter-than-humans/ (accessed 23 March 2018).

Yampolskiy, R.V., Ashby, L. and Hassan, L. (2012), “Wisdom of artificial crowds – a metaheuristic algorithm for optimization”, Journal of Intelligent Learning Systems and Applications, Vol. 4 No. 2, pp. 98-107.

Crossref Google Scholar

Yan, T., Kumar, V. and Ganesan, D. (2010), “Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones”, Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, ACM, pp. 77-90.https://doi.org/10.1145/1814433.1814443

Zadeh, L.A. (2008), “Toward human level machine intelligence-is it achievable? the need for a paradigm shift”, IEEE Computational Intelligence Magazine, Vol. 3 No. 3.

Crossref Google Scholar

Zahedi, M.S. et al. (2017), “How questions are posed to a search engine? An empiricial analysis of question queries in a large scale persian search engine log”, Proceedings of 3th International Conference on Web Research (ICWR), IEEE, pp. 84-89.https://doi.org/10.1109/ICWR.2017.7959310

Zeng, Z., Tang, J. and Wang, T. (2017), “Motivation mechanism of gamification in crowdsourcing projects”, International Journal of Crowd Science, Vol. 1 No. 1, pp. 71-82.

Crossref Google Scholar

Zhang, J., Wang, S. and Huang, Q. (2017), “Location-based parallel tag completion for geo-tagged social image retrieval”,ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8 No. 3, Article No. 38.

Crossref Google Scholar

Zhao, Z., Wei, F., Zhou, M., Chen, W. and Ng, W. (2015), “Crowd-selection query processing in crowdsourcing databases: a task-driven approach”, Proceedings of EDBT, pp. 397-408.

International Journal of Crowd Science

Volume 3 Issue 1,
June 2019

Pages 49-62

DOI: 10.1108/IJCS-12-2018-0026

Cite this article:

Moradi M. Crowdsourcing for search engines: perspectives and challenges. International Journal of Crowd Science, 2019, 3(1): 49-62. https://doi.org/10.1108/IJCS-12-2018-0026

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号