Mapping the Topics and Intellectual Structure of Web Science




This paper describes a mixed methods analysis of papers published at the ACM Web Science Conference series from 2009 to 2016, using co-citation analysis, bibliographic coupling, natural language processing, topic modelling and network visualisation techniques. The knowledge base of the Web Science community and the knowledge transfer from the ACM Web Science Conference series are studied, revealing major themes and key authors as a map of Web Science. In particular, the foundations of the Web Science community are revealed via co-citation analysis of authors of papers cited by ACM Web Science papers, while NLP analysis reveals topical descriptors and application contexts of Web Science. Finally, author-based bibliographic coupling of papers published at Web Science reveals authors who have been influenced by the Web Science community. In sum, this paper presents a knowledge map of the Web Science discipline visualizing topical foci, methodical roots in various disciplines, and key players in Web Science research.

Author Biography

  • Cécile Robin, NUI Galway
    Insight Centre for Data Analytics


Agarwal, S., Mittal, N., & Sureka, A. (2017 [preprint]). A General Overview and Bibliometric Analysis of Seven ACM Hypertext and Web Conferences. International Journal of Web Engineering and Technology. Retrieved from

Barthélémy, M. (2004). Betweenness centrality in large complex networks. The European Physical Journal B - Condensed Matter and Complex Systems, 38(4), 163-168.

Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10, P10008.

Bordea, G., & Buitelaar, P. (2010). DERIUNLP: A context based approach to automatic keyphrase extraction. In Proceedings of the 5th International Workshop on Semantic Evaluation (pp. 146-149). Stroudsburg, PA, USA: Association for Computational Linguistics.

Bordea, G., Buitelaar, P., & Polajnar, T. (2013). Domain-independent term extraction through domain modelling. In Proceedings of the 10th International Conference on Terminology and Artificial Intelligence, Paris, France. Retrieved from

Bordea, G. (2013). Domain adaptive extraction of topical hierarchies for Expertise Mining (Doctoral dissertation). Retrieved from ARAN Repository. Available from

Cain, K. W. (1986). Co-cited author mapping as a valid representation of intellectual structure. Journal of the American Society for Information Science, 37(3), 111–122.

Chen, C., & Carr, L. (1999). Trailblazing the literature of hypertext: author co-citation analysis (1989–1998). In Proceedings of the 10th ACM Conference on Hypertext and Hypermedia (pp. 51-60).

Cole, J. R., & Cole, S. (1973). Social Stratification in Science. Chicago, IL: University of Chicago Press.

Coulter, N., Monarch, I., & Konda, S. (1998). Software engineering as seen through its research literature: A study in co-word analysis. Journal of the American Society for Information Science, 49(13), 1206–1223.

Culnan, M. J. (1987). Mapping the intellectual structure of MIS, 1980-1985: A co-citation analysis. MIS Quarterly, 11(3), 341–353.

Goodrum, A. A., McCain, K. W., Lawrence, S., & Giles, C. L. (2001). Scholarly publishing in the Internet age: a citation analysis of computer science literature. Information Processing & Management, 37(5), 661-675.

Hall, W., Hendler, J., & Staab, S. (2016). Web Science Manifesto: Retrieved from:

Haustein, H., & Larivière, V. (2014). A multidimensional analysis of Aslib proceedings – using everything but the impact factor. Aslib Journal of Information Management, 66(4), 358-380.

Hooper, C. J., Bordea, G., & Buitelaar, P. (2013). Web Science and the Two (Hundred) Cultures: Representation of Disciplines Publishing in Web Science. Web Science, (pp. 162-171). Paris, France.

Hooper, C. J., Hedge, N., Hutchison, D., Papadimitrious, D., Passarella, A., Sourlas, V., et al. (2014). EINS Deliverable 2.3: Whitepaper on recommendations for funding agencies. Network of Excellence in Internet Science FP7-288021.

Hooper, C. J., Millard, D. E., & Azman, N. (2014). Interdisciplinary Coups to Calamites (workshop). Web Science 2014. Bloomington, IN, USA.

Hooper, C., Marie, N., & Kalampokis, E. (2012). Dissecting the Butterfly: Representation of Disciplines Publishing at the Web Science Conference Series. Proc. WebSci 2012 (pp. 137-140). ACM Press.

Hooper, C., Trossen, D., & Surridge, M. (2014). D2.1.1 Repository of methodologies, design tools and use cases. Network of Excellence in Internet Science FP7-288021.

Ibargoyen, A., Szostak, D., & Bojic, M. (2013). The Elephant in the Conference Room:Let’s Talk About Experience Terminology. In CHI'13 Extended Abstracts on Human Factors in Computing Systems, Paris, France (pp. 2079-2088).

Jacomy, M. (2011, June 6). ForceAtlas2, the new version of our home-brew Layout. (Gephi) Retrieved from

Jiang B., Endong X., & Jianzhong Q. (2015). A Domain Independent Approach for Extracting Terms from Research Papers. In Sharaf M., Cheema M., Qi J. (eds.), Databases Theory and Applications. Proceedings of the 26th Australasian Database Conference, Melbourne, Australia. Lecture Notes in Computer Science, 9093 (pp.155-166). Springer: Cham.

Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1), 10–25.

Kraker, P., Jack, K., Schlögl, C., Trattner, C., & Lindstaedt, S. (2013). Head Start: Improving Academic Literature Search with Overview Visualizations based on Readership Statistics. In Proceedings of the ACM Web Science Conference, Paris, France. Retrieved from

Lopez-Herrera, A., Cobo, M., Herrera-Viedma, E., & Herrera, F. (2010). A bibliometric study about the research based on hybridating the fuzzy logic field and the other computational intelligent techniques: A visual approach. International Journal of Hybrid Intelligent Systems, 7(1), 17-32.

Merton, R. K. (1968). The Matthew Effect in Science: The reward and communication systems of science are considered. Science, 159(3810), 56-63.

Navigli, R., Velardi, P., & Faralli, S. (2011). A graph-based algorithm for inducing lexical taxonomies from scratch. In T. Walsh (Ed.), Proceedings of the Twenty-Second international joint conference on Artificial Intelligence (pp. 1872-1877). doi: 10.5591/978-1-57735-516-8/IJCAI11-313

Ni, C., & Jiang, J. (2016). Visualizing computer science communities using conference hashtags. In Proceedings of the 2016 iConference, Philadelphia, USA (poster). Retrieved from

Rorissa, A., & Yuan, X. (2012). Visualizing and mapping the intellectual structure of information retrieval. Information Processing and Management, 48(1), 120-135.

Sahal, A., Wyatt, S., Passi, S., & Scharnhorst, A. (2013). Mapping EINS--An exercise in mapping the Network of Excellence in Internet Science. In Proceedings of the 1st International Conference on Internet Science, Brussels, Belgium (pp. 75-78). Retrieved from

Small, H. (1973). Co-citation in the Scientific Literature: A New Measure of the Relationship Between Two Documents. Journal of the American Society for Information Science, 24(4), 265–269.

Van Eck, N. J., & Waltman, L. (2016). VOSviewer Manual. Retrieved from manual

Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.

Van Eck, N. J., & Waltman, L. (2009). How to normalize cooccurrence data? An analysis of some wellknown similarity measures. Journal of the American Society for Information Science and Technology, 60(8), 1635–1651.

Wagner, C., Roessner, J., Bobb, K., Klein, J., Boyack, K.,. Keyton, J., Rafols, I. & Börner, K. (2011). Approaches to Understanding and Measuring Interdisciplinary Scientific Research (IDR): A Review of the Literature. Journal of Informetrics, 165, 14-26.

Wang, Z.-Y., Li, G., Li, C.-Y., & Li, A. (2012). Research on the semantic-based co-word analysis. Scientometrics, 90(3), 855-875.

Wuchty, S., Jones, B. F., & Uzzi, B. (2007). The Increasing Dominance of Teams in Production of Knowledge. Science, 316, 1036-1039.

Zhang, Z., Iria, J., Brewster, C., & Ciravegna, F. (2008). A comparative evaluation of term recognition algorithms. In Proceedings of the Sixth International Conference on Language Resources and Evaluation, Marrakech, Morocco (pp. 2108-2113). Retrieved from

Zhao, D., & Strotmann, A. (2008). Evolution of research activities and intellectual influences in information science 1996–2005: Introducing author bibliographic-coupling analysis. Journal of the American Society for Information Science and Technology, 59 (13), 2070–2086.