Recreating the network of early modern natural philosophy: social, semantic and linguistic dimensions

Andrea Sangiacomo, Raluca Tanasescu, Silvia Donker and Hugo Hogenbirk

Time and Place: Thursday, 01.07., 11:35–11:55, Room 1
Session: History of Science

1. Background 

Early modern natural philosophy (the ancestor of today’s natural sciences) underwent dramatic  transformations that completely reshaped its conceptual framework and set of practices. The master  narrative about the seventeenth- and eighteenth-century, the Scientific Revolution, has often  presented this as a somehow linear process, which progressed from the dismissal of Aristotelian  natural philosophy to the establishment of a new Newtonian paradigm. Today’s scholarship is critical  of this overly simplified reconstruction, but it struggles to find ways of delving into the actual  historical complexity of the period. The difficulty is mostly due to the limitations of traditional  methods and approaches, which are not well suited to handle and study the vast amount of materials  that should be taken into account for providing a more satisfying investigation into the evolution of  the field. 

Our long-term project aims at integrating traditional scholarship and network analysis in order to  explore the co-evolution of social and semantic dimensions that shaped early modern natural  philosophy. In the first phase of the project, we reconstructed a large corpus of works related to natural  philosophy, compiled from the point of view of how the discipline was taught, thus focusing on  textbooks and other works connected with the early modern academic milieu. 

In this paper, we explore the following question: how can we best combine socio and semantic  dimensions of a network in which we do not have access to explicit ties among authors or works  composing it? 

2. Methods and data 

Building on our previous research, we managed to compile a corpus of 239 early modern printed  books, containing approximately twenty million words, written in Latin, French, and English, which  are all concerned with providing a systematic and encompassing account of the changing field of  natural philosophy between 1587 (Abraham de la Framboisière’s Methodicae Institutiones) and 1832  (John Robison’s A System of Mechanical Philosophy). The OCR quality of the corpus scores a minimum of 90% per page, which allows for reliable text mining. The criterion used for compiling  this corpus has been largely affected by available bio-bibliographical information in secondary  scholarship and web-scraping procedures in the WorldCat. This provided access to a wealth of titles,  which are mostly obscure or entirely forgotten in today’s scholarship. We do not have access to  explicit information about how particular authors or works were connected among one another (e.g.  personal relationships or correspondences between authors, direct references among works). Despite  how large the amount of exciting research could be conducted on these materials, little can be done  without finding suitable ways of representing this collection of scattered works as forming some  coherent whole.

In this paper, we offer a method for creating a multifaceted representation of our corpus, which  expresses key aspects or features of the corpus in terms of different but connected multiplex networks.  In particular, we assume that a thorough study of our corpus should encompass at least three different  dimensions: (i) social; (ii) semantic; (iii) linguistic (textual). The social dimension is more concerned  with the question of ‘who’ the authors of our works were, and how can we bind together from the  point of view of social properties, such as the fact of having studied or worked at certain institutions or having interacted with certain publishers. The semantic dimension encompasses the way in which  specific keywords were used in our corpus, from which we expect to derive information about how  certain concepts were understood, reshaped, and disseminated by different authors or appropriated  by different approaches and traditions. The linguistic dimension represents even broader features,  such as the homogeneity in the style and linguistic usages in the overall corpus, both among works  written in the same language, and across multiple languages. These three dimensions, then, tackle the  potential ‘similarity’ between the authors and works in our corpus from different perspectives, and  our method consists in using this threefold notion of similarity to build links between the authors and  works by formalizing the relationships they establish as networks. 

Since each of the three dimensions we consider is in itself complex and multifaceted, the  networks we construct for each of them cannot be a single-layered network, but rather a multiplex  network composed of several layers. Each multiplex network combines different computational  approaches: co-affiliation and assortativity coefficient for the social dimension, collocate analysis for  the semantic dimension, and a combination of topic modelling, tf-idf and word embeddings for the  linguistic dimension. 

In order to exemplify how our method works, we pick a small selection of books, which  illustrate how human readers with some background knowledge would connect and group together  different works included in the corpus at hand. We use these works as a reference and throughout our  discussion we then compare where they are located and represented in the networks we build. In this  way, we offer a more direct insight into how our distant computational perspective adds and integrates  our initial expectations and assumptions as human readers. Our purpose here is not to advance any  specific claim about the history of early modern natural philosophy and science that can be derived  by using our method or studying these particular works, but rather establishing that that method is  sound and effective and that it can be implemented for exploring our sources in new ways. 

3. Findings 

The result of the method used is that we can now represent our starting corpus from the point of view  of three multiplex networks, which are connected with one another in virtue of the fact that they are  derived from the same entities (ultimately, the 239 books). This result is already sufficient to begin  exploring the properties of this corpus and how it can be used to investigate the history of early  modern natural philosophy. However, the method has even a greater potential, since our three  multiplex networks can be built themselves together into a complex multilayer network, which would  then allow for a synoptic representation of the three dimensions described here as a whole unified  graph. In this sense, the methodology presented in this paper provides the groundwork for such further  development. Given the technical and conceptual complexities involved in this research, we focus for  now on the more technical and practical methodological dimensions, in order to also demonstrate its potential for being applied to any other multilingual corpus relevant for other disciplines or time  periods. 

4. Selected bibliography: 

Bianconi, Ginestra. 2018. Multilayer Networks: Structure and Function. Oxford: Oxford Scholarship  Online.  

Borgatti, S. & Halgin, D. 2014. “Analyzing affiliation networks.” In Scott, J., & Carrington, P. J.  (eds.). The SAGE handbook of social network analysis, pp. 417-433. London: SAGE Publications 

Brezina, V., McEnery, T., & Wattam, S. 2015. “Collocations in context: A new perspective on  collocation networks.” International Journal of Corpus Lingustics, 20(2), pp. 139-173.

de Bolla, P., Jones, E., Nulty, P., Recchia, G., & Regan, J. (2019). Distributional Concept Analysis:  A Computational Model for History of Concepts. Contributions to the history of concepts, 66-92. 

Dickinson, Mark E; Magnani, Matteo; and Rossi, Luca. 2016. Multilayer Social Networks.  Cambridge : Cambridge University Press. 

Roth, Camille, and Jean-Philippe Cointet. 2010. “Social and Semantic Coevolution in Knowledge  Networks.” Social Networks 32 (1), pp. 16–29.

Taeho, Jo. 2019. Text Mining. Concepts, Implementation, and Big Data Challenge. New York:  Springer.