Aarhus University Seal

Big Humanities Data: About the role of Automatic Natural Language Processing Techniques in the Digital Humanities

Talk by Dr. Marco Büchler, Göttingen Centre for Digital Humanities and Leipzig University

Info about event


Friday 23 October 2015,  at 13:00 - 16:00


IMC Meeting Room, Jens Chr. Skous Vej 4, Building 1483, 312


Kristoffer Nielbo

The ever-evolving study of the humanities has led to a large-scale digitization of historical data. With digitally available information, researchers in the Digital Humanities are now able to further those studies by using quantitative methods.

This presentation aims to introduce the work of three different Digital Humanities projects and focuses firstly on a “recommender system” which, by drawing from NLP techniques, proposes candidates for missing words or fragmentary text in ancient papyri; secondly, the presentation introduces a graph mining technique that is able to systematically identify “serendipity” among data and explains it with a visualization of test results; thirdly the talk illustrates the most recent research on text reuse and the question of the dependencies of algorithms and parameters where there are different proximities between the source and target texts. This last point will be illustrated with an example taken from the comparison of seven different editions of the Bible. 

Dr. Marco Büchler (Göttingen Centre for Digital Humanities and Leipzig University) is visiting Digital Text Laboratory to discuss future collaborations. DTL uses the opportunity to host a workshop on data-intensive methods and knowledge discovery in cultural heritage databases with a particular focus on unstructured (text-heavy) data. Everybody is welcome (researchers and students), especially if you have an interest in the topic or simply want to learn more.


Please write Kristoffer Nielbo kln@cas.au.dk if you plan onparticipating.

Marco Büchler holds a Diploma in computer science. Since 2006 he has worked as a research associate in the Natural Language Processing Group at Leipzig University. From April 2008 to March 2011 Marco served as the technical project manager for the eAQUA project and has continued to work in the capacity for the eTraces project since July 2011. His research interests concern textual transmissions and related text mining techniques. In addition to his primary responsibilities, Marco manages the Medusa project as well as the Leipzig Linguistic Services. In March 2013 he received his PhD in the field of eHumanities.