Home > Library > 

Knowledge Laboratory (DataLab)

Knowledge laboratory – DataLab

The Library of Fundación Juan March acts as an active and dynamic space aiming to integrate the knowledge generated by the Foundation and building on internal and external collaborations.

The knowledge laboratory - DataLab is a section of the library dedicated to digital curation and analytis projects with the data produced in the Library and the rest of the Foundation. It builds on the experience from the Data Library in the former Advanced Studies in Social Sciences (CEACS), the Juan March Institute.

The DataLab takes advante of data science technologies and methodologies to curate, analyze and visualize information in new ways. It collaborates with universities and similar centers, and the work is disseminated at conferences, academic articles and specialized workshops.



The digital curation projects aim to organize our digital knowledge and include a range of activities such as data cleansing, modelling, preservation, curation and dissemination of our printed, audio and digital content. The data analytics projects aim to better understand the activity of the Foundation and its users as well as support organizational decision-making.

  • Spanish Music Theatre Portal from a selection of works. Includes a wide range of curatorial tasks and the innovative work of OCR of printed notation scores resulting in MusicXML documents that allows listening to the score with virtual instruments.
  • Automatic keyword generation for concerts and lectures. Applying machine learning algorithms to automatically assign keywords to events in the Foundation.
  • Personal and organizational name dissambiguation. Using clústering techniques to identify names that are the same but written slightly different.
  • Web user behaviour analysis. From the data collected from web traffic and to improve user experience, user behavior is analysed to learn what they use, how they book events or what they look for.


The Library collaborates with Universities in research activities and projects. Currently, it is taking part in a research project led by the University of Salamanca to develop a statistical framework for coincidence analysis. The statistical programming tool R is used with the D3 visualization library. In addition to this, there is an ongoing collaboration with the Faculty of Mathematics from the University Complutense to undertake MSc dissertations in Computational Statistics in the Foundation.