Research Infrastructure of Institute of Contemporary History

Research Infrastructure of Institute of Contemporary History

Code: I0-0013 SICRIS 

Period:  January 1, 2022-December 31, 2027     

Sponsor: Javna agencija za raziskovalno dejavnost Republike Slovenije (ARRS)

Range in 2020: 6,5 FTE

Head: dr Mojca Šorn

ResearchersNeja Blaj Hribar, Ana CvekKarin KondaMarko Kupljen, Mihael Ojsteršek, dr. Andrej Pančur, Sergej Škofljanec, Darja Vipavc, dr. Marta Rendla, dr. Jure Gašparič, Ivan Smiljanić.

The infrastructure programme Research Infrastructure of Slovenian Historiography supports the research activities of the Institute of Contemporary History (ICH) and is primarily in the service of the national historiography and national scientific collections (DARIAH-SI, SIstory). It is also responsible for the national digital infrastructure for the humanities and arts (DARIAH-SI, SI-DIH), which is a part of the international infrastructure project ESFRI (DARIAH-EU). The infrastructure programme carries out its activities within various work packages (WP) in line with the Institute’s, national, and international infrastructure. It implements the work packages as a centre for IT infrastructure (WP1), library and documentary materials (WP2), digitisation (WP3), unstructured data (WP4), semi-structured data (WP5), structured data (WP6), digital editions (WP7), publishing (WP8), and digital storage and access (WP9). In the context of all these activities, the infrastructure programme carries out research and development activities in the field of digital humanities (WP10). In their interaction with other infrastructures and researchers, projects, and institutions, these activities are supported at the national level by the cooperation, training, and promotion service (WP11) and at the international level by the international cooperation service (WP12).

 

WP1: IT infrastructure centre

IT infrastructure represents the basis for all the services provided by the infrastructure programme. It carries out its activities mainly through the use of:

 

WP2: Library and documentary materials centre

Although the infrastructure programme is exceedingly digital-oriented, most of its digital collections are derived from the original analogue cultural and scientific heritage collections. As a specialised library for scientific and research activities, the centre also keeps more than 50,000 items of library materials, including the extensive D-collection (more than 15,000 items). Some of these materials have already been digitised and are available on the History of Slovenia – SIstory portal, while the rest are being digitised at an accelerated pace. In line with the vision of the Association of European Research Libraries LIBER, the centre aims to follow the principles of open access and FAIR research data as much as possible.

The infrastructure programme also supports the Institute of Contemporary History’s collection of documentary materials relevant to the history of the Institute as a research organisation.

 

WP3: Digitisation centre

As digitally-oriented research is only possible with data in machine-readable form, one of the key objectives of the research infrastructure is to digitise the original analogue materials. To this end, it performs the following tasks:

  • digitisation of analogue books and documentary and archival materials: additional image processing (margins, alignment, contrast), conversion to other formats, and, in the case of digitisation of printed/typed texts, optical character recognition (OCR);
  • recording of scientific and professional events in the field of historiography and humanities (lectures, round tables, conferences, etc.): image and sound processing and editing.

 

WP4: Unstructured data centre

Collecting, editing, and processing textual, image, audio, and video collections of unstructured data. Image collections can be small (e.g. just over 100 photos of death masks) or extensive (over 350,000 images of historical population censuses). The same is true of the even more numerous collections of texts, which can also be smaller (e.g. the collection of 42 printed items about the Carinthian plebiscite) or larger (e.g. the collection of the Poročevalec državnega zbora publication, consisting of 1,668 digital objects or 142,468 pages). The unstructured data centre performs the following tasks:

  • file-based editing of materials,
  • adding descriptive and technical metadata.

Unstructured data collections are the most common types of collections produced in the context of the infrastructure programme activities. They are mostly accessible through the History of Slovenia – SIstory portal (see WP8: digital storage and access centre), which currently contains more than 35 collections of archival and printed sources, literature, and events.

 

WP5: Semi-structured data centre

Collecting, editing, analysing, and encoding semi-structured data, mainly in XML format and according to the guidelines of the international Text Encoding Initiative (TEI) consortium. In collaboration with digital humanities researchers, the centre mainly performs more or less complex encodings of the structure and meaning of texts. The following databases and scientific publications are encoded under the infrastructure programme:

  • The Slovenian parliamentary corpus: the latest version of siParl 2.0 covers parliamentary debates from 1990 to 2018 and consists of 11,967 texts and almost 240 million words. The project titled Development of Slovene in a Digital Environment (DSDE) will cover the even older minutes up to 1947, while we also plan to encode the minutes of parliamentary sessions from before World War I and II.
  • The collection of political and party life in Slovenia: programmes of political parties and organisations.
  • The collection of Slovenian legal texts SI-IUS (in cooperation with PF UL, IJS, ZRC SAZU).
  • Collections of scientific texts: the scientific journal Prispevki za novejšo zgodovino/Contributions to Contemporary History (currently from 2014 to date) and monographs published by the ICH Press (currently 8 publications).
  • The collection of repertoriums of places (1817–1939): toponyms in Slovenia, geographical and statistical data.
  • Various prosopographical collections (Jews in Slovenia, victims of World War I and II, population censuses), which can be linked to the collections in the next work package.

(Free) access to the data contained in these scientific collections can be provided through GitHub and GitLab repositories, the digital publishing centre (WP7), and the CLARIN.SI repository.

 

WP6: Structured data centre

Collecting, editing, analysing, and entering structured data into relational databases. The infrastructure programme supports and actively participates in the development of the following major collections:

Data from all these relational research databases are freely accessible through the web applications developed by WP6 members in collaboration with DARIAH-SI.

 

WP7: Digital editions centre

Additional encoding, editing, production, conversion, and issuing of electronic publications and digital scientific editions in accordance – insofar as possible – with the principles of the Endings Project:

  • Data: TEI XML, Git, data validation and diagnostics.
  • Web applications: static web pages, the possibility to add dynamic content, different versions of editions more or less in line with the principles of the Endings Project.
  • Processing: validation of the static web page processing.
  • Documentation: data model, copyrights.
  • Careful and verified management of new releases.

SIstory TEI profile is used as a static web page generator, which needs to be adapted each time to the specific requirements of a particular digital release (added via GitHub and GitLab repository of the digital release in question).

These digital editions are mostly accessible through a GitHub Pages server, the SIstory portal, and the SI-DIH repository (see WP9). Currently, digital editions are also being published on a trial basis via teiPublisher.

 

WP8: Publishing centre

The centre offers technical and professional support to the publishing activities of the Institute of Contemporary History as well as to other publishers in the field of historiography, especially by providing free access to scientific publications. To this end, it manages:

the Open Journal System, an open-source application that enables the management of editorial processes and the publication of the three current scientific magazines: Contributions to Contemporary History (currently more than 2,000 articles), Historical Review (currently more than 1,100 articles), and Kronika (currently almost 400 articles);

Open Monograph Press, an open-source application that will enable the management and publishing of the ICH Press scientific monographs: currently under development. In the meantime, the publisher’s digitised monographs are available on the SIstory portal.

 

WP9: Digital storage and access centre

Ensuring the permanent and comprehensive digital storage of research data collections and scientific publications and making them freely accessible in the following systems:

  • sustainable storage of research data and digitised cultural heritage in the Archivematica digital storage system (not publicly available, the members of the infrastructure programme need to be contacted);
  • digital library of the History of Slovenia – SIstory portal: currently more than 45,000 freely accessible digital objects of scientific and cultural heritage and research results of Slovenian historiography; current annual traffic: 84,306 unique visitors, 443,924 page views, 53,741 file downloads;
  • Slovenian Digital Humanities Repository SI-DIH: 4 collections and the results of 4 projects, 1,871 intellectual entities, and 4,395 files are currently freely accessible;
  • permanent storage of text files and program code for digital editions and research software in Git version control repositories: Digital humanities research in Slovenia (currently 46 projects), GitHub (https://github.com/SIstory, https://github.com/DARIAH-SI, https://github.com/sidih, currently 50 Git repositories in total).

 

WP10: Research and development centre for digital humanities

The work package members develop and implement digital humanities methods, standards, tools, and services according to the needs of individual collections, projects, and the entire humanities community. They work with external contractors to develop tools, repositories, portals, databases, and web applications. WP10 acts as a hub for the development of the digital humanities research programme.

 

WP11: Cooperation, training, and promotion service

Cooperation with national research infrastructures: to ensure that the research data cycle runs smoothly and that scientific collections receive all the necessary support, WP11 is engaged in an intensive collaboration with:

  • the institutions and individuals that preserve cultural and scientific heritage,
  • the research institutions that develop digital history and digital humanities,
  • related national infrastructures:
  • CLARIN.SI regarding the production and storage of language corpora and digital editions;
  • CESSDA regarding the research data storage planning;
  • the Slovenian national supercomputing network – SLING consortium with the aim of connecting to and benefitting from the supercomputing network;
  • the European digital infrastructures consortium DARIAH ERIC to set up a digital research infrastructure for the arts and humanities in Europe.

WP11 promotes the use of new digital technologies and methods in the development of scientific collections and in the conduct of digitally supported research:

  • supporting research projects and programmes,
  • (co-)organising individual and group workshops,
  • (co-)organising scientific conferences,
  • implementing an internship programme with a particular focus on training its own data scientists, data specialists, and other support staff to advise, plan, and implement research infrastructure activities.

 

WP12: International cooperation service

Cooperation with the European network for digital humanities DARIAH ERIC and other national DARIAH infrastructures. Monitoring developments and active participation:

since 2008, the infrastructure programme of the Institute of Contemporary History has participated in the European project DARIAH, which is included in the European Strategy Forum on Research Infrastructure (ESFRI Roadmap), in partnership with the Scientific Research Centre of the Slovenian Academy of Sciences and Arts since 2010. In the context of the pan-European links, Slovenia signed the accession to DARIAH-ERIC in 2013, and the Institute of Contemporary History became the DARIAH coordinator in Slovenia.

The infrastructure programme follows the general objectives of the ESFRI Roadmap, while WP12:

  • participates in outlining the digital humanities development policy in terms of international and interdisciplinary interoperability and long-term sustainability, with a focus on the activities related to contents and technology as well as the collaboration within working groups (WG) and virtual competence centres (VCC, especially 1–3),
  • integrates research in the arts and humanities with modern technologies and technological advances to achieve the following strategic orientations:
    • designing and managing activities and services within the humanities research community;
    • effective integration of resources, tools, and services with broader support in the existing research activities into the further development of infrastructures at the national and global level;
    • effective promotion of digital methods in humanities research to achieve sustainable development of digital humanities;
    • ensuring comprehensive access to data and services in an open and freely accessible infrastructure;
    • building a trusted infrastructure that includes partners, data, services, and processes.