Skip to main content

HUN-REN Data Repository Platform

8th Hungarian Data Steward Meeting

The Hungarian Academy of Sciences hosted the first informal Data Steward Meeting of 2026, organized jointly by the ELTE National Laboratory for Digital Heritage (ELTE DH-LAB), the HUN-REN Data Repository Platform (HUN-REN ARP), and Pro-M Zrt. The event, held on January 23, focused on linguistic topics, primarily linguistic corpora.

At the meeting, Veronika Lipp, director-general of the ELTE Research Centre for Linguistics, welcomed the participants. In her opening speech, she congratulated the series, assured the data stewards of her support, and emphasized the importance of the data steward profession and the work of data stewards.

After the welcome speech, Bálint Sass (ELTE NYTK, Institute for Lexicology) talked about large corpora, tokens, vertical format, token annotation and structure annotation, span, metadata, as well as gold standard and silver standard. Among many other things, his presentation mentioned the 1 billion-word Hungarian National Corpus and the giant corpus of 100 billion words, equivalent to 50 km of printed text, used for teaching artificial intelligence.

Ibolya Kristóf (ELTE NYTK, Institute for Lexicology) told the audience about the Hungarian Historical Corpus, which contains bibliographic data on Hungarian-language texts published in print between 1772 and 2020, compiled for the purpose of writing The Comprehensive Dictionary of Hungarian Language. In her presentation, she discussed handwriting recognition, indexing, the storage, use, and readability of index cards, Antal Horger and Joseph II, as well as the Prószéky codes. To date, eight volumes of The Comprehensive Dictionary of Hungarian Language have been published in print, while the entire collection of entries is also available in a freely accessible digital database.

After the presentations, those interested could view the exhibition entitled "200 Years of Treasures" in the rooms of the MTA Art Collection, guided by Zsuzsa Sidó (ELTE HTK MI - MTA Art Collection), which presents the history of the Academy in the 19th and 20th centuries and its outstanding personalities through paintings, manuscripts, and works of art.

Kép
előadóterem beszélgető emberekkel
Kép
Lipp Veronika előad
Kép
Sass Bálint előad
Kép
Kristóf Ibolya előad
Kép
Előadóterem emberekkel
Kép
Sidó Zsuzsa előad
Kép
Festmények az MTA galéria falán
Kép
Csoportkép
The presentation of Bálint Sass:
The presentation of Ibolya Kristóf:

The aim of the Data Steward Meeting series is to provide an opportunity for Hungarian data stewards to meet informally, talk, get to know each other, and discuss the latest news, events, trends, and issues in the field. The series, which brings together Hungarian data stewards is open to anyone interested. The next event is expected to take place in the spring of 2026. For the exact date and location, please contact @email. The organizers look forward to seeing all interested data stewards at future events!

Reports from previous meetings:

 

Photos: Julia Egyed-Gergely (ELTE TK), Ákos Lencsés (Pro-M Zrt.), Zsuzsa Sidó (DH-LAB/ ELTE HTK MI )