In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular
Language. Scanning and character recognition were a particular challenge, since various non-standard
character set encoding was used in the course of the almost 60-year long production of the dictionary. The first
aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized
text of and transform it into structured data stored in relational lexical database. This approach ...
... Paroubek, John Wiley & Sons, Inc.
Ivanović, N., Jakić, M., Ristić, S. (2016). Građa Rečnika SANU – potrebe i mogućnosti digitalizacije u svetlu savre-
menih pristupa, u: S. Ristić i dr. (ed.), Leksikologija i leksikografija u svetlu savremenih pristupa, Beograd:
Institut za srpski jezik SANU, pp. 133–154. ... ... leaflets) began in 2016, and the first use of the two volumes that were
1 Упутство за обраду Речника, Београд: Институт за српск(охрватск)и језик САНУ (рукопис), 1959. и (допуњено) 2017
[A Handbook for Dictionary Processing, Belgrade: Institute for Serbo(-Croatian) language SASA (manuscript), 1959 and ... ... gramm. data -а hyphen
lemma палеòцēн bold и comma or
trigger begin
gramm. data -ена hyphen
gramm.
data
м item in a list
Etymology palaiós kainós Open parenthesis +
item in a list, e.g.
„(грч.“
closing
parenthesis
Sense 1, 2, 3 or а, б, в
or I, II or trigger
begin
terminological
markers ...
Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)