Претрага
83 items
-
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... today in Serbian) of the same verb ‘to establish’. Similarly, hleb and leb are two variants of the same noun ‘bread’ (the second one being non-literary). The Ijekavian lemma for the Ekavian lemma devojka ‘girl’ is djevojka. These lemma pairs were recorded in DELAS entries of e-dictionaries, in the ...
... existing SMD to Lex- Info, as a catalog of data categories (e.g., to denote gender, number, part of speech, etc.). 3Unitex is a lexically-based corpus processing suite that offers strong support for finite-state processing using morphological dic- tionaries –http://unitexgramlab.org/ Figure 1: ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... 2004; Stanković et al., 2018) and the Rečnik Matice srpske I-VI: Rečnik srpskohrvatskog književnog jezika (Dictionary of the Serbo-Croatian Literary Language). Slovene (JSI) Slovene WordNet (Erjavec and Fiser, 2006) and Slovene Lexical Database (Gantar and Krek, 2011) were used. Slovene (ISJFR) ...
... (1240) 0 (0) 0 (0) 0 (0) 508 (3230) Serbian WordNet 691 (5864) 985 (6522) 92 (713) 0 (0) 0 (0) 1768 (13099) Serbian Dictionary of Serbo- Croatian Literary Language 289 (2360) 281 (1527) 29 (215) 0 (0) 0 (0) 599 (4102) Slovene WordNet 409 (1106) 303 (901) 237 (733) 44 (133) 0 (0) 993 (2873) Slovenian ...
... (LTC 2011), pages 126–130. Henrich, V., Hinrichs, E. W., and Suttner, K. (2012). Auto- matically linking GermaNet to Wikipedia for harvesting corpus examples for Germanet senses. JLCL, 27(1):1– 19. Henrich, V., Hinrichs, E., and Barkey, R. (2014). Align- ing word senses in GermaNet and the DWDS ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
Developing Termbases for Expert Terminology under the TBX Standard
... tegration with cascades for named entity recognition such as mining equipment, specific minerals and the like. Building of an aligned Serbian-English corpus of texts in the area of mining and geology from sources like the bilingual jour- nal “Underground Mining” are underway. The possibility of searching ...Ranka Stanković, Ivan Obradović, and Miloš Utvić. "Developing Termbases for Expert Terminology under the TBX Standard" in Natural Language Processing for Serbian - Resources and Applications, Belgrade : University of Belgrade, Faculty of Mathematics (2014)