Претрага
271 items
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... eu/MWSA. Keywords: lexical semantic resources, sense alignment, lexicography, language resource 1. Introduction Lexical semantic resources (LSRs) are knowledge reposi- tories that provide the vocabulary of a language in a de- scriptive and structured way. One of the famous examples of LSRs are ...
... et al., 2012) and with the Digital Dictionary of the German Language (Digitales Wörterbuch der Deutschen Sprache (Klein and Geyken, 2010)) (Henrich et al., 2014). Gurevych et al. (2012) present UKB–a large-scale lexical-semantic resource con- taining pairwise sense alignments between a subset of ...
... meaning. The annotator found the most challenging aspect of the task to lie in the necessity of having to choose the type of match- 3237 Language Resource Nouns Verbs Adjectives Adverbs Other All Basque Wordnet 929 (6836) 0 (0) 0 (0) 0 (0) 0 (0) 929 (6836) Basque Euskal Hiztegia 971 (7754) 0 (0) ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
Wordnet Development Using a Multifunctional Tool
Ivan Obradović, Ranka Stanković (2007)In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their synchronous use in various tasks. We focus here on the description of the possibilities this tool offers in the development of wordnets. Besides the wordnet module which enables parallel handling of two wordnets, other modules, such as the module for morphological dictionaries and the module for aligned texts, as well as available finite ...... 3 http://www.illc.uva.nl/EuroWordNet/sample.html 4 http://nlp.fi.muni.cz/projekty/visdic/ 3. A Multifunctional Language Resource Tool 3.1 Motivation The Human Language Technology group at the University of Belgrade has been developing various lexical resources over quite a long period ...
... finite state transducers, can also be used to aid the user in developing and refining the wordnet. Keywords Wordnet development, language resource integration, HLT tools 1. Introduction The first wordnet, namely the Princeton WordNet (PWN), or simply WordNet, was conceived in 1985 by ...
... task, the HLT group produced an integrated and easily adjustable tool, the workstation for language resources, labeled WS4LR, which greatly enhances the potentials of manipulating each particular resource as well as several resources simultaneously. Exploiting the synergy of various resources ...Ivan Obradović, Ranka Stanković. "Wordnet Development Using a Multifunctional Tool" in Proceedings of the International Workshop Computer Aided Language Processing (CALP) '2007, Borovets, Bulgaria, September 2007, - (2007)
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... of Philology, Her scientific field is Human Language Technologies (HLT) and technology enhanced learning (TEL). She published one book and more than 100 scientific papers, most of them related to natural language processing, more specifically to language resources development and their application ...
... components In designing Bibliša special attention is given to its language support component. It supports various aspects of multilingual libraries: its content is not only multilingual, but also aligned and it can be searched in any language. The proposed tool basically consists of the following components: ...
... The System retrieves terms that match the given keywords from the lexical resources of a query language and then finds their equivalents in another language based on inter- lingual relations established in the lexical resources. After refinement of a query (e.g. deleting or ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
From DELA Based Dictionary to Leximirka Lexical Database
Biljana Lazić, Mihailo Škorić (2020)In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...... Morphological dictionaries represent a significant linguistic resource for languages with rich flexion. Therefore, Serbian morphological dictionaries represent a significant resource for Serbian language processing. The importance of this resource is in its multiple applications. Although Serbian morphological ...
... for natural language processing - NLP. 3 TEI 4 LMF 5 Lemon 84 Infotheca Vol. 19, No. 2, December 2019 Scientific paper The LMF prescribes a standardized framework for recording linguistic in- formation in computer lexicons and is based on the Standard ISO 24613: 2008 (Language Resource Management ...
... Framework - LMF). LMF is designed for lexicons specially designed for Natural Language Pro- cessing and Machine-Readable Dictionaries. LMF specification is represented as a subset of UML (Unified Modeling Language) language that provides lin- guistic description. The LMF consists of mandatory Core package ...Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
-
Parallel Bidirectionally Pretrained Taggers as Feature Generators
In a setting where multiple automatic annotation approaches coexist and advance separately but none completely solve a specific problem, the key might be in their combination and integration. This paper outlines a scalable architecture for Part-of-Speech tagging using multiple standalone annotation systems as feature generators for a stacked classifier. It also explores automatic resource expansion via dataset augmentation and bidirectional training in order to increase the number of taggers and to maximize the impact of the composite system, which ...Ranka Stanković, Mihailo Škorić, Branislava Šandrih Todorović. "Parallel Bidirectionally Pretrained Taggers as Feature Generators" in Applied Sciences, MDPI AG (2022). https://doi.org/10.3390/app12105028
-
Terminological and lexical resources used to provide open multilingual educational resources
Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...... above mentioned, terminology now constitues a very important field of Natural Language Processing whilethe work that has been done in the field of terminologyhas become to be an indespensible, widespread used resource. The standards related to terminology management are often used by the localization ...
... extraction, providing an invaluable education resource, applicable in all of its domains. In the further work bilingual terminology extraction will be considered. REFERENCES [1] I. Gurevych, D. Bernhard and A. Burchardt, “Educational Natural Language Processing,” Notes for ENLP tutorial held at ...
... resources, Natural Language Processing, Terminology 1. INTRODUCTION Natural Language Processing (NLP) has a two-faceted approach to education where one involves e-learning and computer-assisted learning and instruction and the other consists of NLP tools for analysis and use of language by machines [1] ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
-
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain
U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...... the risk domain. FrameNet data is also readily available through the Python API included in the NLTK (Natural Language Toolkit) suite, which provides a good natural language pro- cessing resource. The last chapter shows a corpus search of the noun risk in a mining- themed corpus. We also present its most ...
... it can be used for different purposes: as a dictionary for language learning (since it contains more than 13,000 LUs); as a valence dictionary; as a training dataset for semantic role labeling14 which makes it a rich digital language resource (with over 200,000 manually annotated sentences linked to over ...
... reasoning. The NLTK system uses wrappers for other Python natural language processing and lexical resource libraries. One of the APIs available within NLTK is FrameNet and the accompanying program library designed for searching this resource, as well as for extracting information from it. As mentioned in ...Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
-
Српски језик у дигиталном добу -- The Serbian Language in the Digital Age
Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević (2012)... soci- ety and assess the current state of language technology for the Serbian language. 47 3 THE SERBIAN LANGUAGE IN THE EUROPEAN INFORMATION SOCIETY 3.1 GENERAL FACTS Standard Serbian is the standard national language of Serbs and the official language in the Republic of Ser- bia. It was formed ...
... typical language technology applications. In the next chapter, we will present an overview of language technology and its core application areas as well as an evaluation of the current situation of language technology support for Serbian. 57 4 LANGUAGE TECHNOLOGY SUPPORT FOR SERBIAN Language technology ...
... stract rules, tables and examples. 46 Humans acquire language skills in two different ways: learning from examples and learning the underlying language rules. Moving now to language technology, the two main types of systems acquire language capabilities in a sim- ilar manner. Statistical (or data-driven) ...Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević. "Српски језик у дигиталном добу -- The Serbian Language in the Digital Age" in META-NET White Paper Series, G. Rehm, H. Uszkoreit (eds.), Springer (2012)
-
Речници у дигиталном добу - информатичка подршка за српски језик
Биљана Рујевић (2022)Морфолошки речници српског језика представљају електронски језички ресурс који има значајну историју развоја и коришћења за потребе обраде природних језика. С обзиром на то да су чувани у облику датотека чији је број нарастао па је самим тим управљање речницима постало отежано јавила се потреба за смештањем информација из речника у облик лексикографске базе. Како би се омогућио симултани рад на развоју речника за више корисника јавила се потреба за веб-апликацијом заснованој на лексикографској бази. Како би се размотриле ...Биљана Рујевић. Речници у дигиталном добу - информатичка подршка за српски језик, Београд : [Б. Рујевић], 2022
-
Part of Speech Tagging for Serbian language using Natural Language Toolkit
Ranka Stanković, Boro Milovanović (2020)Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...... 5 released in March 2020. Having a plethora of different algorithms makes this library a good choice for a research. Serbian language belongs to a group of low-resource languages so there’s a modest research on this topic. First attempts to create an automatic PoS tagger for Serbian relied on a ...
... typology,” Proc. Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland, May 2014 [14] C. Krstev and D. Vitas, “Serbian Morphological Dictionary – SMD,” University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0, 2015 [15] A. Balvet, D. Stošić, and ...
... Part of Speech Tagging for Serbian language using Natural Language Toolkit Ranka Stanković, Boro Milovanović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Part of Speech Tagging for Serbian language using Natural Language Toolkit | Ranka Stanković, Boro M ...Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... and low-resource languages such as Serbian. The main contribution of this work is the creation of the AbCoSER, the first abusive speech corpus in Serbian, that will, together with abusive speech lexicon, enable the development of automatic abusive speech detection systems for the Serbian language. In the ...
... a general data set convenient for the detection of a broad range of abusive topics. We already used this resource for the detection of abusive triggers and the augmentation of the abusive language lexicon. D. Jokić, R. Stanković, C. Krstev, and B. Šandrih 13:3 1.2 Related work In the past two decades ...
... [52] and the number of false positives was high, indicating that lexicons are not a sufficient resource for hate speech detection. High-quality corpora of hate speech, offensive speech, and abusive language are very important as a first step in building an automated system for the detection of these ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
-
The Dictionary of the Serbian Academy: from the Text to the Lexical Database
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach ...... database, language resources, dictionary, Serbian language 1 Introduction The first volume of the Dictionary of the Serbo-Croatian Standard and Vernacular Language (re- ferred to as the Dictionary of Serbian Academy or DSA), prepared and compiled by the Institute for the Serbian Language of the Serbian ...
... the Lexical Database Ranka Stanković1, Rada Stijović2, Duško Vitas1, Cvetana Krstev1, Olga Sabo2 1University of Belgrade, 2Institute for Serbian Language, Serbian Academy of Sciences and Arts E-mail: ranka.stankovic@rgf.bg.ac.rs, rada.stijovic@isj.sanu.ac.rs, vitas@matf.bg.ac.rs, cvetana@matf.bg ...
... olga011@yahoo.com Abstract In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Ver- nacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the ...Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)
-
Karst wastewater as a high quality, renewable and within the circular economy water resource
Jovana Nikolić, Vesna Ristić Vakanjac (2021)High quality drinking water in it’s natural state is becoming less and less available to the human population. Based on the expected climate changes, it is considered that this resource will be less in the world but also in our region. Also, the accompanying polluting components that exceed the maximum allowable concentration are increasingly present in the waters. Even after the water treatment, it happens that some components are still in the drinking water, which adversely affects human health. ...Jovana Nikolić, Vesna Ristić Vakanjac. "Karst wastewater as a high quality, renewable and within the circular economy water resource" in Book of proceedings of the 3rd International Scientific Conference on vircular and bioeconomy CIBEK2021, Belgrade, Belgrade : School of Engineering Management (2021)
-
Towards a Mining Equipment Ontology
... software implementation of this resource, whereas in Section 4 we describe the mechanisms by which RudOnto, as a central resource, can be used for transformation of subsets of its concepts to ontologies for specific areas of mining engineering using OWL (Web Ontology Language). The final section features ...
... be derived from RudOnto for the area of Geostatistics, Mine safety, Mineral resource exploitation, Petroleum exploitation or Mining equipment. The structure of RudOnto can be described by an UML (Unified Modeling Language) model, as depicted in Figure 2. A brief description of this model follows. ...
... ultimately led to the idea of a general terminological resource for mining engineering. Hence RudOnto was conceived, as a complex terminological resource, aimed at covering the larger area of mining engineering and becoming the reference resource for mining terminology in Serbian. RudOnto is presently ...Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Towards a Mining Equipment Ontology" in Proceedings of the 12th International Conference Research and Development in Mechanical Industry, RaDMI 2012, September 2012, Vrnjačka Banja, Serbia no. 1, Vrnjačka Banja, Serbia : SaTCIP (Scientific and Technical Center for Intellectual Property) Ltd. (2012)
-
English for Geology Students. 2
Lidija Beko (2023)... previous textbook with this one, putting their own principles of clarity and coherence as a way in which they wish to teach the subject of English language and geology. Six thematic units: 1. Landslides 2. Metamorphic rocks 3. Mineral deposits 4. Hydrological cycle and groundwater 5. Surface ...
... aking between registers while at the same time referring to active learning within the given context. Teaching vocabulary, which is the base of language knowledge, can be continued by creating paper vocabulary cards, and later even electronic cards, which would ensure continuity in vocabulary learning ...Lidija Beko. English for Geology Students. 2, Belgrade : The Faculty of Mining and Geology, 2023
-
English for Geology Students. 1
Lidija Beko (2023)Lidija Beko. English for Geology Students. 1, Belgrade : The Faculty of Mining and Geology, 2023
-
English for Geology Students 2 - Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 2 - Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023
-
English for Geology Students 1 – Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 1 – Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023