Претрага
15 items
-
Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface
Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos (2024)Predstavljamo trenutne aktivnosti na definisanju interfejsa leksikona i korpusa koji će služiti kao referenca u prikazu polileksemskih jedinica - višečlanih izraza - (različitih tipova - imenskih, glagolskih, itd.) u specijalizovanim leksikonima i povezivanju ovih unosa sa njihovim pojavljivanjima u korpusima. Konačni cilj je korišćenje ovakvih resursa za automatsko identifikovanje višečlanih izraza u tekstu. Uključivanje nekoliko prirodnih jezika ima za cilj univerzalnost rešenja koje nije usredsređeno na određeni jezik, kao i prilagođavanje idiosinkrazijama. Raspravljaju se izazovi u leksikografskom opisu višerečnih ...Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos. "Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024) М33
-
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities
Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024) М33
-
Sentiment Analysis of Serbian Old Novels
In this paper we present first study of Sentiment Analysis (SA) of Serbian novels from the 1840-1920 period. The preparation of sentiment lexicon was based on three existing lexicons: NRC, AFFIN and Bing with additional extensive corrections. The first phase of dataset refinement included filtering the word that are not found in Serbian morphological dictionary and in second automatic POS tagging and lemma were manually corrected. The polarity lexicon was extracted and transformed into ontolex-lemon and published as initial ...Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović. "Sentiment Analysis of Serbian Old Novels" in Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data, June 2022, Marseille, France, European Language Resources Association (2022) М33
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13 М33
-
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names
In this paper we present a rule- and lexicon-based system for the recognition of Named Entities (NE) in Serbian news paper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annota tion, which were further used to train two Named Entity Recognition (NER) sys tems: Stanford and spaCy. All obtained models, together with a rule- and lexicon based system were evaluated on ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names" in Proceedings - Natural Language Processing in a Deep Learning World, Incoma Ltd., Shoumen, Bulgaria (2019). https://doi.org/10.26615/978-954-452-056-4_122 М33
-
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020) М33
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015) M33
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/ M33
-
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010) M33
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information RetrievalKrstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008) М63
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615 М21
-
A Data Driven Approach for Raw Material Terminology
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaciOlivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892 М22
-
Determining the Impact of Cutting Elements State on the Bucket–Wheel Excavator Vibration and Energy Consumption
Background There are two types of specifc sizes’ measurements. The frst is the measurement of energy consumption in a working technological process, and the second is the measurement of vibration, which is displayed over amplitudes of speed. There is an interplay between these two processes. Increasing the power consumption, the value of the vibration speed amplitudes increases. Purpose The process of the determination of machining parameters was based on minimizing the power consumption and vibration of the cutting tool, ...Filip Miletić, Predrag Jovančić, Miloš Milovančević, Miloš Tanasijević, Stevan Đenadić. "Determining the Impact of Cutting Elements State on the Bucket–Wheel Excavator Vibration and Energy Consumption" in Journal of Vibration Engineering & Technologies, Springer (2022). https://doi.org/10.1007/s42417-022-00482-3 М22
-
Football terminology: compilation and transformation into OntoLex-Lemon resource
У овом раду представља се пројекат који је у развоју, креирање првог дигиталног фудбалског речника на српском језику, као и да демонстрација примене модела OntoLex и љегових модула. OntoLex-FrAC модул укључује информације о учесталости и примерима употребе екстрахованих из корпуса. У овом случају, креиран је корпус за специфичан домен под називом СрФудКо, који садржи чланке вести о фудбалу на српском језику. Вишечлани термини аутоматски су екстраховани из српског корпуса, а затим ручно евалуирани и класификовани као спортски или ...Jelena Lazarević, Ranka Stanković, Mihailo Škorić, Biljana Rujević. "Football terminology: compilation and transformation into OntoLex-Lemon resource" in LDK 2023 – 4th Conference on Language, Data and Knowledge, 12-15 September in Vienna, Austria, Lisabon : NOVA FCSH - CLUNL (2023). https://doi.org/10.34619/srmk-injj М33
-
Оптимизација управљања процесима рекултивације површинских копова угља
Bojan S. Dimitrijević (2014-05-30)Ширење површинске експлоатације минералних сировина а нарочито енергетских ресурсафосилних горива, чине површинске копове угља просторно доминантним објектима великих рударскихбасена како у нас тако и у свету. Експлоатациона активност, прати негативне еколошке утицаје наживотну средину шире околине, што нас обавезује на интегрално планирање, ревитализацију,рекултивацију и уређење нарушеног простора за његово поновно хумано коришћење упостексплоатационом периоду. Постексплоатационо санирање и уређење предела површинских коповаи одлагалишта односно нарушеног простора изазваног рударским радовима и пратећих објеката,представља синтетички мултидисциплинарно-интердисциплинаран веома комплексан вишефазниинжењерски пројектни задатак.У уводном ...рекултивација, постексплоатациони предео, површински коп угља,вишеатрибутна анализа, одлучивање, процеси, рекултивационо решењеBojan S. Dimitrijević. "Оптимизација управљања процесима рекултивације површинских копова угља" in Универзитет у Београду, Универзитет у Београду, Рударско-геолошки факултет (2014-05-30)