Претрага
114 items
-
Integrisanje heterogenih leksičkih resursa
Osnovna aktivnost Grupe za obradu prirodnih jezika na Matematičkom fakulteta Univeziteta u Beogradu je usmerena na razvoj različitih resursa za obradu srpskog jezika. Među njima su posebno značajni sistem morfoloških rečnika srpskog jezika razvijenih u okviru mreže RELEX [1] i semantička mreža (tipa wordnet) za srpski jezik razvijena u okviru međunarodnog projekta Balkanet. Radi se o dva heterogena leksička resursa, razvijena na osnovu sasvim različitih modela, koji samim tim sadrže i različite vrste leksičkih informacija. Integracijom ovih resursa, informacije ...... Ranka Stanković, Cvetana Krstev, Duško Vitas, Ivan Obradović, Gordana Pavlović-Lažetić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Integrisanje heterogenih leksičkih resursa | Ranka Stanković, Cvetana Krstev, Duško Vitas, Ivan Obradović, Gordana Pavlović-Lažetić ...
... Integrisanje heterogenih leksičkih resursa Ranka Stanković, Rudarsko-geološki fakultet, Beograd Cvetana Krstev, Filološki fakultet, Beograd Duško Vitas, Matematički fakultet, Beograd Ivan Obradović, Rudarsko-geološki fakultet, Beograd Gordana Pavlović-Lažetić, Matematički fakultet, Beograd ...
... (2002). BALKANET: A Multilingual Semantic Network for Balkan Languages. Proceedings of 1st International Wordnet Conference, Mysore, India. [4] Vitas, D. et al. (2003). Resources and Basic Tools for the Processing of Serbian Written Texts. Proc. of the Workshop on Balkan Language Resources, 1st ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Ivan Obradović, Gordana Pavlović-Lažetić. "Integrisanje heterogenih leksičkih resursa" in Festivalski katalog 11. Festivala informatičkih dostignuća INFOFEST 2004, 26th September - 2nd October, 2004, Budva, Montenegro, INFOFEST (2004)
-
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković (2019)У овом раду представљамо модел за избор добрих примера за речник српског језика и развој иницијалних компоненти модела. Метода која се користи заснива се на детаљној анализи различитих лексичких и синтактичких карактеристика у корпусу састављених од примера из пет дигитализованих свезака речника САНУ. Почетни скуп функција био је инспирисан сличним приступом и за друге језике. Дистрибуција карактеристика примера из овог корпуса упоређује се са карактеристиком дистрибуције узорака реченица ексцерпираних из корпуса који садрже различите текстове. Анализа је показала да ...Српски, добри примери из речника, аутоматизација израде речника, издвајање својстава, Машинско учење... Krstev, Duško Vitas, Aleksandra Marković Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian | Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra ...
... Krstev1, Duško Vitas1, Aleksandra Marković2 1 University of Belgrade, Studentski trg 1, Belgrade, Serbia 2 Institute for Serbian Language, SASA, Knez Mihailova 36, Belgrade, Serbia E-mail: ranka@rgf.rs, branislava.sandrih@fil.bg.ac.rs, rada.stijovic@isj.sanu.ac.rs, cvetana@matf.bg.ac.rs, vitas@matf ...
... ideas how to modernize the work on the SASA dictionary came many years ago (Sabo & Vitas, 1989). These ideas were later revitalized and various possibilities for updating the work on this dictionary were considered (Vitas & Krstev, 2015; Ivanović et al., 2016). The modernization of work finally began ...Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković. "SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian" in Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference , Lexical Computing CZ, s.r.o. (2019)
-
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
-
Automatic construction of a morphological dictionary of multi-word units
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multiwordn units, noun phrases, query expansion... Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Automatic construction of a morphological dictionary of multi-word units | Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić | Lecture Notes ...
... al Inflection of Multi-Word Units - A Contrastive Study of Lexical Approaches. Linguistic Issues in Language Technologies 1 (2008) 4. Krstev, C., Vitas, D.: Finite State Transducers for Recognition and Generation of Compound Words. In Erjavec, T., Žganec Gros, J., eds.: IS-LTC 2006, Ljubljana, Slovenia ...
... 192–197 5. Savary, A.: Multiflex: A Multilingual Finite-State Tool for Multi-Word Units. In: CIAA. (2009) 237–240 6. Krstev, C., Stanković, R., Vitas, D., Obradović, I.: The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines. In: 6th LREC, Marrakech, Marocco ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić. "Automatic construction of a morphological dictionary of multi-word units" in Lecture Notes in Computer Science 6233, Advances in Natural Language Processing, Proceedings of the 7thInternational Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 2010, Springer (2010): 226-237. https://doi.org/10.1007/978-3-642-14770-8_26
-
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
-
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis
U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...... Krstev, Duško Vitas, “Corpus and Lexicon - Mutual Incompletness ”, in Proceedings of the Corpus Linguistics Conference, 14-17 July 2005, Birmingham, eds. Pernilla Danielsson and Martijn Wagenmakers, ISSN 1747-9398, http://www.corpus.bham.ac.uk/PCLC/, 2005 10 Cvetana Krstev, Ranka Stanković, Duško Vitas ...
... library. 15 Cvetana Krstev. Processing of Serbian – Automata, Text and Electronic Dictionaries, Faculty of philology, Belgrade, 2008 16 Duško Vitas, Cvetana Krstev, Ivan Obradović, Ljubomir Popović, Gordana Pavlović-Lažetić”, An Processing Serbian Written Texts: An Overview of Resources and ...
... be reached via a synchronized synsets. Figure 4. Sequence diagram a multilingual query expansions 17 Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović, “The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines”, in Proceedings of the Sixth ...Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
-
Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking
U radu se prikazuju rezultati istraživanja vezanih za pripremu paralelnih korpusa, fokusirajući se na transformaciju u RDF grafove koristeći NLP Interchange Format (NIF) za lingvističku anotaciju. Pružamo pregled paralelnog korpusa koji je korišćen u ovom studijskom slučaju, kao i proces označavanja delova govora, lematizacije i prepoznavanja imenovanih entiteta (NER). Zatim opisujemo povezivanje imenovanih entiteta (NEL), konverziju podataka u RDF, i uključivanje NIF anotacija. Proizvedene NIF datoteke su evaluirane kroz istraživanje triplestore-a korišćenjem SPARQL upita. Na kraju, razmatra se povezivanje Linked ...paralelni korpusi, povezivanje imenovanih entiteta, prepoznavanje imenovanih entiteta, NER, NEL, povezani podaci, NIF, VikipodaciRanka Stanković, Milica Ikonić Nešić, Olja Perisic, Mihailo Škorić, Olivera Kitanović. "Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Resource-based WordNet Augmentation and Enrichment | Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev | Proceedings of the ...
... Krstev, C., Stanković, R., Vitas, D., and Obradović, I. (2006). WS4LR: A Workstation for Lexical Resources. In Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, pages 1692–1697. Krstev, C., Stanković, R., and Vitas, D. (2010). A Description of M ...
... ac.rs ivano@rgf.bg.ac.rs Miljana Mladenović College for Preschool Teachers Bujanovac, Serbia ml.miljana@gmail.com Cvetana Krstev and Marko Vitas Faculty of Philology University of Belgrade, Serbia cvetana@matf.bg.ac.rs vitas.marko@gmail.com Abstract In this paper we present an approach ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
-
Proširivanje upita zasnovano na leksičkim resursima
U radu je opisano kako se leksički resursi za srpski jezik i softverski alati, razvijeni u okviru Grupe za jezičke tehnologije Univerziteta u Beogradu, mogu koristiti za unapređenje postavljanja upita. Rezultati pretrage mogu biti značajno unapređeni korišćenjem različitih leksičkih resursa, kakvi su morfološki rečnici i semantičke mreže. Izloženi pristup može se iskoristiti i u Sistemu naučnih, tehnoloških i poslovnih informacija, jer je efikasno pretraživanje ovog dragocenog resursa, imajući u vidu njegovu heterogenost i obim, kao i preovladavajući tekstualni sadržaj, ...... Lexical Database, The MIT Press. [5] Maurel D., Vitas D., Krstev S., Koeva S., (2007) „Prolex: a lexical model for translation of proper names. Application to French, Serbian and Bulgarian“, BULAG n°32, 2007. [6] Krstev C., Stanković R., Vitas D., Obradović I., “WS4LR: A Workstation for Lexical ...
... fakultetu Univerziteta u Beogradu već duži niz godina, tako da je danas na raspolaganju veliki broj različitih resursa, razvijenih u značajnom obimu (Vitas et al., 2003). Pored korpusa srpskog jezika, kao i višejezičnih paralelnih korpusa, od posebnog su značaja sistem morfoloških rečnika srpskog jezika ...
... njegovo uspešno prilagođavanje različitim namenama, pa samim tim otvaraju i mogućnosti njeogovog korišćenja u okviru SNTPI. LITERATURA [1] Vitas D., Pavlović-Lažetić G., Krstev C., Popović Lj., Obradović I. (2003): „Processing Serbian Written Texts: An Overview of Resources and Basic Tools“ ...Ranka Stanković, Ivan Obradović, Cvetana Krstev. "Proširivanje upita zasnovano na leksičkim resursima" in SNTPI 09 - Naučno-stručni skup Sistem naučnih, tehnoloških i poslovnih informacija, Beograd 19. i 20. jun 2009, Beograd : Fakultet informacionih tehnologija (2009)
-
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... al electronic dictionaries Morphological electronic dictionaries of Serbian for NLP are being developed for many years now (Vitas et al., 1993) (Krstev, Cvetana and Vitas, Duško, 2015). They cover gen- eral lexica, proper names (persons and toponyms), general knowledge (famous or fictitious persons ...
... udžbenike. Koeva, S., Krstev, C., and Vitas, D. (2008). Morpho- semantic relations in wordnet–a case study for two slavic languages. In Proceedings of Global WordNet Confer- ence 2008, pages 239–253. University of Szeged, De- partment of Informatics. Krstev, C. and Vitas, D. (2007). Extending the Serbian ...
... C., Vitas, D., and Erjavec, T. (2004). MULTEXT- East resources for Serbian. In Zbornik 7. mednarodne multikonference Informacijska druzba IS 2004 Jezikovne tehnologije 9-15 Oktober 2004, Ljubljana, Slovenija, 2004. Erjavec, Tomaž and Zganec Gros, Jerneja. Krstev, C., Stanković, R., Vitas, D., ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... Krstev, C., Stanković, R., Vitas, D., Obradović, I. (2006). “WS4LR: A Workstation for Lexical Resources”. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, May 2006, pp. 1692–1697 [10] Krstev, C., Vitas D., Stanković R., Obradović ...
... [11] Krstev C., Pavlović-Lažetić G., Vitas D., Obradović I.: “Using Textual and Lexical Resources in Developing Serbian Wordnet”, Romanian J. Information Science and Technology, Romanian Academy, vol. 7, No. 1–2, pp. 147–161, (2004) [12] Krstev, C., Vitas, D., Maurel, D., Tran, M. (2005). “Mu ...
... Serbia” u časopisu Zapisnici Srpskog geološkog društva, Srpsko geološko društvo, Beograd. [7] ESRI Developer network (http://edn.esri.com) [8] Vitas D., G. Pavlović-Lažetić, C. Krstev, Lj. Popović, I. Obradović (2003): „Processing Serbian Written Texts: An Overview of Resources and Basic Tools“ ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
-
From DELA Based Dictionary to Leximirka Lexical Database
Biljana Lazić, Mihailo Škorić (2020)In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...... Mining and Geology Belgrade, Serbia 1 Introduction Prof. Dr. Dusko Vitas and Prof. Dr. Cvetana Krstev started working on the development of Serbian morphological dictionaries more than 25 years ago (Vitas, 1993; Krstev, 1997; Vitas et al., 1993). Morphological dictionaries represent a significant linguistic ...
... no. 6 (2018): 993–1009, URL https://doi.org/10.1108/EL-11-2017-0239 Vitas, Duško. “Matematički model morfologije srpskohrvatskog jezika (imen- ska fleksija)”. Phdthesis, Univerzitet u Beogradu, Matematički fakultet, 1993 Vitas, Duško, Gordana Pavlovic-Lažetić and Cvetana Krstev. “Electronic ...
... “bibliotekar” is among the 10,000 most frequent words in the Serbian Corpus of the Serbian Language SrbCorp (version of 122 million words by Duško Vitas and Miloš Utvić)6. Information about the Corpus is stored in the KorpusMeta table. The LexicalRelation table stores information 6 Corpus of the Serbian ...Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
-
Stabilnost terena kao uticajni faktor na urbanistička planiranja
Duško Sunarić (1980)Duško Sunarić. Stabilnost terena kao uticajni faktor na urbanistička planiranja, Beograd:Rudarsko-geološki fakultet, 1980
-
Uticaj tehničko-organizacionih parametara na brzinu izrade podzemnih prostorija u rudnicima uglja Srbije
Duško Đukanović (2002)Duško Đukanović. Uticaj tehničko-organizacionih parametara na brzinu izrade podzemnih prostorija u rudnicima uglja Srbije, Beograd:Rudarsko-geološki fakultet, 2002
-
Geološko-tektonska građa i hidrogeološke prilike akumulacionog basena "Mavrovo" u vezi gubljenja vode iz akumulacije
Duško Đuzelkovski (1975)Duško Đuzelkovski. Geološko-tektonska građa i hidrogeološke prilike akumulacionog basena "Mavrovo" u vezi gubljenja vode iz akumulacije, Beograd:Rudarsko Geološki Fakultet, 1975
-
Stabilnost dolinskih padina u gornjem toku reke Drine
Duško Sunarić (1984)Duško Sunarić. Stabilnost dolinskih padina u gornjem toku reke Drine, Beograd:Rudarsko Geološki Fakultet, 1984
-
Српски језик у дигиталном добу -- The Serbian Language in the Digital Age
Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević (2012)... Digital Age Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Српски језик у дигиталном добу -- The Serbian Language in the Digital Age | Duško Vitas, Ljubomir ...
... СРПСКИ ЈЕЗИК У ДИГИТАЛНОМ ДОБУ Duško Vitas Ljubomir Popović Cvetana Krstev Ivan Obradović Gordana Pavlović-Lažetić Mladen Stanojević White Paper Series THE SERBIAN LANGUAGE IN THE DIGITAL AGE Серија белих књига СРПСКИ ЈЕЗИК У ДИГИТАЛНОМ ДОБУ Duško Vitas University of Belgrade Ljubomir ...
... Sciences: Radovan Garabík Словенија Slovenia Jožef Stefan Institute: Marko Grobelnik Србија Serbia Univ. of Belgrade, Faculty of Mathematics: Duško Vitas, Cvetana Krstev, Ivan Obradović Pupin Institute: Sanja Vraneš Финска Finland Computational Cognitive Systems Research Group, Aalto Univ.: Timo ...Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević. "Српски језик у дигиталном добу -- The Serbian Language in the Digital Age" in META-NET White Paper Series, G. Rehm, H. Uszkoreit (eds.), Springer (2012)
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... evaluation. Terminology, 16(2), pp.141--158. Vitas, D., Popović, Lj., Krstev, C., Obradović, I., Pavlović-Lažetić, G. and Stanojević, M. (2012). The Serbian Language in the Digital Age. Berlin; Springer-Verlag. 8. Language Resource References Vitas D., Utvić M. (2015). SrpKor22M, Serbian au ...
... language resources such as morphological e-dictionaries and grammars developed within the University of Belgrade Human Language Technology Group (Vitas et al., 2012). For our approach, production of lemmas for various forms of MWTs extracted from a corpus is necessary for two main reasons. Firstly ...
... In Proc. of the Workshop on BSNLP: Information Extraction and Enabling Technologies, pp. 59--66. Krstev, C., Obradović, I., Stanković, R., and Vitas, D. (2013). An Approach to Efficient Processing of Multi-Word Units. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (Eds.) Computational ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... production of the new tag- ger model for Serbian are: (a) Serbian morphological dic- tionaries (Cvetana Krstev, Duško Vitas, 2015) (SMD); (b) pre-annotated texts (Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić, 2019). 2.1. Serbian morphological dictionaries Serbian morphological ...
... Bidirectional LSTM-CRF Models for Sequence Tagging. Krstev, C., Vitas, D., and Erjavec, T. (2004). Morpho- Syntactic Descriptions in MULTEXT-East-the Case of Serbian. Informatica, 28(4):431–436. Krstev, C., Obradović, I., Utvić, M., and Vitas, D. (2014). A system for named entity recognition based on ...
... 12(2):36a–47a, December. 8. Language Resource References Cvetana Krstev, Duško Vitas. (2015). Serbian Morpho- logical Dictionary - SMD. University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0. Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić. (2019). Sr-Basic: Annotated corpus ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239