Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Претрага

Per page

Sort by

22 items

The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines

Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan (2008)

In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...

LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval

Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008) М63
Managing mining project documentation using human language technology

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja (2018)

Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...

Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentation

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239 М23
Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2017)

Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8 M33
Indexing of textual databases based on lexical resources: A case study for Serbian

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2015)

In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15 M13
From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back)

Milica Ikonić Nešić, Ranka Stanković, Christof Schöch and Mihailo Škorić (2022)

In this paper we present the wikification of the ELTeC (European Literary Text Collection), developed within the COST Action ``Distant Reading for European Literary History'' (CA16204). ELTeC is a multilingual corpus of novels written in the time period 1840—1920, built to apply distant reading methods and tools to explore the European literary history. We present the pipeline that led to the production of the linked dataset, the novels’ metadata retrieval and named entity recognition, transformation, mapping and Wikidata population, ...

Wikidata, linked data, SPARQL, distant reading, literary corpus, named entity linking, ELTeC

Milica Ikonić Nešić, Ranka Stanković, Christof Schöch and Mihailo Škorić. "From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back)" in Proceedings of The 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022) М33
Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology

Mihailo Škorić, Mauro Dragoni (2019)

This paper is a result of a task that was presented to attendants of Keyword Search in Big Linked Data summer school, that was organized by Vienna University of Technology, under the Keystone COST action in the summer of 2017. It presents a specific approach to the classification via creation of minimal document surrogates based on the US National medical library’s MeSH ontology, which is derived from the Medical Subject Headings thesaurus. In a series of previously classified medically ...

document classification, MeSH, ontology, information extraction

Mihailo Škorić, Mauro Dragoni. "Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology" in Infotheca, Faculty of Philology, University of Belgrade (2019). https://doi.org/10.18485/infotheca.2019.19.1.3 М53
Bilingual lexical extraction based on word alignment for improving corpus search

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović (2019)

Library and Information Sciences,Computer Science Applications

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056 М23
Corpus-based bilingual terminology extraction in the power engineering domain

Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev (2022)

Ovaj rad predstavlja resurse i alate koji se koriste za ekstrkciju i evaluaciju dvojezične, englesko-srpske terminologije u domenu energetike. Resursi se sastoje od postojeće opšte i domenske leksike i domenskog paralelnog korpusa; alati uključuju ekstraktore termina za oba jezika i alat za poravnavanje segmenata koji pripadaju korpusnim rečenicama. Sistem je testiran variranjem funkcije podudaranja koja utvrđuje prisustvo ekstrahovanog termina u poravnatom segmentu (odsečak), u rasponu od veoma labavog do strogog. Procena rezultata je pokazala da je preciznost izdvajanja termina ...

Library and Information Sciences, Communication, Language and Linguistics

Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev. "Corpus-based bilingual terminology extraction in the power engineering domain" in Terminology, John Benjamins Publishing Company (2022). https://doi.org/10.1075/term.20038.iva М23
Geologic Information System of Serbia

Branislav Blagojević, Branislav Trivić, Ranka Stanković, Nenad Banjac, Olivera Kitanović (2011)

Geologic information system of Serbia (GeolISS) represents repository for digital archiving, query, retrieving, analysis and geologic data visualization. The GeolISS is implemented through ESRI ArcGIS technology, and is designed to operate as a personal geodatabase (MS Jet 4.0 Engine) and SDE enterprise geodatabase in MS SQL Server. The objective of GeolISS implementation is integration of existing geologic archives, data from published maps at different scales, newly acquired field data, as well as Web publishing of geologic information. Physical implementation ...

Geologic information system, Conceptual model, Logical model, Implementation, Geodatabase

Branislav Blagojević, Branislav Trivić, Ranka Stanković, Nenad Banjac, Olivera Kitanović. "Geologic Information System of Serbia" in Proceedings of the 17th Meeting of the Association of European Geological Societies, 14.-18. september 2011., Beograd : Srpsko geološko društvo (2011) M34
Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons

Mihailo Škorić (2017)

The goal of this paper is to draw attention to the possibility of using emoticon-riddled text on the web in language-neutral sentiment analysis. It introduces several innovations in the existing framework of research and tests their effectiveness. It also presents a software tool especially made for that purpose, explains how it builds a database with sentimental value of terms and offers the user manual. Finally, it presents a software tool that tests the new database and gives some examples ...

data mining, information extraction, emotions, text on the web

Mihailo Škorić. "Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons" in Infotheca, Faculty of Philology, University of Belgrade (2017). https://doi.org/10.18485/infotheca.2017.17.1.4 М53
Creation of a Training Dataset for Question-Answering Models in Serbian

Ranka Stanković, Jovana Rađenović, Maja Ristić, Dragan Stankov (2024)

Razvoj i primena veštačke inteligencije u jezičkim tehnologijama značajno su napredovali poslednjih godina, posebno u domenu zadatka odgovaranja na pitanja (Question Answering - QA). Dok su postojeći resursi za QA zadatke razvijeni za glavne svetske jezike, srpski jezik je relativno zanemaren u ovoj oblasti. Ovaj rad predstavlja inicijativu za kreiranje obimnog i raznovrsnog skupa podataka za obučavanje modela za odgovaranje na pitanja na srpskom jeziku, koji će doprineti unapređenju jezičkih tehnologija za srpski jezik. Pored brojnih istraživanja o jezičkim modelima ...

veštačka inteligencija, obrada prirodnog jezika, jezički resursi, anotirani skupovi, ekstrakcija informacija, odgovaranje na pitanja

Ranka Stanković, Jovana Rađenović, Maja Ristić, Dragan Stankov. "Creation of a Training Dataset for Question-Answering Models in Serbian" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024) М64
Preparation of Multimedia Document “YU Rock Scene”

Milena Obradović, Aleksandra Arsenijević, Mihailo Škorić (2017)

SUMMARY: This study will present the preparation process of a multimedia document entitled YU ROCK SCENE in which participants were senior students of undergraduate studies of the Department of Library and Information Science at the University of Belgrade Faculty of Philology during the academic year 2014/2015, as a part of the subject Multimedia Documents. This study gives an overview of the historical development of rock and roll in the territory of the former Yugoslavia, rock scene in Yugoslav republics, ...

multimedia document, library science, information science, rock and roll, music, Yugoslavia

Milena Obradović, Aleksandra Arsenijević, Mihailo Škorić. "Preparation of Multimedia Document “YU Rock Scene”" in Infotheca - Journal for Digital Humanities, Faculty of Philology, University of Belgrade (2017). https://doi.org/10.18485/infotheca.2016.16.1_2.6 М53
Combining Heterogeneous Lexical Resources

Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić (2004)

development of lexical resources, morphological dictionaries, WordNet

Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić. "Combining Heterogeneous Lexical Resources" in Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4, ELRA - European Language Resources Association (2004) М33
Serbian ELTeC Sub-Collection in Wikidata

Milica Ikonić Nešić, Ranka Stanković, Biljana Rujević (2021)

This paper presents an example of integration of Wikidata with digital libraries and external systems, as well as some best practices for speeding up the process of data preparation and import to Wikidata, on the use case of SrpELTeC, Serbian subcollection of the ELTeC multilingual collection (European Literary Text Collection). After preliminary work on the manual Wikidata population with SrpELTeC novels, the goal was to automate the process of preparing and importing information, so different solutions were analysed and ...

Википодаци, удаљенои читање, књижевни корпус, повезивање именованих ентитета, ELTeC, SrpELTeC

Milica Ikonić Nešić, Ranka Stanković, Biljana Rujević. "Serbian ELTeC Sub-Collection in Wikidata" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.2.4 М53
Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection

Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić (2022)

In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published ...

Corpus, Distant Reading, Digital Humanities, Linked Data, Named Entity Recognition, Text Analytics

Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić. "Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection" in Proceedings of the Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022) М33
A bilingual digital library for academic and entrepreneurial knowledge management

Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić (2015)

A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...

Knowledge management, Digital library, Multilingualism, Language Resources, Terminology

Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015) M33
Rule-based Automatic Multi-word Term Extraction and Lemmatization

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac (2016)

In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...

term extraction, terminology, multi-word units, lemmatization, finite-state transducers

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016) M33
Terminological and lexical resources used to provide open multilingual educational resources

Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić (2016)

Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...

otvoreni obrazovni resursi, leksički resursi, obrada prirodnih jezika, terminologija

Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016) М33
CC-PESTO: a novel GIS-based method for assessing the vulnerability of karst groundwater resources to the effects of climate change

Zoran Stevanović, Veljko Marinović, Jelena Krstajić (2020)

The new GIS-based CC-PESTO method is shown to successfully assess and map the vulnerability/resilience of karst aquifers to effects of climate change. Karst aquifers were chosen due to their importance at the global level and widespread utilisation in potable water supply and irrigation, but also because of their hydrogeological complexity. The method was developed to assess the intrinsic vulnerability of aquifers, without considering the direct impact of variable climate factors, but considering the adaptive capacity of aquifers in response ...

Klimatske promenem Karst, Ranjivost, Geografski informacioni sistemi, Crna Gora, Srbija

Zoran Stevanović, Veljko Marinović, Jelena Krstajić. "CC-PESTO: a novel GIS-based method for assessing the vulnerability of karst groundwater resources to the effects of climate change" in Hydrogeology Journal, Springer Science and Business Media LLC (2020). https://doi.org/10.1007/s10040-020-02251-6 М21
SrpELTeC: A Serbian Literary Corpus for Distant Reading

Ranka Stanković, Cvetana Krstev, Duško Vitas (2024)

U članku je predstavljen SrpELTeC, korpus razvijen u okviru akcije COST Distant Reading for European Literary History (CA16204). Svi romani u SrpELTeC-u su odabrani, pripremljeni i obeleženi korišćenjem zajedničkih principa uspostavljenih za sve jezičke zbirke u Evropskoj zbirci književnog teksta (ELTeC). Navedeni su izazovi i rešenja u pripremi SrpELTeC od nule. Svi romani su ručno kodirani u TEI sa bogatim metapodacima i strukturnim napomenama. Automatska anotacija je uključivala POS-označavanje, lematizaciju i imenovane entitete, oslanjajući se na resurse za obradu ...

digital humanities, Serbian literature, text corpora, distant reading , linked data, named entity recognition, text analytics

Ranka Stanković, Cvetana Krstev, Duško Vitas. "SrpELTeC: A Serbian Literary Corpus for Distant Reading" in Primerjalna književnost, Research Centre of the Slovenian Academy of Sciences and Arts (2024). https://doi.org/10.3986/pkn.v47.i2.03 М23

Претрага

22 items

The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines cite

Managing mining project documentation using human language technology cite

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources cite

Indexing of textual databases based on lexical resources: A case study for Serbian cite

From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back) cite

Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology cite

Bilingual lexical extraction based on word alignment for improving corpus search cite

Corpus-based bilingual terminology extraction in the power engineering domain cite

Geologic Information System of Serbia cite

Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons cite

Creation of a Training Dataset for Question-Answering Models in Serbian cite

Preparation of Multimedia Document “YU Rock Scene” cite

Combining Heterogeneous Lexical Resources cite

Serbian ELTeC Sub-Collection in Wikidata cite

Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection cite

A bilingual digital library for academic and entrepreneurial knowledge management cite

Rule-based Automatic Multi-word Term Extraction and Lemmatization cite

Terminological and lexical resources used to provide open multilingual educational resources cite

CC-PESTO: a novel GIS-based method for assessing the vulnerability of karst groundwater resources to the effects of climate change cite

SrpELTeC: A Serbian Literary Corpus for Distant Reading cite

The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines

Managing mining project documentation using human language technology

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Indexing of textual databases based on lexical resources: A case study for Serbian

From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back)

Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology

Bilingual lexical extraction based on word alignment for improving corpus search

Corpus-based bilingual terminology extraction in the power engineering domain

Geologic Information System of Serbia

Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons

Creation of a Training Dataset for Question-Answering Models in Serbian

Preparation of Multimedia Document “YU Rock Scene”

Combining Heterogeneous Lexical Resources

Serbian ELTeC Sub-Collection in Wikidata

Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection

A bilingual digital library for academic and entrepreneurial knowledge management

Rule-based Automatic Multi-word Term Extraction and Lemmatization

Terminological and lexical resources used to provide open multilingual educational resources

CC-PESTO: a novel GIS-based method for assessing the vulnerability of karst groundwater resources to the effects of climate change

SrpELTeC: A Serbian Literary Corpus for Distant Reading