Претрага
109 items
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... ) This false retrieval occurs because two constituents of the multi-word term are treated separately, and neither nearness conditions nor grammatical agreement conditions are taken into account, which reduces precision. Conversely, if a literal search is performed as with “beli luk” then ...
... case the latter solution would not yield erroneous results either since for query expansion we need only correctly inflected forms and not grammatical categories. 6. In Serbian many compounds have a structure in which some of its components do not inflect (like X+noun or noun+X+X). When i ...
... the components that do not inflect, one of them being “if the word that follows a noun is possibly a preposition and the next word is in the grammatical case that is required by that preposition, neither of the word forms following the noun will inflect”. This rule would correctly determine that ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
-
Determining the Availability of Continuous Systems at Open Pits Applying Fuzzy Logic
This work presents a model for determining the availability of continuous systems at open pits by applying fuzzy logic and fuzzy inference systems. The applied model was formed by the synthesis of independent partial indicators of availability. The model is based on an expert system for assessing the availability of continuous mining systems. The availability of the system, as a complex state parameter, is decomposed into the partial indicators, reliability, and convenience of maintenance, and the fuzzy compositions, used ...системи, континуални систем експоатације, површински коп, рударство, расположивост, фази логика, max–min композиција, min–max композицијаMiljan Gomilanović, Miloš Tanasijević, Saša Stepanović. "Determining the Availability of Continuous Systems at Open Pits Applying Fuzzy Logic" in Energies, MDPI AG (2022). https://doi.org/10.3390/en15186786
-
Indexing of textual databases based on lexical resources: A case study for Serbian
In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...... When search- ing with the keyword zlato ‘gold’ the old system ranks the document as 125th with general search and as 84th when searching in the category mineral deposits because the keyword matches only that particular form of the word (two matches highlighted in green in Figure 1). The new system ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
-
Extensive vibrations of the belt conveyer drive electromotor of a bucket wheel excavator as a result of intesified wear-and-tear of its mount support
Vesna Damnjanović, Predrag Jovančić, Snežana Aleksandrović. "Extensive vibrations of the belt conveyer drive electromotor of a bucket wheel excavator as a result of intesified wear-and-tear of its mount support" in Journal of Vibroengineering (2017). https://doi.org/10.21595/jve.2016.17321
-
Integracija heterogenih tekstualnih resursa
Ranka Stanković, Ivan Obradović (2007)U radu je opisan pristup integraciji heterogenih tekstualnih resursa za srpski jezik uz pomoć jednog kompleksnog softverskog alata, razvijenog specijalno za ove potrebe. Opisani su struktura i osnovne komponente razvijenog sistema. Iznete su i mogućnosti unapređivanja resursa međusobnom razmenom informacija, koje pruža razvijeno integrisano okruženje. Konačno, opisana je i mogućnost primene integrisanih heterogenih resursa za proširenje upita, kao i pretraživanje tekstova uopšte, a naznačeni su i neki od pravaca daljeg razvoja.... trenutno omogućena jer nisu na raspolaganju odgovarajući resursi za te jezike. Pored postoje dne formate, kao što su MULTEXT-east, DCR (Data Category Registry), LMF (Lexical Markup Framework) i MAF (Morphologial Annotation Framework). Ugrađivanje derivacija u WS4LR, koje je takođe u planu, otvorilo ...Ranka Stanković, Ivan Obradović. "Integracija heterogenih tekstualnih resursa" in Zbornik radova međunarodnog simpozijuma Razlike između bosanskog/bošnjačkog, hrvatskog i srpskog jezika, Graz, Austria, April 2007, - (2007)
-
Frequency and Length of Syllables in Serbian
Marija Radojičić, Biljana Lazić, Sebastijan Kaplar, Ranka Stanković, Ivan Obradović, Ján Mačutek, Lívia Leššová (2019)Basic analyses of several properties of syllables (the rank-frequency distribution, the distribution of length, and the relation between length and frequency) in Serbian is presented. The syllabification algorithm used combines the maximum onset principle and the sonority hierarchy. Results indicate that syllables behave similarly to words as far as mathematical models are concerned, but values of parameters in models for syllables are quite different from those for words.... with approximants and nasals being sonorants. Admittedly, this scale puts many consonants with different phonological characteristics into one category (e.g. stops and fricatives); however, according to Zec (1995, p.86), it „is not nearly as elaborate as some of the scales proposed in the literature ...Marija Radojičić, Biljana Lazić, Sebastijan Kaplar, Ranka Stanković, Ivan Obradović, Ján Mačutek, Lívia Leššová. "Frequency and Length of Syllables in Serbian" in Glottometrics (2019)
-
Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface
Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos (2024)Predstavljamo trenutne aktivnosti na definisanju interfejsa leksikona i korpusa koji će služiti kao referenca u prikazu polileksemskih jedinica - višečlanih izraza - (različitih tipova - imenskih, glagolskih, itd.) u specijalizovanim leksikonima i povezivanju ovih unosa sa njihovim pojavljivanjima u korpusima. Konačni cilj je korišćenje ovakvih resursa za automatsko identifikovanje višečlanih izraza u tekstu. Uključivanje nekoliko prirodnih jezika ima za cilj univerzalnost rešenja koje nije usredsređeno na određeni jezik, kao i prilagođavanje idiosinkrazijama. Raspravljaju se izazovi u leksikografskom opisu višerečnih ...Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos. "Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... years now. In the dictionary of lemmas (DELAS) each lemma is described in full detail so that the dictionary of forms containing all the necessary grammatical information (DELAF) can be generated from it, and subsequently used for various NLP tasks. Serbian e-dictionaries of simple forms have reached a ...
... in the hope of achieving this goal correctly most of the time. Serbian, like other Slavic languages, is a highly-inflected language, with complex grammatical rules that cannot be adequately expressed by stemming rules. However, for highly-inflected languages, lemmatization can hardly be avoided as each ...
... 50 parallel abstracts in the Ser- bian and English language including the average value, the minimal and maximal number of words in rows for each category presented by columns. The first column is related to the numbers of words in the text, the KW count lists the number of keywords given by an author ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
Wordnet Development Using a Multifunctional Tool
Ivan Obradović, Ranka Stanković (2007)In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their synchronous use in various tasks. We focus here on the description of the possibilities this tool offers in the development of wordnets. Besides the wordnet module which enables parallel handling of two wordnets, other modules, such as the module for morphological dictionaries and the module for aligned texts, as well as available finite ...... (Figure 3). For inflected compound constituents additional information is needed: the lemma, its inflectional class code, as well as the list of grammatical categories of the form that appears in the compound lemma. For example, in the compound crno-beli film (black and white movie), the lemma for ...
... shouldn’t be altered. Similarly, when a dictionary type file is transformed, only lemmas and word forms are converted, not the part of speech and grammatical codes. The user can choose a conversion Perl or awk script suitable for the specific file type, or produce his/her own script easily. The module ...
... research projects [6]. PWN was formalized as a semantic network of concepts, abstract ideas or mental symbols that denote objects in a given category or class of entities, interactions, phenomena, or relationships between them. In PWN, concepts are lexicalized by one or more synonymous English ...Ivan Obradović, Ranka Stanković. "Wordnet Development Using a Multifunctional Tool" in Proceedings of the International Workshop Computer Aided Language Processing (CALP) '2007, Borovets, Bulgaria, September 2007, - (2007)
-
Definition of groundwater genesis and circulation conditions of the complex hydrogeological karst system Mlava–Belosavac–Belosavac-2 (eastern Serbia)
Ljiljana Vasić, Saša Milanović, Zoran Stevanović, Laszlo Palcsu. "Definition of groundwater genesis and circulation conditions of the complex hydrogeological karst system Mlava–Belosavac–Belosavac-2 (eastern Serbia)" in Carbonates and Evaporites, Springer Science and Business Media LLC (2020). https://doi.org/10.1007/s13146-020-00550-3
-
Integrative GHG Assessment in Oil and Gas Industry
Reducing greenhouse gas emissions is one of the main targets of national strategies in European countries. As a main contributor to emissions, the energy sector is recognized as the most promising to apply measures and actions aimed to decrease GHG emissions. The Oil and Gas industry as a significant contributor to global greenhouse gas emissions is facing a growing need for estimating, mitigating, and reducing the impact of their operations on the atmosphere to stay competitive in a newly ...... thce exploration, production, processing, and delivery of fossil fuciIs to users. Methane leaks being the major source of GHG emissions under this category. Commonly used methodologies in estimating fugitive emissions in O&G operations include: • Direct Measurement: Requires usage of instrumentation ...Aleksandar Mirković, Marija Živković, Stevan Đenadić, Darja Lubarda, Chinedu Anyanwa. "Integrative GHG Assessment in Oil and Gas Industry" in Energija, ekonomija, ekologija (2023). https://doi.org/10.46793/EEE23-1.51M
-
Geochemical evaluation of dolostone deposits in Montenegro: Implications for potential industrial applications
Darko Bozovic, Vladimir Simic, Dragan Radulovic, Slobodan Radusinovic, Vesna Matovic, Anja Terzic (2024)This study presents a unique model for assessing the dependability of continuous parts of combined systems in open-pit mining through the application of fuzzy logic. Continuous sub-systems as part of the combined system of coal exploitation in surface mines have the basic function of ensuring safe operation, high capacity with high reliability, and low costs. These subsystems are usually part of the thermal power plant’s coal supply system and ensure stable fuel supply. The model integrates various independent partial ...примарне сировине, минералогија, физичко-механичка својства, технолошка својства, наука о материјалимаDarko Bozovic, Vladimir Simic, Dragan Radulovic, Slobodan Radusinovic, Vesna Matovic, Anja Terzic. "Geochemical evaluation of dolostone deposits in Montenegro: Implications for potential industrial applications" in Science of Sintering, Bor, August 2024, National Library of Serbia (2024). https://doi.org/10.2298/SOS240701029B
-
Using Metadata For Content Indexing Within An OER Network
Ranka Stanković, Olivera Kitanović, Ivan Obradović, Roberto Linzalone, Giovanni Schiuma, Daniela Carlucci (2014)... "lesson", "module", "monitoring"and "evaluation techniques", "policy brief", "portal", "promotional material", or "reference material". The Lifecycle category describes the history and current state of a learning object. Lifecycle fields, version and status are taken from the LOM Standard. Version ...Ranka Stanković, Olivera Kitanović, Ivan Obradović, Roberto Linzalone, Giovanni Schiuma, Daniela Carlucci. "Using Metadata For Content Indexing Within An OER Network" in Proceedings of the Fifth International Conference on e-Learning, eLearning 2014, September 2014, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2014)
-
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis
U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...... Morphological search includes search of all inflected forms of specified word that retrieve from SrpMD (Serbian morphological dictionary). For nouns, grammatical forms include case and number for example for kuća (eng. House) kuće, kućama, kući, etc. for adjective additionally comparison, for verbs person ...
... process). Further classification is possible, so categories can contain subcategories, to achieve better organisation of digital objects. For each category or subcategory, it is possible to define a specific collection that will display the entire content of the collection. Apart from navigation and ...Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
-
Development of a petrographic classification system for organic particles affected by self-heating in coal waste. (An ICCP Classification System, Self-heating Working Group – Commission III)
M. Misz-Kennan, J. Kus, D. Flores, C. Avila, Z. Büçkün, N. Choudhury, K. Christanis, J.P. Joubert, S. Kalaitzidis, A.I. Karayigit, M. Malecha, M. Marques, P. Martizzi, J.M.K. O'Keefe, W. Pickel, G. Predeanu, S. Pusz, J. Ribeiro, S. Rodrigues, A.K. Singh, I. Suárez-Ruiz, I. Sýkorová, N.J. Wagner, D. Životić (2020)Self-heating of coal waste is a major problem in the leading coal-producing and consuming countries, independent of the recent or past coal exploitation history. The phenomenon of self-heating is dependent on many factors such as the properties of organic matter (maceral composition and rank), moisture and pyrite content, climate effects, and storage conditions (shape of the dump or compaction of the coal waste). Once deposited, coal waste undergoes oxidation, which can lead to self- heating with the overall temperatures ...M. Misz-Kennan, J. Kus, D. Flores, C. Avila, Z. Büçkün, N. Choudhury, K. Christanis, J.P. Joubert, S. Kalaitzidis, A.I. Karayigit, M. Malecha, M. Marques, P. Martizzi, J.M.K. O'Keefe, W. Pickel, G. Predeanu, S. Pusz, J. Ribeiro, S. Rodrigues, A.K. Singh, I. Suárez-Ruiz, I. Sýkorová, N.J. Wagner, D. Životić. "Development of a petrographic classification system for organic particles affected by self-heating in coal waste. (An ICCP Classification System, Self-heating Working Group – Commission III)" in International Journal of Coal Geology, Elsevier BV (2020). https://doi.org/10.1016/j.coal.2020.103411
-
An Integrated Environment for Management and Exploitation of Linguistic Resources
Ranka Stanković, Ivan Obradović (2009)... the inflectional class A2 with grammatical categories: aefs1g (positive, same written form, feminine, singular, nominative, both animate and non-animate), and “obaveza” is a noun belonging to the inflectional class N600 with corresponding grammatical categories: fs1q (feminine, ...
... con- tains data related to simple word forms that are parts of the compound (inflectional class code of each simple lemma, the set of its grammatical categories, etc). Finally, the module for management of dictionaries al- lows access to an editor of regular expressions, namely, ...
... are applied in the order they are list- ed. Conditions defined within each rule fall into two types: conditions of the first type specify grammatical categories of compound components and they usually apply to the com- ponents that inflect, whereas conditions belonging to the second ...Ranka Stanković, Ivan Obradović. "An Integrated Environment for Management and Exploitation of Linguistic Resources" in Proceedings of the International Multiconference on Computer Science and Information Technology, Computational Linguistics – Applications Workshop (CLA09), Mrągowo, Poland, October 2009, Piscataway : IEEE (2009)
-
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++
Branislava Šandrih, Ranka Stanković (2020)U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...... previously harmonised to the best possible extent. For example, the grammatical category codes in the Serbian dictionary are a/b/c, for the positive/comparative/superlative forms. The Unitex English dictionary does not have a code for the positive, while the codes for the comparative and superlative are ...
... inflected forms with grammatical categories we used the English morphological dictionary from the Unitex distribution and the MULTEX-East English lexicon.8 4. In the final step Serbian and English inflected word forms were aligned taking into account the corresponding grammatical codes, which were previously ...
... graph output consists of 4 values for each recognised MWU, sep- arated by “;”: graph label (grf04a or grf04b), followed by a label that indicates grammatical number (sin or plu); followed by recognised form (n.INFLECTED p np or n.INFLECTED ng1 ng2) and lemmatised inflective component followed by constant ...Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
-
Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking
U radu se prikazuju rezultati istraživanja vezanih za pripremu paralelnih korpusa, fokusirajući se na transformaciju u RDF grafove koristeći NLP Interchange Format (NIF) za lingvističku anotaciju. Pružamo pregled paralelnog korpusa koji je korišćen u ovom studijskom slučaju, kao i proces označavanja delova govora, lematizacije i prepoznavanja imenovanih entiteta (NER). Zatim opisujemo povezivanje imenovanih entiteta (NEL), konverziju podataka u RDF, i uključivanje NIF anotacija. Proizvedene NIF datoteke su evaluirane kroz istraživanje triplestore-a korišćenjem SPARQL upita. Na kraju, razmatra se povezivanje Linked ...paralelni korpusi, povezivanje imenovanih entiteta, prepoznavanje imenovanih entiteta, NER, NEL, povezani podaci, NIF, VikipodaciRanka Stanković, Milica Ikonić Nešić, Olja Perisic, Mihailo Škorić, Olivera Kitanović. "Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
Моделирање дисперзије прашине у фази одлагања флотацијске јаловине у руднику “Шупља стијена”
Вања Живановић (2024)Рударство, као једна од кључних индустрија у многим земљама света игра важну улогу у стварању економске вредности али истовремено и велики притисак на животну средину. Процеси експлоатације и обраде руде често укључују употребу разних хемикалија, што може изазвати загађење земљишта, воде и ваздуха. Очување животне средине постаје све важније питање у свету у којем се суочавамо са изазовима као што су климатске промене, деградација земљишта, губитак биодиверзитета. Моделирање дисперзије загађења животне средине представља процес у ком се коришћењем математичких модела ...... KOHcCTaHTe (Ta0Oena 3.1). [5] Ta6ena 3.1.- KoucrarHre perpecuje 3a MapruHopBe KpuBe 3a pypajHa nonpyuja Stability lor x<1! km lor x> | km “ category c d [3 d A 213 4408 1.941 9.27 459,7 2004 .~.6 B 156 106.6 1.149 3.3 108.2 1.098% 2.0 c 104 61.0 0.911 0 61.0 0.911 | D 68 33.2 0.725 •1.7 ...
... Jwicnep3Hje. Perpecuona JexHauurHa McElroy-Poolerov-a je nara y raoemu 3.2. [5] Ta6ena 3.2.- McElroy-Pooler kpape 3a ypGaHa nojipyuja Stability e . category >_„(m) =-.(m) A 0.32x(1.0-0.0004x)“* 0.24x(1.0+0.001x)"* B 0.32x(1.0-0.0004x)“* 0.24x(1.0+0.001x)"?* < O0.22x(1.0-0.0004x)“* 0.20x D O0.16x(1 ...Вања Живановић. Моделирање дисперзије прашине у фази одлагања флотацијске јаловине у руднику “Шупља стијена”, 2024
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... with the same meaning). In the example Testovi za AM, A1 I A2 kategoriju su vrhunska stvar koja može da ti se desi u životu, ‘Tests for AM, A1 and A2 category are the best thing that can happen to you in your life’ although some abbreviations were not identified, the tweet was identified as positive. Likewise ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/