Serbian NER&Beyond: The Archaic and the Modern Intertwinned
- Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić
- Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications
- Galia Angelova, Maria Kunilovskaya, Ruslan Mitkov, Ivelina Nikolova-Koleva
- 2021
- U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje razvijenog modela sa postojećim, nakon čega sledi diskusija o prednostima i nedostacima oba modela.
- In this work, we present a Serbian litera ry corpus that is being developed under the umbrella of the “Distant Reading for European Literary History” COST Action CA16204. Using this corpus of novels written more than a century ago, we ha ve developed and made publicly available a Named Entity Recognizer (NER) trai ned to recognize 7 different named enti ty types, with a Convolutional Neural Ne twork (CNN) architecture, having F1 score of ≈91% on the test dataset. This model has been further assessed on a separate eva luation dataset. We wrap up with compa rison of the developed model with the exi sting one, followed by a discussion of pros and cons of the both models.
- 1252
- 1260
- 10.26615/978-954-452-072-4_141
- 978-954-452-072-4
- Creative Commons – Attribution-Share Alike 4.0 International
Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). М33
