linis (1185431), страница 3

Файл №1185431 linis (Аннотации) 3 страницаlinis (1185431) страница 32020-08-252020-08-25СтудИзба

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 3)

In our case the share of ±1 class matchescomprises 93.0% which is comparable to Thelwall’s results—96.9% (Thelwall et al,2010). Prediction of the negative classes is better than that of the positive ones(95% and 59% for ‘−1’ and ‘−2’ classes vs. 82% and 19% for ‘+1’ and ‘+2’ classes).As it can also be seen, moderate classes are predicted much better than extremeclasses, which are very small, while the dominant ‘0’ class yields 99.6% of ±1 classmatches.SA systems for Russian use different evaluation techniques.

The closest to ourcase was the ROMIP SA competition held on texts from political news and fromblogs containing customer opinions (Chetviorkin, Loukachevitch, 2013). As sentiment lexicons are domain sensitive, it would be unfair to directly test our lexiconon the texts of a different type and to compare it to the approaches that were developed specially for this type. It would be equally unfair to apply the ROMIP methodsto our collection.

We therefore performed an indirect comparison of the results,using the same methodology of quality evaluation as ROMIP. Its best participantsin a three-class blog classification task exceeded their baseline by 12–27% in termsof recall and by 5–29% in terms of precision. In news classification task the respective values were 23–28% and 43–49%. Having converted our data into threeclasses (positive, negative and neutral), we calculated our baseline, precision andrecall (see Table 3).Table 3. Three-class classification qualityRecall (macro)Our lexiconBaselineDifference0.430.330.10Precision (macro)0.440.180.26The quality of our lexicon is comparable to that of the ROMIP approaches usedin the blog classification task and is lower than the quality reached for news. It shouldbe noted that class distribution of the ROMIP news collection was much more balanced (Panicheva, 2013) than that of both its blog collection and of our sample.

Thishas made the task of exceeding the baseline more difficult in blog SA. In contrastto most ROMIP methods, our lexicon is publicly available and may be improved by theresearch community.An Opinion Word Lexicon and a Training Dataset for Russian Sentiment Analysis of Social Media5. Conclusion and future researchWe have presented a lexicon for sentiment analysis of political and social Russian-language blogs.

Its quality is comparable to the results obtained for Englishlanguage Twitter and for Russian-language blogs with customer opinions. We havealso described the results of words and texts annotation based on a crowdsourcingapproach. The lexicon and the annotated collection are publicly available at our website linis-crowd.org that allows further crowdsourcing of sentiment markup. This webresource is aimed at the widest research community. While the lexicon can be alreadyused by social scientists, the collection may serve as a benchmark for testing new sentiment instruments. In particular, we are now using it for training machine learningSA algorithms that should help increase the quality of SA.

We also plan to improve thelexicon by replicating our research on a collection of blog comments that are potentially much more emotional.6. AcknowledgementsThis work was supported by the Russian Foundation for Humanities, project ‘Development of a publicly available database and a crowdsourcing website for testingsentiment analysis instruments’, Grant No 14-04-1203.References1.2.3.4.5.Alexeeva S., Koltsova E., Koltcov S. (2015) Linis-crowd.org: A lexical resource forRussian sentiment analysis of social media [Linis-crowd.org: lexicheskij resursdl’a analiza tonal’nosti sotsial’no-politicheskix tekstov], Computational Linguistics and computantional ontologies: Proceedings of the XVIII joint Conference“Internet and modern society (IMS-2015)” [Kompyuternaya lingvistika i vyichislitelnyie ontologii: sbornik nauchnyih statey.

Trudyi XVIII ob’edinennoy konferentsii «Internet i sovremennoe obschestvo» (IMS-2015)], St. Peterburg, pp. 25–34.Bodrunova S., Koltsov S., Koltsova O., Nikolenko S., Shimorina A. (2013) IntervalSemi-Supervised LDA Classifying Needles in a Haystack, Proceeding of the 12thMexican International Conference on Artificial Intelligence (MICAI 2013) Part I:Advances in Artificial Intelligence and Its Applications, Berlin: Springer Verlag,pp. 265–24.Chetviorkin I. I., Braslavski P. I., Loukachevitch N. V. (2012), Sentiment Analysis Track at ROMIP 2011, Proceedings of International Conference Dialog,pp. 739–746.Chetviorkin I., Loukachevitch N. (2012) Extraction of Russian Sentiment Lexicon for Product Meta-Domain, Proceedings of COLING 2012: Technical Papers,pp. 593–610.Chetviorkin I., Loukachevitch N.

(2013) Sentiment Analysis Track at ROMIP2012, Proceedings of International Conference Dialog, Vol. 2, pp. 40–50.Koltsova O. Yu., Alexeeva S. V., Kolcov S. N.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.Esuli A., Sebastiani F. (2006) SentiWordNet: A publicly available lexical resourcefor opinion mining, Proceedings of 5th International Conference on LanguageResources and Evaluation (LREC), Genoa, pp. 417–422.Godbole N., Srinivasaiah M., Skiena S.

(2007) Large Scale Sentiment Analysis forNews and Blogs, ICWSM’2007, Boulder, Colorado, USA.Hong Y., Kwak H., Baek Y., Moon S. (2013) Tower of Babel: a crowdsourcing gamebuilding sentiment lexicons for resource-scarce languages, Proceedings of the22nd International World Wide Web Conference (WWW), pp. 549–556.Hsueh P., Melville P., Sindhwani V. (2009) Data quality from crowdsourcing:a study of annotation selection criteria, Proceedings of the NAACL HLT 2009Workshop on Active Learning for Natural Language Processing, Boulder, Colorado, pp. 27–35.Hu M., Liu B. (2004) Mining and summarizing customer reviews, Proceedingsof the ACM SIGKDD International Conference on Knowledge Discovery and DataMining (KDD-2004), Seattle, WA, pp. 168–177.Koltsova O., Koltcov S., Alexeeva S.

(2014) Do ordinary bloggers really differfrom blog celebrities? Proceedings of WebSci ‹14 ACM Web Science Conference,Bloomington, IN, USA, NY: ACM, pp. 166–170.Koltsova O., Shcherbak A. (2015) ‘LiveJournal Libra!’: The political blogosphereand voting preferences in Russia in 2011–2012, New Media and Society, vol.

17,no. 10, pp. 1715–1732.Ku L.-W., Liang Y.-T., Chen H.-H. (2006) Opinion Extraction, Summarization andTracking in News and Blog Corpora, Proceedings of the AAAI-CAAW›06.Loukachevitch N. V., Blinov P. D., Kotelnikov E. V., Rubtsova Y. V., Ivanov V. V., Tutubalina E. (2015) SentiRuEval: testing object-oriented sentiment analysis systems in Russian, Proceedings of International Conference Dialog, Vol.

2.Medhat W., Hassan A., Korashy H. (2014) Sentiment analysis algorithms and applications: a survey, Ain Shams Engineering Journal, Vol. 5, Issue 4, pp. 1093–1113.Mihalcea R., Banea C., Wiebe J. (2007) Learning multilingual subjective languagevia cross-lingual projections, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 976–983.Mohammad S, Dorr B., Hirst G., Turney P. (2011) Measuring degrees of semanticopposition, Technical report, National Research Council Canada.Mohammad S. M., Turney, P. D.

(2013), Crowdsourcing a word-emotion association lexicon, Computational Intelligence, Vol. 29 no. 3, pp. 436–465.Morkovkin V. V. (2003) Explanatory dictionary of Russian language: structuralwords: prepositions, conjunctions, particles, interjections, parentheses, pronouns,numbers, connections [Ob’jasnitelnyj slovar’ russkogo jazyka: Structurnyje slova:predlogi, sojuzy, chastitsy, mezhdometija, vvodnyje slova, mestoimenija, chislitelnyje, svjazannyje slova], Astrel, Moscow.Nikolenko S., Koltcov S., Koltsova O. (2015) Topic Modeling for Qualitative Studies. Journal of Information Science (R&R).Pang B., Lee L. (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, 42nd Meeting of the Associationfor Computational Linguistics[C] (ACL-04), pp.

Характеристики

Тип файла

PDF-файл

Размер

772,34 Kb

Материал

Аннотации

Тип материала

Другое

Предмет

Английский язык

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов учебной работы

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.