Multilingual Opinion Mining
Egilea:
Zuzendariak: Montserrat Cuadros Oller (Vicomtech) Germán Rigau (Unibertsitatea)
Unibertsitatea: UPV/EHU
Data: 11.07.2017
Lekua: Donostia-San Sebastián
Every day a lot of text is generated in different online media. Much of this text contains opinions about a multitude of entities, products, services, etc. Given the growing need for automated means to analyse, process and exploit this information, sentiment analysis techniques have received a great deal of attention from industry and the scientific community over the past decade and a half. However, many of the techniques used often require supervised training using manually annotated examples, or other language resources related to a specific language or application domain. This limits the application of these types of techniques, since these resources and training examples are not easy to obtain. This thesis explores a series of methods for performing various automatic text analyses in the context of sentiment analysis, including the automatic extraction of terms of a domain, words expressing opinions, the polarity of the sentiment of those words (positive or negative), etc. Finally, a method combining continuous word embeddings and topic-modelling, inspired by the Latent Dirichlet Allocation (LDA) technique, is proposed and evaluated to obtain an aspect-based sentiment analysis system (ABSA) which only needs a few seed words to process texts from a given language or domain. In this way, the adaptation to another language or domain is reduced to the translation of the corresponding seed words.