This commit is contained in:
wea_ondara
2022-09-04 10:27:29 +02:00
parent a5e60a7335
commit 5154ed89e9
4 changed files with 38 additions and 4 deletions

View File

@@ -25,7 +25,7 @@ If these criteria improve after the change is introduced, the community is affec
%only when new contributor insicator is shown
\section{Vader}
To measure the effect on the sentiment of the change this thesis utilizes the Vader\cite{hutto2014vader} sentiment analysis tool. This decision is based on the performance in analyzing and categorizing microblog-like texts, the speed of processing, and the simplicity of use. Vader uses a lexicon of words, and rules related to grammar and syntax. This lexicon was manually created by \citeauthor{hutto2014vader} and is therefore considered a \emph{gold standard lexicon}. Each word has a sentiment value attached to it. Negative words, for instance, \emph evil, have negative values; good words, for instance, \emph brave, have positive values. The range of these values is continuous, so words can have different intensities, for instance, \emph bad has a higher value than \emph evil. This feature of intensity distinction makes Vader a valance-based approach.
However, just simply looking at the words in a text is not enough and therefore Vader also uses rules to determine how words are used in conjunction with other words. Some words can boost other words. For example, ``They did well.'' is less intense than ``They did extremely well.''. This works for both positive and negative sentences. Moreover, words can have different meanings depending on the context, for instance, ``Fire provides warmth.'' and ``Boss is about to fire an employee.'' This feature is called \emph{Word Sense Disambiguation}.