This commit is contained in:
wea_ondara
2021-01-27 21:54:13 +01:00
parent 41a08f812b
commit 08455c69fb

View File

@@ -348,6 +348,10 @@ Maximum Entropy (ME) is a more sophisticated algorithm. It uses a an exponential
%- long training period (other methods do not need training at all because lexica) (vader)
Support Vector Machines (SVM) uses a different approach. SVM put datapoints in an $n$-dimentional space and differentiates them with hyperplanes ($n-1$ dimentional planes), so datapoints fall in 1 of the 2 halfs of the space divided by the hyper plane. This approach is usually very memory and computation intensive as each datapoint is represented by an $n$-dimentional vector where $n$ denotes the number of trained features.
%generall blyabla, transition to vader
In general, ML approaches do not provide an improvment over hand crafted lexicon approaches as they only shift the time intensive process to training data set collections. Furthermore, lexicon based approaches seem to progressed further in terms of coverage and feature weighting. However, many tools are not specifically tailored to social media text analysis and leak in coverage of feature detection.
%vader (Valence Aware Dictionary for sEntiment Reasoning)(grob) \cite{hutto2014vader}
% - 2014
% - detects acyrnoms, ...
@@ -356,6 +360,10 @@ Support Vector Machines (SVM) uses a different approach. SVM put datapoints in a
% - context awareness
% - disabliguation of words if they have multiple meanings (contextual meaning)
This shortcoming was addressed by \citeauthor{hutto2014vader} who introducted a new sentiment analysis tool: Valence Aware Dictionary for sEntiment Reasoning (VADER)\cite{hutto2014vader}. \citeauthor{hutto2014vader} acknowledged the problems that many tools have and designed VADER to leverage the shortcomings. Their aim was to introduce a tool which works well in the social media domain, provides a good coverage of features occuring in the social media domain (acronyms, initialisms, slang, etc.), and is able to work with online streams (live processing) of texts. VADER is also able to distinguish between different meanings of words (WSD) and it is able to take sentiment intensity into account. These properties make VADER an excellent choice when analysing sentiment in the social media domain.
%The authors used a lexicon based approach as performance was one of the most important reuqirements.
%general
%dep on sentiment lexicons, more info in vader 2.1 Sentiment Lexicons
%vader not binary (pos, neg) but 3 categories