wip

2020-12-09 19:05:48 +01:00
parent 057ded15e8
commit ab26333812
1 changed files with 73 additions and 21 deletions
--- a/text/2_relwork.tex
+++ b/text/2_relwork.tex
@@ -1,15 +1,7 @@
 \chapter{Related Work}
 % py umschreiben auf how the new contributor thing works https://meta.stackexchange.com/questions/314472/what-are-the-exact-criteria-for-the-new-contributor-indicator-to-be-shown  ; change date = 2018-08-21T21:04:49.177
 %read template notes again and adjust 
 %askubuntu analyse, stackexchange.com/sites anschauen was noch analyse
 This section is divided into two parts. The first part explains what StackExchange is, how it developed since its inception, and how it works. The second part shows previous and related work. %TODO more
 % first look at how stackexchange works in backgeound section
 % 
 \section{Background}
 StackExchange\footnote{\url{https://stackexchange.com}} is a community question and answering (CQA) platform where users can ask and answer questions, accept answers as an appropriate solution to the question, and up-/downvote questions and answers. StackExchange uses a community-driven knowledge creation process by allowing everyone who registers to participate in the community. Invested users also get access to moderation tools to help maintain the vast community. All posts on the StackExchange platform are publicly visible, allowing non-users to benefit from the community as well. Posts are also accessible for web search engines so users can find questions and anwsers easily with a simple web search. StackExchange keeps an archive of all questions and answers posted, creating a knowledge archive for future visitors to look into.
@@ -217,16 +209,39 @@ Quality also depends on the type of platform. \cite{lin2017better} showed that e
 % alle sentiment methoden + vader
 \subsection{Sentiment analysis}
 Researchers put forth many tools for sentiment analysis over the years. Each tool has is advantages and drawbacks and there is not a silber bullet solution that fits all research questions. Researches have to choose a tool which best fits their need and they need to be aware of the drawbacks of their choice. Sentiment analysis poses three important challenges:
 \begin{itemize}
 \item Coverage: detecting as many features as possible from a given piece of text
 \item Weighting: assigning one or multiple values (value range and granularity) to detected features
 \item Creation: creating and maintaining a sentiment analysis tool is a time and labor intensive process
 \end{itemize}
 % many different methods
 % 
 % have to choose tool depending on task
 % beware of the drawbacks
 %challenges (vader)
 % - coverage (e.g. of lexical features, important in mircoblog texts)
 % - sentiment intensity (some of the following tools ignore intensity completly (just -1, or 1)
 % - creating a human-validated gold standard lexicon is very time consuming/labor intensive, with sentiment valence scores, feature detection and context awareness, 
 In general, sentiment analysis tools can be grouped into two categories: handcrafted and automated (machine learning).
 %distinction into 2 groups: handcrafted and automated tools
 % polarity-based -> binary
 % valence-base -> continuous
-%%%%% handcrafted - TODO order by sofistication, sentwordnet last
+%%%%% handcrafted - TODO order by sofistication, sentiwordnet last
 %lexicon generation very time consuming
 %generally fast sentiment computation
 %realtively easy to update (added words, ...)
 %nachvolliziehbare results
 Creating hand crafted tools is often a huge undertaking. They depend on a hand crafted lexicon (gold standard, human-curated lexicons), which maps features of a text to a value. In the simplest sense these just map a word to a binary value -1 (negative word) or 1 (positive word). However, most tools use a more complex lexicon to capture more features of piece of text. By design they allow a fast computation of the sentiment of a given piece of text. Also, hand crafted lexicons are easy to update and extend. Furthermore, hand crafted tools produce easily comprehensible results. The following paragraphs explain some of the analysis tools in this category.
 %liwc (Linguistic Inquiry and Word Count) \cite{pennebaker2001linguistic,pennebakerdevelopment}, 2001 %TODO refs wrong?
 % - well verified
 % - ignores acronyms, initialisms, emoticons, or slang, which are known to be important for sentiment analysis of social text (vader)
@@ -234,37 +249,64 @@ Quality also depends on the type of platform. \cite{lin2017better} showed that e
 % - ca 4500 words (uptodate?), ca 400 pos words, ca 500 neg words, lexicon proprietary (vader)
 % - TODO list some application examples 
 % ...
 Linguistic Inquiry and Word Count (LIWC) \cite{pennebaker2001linguistic,pennebakerdevelopment} is one of the more popular tools. Due to its widespread usage, LIWC is well verfied, both internally and externally. Its lexicon consists of about 4500 words where words are categorized into one or more of the 76 defined categories. Approximatly 400 words have a positive and 500 words have a negative emotion. %TODO ref for 400 500, list example see todo
 However, the lexicon is proprietary, so .... %TODO ref or remove
 LIWC also has some drawbacks, for instance, it does not capture acronyms, emoticons, or slang words. Furthermore, LIWC's lexicon uses a polarity-based approach, meaning that it cannot distinguish between the scentences ''This pizza is good`` and ''This pizza is excellent``. \emph Good and \emph excellent are both in the category of positive emotion but LIWC does not distinguish between single words in the same category.
 %General Inquirer (GI) \cite{stone1966general} 1966 TODO ref wrong?
 % - 11k words, 1900 pos, 2300 neg, all approx (vader)
 % - very old (1966), continuously refined, still in use (vader)
 % - misses lexical feature detection (acronyms, ...) and sentiment intensity (vader)
 General Inquirer (GI)\cite{stone1966general} is one of the oldest sentiment tools still in use. It was originally designed in 1966 and has been continuously refined and now consists of about 11000 words where 1900 positively rated words and 2300 negatively rated words. %TODO how does it work
 Like LIWC, GI uses a polarity-based lexicon and therefore is not able to capture sentiment intensity. Also, GI does not recognize lexical features, such as, acronyms, initalisms, etc.. %TODO ref
 %Hu-Liu04 \cite{hu2004mining,liu2005opinion}, 2004
 % - focuses on opinion mining, find features in multiple texts (eg reviews) and rate the opinion about the feature, pos/neg binary classification (hu2004mining)
 % - does not text summarize opinions but summarizes ratings (hu2004mining)
 % - 6800 words, 2000 pos, 4800 neg, all approx values (vader)
 % - better suited for social media text, misses emoticons and acronyms/initialisms (vader)
 % - bootstrapped from wordnet (wellknown english lexical database) (vader, hu2004mining)
 %TODO refs
 Hu-Liu04 \cite{hu2004mining,liu2005opinion} is a opinion mining tool. It searches for features in multiple pieces of text, for instance, product reviews, and 
 rates the opinion of the feature by using a binary classification\cite{hu2004mining}. Crutially Hu-Liu04 does not summarize the texts but summarizes ratings of the opinions about features mentioned in the texts. Hu-Liu04 was bootstrapped from WordNet\cite{TODO} and then extended further. It now uses a lexicon consisting of about 6800 words where 2000 words have a positive sentiment and 4800 word have a negative sentiment attached. %TODO ref
 This tools is, by design, better suited for social media texts, although it also misses emiticons, acronyms and initialisms.
 %SenticNet \cite{cambria2010senticnet} 2010
 % - concept-level opinion and sentiment analysis tool (vader)
 % - sentic mining: combination of AI and Semantic Web (vader, senticnet)
 % - graphmining and dimensionality reduction (vader, senticnet)
 % - uses conceptnet: directed graph of concepts and relations (TODO refernce
 % - lexicon: 14250 common-sense concepts, with polarity scores [-1,1] continuous, and many other values (vader)
 % - TODO list some concepts (vader) or maybe not
 SenticNet \cite{cambria2010senticnet} is also an opinion mining tool but it focuses on concept-level opinions. SenticNet is based on a paradigm called \emph{Sentic Mining} which uses a combination of concepts from artificial integelligence and the Semantic Web. More specifically, it uses graph mining and dimentionality reduction. SenticNets lexicon consists of about 14250 common-sense concepts which a have rating on many scales of which one is a polarity score with a continuous range from -1 to 1. This continuous range of polarity scores enables SenticNet to be sentiment-intensity aware.
 %Word-Sense Disambiguation (WSD) \cite{akkaya2009subjectivity}, 2009
 % - TODO
 % - not a sentiment analysis tool per se but can be combined with sentiement analysis tool to distinuish multiple meaning for a word (vader, akkaya2009subjectivity)
 % - a word can have multiple meanings, pos neu neg depending on context (vader,akkaya2009subjectivity)
 % - derive meaning from context -> disambiguation (vader, akkaya2009subjectivity)
 % - distinguish subjective and objective word usage, sentences can only contain negative words used in object ways -> sentence not negative, TODO example sentence (akkaya2009subjectivity)
 %ANEW (Affective Norms for English Words) \cite{bradley1999affective} 1999
-% - lexicon: 1034 words, ranked by pleasure, arousal, and dominance (vader)
+% - tool introducted to compare and standardize research
-% - words get value 1-9 (neg-pos, continuous), 5 neutral (TODO maybe list word examples with associated value) (vader)
+% - lexicon: 1034 words, ranked by pleasure, arousal, and dominance (vader, bradley1999affective)
-% - therefore captures sentiement intensity (vader)
+% - words get value 1-9 (neg-pos, continuous), 5 neutral (TODO maybe list word examples with associated value) (vader, bradley1999affective)
 % - therefore captures sentiement intensity (vader, bradley1999affective)
 % - misses lexical features (e.g. acronyms, ...) (vader)
-%SenticNet \cite{cambria2010senticnet} 2010
+Affective Norms for English Words (ANEW) \cite{bradley1999affective} is sentiment analysis tool and was introducted to standardize research and offer a way to compare research. Its lexicon is fairly small and consists of only 1034 words which are ranked pleasure, arousal, and dominance. However, ANEW uses a continuous scale from 1 to 9 where 1 represents the negative end, 9 represents the positive end, and 5 is considered neutral. With this design, ANEW is able to capture sentiment intensity. However, ANEW still misses lexical features, for instance, acronyms.
-% - concept-level opinion and sentiment analysis tool (vader)
+
-% - sentic mining: combination of AI and Semantic Web (vader)
+%wordnet \cite{miller1998wordnet} 1998, TODO maybe exlcude or just mention briefly in sentiwordnet
 % - graphmining and dimensionality reduction (vader)
 % - lexicon: 14250 common-sense concepts, with polarity scores [-1,1] continuous, and many other values (vader)
 % - TODO list some concepts (vader)
 %wordnet \cite{miller1998wordnet} 1998, maybe exlcude or just mention briefly in sentiwordnet
 % - well-known English lexical database (vader)
 % - group synonyms (synsets) together (vader)
 % - TODO
 %sentiwordnet \cite{baccianella2010sentiwordnet}
 % - extension of wordnet (vader, baccianella2010sentiwordnet)
 % - 147k synsets (vader), 
@@ -273,25 +315,35 @@ Quality also depends on the type of platform. \cite{lin2017better} showed that e
 % - lexicon very noisy, most synset not pos or neg but mix (vader)
 % - misses lexical features (vader)
 SentiWordNet \cite{baccianella2010sentiwordnet} is an extension of WordNet and adds ... %TODO whats the difference
 Its lexicon consists of about 147000 synsets, each having 3 value (positive, neutral, negative) attached to them. The each value has a continuous range from 0 to 1 and the sum of these 3 values is set to be 1. The values of each synset are calculated by a mix of semi supervised algorithms, mostly propergation and classifiers. This distinguishes SentiWordNet from previously explained sentiment tools, where the lexica are exclusively created by humans (except for simple mathemtical operations, for instance, averaging of values). Therefore, SentiWordNets lexicon is not considered to be a human-curated gold standard. Furthermore, the lexicon is very noisy and most of the synsets neigher positive or negative but a mix of both\cite{hutto2014vader}. Moreover, SentiWordNet misses lexical features, for instance, acronyms, initalisms and emoticons.
 %%%%% automated (machine learning)
 %often require large training sets, compare to creating a lexicon  (vader)
 %training data must represent as many features as possible, otherwise feature is not learned, often not the case  (vader)
 %training data should be unbiased, or else wrong learning (NOT VADER)
 %very cpu and memory intensive, slow, compare to lexicon-based  (vader)
 %derived features not nachvollziehbar as a human (black-box)  (vader)
-%generaization problem (vader)
+%generalization problem (vader)
 %updateing (extend/modify) hard (e.g. new domain) (vader)
 Because hand crafting sentiment analysis requires a lot of effort, researches turned to approaches which offload the labor intensive part to machine learning (ML). However, this results in a new challenge, namely: gathering a \emph good data set to feed the machine learning algorithms for training. Firstly, \emph good data set needs to represent as many features as possible, or otherwise the algorithm will not recognise it. Secondly, the the data set has to be unbiased and representative for all the data of which the data set is a part of. The data set has to represent each feature in an appropiate amount, or otherwise the algorithms may discrimate a feature in favor of other more represented features. These requirements are hard to fulfill and often they are not\cite{hutto2014vader}. After a data set is aquired, a model has to be learned by the ML algorithm, which is, depending on the complexity of the alogrithm, a very computational-intensive and memory-intensive process. After training is completed, the algorithm can predict sentiment values for new pieces of text, which it has never seen before. However, due to the nature of this appraoch, the results cannot be comprehended by humans easily if at all. ML approaches also suffer from an generalization problem and therefore cannot be transfered to other domains without accepting a bad performance, or updating the training data set to fit the new domain. Updating (extending or modify) the training also require a complete training from scratch.
 % naive bayes
 % - simple (vader)
 % - assumption: feature probabilties are indepenend of each other (vader)
 The Naive Bayes (NB) classifier is one of the simplest ML algorithms. It uses Bayesion probabilty to classify samples. This requires the assumption that the propabilities of the features are independend of oneanother. %which often they are not because languages have certain structures of features.
 % Maximum Entropy
 % - exponential model + logistic regression (vader)
 % - feature weighting through not assuming indepenence as in naive bayes (vader)
 Maximum Entropy (ME) is a more sophisticated algorithm. It uses a an exponential model and logistic regression. It distinguishes itself from NB by not assuming conditional indepenence of features. It also supported weighting of features by using the entropy of features.
 %svm
 %- mathemtical anspruchsvoll (vader)
 %- seperate datapoints using hyper planes (vader)
 %- long training period (other methods do not need training at all because lexica) (vader)
-
+Support Vector Machines (SVM) uses a different approach. SVM put datapoints in an $n$-dimentional space and differentiates them with hyperplanes ($n-1$ dimentional planes), so datapoints fall in 1 of the 2 halfs of the space divided by the hyper plane. This approach is usually very memory and computation intensive as each datapoint is represented by an $n$-dimentional vector where $n$ denotes the number of trained features.
 %vader (Valence Aware Dictionary for sEntiment Reasoning)(grob) \cite{hutto2014vader}
 % - 2014