wip
This commit is contained in:
@@ -1,10 +1,10 @@
|
||||
\chapter{Method}
|
||||
|
||||
StackExchange introduced a \emph{new contributor} indicator to all communities on $21^{st}$ of August in 2018 at 9 pm UTC \cite{post2018come}. This step is one of many StackExchange took to make the platform and its members more welcoming towards new users. This indicator is shown to potential answerers in the answer text box of a question flagged as from a new contributor as shown in figure \ref{newcontributor}. The indicator is added to a question if the question is the first contribution of the user or if the first contribution (question or answer) of the user was less than 7 days ago \cite{sonic2018what}. The indicator is then shown for 7 days from the creation date of the question. Note that the user can be registered for a long time and then post their first question and it is counted as a question from a new contributor. Also, if a user decides to delete all their existing contributions from the site and then creates a new question this question will have the \emph{new contributor} indicator attached. The sole deciding factor for the indicator is the date and time of the first non-deleted contribution and the 7-day window afterward.
|
||||
StackExchange introduced a \emph{new contributor} indicator to all communities on $21^{st}$ of August in 2018 at 9 pm UTC\footnote{\label{post2018come}\url{https://meta.stackexchange.com/questions/314287/come-take-a-look-at-our-new-contributor-indicator}}. This step is one of many StackExchange took to make the platform and its members more welcoming towards new users. This indicator is shown to potential answerers in the answer text box of a question flagged as from a new contributor as shown in figure \ref{newcontributor}. The indicator is added to a question if the question is the first contribution of the user or if the first contribution (question or answer) of the user was less than 7 days ago \footnote{\label{sonic2018what}\url{https://meta.stackexchange.com/questions/314472/what-are-the-exact-criteria-for-the-new-contributor-indicator-to-be-shown}}. The indicator is then shown for 7 days from the creation date of the question. Note that the user can be registered for a long time and then post their first question and it is counted as a question from a new contributor. Also, if a user decides to delete all their existing contributions from the site and then creates a new question this question will have the \emph{new contributor} indicator attached. The sole deciding factor for the indicator is the date and time of the first non-deleted contribution and the 7-day window afterward.
|
||||
|
||||
\begin{figure}
|
||||
\centering\includegraphics[scale=0.47]{figures/new_contributor}
|
||||
\caption{The answer box a potential answerers sees when viewing a question from a new contributor. \copyright{Tim Post, 2018, \url{https://meta.stackexchange.com/users/50049/tim-post}} in \cite{post2018come}}
|
||||
\caption{The answer box a potential answerers sees when viewing a question from a new contributor. \copyright{Tim Post, 2018, \url{https://meta.stackexchange.com/users/50049/tim-post}}\footref{post2018come}}
|
||||
\label{newcontributor}
|
||||
\end{figure}
|
||||
|
||||
@@ -19,7 +19,7 @@ To measure the effectiveness of the change this thesis utilizes Vader, a sentime
|
||||
% sentiment calculation via vaderlib, write whole paragraph and explain, also add ref to paper \cite{hutto2014vader}
|
||||
|
||||
\section{Data gathering and preprocessing}
|
||||
StackExchange provides anonymized data dumps of all their communities for researchers to investigate at no cost on archive.org \cite{archivestackexchange}. These data dumps contain users, posts (questions and answers), badges, comments, tags, votes, and a post history containing all versions of posts. Each entry contains the necessary information, for instance, id, creation date, title, body, and how the data is linked together (which user posted a question/answer/comment). However, not all data entries are valid and therefore cannot be used in the analysis, for instance, questions or answers of which the user is unknown, but this only affects a very small amount entries. So before the actual analysis, the data has to be cleaned. Moreover, the answer texts are in HTML format, containing tags that could skew the sentiment values, and they need to be stripped away beforehand. Additionally, answers may contain code sections which also would skew the results and are therefore omitted.
|
||||
StackExchange provides anonymized data dumps of all their communities for researchers to investigate at no cost on archive.org \footnote{\label{archivestackexchange}\url{https://archive.org/download/stackexchange}}. These data dumps contain users, posts (questions and answers), badges, comments, tags, votes, and a post history containing all versions of posts. Each entry contains the necessary information, for instance, id, creation date, title, body, and how the data is linked together (which user posted a question/answer/comment). However, not all data entries are valid and therefore cannot be used in the analysis, for instance, questions or answers of which the user is unknown, but this only affects a very small amount entries. So before the actual analysis, the data has to be cleaned. Moreover, the answer texts are in HTML format, containing tags that could skew the sentiment values, and they need to be stripped away beforehand. Additionally, answers may contain code sections which also would skew the results and are therefore omitted.
|
||||
% data sets as xml files from archive.org \cite{archivestackexchange}
|
||||
|
||||
%cleaning data
|
||||
|
||||
Reference in New Issue
Block a user