This commit is contained in:
wea_ondara
2021-02-13 23:33:51 +01:00
parent 08455c69fb
commit 303794e14e
3 changed files with 145 additions and 114 deletions

View File

@@ -58,7 +58,7 @@ These platforms allow communication over large distances and facilitate fast and
% DONE How Do Programmers Ask and Answer Questions on the Web? \cite{treude2011programmers} qa sites very effective at code review and conceptual questions
% DONE The role of knowledge in software development \cite{robillard1999role} people have different areas of knowledge and expertise
All these communities differ in their design. Wikipedia is a community-driven knowledge repository and consists of a collection of articles. Every user can create an article. Articles are edited collaboratively and continually improved an expanded. Reddit is a platform for social interaction where users create posts and comment on other posts or comments. Quora, StackExchange, and Yahoo! Answers are community questions and answer (CQA) platforms. On Quora and Yahoo! Answers users can ask any question regarding any topics whereas on StackExchange users have to post their questions in the appropriate subcommunity, for instance, StackOverflow for programming related questions or MathOverflow for math related questions. CQA sites are very effective at code review \cite{treude2011programmers}. Code may be understood in the traditional sense of source code in programming related fields but this also translates to other fields, for instance, mathematics where formulas represent code. CQA sites are also very effective at solving conceptual questions. This is due to the fact that people have different areas of knowledge and expertise \cite{robillard1999role} and due to the large user base established CQA sites have, which again increases the variety of users with experise in different fields.
All these communities differ in their design. Wikipedia is a community-driven knowledge repository and consists of a collection of articles. Every user can create an article. Articles are edited collaboratively and continually improved and expanded. Reddit is a platform for social interaction where users create posts and comment on other posts or comments. Quora, StackExchange, and Yahoo! Answers are community questions and answer (CQA) platforms. On Quora and Yahoo! Answers users can ask any question regarding any topics whereas on StackExchange users have to post their questions in the appropriate subcommunity, for instance, StackOverflow for programming related questions or MathOverflow for math related questions. CQA sites are very effective at code review \cite{treude2011programmers}. Code may be understood in the traditional sense of source code in programming related fields but this also translates to other fields, for instance, mathematics where formulas represent code. CQA sites are also very effective at solving conceptual questions. This is due to the fact that people have different areas of knowledge and expertise \cite{robillard1999role} and due to the large user base established CQA sites have, which again increases the variety of users with experise in different fields.
Despite the differences in purpose and manifestation of these communities, they are social communities and they have to follow certain laws.
In their book on ''Building successful online communities: Evidence-based social design`` \cite{kraut2012building} \citeauthor{kraut2012building} lie out five equally important criteria online platforms have to fulfill in order to thrive. 1) When starting a community, it has to have a critical mass of users who create content. StackOverflow already had a critical mass of users from the beginning due to the StackOverflow team already being experts in the domain \cite{mamykina2011design} and the private beta\footref{atwood2008stack}. Both aspects ensured a strong community core early on.
@@ -172,6 +172,7 @@ Different badges also create status classes \cite{immorlica2015social}. The hard
% DONE Steering user behavior with badges \cite{anderson2013steering} # all abount badges, steering users, motivation, user may put in non trivial amounts of work to achieve badges -> powerful incentives, badges used in multiple ways (steer users to ask/answer more questions, voting, etc.)
Quality is often a concern in online communities. Platform moderators and admins want to keep a certain level of quality or even raise it. However, higher-quality posts take more time and effort than lower-quality posts. In the case of CQA platforms, this is an even bigger problem as higher quality answers fight against fast responses. Despite that, StackOverflow also has a problem with low quality and effort questions and subsequent unwelcoming answers and comments\footref{silge2019welcome}. StackOverflow has grown into a large community and larger communities are harder to control. \citeauthor{lin2017better} investigated how growth affects a community. They looked at Reddit communities that were added to the default set of subscribed communities of every new user (defaulting) which lead to a huge influx of new users to these communities as a result. The authors found that contrary to expectations, the quality stays largely the same. The vote score dips shortly after defaulting but quickly recovers or even raises to higher levels than before. The complaints of low-quality content did not increase, and the language used in the community stayed the same. However, the community clustered around fewer posts than before defaulting.
\citeauthor{tausczik2011predicting} found reputation is linked to the perceived quality of posts in multiple ways \cite{tausczik2011predicting}. They suggest reputation could be used as an indicator of quality.
Quality also depends on the type of platform. \cite{lin2017better} showed that expert sites who charge fees, for instance, library reference services, have higher quality answers compared to free sites. Also, the higher the fee the higher the quality of the answers. However, free community sites outperform expert sites in terms of answer density and responsiveness.

View File

@@ -8,7 +8,7 @@ This section shows the results of the experiments described in section 3 on the
% maybe average data points per month
%TODO write some text to each result
\pagebreak
\section{StackOverflow.com}
StackOverflow shows a very slight decrease in the average sentiment of time before the change is introduced. When the change occurs the average sentiment jumps up. After the change, the sentiments reach higher levels and keep rising. The average vote score rises right before and stays fairly constant after the change. This indicates that the vote score is not affected by the change. However, the number of questions from new contributors increases after the change while before the change is fairly constant. The number of follow-up questions from new contributors declines before the change and rise after the change.
@@ -33,6 +33,132 @@ StackOverflow shows a very slight decrease in the average sentiment of time befo
% jump upward at the change
% sentiments rising after change
\section{AskUbuntu.com}
AskUbuntu sees a decrease in average sentiments prior to the change. After the introduction of the change, the regression dips but sentiments keep rising drastically since then. The vote score has a huge range of values prior to and after the change, however, the graph indicates the vote score declines after the change. The number of 1st questions slightly decreases prior to the change and starts rising after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on AskUbuntu.com}
\label{ubuntu_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on AskUbuntu.com}
\label{ubuntu_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on AskUbuntu.com}
\label{ubuntu_questionsits}
\end{subfigure}
\end{figure}
% senitments have gradually fallen prior to the change
% sentiments increased after the change
% maybe: sentiments did not change drastically as seen in maths communities
\section{ServerFault.com}
ServerFault shows gradually rising average sentiments prior to the change. At the time of the change, the regression makes a jump upward and the average sentiment decreases slowly afterward. The vote score falls prior to the change, made a huge jump upward, and quickly returns to the levels just prior to the change. The number of 1st questions, however, sees a drastic change. Prior to the change, the number of 1st questions decreases steadily, while after the change the numbers increase at the same pace as they fall prior to the change. The number of follow-up questions also sees the same course direction, falling prior and raising after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on ServerFault.com}
\label{fault_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on ServerFault.com}
\label{fault_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on ServerFault.com}
\label{fault_questionsits}
\end{subfigure}
\end{figure}
% sentiments fairly stable before and after the change
% small jump in avg sentiments at change date
\section{stats.stackexchange.com}
On stats.stackexchange.com the average sentiment decreases steadily prior to the change. The regression dips when the change is introduced. However, the average sentiment after the change indicates a slight upward trend. The vote score also decreases prior to the change but does not recover afterward. However, the number of 1st questions and follow-up questions rise prior to the change and increase even faster after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on stats.stackexchange.com}
\label{stats_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on stats.stackexchange.com}
\label{stats_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on stats.stackexchange.com}
\label{stats_questionsits}
\end{subfigure}
\end{figure}
% sentiments steadily decreasing prior to the change
% dip in avg sentiment at the change date
% sight upward trend after the change
\section{tex.stackexchange.com}
On tex.stackexchange.com the average sentiment is low compared to the other investigated data sets. Prior to the change the average sentiment only slightly decreases. When the change is introduced the regression takes a dip down and after the change, the average sentiment increases drastically. Future data will be required to see if this upward trend continues or evens out. In stark contrast, the vote score shows a downward trend, although there is a short window around the change date where vote scores are higher compared to before and after the change. The number of 1st questions has a downward trend before the change and an upward trend afterward. The downward trend of the number of follow-up questions is uninterrupted by the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on tex.stackexchange.com}
\label{tex_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on tex.stackexchange.com}
\label{tex_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on tex.stackexchange.com}
\label{tex_questionsits}
\end{subfigure}
\end{figure}
% avg sentiment fairly low compared to the other investigated communities
% avg sentiment slowly decreasing prior to the change
% large dips in avg snetiment after the change
% trend after change strongly upward
\section{unix.stackexchange.com}
On unix.stackexchange.com the average sentiment decreases prior to the change. When the change is introduced the regression takes a small dip down, however, the average sentiment increases fast after the change. The vote score shows a continuous downward trend and the number of 1st and follow-up questions fall slightly prior to the change and increase afterward.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on unix.stackexchange.com}
\label{unix_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on unix.stackexchange.com}
\label{unix_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on unix.stackexchange.com}
\label{unix_questionsits}
\end{subfigure}
\end{figure}
% sentiments decreasing prior to the change
% snetiments rising after the change
% little jump upwards at change date
% these communities befitted from the change
% #number of 1st questions rose in every of these communities
% #number of follow up questions are rising in most of the communities
% sentiment rose in most of the communities
% the vote score is mostly uncorrelated with the change
More than half of the commiunities show befits from the change. The number of first questions increase in all of the 6 previously shown communities. Also, for most of these communities the number of follow-up questions increased too. Furthermore, the sentiment ITS shows an improvement in all except 1 community. The vote score analysis yielded no meaningful results for these communities. The vote score does not change with the introduction of Stackexchange' policy, with the exception of ServerFault, however, the increase in the vote score did not last for long.
\section{math.stackexchange.com}
The math.stackexchange.com community shows a decrease in average sentiments, vote score, and the number of questions prior to the change. The measurements make a small jump upward when the change is introduced, however, they continue their downward trend after the introduction of the change. Only the number of follow-up questions stabilizes and begins to increase after the change.
\begin{figure}[H]
@@ -77,50 +203,7 @@ MathOverflow shows a constant regression before the change, however, average sen
% senitments stable/constant prior to the change
% falling after the change
\section{AskUbuntu.com}
AskUbuntu sees a decrease in average sentiments prior to the change. After the introduction of the change, the regression dips but sentiments keep rising drastically since then. The vote score has a huge range of values prior to and after the change, however, the graph indicates the vote score declines after the change. The number of 1st questions slightly decreases prior to the change and starts rising after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on AskUbuntu.com}
\label{ubuntu_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on AskUbuntu.com}
\label{ubuntu_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../askubuntu.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on AskUbuntu.com}
\label{ubuntu_questionsits}
\end{subfigure}
\end{figure}
%senitments have gradually fallen prior to the change
% sentiments increased after the change
%maybe: sentiments did not change drastically as seen in maths communities
\section{ServerFault.com}
ServerFault shows gradually rising average sentiments prior to the change. At the time of the change, the regression makes a jump upward and the average sentiment decreases slowly afterward. The vote score falls prior to the change, made a huge jump upward, and quickly returns to the levels just prior to the change. The number of 1st questions, however, sees a drastic change. Prior to the change, the number of 1st questions decreases steadily, while after the change the numbers increase at the same pace as they fall prior to the change. The number of follow-up questions also sees the same course direction, falling prior and raising after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on ServerFault.com}
\label{fault_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on ServerFault.com}
\label{fault_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../serverfault.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on ServerFault.com}
\label{fault_questionsits}
\end{subfigure}
\end{figure}
%sentiments fairly stable before and after the change
% small jump in avg sentiments at change date
\section{SuperUser.com}
SuperUser shows only sightly decreasing average sentiment and vote score up to the change. At the change time the regressions take a dip down and the regression shows a downward trend after the change. Indeed the average sentiments and vote score dipped considerably when the change is introduced. The average sentiment recovers about 13 months later, while the vote score does not recover as well. The number of 1st questions decreases prior to the change and then goes through the roof indicating a huge wave of new users. This drastic influx of new users may explain the crash of the average sentiment and vote score that occurs at the same time. Data available in the future will show if the recovery is persistent.
@@ -168,75 +251,7 @@ On electronics.stackexchange.com the average sentiment and votes decrease contin
% recovery started after 12 month after the change
% more data in the future will be required to determine if upward trend in the end continues
\section{stats.stackexchange.com}
On stats.stackexchange.com the average sentiment decreases steadily prior to the change. The regression dips when the change is introduced. However, the average sentiment after the change indicates a slight upward trend. The vote score also decreases prior to the change but does not recover afterward. However, the number of 1st questions and follow-up questions rise prior to the change and increase even faster after the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on stats.stackexchange.com}
\label{stats_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on stats.stackexchange.com}
\label{stats_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../stats.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on stats.stackexchange.com}
\label{stats_questionsits}
\end{subfigure}
\end{figure}
% sentiments steadily decreasing prior to the change
% dip in avg sentiment at the change date
% sight upward trend after the change
\section{tex.stackexchange.com}
On tex.stackexchange.com the average sentiment is low compared to the other investigated data sets. Prior to the change the average sentiment only slightly decreases. When the change is introduced the regression takes a dip down and after the change, the average sentiment increases drastically. Future data will be required to see if this upward trend continues or evens out. In stark contrast, the vote score shows a downward trend, although there is a short window around the change date where vote scores are higher compared to before and after the change. The number of 1st questions has a downward trend before the change and an upward trend afterward. The downward trend of the number of follow-up questions is uninterrupted by the change.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on tex.stackexchange.com}
\label{tex_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on tex.stackexchange.com}
\label{tex_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../tex.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on tex.stackexchange.com}
\label{tex_questionsits}
\end{subfigure}
\end{figure}
%avg sentiment fairly low compared to the other investigated communities
% avg sentiment slowly decreasing prior to the change
% large dips in avg snetiment after the change
% trend after change strongly upward
\section{unix.stackexchange.com}
On unix.stackexchange.com the average sentiment decreases prior to the change. When the change is introduced the regression takes a small dip down, however, the average sentiment increases fast after the change. The vote score shows a continuous downward trend and the number of 1st and follow-up questions fall slightly prior to the change and increase afterward.
\begin{figure}[H]
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/its/average_sentiments-i1.png}
\caption{An interrupted time series analysis of the sentiments of answer to questions created by new contributors on unix.stackexchange.com}
\label{unix_its}
\end{subfigure}
\begin{subfigure}[t]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/votesits/average_votes-i1.png}
\caption{An interrupted time series analysis of the vote score of questions created by new contributors on unix.stackexchange.com}
\label{unix_votesits}
\end{subfigure}\\
\begin{subfigure}[c]{0.5\textwidth}
\includegraphics[scale=0.37]{../unix.stackexchange.com/output/questionits/average_questions-i1.png}
\caption{An interrupted time series analysis of the number of questions created by new contributors on unix.stackexchange.com}
\label{unix_questionsits}
\end{subfigure}
\end{figure}
%sentiments decreasing prior to the change
%snetiments rising after the change
% little jump upwards at change date

15
todo2
View File

@@ -23,3 +23,18 @@ allg:
- DONE: links -> foot notes
- mehr structur
extra
5. stackoverflow vote scart last datapoint: probably questions did not have enougth time to gain votes
ranking
stackoverflow good