This commit is contained in:
wea_ondara
2021-03-02 21:50:59 +01:00
parent 9b0aed601a
commit 316fed8283
4 changed files with 27 additions and 5 deletions

View File

@@ -376,6 +376,22 @@ This shortcoming was addressed by \citeauthor{hutto2014vader} who introducted a
% ursprüngliches paper ITS, wie hat man das früher (davor) gemacht
\subsection{Trend analysis}
When introducing a change to a system (experiment), one often wants to know whether the intervention achieves its intended purpose. This leads to 3 possible outcomes: a) the intervention shows effect and the system changes in the desired way, b) the intervention shows effect and the system changes in an undesired way, or c) the system did not react at all to the change. There are multiple ways to determine which of these outcomes occur. To analyze the behavior of the system data from before and after the intervention as well as the nature of the intervation has be aquired. The are multiple ways to run such an experiment and one has to choose which type of experiment fits best. There are 2 categories of approaches: actively creating an experiment where one design the experiment before it is executed (for example randomized control trials in medical fields), or using existing data of an experiment which was not designed beforehand or where setting up a designed experiment is not possible (quasi-experiment).
As this thesis investigates a change which has already been implemented by another party, this thesis covers quasi-experiments. A tool that is often used for this purpose is an \emph{Interrupted Time Series} (ITS) analysis. The ITS analysis is a form of segmented regression analysis, where data from before, after and during the intervention is regressed with seperate line segements\cite{mcdowall2019interrupted, bernal2017interrupted}. ITS requires data at (regular) intervals from before and after the intervention (time series). The interrupt signifies the intervention and the time of when it occured must be known. The intervention can be at a single point in time of it can be streched out over a certain time span. This property must also be known to take it into account when designing the regression. Also, as the data is aquired from an quasi-experiment, it may be baised, for example seasonality, ....%TODO
%\cite{mcdowall2019interrupted} book
%\citeauthor{bernal2017interrupted} paper tutorial
%widely used in medical fields where randomized controll trials are not an option/observational data already exists
% -> based on segmented regression, do regression on pieces of data, then stitch together
% -> its inferior to rct but better than nothing
% -> shortcomming need to be addressed
% -> requires (before and after) data of interest at (regular) intervals (TS in its)
% -> iterrupted from an intervention during the observatios at a known point in time
% -> intervention can be a single point in time or gradual roll out

View File

@@ -251,7 +251,7 @@ SuperUser shows only sightly decreasing average sentiment and vote score up to t
The 4 previously mentioned communities do not profit from the change. Although some communities improve in one statistic, they do not improve accross the field as shown in the other 6 communities. The 1st question statistic decreases in all 4 communities. With the exception of math.stackexchange.com, all of these communities do not improve in the followup question statistic. In all communities the vote score is on a (worse) downward trend after the change. Also, the sentiment values are decreasing after the change.
When looking at the results of SuperUser, the community stands out and shows interesting results. After about 6 mouths after the change the community the number of 1st questions triple. This level of new questions continues for 7 before the the number go down towards the previous levels. In the same time frame the vote score and sentiment take a significant dive. After that the sentiment returns almost to the previous level while the vote score only increases mildly. However, this sudden increase in 1st questions and therefore users is not related to the change this thesis investivates.
When looking at the results of SuperUser, the community stands out and shows interesting results. After about 6 mouths after the change the community the number of 1st questions triple. This level of new questions continues for 7 months before the the number go down towards the previous levels. In the same time frame the vote score and sentiment take a significant dive. After that the sentiment returns almost to the previous level while the vote score only increases mildly. However, this sudden increase in 1st questions and therefore users is not related to the change this thesis investivates.
%summary not working
% number of 1st questions does not increase after the change
@@ -260,4 +260,4 @@ When looking at the results of SuperUser, the community stands out and shows int
% sentiment scores started decreasing more rapid
% superuser oddball
Summarizing, the change introduced by StackExchange clearly improved the engagement in 6 of the 10 investigated communities. Sentiment, vote score, and number (1st and follow-up questions) rose as a result. The other 4 communities do not profit from the change. Although, many statistics jump up to a higher level the downward trends are not stopped. The statistics of SuperUser show a large influx of new users about 6 months after the change sending the sentiment and vote score on a deep dive and with the decrease in new users they raise again. However, this event is not related to the change but the magnitude of the huge change in new user numbers renders the analysis uncomparable.

View File

@@ -356,3 +356,9 @@
year={1999},
institution={Technical report C-1, the center for research in psychophysiology~…}
}
@book{mcdowall2019interrupted,
title={Interrupted time series analysis},
author={McDowall, David and McCleary, Richard and Bartos, Bradley J},
year={2019},
publisher={Oxford University Press}
}

6
todo2
View File

@@ -15,8 +15,8 @@
- vader genau beschreiben
5.
- gruppieren nach categorien
- summarizen unter kategorie und ganz unten
- DONE gruppieren nach categorien
- DONE summarizen unter kategorie und ganz unten
allg:
- 50+ refs
@@ -25,7 +25,7 @@ allg:
extra
5. stackoverflow vote scart last datapoint: probably questions did not have enougth time to gain votes
5. stackoverflow vote score last datapoint: probably questions did not have enougth time to gain votes
ranking