From e04da245ea4d3ce7d95aff150ddf42ac4d6f40c3 Mon Sep 17 00:00:00 2001 From: wea_ondara Date: Fri, 3 Jan 2020 13:24:55 +0100 Subject: [PATCH] wip --- summary | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 summary diff --git a/summary b/summary new file mode 100644 index 0000000..08c3db8 --- /dev/null +++ b/summary @@ -0,0 +1,34 @@ +Question: Did the "new contributor" badge have an impact on social interactions on stack exchange sites? +This badge had been introducted around August to September in 2018. The "new contributor" badge is visible until 1 week after the first contribution of a user. + +Definitions: +new users: Users are considered new users if their first contribution (question or answer) was less than 7 days ago. + +Data: The data sets are aquired from archive.org [https://archive.org/download/stackexchange]. We analysed following data sets: +- electronics.stackexchange.com +- math.stackexchange.com +- mathoverflow.net +- serverfault.com +- stats.stackexchange.com +- stackoverflow.com (not yet) +- superuser.com +- tex.stackexchange.com +- unix.stackexchange.com + +Preprocessing: Some entries in the data sets are broken (e.g. no unique identifiers, etc.) and are filtered out. Furthermore, +question and answers may contain code sections. These sections should not contribute to the sentiment as they may skew results. +Therefore, code sections are excluded in the analysis. + + +Familiarizing with the data sets: We created plots for: +- How many answers where given to questions in each time interval? (posthist.py) +- How many users were active in each time interval? (posthist.py) +- What is the distribution of users with exactly X answers in a given time interval? (posthist.py) +- What are the proportions of negative, neutral, and postive answers in each time interval? (posthist.py) +- What are the differences between new users and others reguarding sentiment? (analyse_batch.py) +- What is the distribution of sentiments in each time interval? (analyse_batch.py) +- What are the reactions (answer sentiments) to questions of new users and users who post the most (95%tile)? + +Analysis: +ITS: We performed an ITS with 3 tensors (slope before, slope at change, slope after) on the sentiments of anwers to questions of new users (answers within 7 days of the first contribution). We choose to not aggregate the sentiments to an average per months but rather use every sentiment to a question individually (better results as number of observations at every time frame many vary greatly). +