\chapter{Methodology} % sentiment calculation via vaderlib, write whole paragraph and explain, also add ref to paper \cite{hutto2014vader} % data sets as xml files from archive.org \cite{archivestackexchange} %cleaning data % broken entries, missing user id % answers in html -> strip html and remove code sections, no contribution to sentiment % calc sentiment for answers % about the change % https://meta.stackexchange.com/questions/314287/come-take-a-look-at-our-new-contributor-indicator \cite{post2018come} % https://meta.stackexchange.com/questions/314472/what-are-the-exact-criteria-for-the-new-contributor-indicator-to-be-shown \cite{sonic2018what} ; change date = 2018-08-21T21:04:49.177 % new user indicator visible for 1 week ... % differences in avg sentiment % look at plots and write something that fits %interrupted time series % ref tutorial paper \cite{bernal2017interrupted} % often used in medical fields to see if changes have an effect % used same tensors as describe in paper, show formula and how it works, 3 tensors describe tensors and what they capture % explain why i chose this model, captures the change, more complex model would capture more but also get more complicated, these 3 tensors are enough to see the impact % fitting every value not aggregated values, aggregated values would have different weights, weights are too far spread, contrary to paper where person years are more or less constant % single value fitting is better, no weight issues, as weights are taken care of via more values % if one month has more values than another then that month affects its more as more values are present %