wip
This commit is contained in:
@@ -6,7 +6,7 @@ This section is divided into three parts. The first part explains what StackExch
|
||||
|
||||
StackExchange\footnote{\url{https://stackexchange.com}} is a community question and answering (CQA) platform where users can ask and answer questions, accept answers as an appropriate solution to the question, and up-/downvote questions and answers. StackExchange uses a community-driven knowledge creation process by allowing everyone who registers to participate in the community. Invested users also get access to moderation tools to help maintain the vast community. All posts on the StackExchange platform are publicly visible, allowing non-users to benefit from the community as well. Posts are also accessible for web search engines so users can find questions and answers easily with a simple web search. StackExchange keeps an archive of all questions and answers posted, creating a knowledge archive for future visitors to look into.
|
||||
|
||||
Originally, StackExchange started with StackOverflow\footnote{\url{https://stackoverflow.com}} in 2008\footnote{\label{atwood2008stack}\url{https://stackoverflow.blog/2008/08/01/stack-overflow-private-beta-begins/}}. Since then StackExchange grew into a platform hosting sites for 174 different topics\footnote{\label{stackexchangetour}\url{https://stackexchange.com/tour}}, for instance, programming (StackOverflow), maths (MathOverflow\footnote{\url{https://mathoverflow.net}} and Math StackExchange\footnote{\url{https://math.stackexchange.com}}), and typesetting (TeX/LaTeX\footnote{\url{https://tex.stackexchange.com}}). Questions on StackExchange are stated in natural English language and consist of a title, a body containing a detailed description of the problem or information need, and tags to categorize the question. After a question is posted the community can submit answers to the question. The author of the question can then accept an appropriate answer which satisfies their question. The accepted answer is then marked as such with a green checkmark and shown on top of all the other answers. Figure \ref{soexamplepost} shows an example of a StackOverflow question. Questions and answers can be up-/downvoted by every user registered on the site. Votes typically reflect the quality and importance of the respective question or answers. Answers with a high voting score raise to the top of the answer list as answers are sorted by the vote score in descending order by default. Voting also influences a user's reputation \cite{movshovitz2013analysis}\footref{stackexchangetour}. When a post (question or answers) is voted upon the reputation of the poster changes accordingly. Furthermore, downvoting of answers also decreases the reputation of the user who voted\footnote{\url{https://stackoverflow.com/help/privileges/vote-down}}.
|
||||
Originally, StackExchange started with StackOverflow\footnote{\url{https://stackoverflow.com}} in 2008\footnote{\label{atwood2008stack}\url{https://stackoverflow.blog/2008/08/01/stack-overflow-private-beta-begins/}}. Since then StackExchange grew into a platform hosting sites for 174 different topics\footnote{\label{stackexchangetour}\url{https://stackexchange.com/tour}}, for instance, programming (StackOverflow), maths (MathOverflow\footnote{\url{https://mathoverflow.net}} and Math StackExchange\footnote{\url{https://math.stackexchange.com}}), and typesetting (TeX/LaTeX\footnote{\url{https://tex.stackexchange.com}}). Questions on StackExchange are stated in the natural English language and consist of a title, a body containing a detailed description of the problem or information needed, and tags to categorize the question. After a question is posted the community can submit answers to the question. The author of the question can then accept an appropriate answer which satisfies their question. The accepted answer is then marked as such with a green checkmark and shown on top of all the other answers. Figure \ref{soexamplepost} shows an example of a StackOverflow question. Questions and answers can be up-/downvoted by every user registered on the site. Votes typically reflect the quality and importance of the respective question or answers. Answers with a high voting score raise to the top of the answer list as answers are sorted by the vote score in descending order by default. Voting also influences a user's reputation \cite{movshovitz2013analysis}\footref{stackexchangetour}. When a post (question or answer) is voted upon the reputation of the poster changes accordingly. Furthermore, downvoting of answers also decreases the reputation of the user who voted\footnote{\url{https://stackoverflow.com/help/privileges/vote-down}}.
|
||||
|
||||
Reputation on StackExchange indicates how trustworthy a user is. To gain a high reputation value a user has to invest a lot of time and effort to reach a high reputation value by asking good questions and posting good answers to questions. Reputation also unlocks privileges which may differ slightly from one community to another\footnote{\url{https://mathoverflow.com/help/privileges/}}\mfs\footnote{\url{https://stackoverflow.com/help/privileges/}}.
|
||||
With privileges, users can, for instance, create new tags if the need for a new tag arises, cast votes on closing or reopening questions if the question is off-topic or a duplicate of another question, or when a question had been closed for no or a wrong reason, or even get access to moderation tools.
|
||||
@@ -59,7 +59,7 @@ These platforms allow communication over large distances and facilitate fast and
|
||||
% DONE How Do Programmers Ask and Answer Questions on the Web? \cite{treude2011programmers} qa sites very effective at code review and conceptual questions
|
||||
% DONE The role of knowledge in software development \cite{robillard1999role} people have different areas of knowledge and expertise
|
||||
|
||||
All these communities differ in their design. Wikipedia is a community-driven knowledge repository and consists of a collection of articles. Every user can create an article. Articles are edited collaboratively and continually improved and expanded. Reddit is a platform for social interaction where users create posts and comment on other posts or comments. Quora, StackExchange, and Yahoo! Answers are community question and answer (CQA) platforms. On Quora and Yahoo! Answers users can ask any question regarding any topics whereas on StackExchange users have to post their questions in the appropriate subcommunity, for instance, StackOverflow for programming-related questions or MathOverflow for math-related questions.
|
||||
All these communities differ in their design. Wikipedia is a community-driven knowledge repository and consists of a collection of articles. Every user can create an article. Articles are edited collaboratively and continually improved and expanded. Reddit is a platform for social interaction where users create posts and comment on other posts or comments. Quora, StackExchange, and Yahoo! Answers are community question-and-answer (CQA) platforms. On Quora and Yahoo! Answers users can ask any question regarding any topic whereas on StackExchange users have to post their questions in the appropriate subcommunity, for instance, StackOverflow for programming-related questions or MathOverflow for math-related questions.
|
||||
|
||||
CQA sites are very effective at code review \cite{treude2011programmers}. Code may be understood in the traditional sense of source code in programming-related fields but this also translates to other fields, for instance, mathematics where formulas represent code. CQA sites are also very effective at solving conceptual questions. This is due to the fact that people have different areas of knowledge and expertise \cite{robillard1999role} and due to the large user base established CQA sites have, which again increases the variety of users with expertise in different fields.
|
||||
|
||||
@@ -68,7 +68,7 @@ Despite the differences in purpose and manifestation of these communities, they
|
||||
|
||||
1) When starting a community, it has to have a critical mass of users who create content. StackOverflow already had a critical mass of users from the beginning due to the StackOverflow team already being experts in the domain \cite{mamykina2011design} and the private beta\footref{atwood2008stack}. Both aspects ensured a strong community core early on.
|
||||
|
||||
2) The platform must attract new users to grow as well as replace leaving users. Depending on the type of community new users should bring certain skills, for example, programming background in open-source software development, or extended knowledge on certain domains; or qualities, for example, a certain illness in medical communities. New users also bring the challenge of onboarding with them. Most newcomers will not be familiar with all the rules and nuances of the community \cite{yazdanian2019eliciting}\footnote{\label{hanlon2018stack}\url{https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-very-welcoming-its-time-for-that-to-change/}}.
|
||||
2) The platform must attract new users to grow as well as replace leaving users. Depending on the type of community new users should bring certain skills, for example, a programming background in open-source software development, or extended knowledge on certain domains; or qualities, for example, a certain illness in medical communities. New users also bring the challenge of onboarding with them. Most newcomers will not be familiar with all the rules and nuances of the community \cite{yazdanian2019eliciting}\footnote{\label{hanlon2018stack}\url{https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-very-welcoming-its-time-for-that-to-change/}}.
|
||||
|
||||
3) The platform should encourage users to commit to the community. Online communities are often based on the voluntary commitment of their users \cite{ipeirotis2014quizz}, hence the platform has to ensure users are willing to stay. Most platforms do not have contracts with their users, so users should see benefits for staying with the community.
|
||||
|
||||
@@ -96,7 +96,7 @@ The onboarding process of new users is a permanent challenge for online communit
|
||||
\textbf{One-day-flies}\\
|
||||
\citeauthor{slag2015one} investigated why many users on StackOverflow only post once after their registration \cite{slag2015one}. They found that 47\% of all users on StackOverflow posted only once and called them one-day-flies. They suggest that code example quality is lower than that of more involved users, which often leads to answers and comments to first improve the question and code instead of answering the stated question. This likely discourages new users from using the site further. Negative feedback instead of constructive feedback is another cause for discontinuation of usage. The StackOverflow staff also conducted their own research on negative feedback of the community\footnote{\label{silge2019welcome}\url{https://stackoverflow.blog/2018/07/10/welcome-wagon-classifying-comments-on-stack-overflow/}}. They investigated the comment sections of questions by recruiting their staff members to rate a set of comments and they found more than 7\% of the reviewed comments are unwelcoming.
|
||||
|
||||
One-day-flies are not unique to StackOverflow. \citeauthor{steinmacher2015social} investigated the social barriers newcomers face when they submit their first contribution to an open-source software project \cite{steinmacher2015social}. They based their work on empirical data and interviews and identified several social barriers preventing newcomers to place their first contribution to a project. Furthermore, newcomers are often on their own in open source projects. The lack of support and peers to ask for help hinders them. \citeauthor{yazdanian2019eliciting} found that new contributors on Wikipedia face challenges when editing articles. Wikipedia hosts millions of articles\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia}} and new contributors often do not know which articles they could edit and improve. Recommender systems can solve this problem by suggesting articles to edit but they suffer from the cold start problem because they rely on past user activity which is missing for new contributors. \citeauthor{yazdanian2019eliciting} proposed a solution by establishing a framework that automatically creates questionnaires to fill this gap. This also helps match new contributors with more experienced contributors that could help newcomers when they face a problem.
|
||||
One-day-flies are not unique to StackOverflow. \citeauthor{steinmacher2015social} investigated the social barriers newcomers face when they submit their first contribution to an open-source software project \cite{steinmacher2015social}. They based their work on empirical data and interviews and identified several social barriers preventing newcomers from placing their first contribution to a project. Furthermore, newcomers are often on their own in open-source projects. The lack of support and peers to ask for help hinders them. \citeauthor{yazdanian2019eliciting} found that new contributors on Wikipedia face challenges when editing articles. Wikipedia hosts millions of articles\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia}} and new contributors often do not know which articles they could edit and improve. Recommender systems can solve this problem by suggesting articles to edit but they suffer from the cold start problem because they rely on past user activity which is missing for new contributors. \citeauthor{yazdanian2019eliciting} proposed a solution by establishing a framework that automatically creates questionnaires to fill this gap. This also helps match new contributors with more experienced contributors that could help newcomers when they face a problem.
|
||||
\citeauthor{allen2006organizational} showed that the one-time-contributors phenomenon also translates to workplaces and organizations \cite{allen2006organizational}. They found out that socialization with other members of an organization plays an important role in turnover. The better the socialization within the organization the less likely newcomers are to leave. This socialization process has to be actively pursued by the organization.
|
||||
|
||||
\textbf{Lurking}\\
|
||||
@@ -105,12 +105,12 @@ One-day-flies may partially be a result of lurking. Lurking is consuming content
|
||||
% DONE Non-public and public online community participation: Needs, attitudes and behavior \cite{nonnecke2006non} about lurking, many programmers do that probably, not even registering, lurking not a bad behavior but observing, lurkers are more introverted, passive behavior, less optimistic and positive than posters, prviously lurking was thought of free riding, not contributing, taking not giving to comunity, important for getting to know a community, better integration when joining
|
||||
|
||||
\textbf{Reflection}\\
|
||||
The StackOverflow team acknowledged the one-time-contributors trend\footref{hanlon2018stack}\footref{silge2019welcome} and took efforts to make the site more welcoming to new users\footnote{\label{friend2018rolling}\url{https://stackoverflow.blog/2018/06/21/rolling-out-the-welcome-wagon-june-update/}}. They lied out various reasons: Firstly, they have sent mixed messages whether the site is an expert site or for everyone. Secondly, they gave too little guidance to new users which resulted in poor questions from new users and in the unwelcoming behavior of more integrated users towards the new users. New users do not know all the rules and nuances of communication of the communities. An example is that ''Please`` and ''Thank you`` are not well received on the site as they are deemed unnecessary. Also the quality, clearness, and language quality of the questions of new users is lower than more experienced users which leads to unwelcoming or even toxic answers and comments. Moreover, users who gained moderation tool access could close questions with predefined reasons which often are not meaningful enough for the poster of the question\footnote{\label{hanlon2013war}\url{https://stackoverflow.blog/2013/06/25/the-war-of-the-closes/}}. Thirdly, marginalized groups, for instance, women and people of color \cite{ford2016paradise}\footref{hanlon2018stack}\mfs\footnote{\label{stackoversurvey2019}\url{https://insights.stackoverflow.com/survey/2019}}, are more likely to drop out of the community due to unwelcoming behavior from other users\footref{hanlon2018stack}. They feel the site is an elitist and hostile place.
|
||||
The team suggested several steps to mitigate these problems. Some of these steps include appealing to the users to be more welcoming and forgiving towards new users\footref{hanlon2018stack}\footref{silge2019welcome}\mfs\footnote{\url{https://stackoverflow.blog/2012/07/20/kicking-off-the-summer-of-love/}}, other steps are geared towards changes to the platform itself: The \emph{Be nice policy} (code of conduct) was updated with feedback from the community\footnote{\url{https://meta.stackexchange.com/questions/240839/the-new-new-be-nice-policy-code-of-conduct-updated-with-your-feedback}}. This includes: new users should not be judged for not knowing all things. Furthermore, the closing reasons were updated to be more meaningful to the poster, and questions that are closed are shown as ''on hold`` instead of ''closed`` for the first 5 days\footref{hanlon2013war}. Moreover, the team investigates how the comment sections can be improved to lessen the unwelcomeness and hostility and keep the civility up.
|
||||
The StackOverflow team acknowledged the one-time-contributors trend\footref{hanlon2018stack}\footref{silge2019welcome} and took efforts to make the site more welcoming to new users\footnote{\label{friend2018rolling}\url{https://stackoverflow.blog/2018/06/21/rolling-out-the-welcome-wagon-june-update/}}. They lied out various reasons: Firstly, they have sent mixed messages whether the site is an expert site or for everyone. Secondly, they gave too little guidance to new users which resulted in poor questions from new users and in the unwelcoming behavior of more integrated users towards the new users. New users do not know all the rules and nuances of communication in the communities. An example is that ''Please`` and ''Thank you`` are not well received on the site as they are deemed unnecessary. Also the quality, clearness, and language quality of the questions of new users is lower than more experienced users which leads to unwelcoming or even toxic answers and comments. Moreover, users who gained moderation tool access could close questions with predefined reasons which often are not meaningful enough for the poster of the question\footnote{\label{hanlon2013war}\url{https://stackoverflow.blog/2013/06/25/the-war-of-the-closes/}}. Thirdly, marginalized groups, for instance, women and people of color \cite{ford2016paradise}\footref{hanlon2018stack}\mfs\footnote{\label{stackoversurvey2019}\url{https://insights.stackoverflow.com/survey/2019}}, are more likely to drop out of the community due to unwelcoming behavior from other users\footref{hanlon2018stack}. They feel the site is an elitist and hostile place.
|
||||
The team suggested several steps to mitigate these problems. Some of these steps include appealing to the users to be more welcoming and forgiving towards new users\footref{hanlon2018stack}\footref{silge2019welcome}\mfs\footnote{\url{https://stackoverflow.blog/2012/07/20/kicking-off-the-summer-of-love/}}, other steps are geared towards changes to the platform itself: The \emph{Be nice policy} (code of conduct) was updated with feedback from the community\footnote{\url{https://meta.stackexchange.com/questions/240839/the-new-new-be-nice-policy-code-of-conduct-updated-with-your-feedback}}. This includes: new users should not be judged for not knowing all things. Furthermore, the closing reasons were updated to be more meaningful to the poster, and questions that are closed are shown as ''on hold`` instead of ''closed`` for the first 5 days\footref{hanlon2013war}. Moreover, the team investigates how the comment sections can be improved to lessen the unwelcomeness and hostility and keep civility up.
|
||||
|
||||
\textbf{Mentorship Research Project}\\
|
||||
The StackOverflow team partnered with \citeauthor{ford2018we} and implemented the Mentorship Research Project \cite{ford2018we}\footnote{\url{https://meta.stackoverflow.com/questions/357198/mentorship-research-project-results-wrap-up}}. The project lasted one month and aimed to help newcomers improve their first questions before they are posted publicly. The program went as follows: When a user is about to post a question the user is asked whether they want their question to be reviewed by a mentor. If they confirmed they are forward to a help room with a mentor who is an experienced user. The question is then reviewed and the mentor suggests some changes if applicable. These changes may include narrowing the question for more precise answers, adding a code example or adjusting code, or removing of \emph{Please} and \emph{Thank you} from the question. After the review and editing, the question is posted publicly by the user. The authors found that mentored questions are received significantly better by the community than non-mentored questions. The questions also received higher scores and were less likely to be off-topic and poor in quality. Furthermore, newcomers are more comfortable when their question is reviewed by a mentor.
|
||||
For this project, four mentors were hand-selected and therefore the project would not scale very well as the number of mentors is very limited but it gave the authors an idea on how to pursue their goal of increasing the welcomingness on StackExchange. The project is followed up by a \emph{Ask a question wizard} to help new users, as well as more experienced users, improve the structure, quality, and clearness of their questions\footref{friend2018rolling}.
|
||||
The StackOverflow team partnered with \citeauthor{ford2018we} and implemented the Mentorship Research Project \cite{ford2018we}\footnote{\url{https://meta.stackoverflow.com/questions/357198/mentorship-research-project-results-wrap-up}}. The project lasted one month and aimed to help newcomers improve their first questions before they are posted publicly. The program went as follows: When a user is about to post a question the user is asked whether they want their question to be reviewed by a mentor. If they confirmed they are forwarded to a help room with a mentor who is an experienced user. The question is then reviewed and the mentor suggests some changes if applicable. These changes may include narrowing the question for more precise answers, adding a code example or adjusting code, or removing \emph{Please} and \emph{Thank you} from the question. After the review and editing, the question is posted publicly by the user. The authors found that mentored questions are received significantly better by the community than non-mentored questions. The questions also received higher scores and were less likely to be off-topic and poor in quality. Furthermore, newcomers are more comfortable when their question is reviewed by a mentor.
|
||||
For this project, four mentors were hand-selected and therefore the project would not scale very well as the number of mentors is very limited but it gave the authors an idea of how to pursue their goal of increasing the welcomingness on StackExchange. The project is followed up by a \emph{Ask a question wizard} to help new users, as well as more experienced users, improve the structure, quality, and clearness of their questions\footref{friend2018rolling}.
|
||||
|
||||
|
||||
% DONE One-day flies on StackOverflow \cite{slag2015one}, 1 contribution during whole registration, only user with 6 month of registration
|
||||
@@ -150,13 +150,13 @@ While attracting and onboarding new users is an important step for growing a com
|
||||
As StackExchange is a CQA platform, the benefits from information exchange, time and location flexibility, and permanency are more prevalent, while social support and social interaction are more in the background. Social support and social interaction are more relevant in communities where individuals communicate about topics regarding themselves, for instance, communities where health aspects are the main focus \cite{maloney2005multilevel}. Time and location flexibility is important for all online communities. Information exchange and permanency are important for StackExchange as it is a large collection of knowledge that mostly does not change over time or from one individual to another. StackExchange's content is driven by the community and therefore depends on the voluntarism of its users, making benefits even more important.
|
||||
|
||||
%TODO abc this seem wrong here
|
||||
The backbone of a community is always the user base and its voluntarism to participate with the community. Even if the community is led by a commercial core team, the community is almost always several orders of magnitude greater than the number of the paid employees forming the core team \cite{butler2002community}. The core team often provides the infrastructure for the community and does some community work. However, most of the community work is done by volunteers of the community.
|
||||
The backbone of a community is always the user base and its voluntarism to participate in the community. Even if the community is led by a commercial core team, the community is almost always several orders of magnitude greater than the number of the paid employees forming the core team \cite{butler2002community}. The core team often provides the infrastructure for the community and does some community work. However, most of the community work is done by volunteers of the community.
|
||||
This is also true for the StackExchange platform where the core team of paid employees is between 200 to 500\footnote{\url{https://www.linkedin.com/company/stack-overflow}} (this includes employees working on other products) and the number of voluntary community members (these users have access to moderation tools) performing community work is around 10,000 \footnote{\url{https://data.stackexchange.com/stackoverflow/revision/1412005/1735651/users-with-rep-20k}}.
|
||||
|
||||
|
||||
|
||||
\subsection{Encourage contribution}
|
||||
In a community, users can generally be split into 2 groups by motivation to voluntarily contribute: One group acts out of altruism, where users contribute with the reason to help others and do good to the community; the second group acts out of egoism and selfish reasons, for instance, getting recognition from other people \cite{ginsburg2004framework}. Users of the second group still help the community but their primary goal is not necessarily the health of the community but gaining reputation and making a name for themselves. Contrary, users of the first group primarily focus on helping the community and see reputation as a positive side effect which also feeds back in their ability to help others. While these groups have different objectives, both groups need recognition of their efforts \cite{iriberri2009life}. There are several methods for recognizing the value a member provides to the community: reputation, awards, trust, identity, etc. \cite{ginsburg2004framework}. Reputation, trust, and identity are often reached gradually over time by continuously working on them, awards are reached at discrete points in time. Awards often take some time and effort to achieve. However, awards should not be easily achievable as their value comes from the work that is required for them\cite{lawler2000rewarding}. They should also be meaningful in the community they are used in. Most importantly, awards have to be visible to the public, so other members can see them. In this way, awards become a powerful motivator to users.
|
||||
In a community, users can generally be split into 2 groups by motivation to voluntarily contribute: One group acts out of altruism, where users contribute with the reason to help others and do good to the community; the second group acts out of egoism and selfish reasons, for instance, getting recognition from other people \cite{ginsburg2004framework}. Users of the second group still help the community but their primary goal is not necessarily the health of the community but gaining reputation and making a name for themselves. Contrary, users of the first group primarily focus on helping the community and see reputation as a positive side effect which also feeds back into their ability to help others. While these groups have different objectives, both groups need recognition of their efforts \cite{iriberri2009life}. There are several methods for recognizing the value a member provides to the community: reputation, awards, trust, identity, etc. \cite{ginsburg2004framework}. Reputation, trust, and identity are often reached gradually over time by continuously working on them, awards are reached at discrete points in time. Awards often take some time and effort to achieve. However, awards should not be easily achievable as their value comes from the work that is required for them\cite{lawler2000rewarding}. They should also be meaningful in the community they are used in. Most importantly, awards have to be visible to the public, so other members can see them. In this way, awards become a powerful motivator for users.
|
||||
|
||||
|
||||
%TODO maybe look at finding of https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.3093&rep=rep1&type=pdf , in discussion bullet point list: subgroups, working and less feature > not working and more features, selfmoderation
|
||||
@@ -238,21 +238,21 @@ Different badges also create status classes \cite{immorlica2015social}. The hard
|
||||
Regulation evolves around the user actions and the content a community creates. It is required to steer the community and keep the community civil. Naturally, some users will not have the best intentions for the community in mind. These actions of such must be accounted for, and harmful actions must be dealt with. Otherwise, the community and its content will deteriorate.
|
||||
|
||||
\textbf{Content quality}\\
|
||||
Quality is a concern in online communities. Platform moderators and admins want to keep a certain level of quality or even raise it. However, higher-quality posts take more time and effort than lower-quality posts. In the case of CQA platforms, this is an even bigger problem as higher-quality answers fight against fast responses. Despite that, StackOverflow also has a problem with low quality and effort questions and the subsequent unwelcoming answers and comments\footref{silge2019welcome}.
|
||||
Quality is a concern in online communities. Platform moderators and admins want to keep a certain level of quality or even raise it. However, higher-quality posts take more time and effort than lower-quality posts. In the case of CQA platforms, this is an even bigger problem as higher-quality answers fight against fast responses. Despite that, StackOverflow also has a problem with low-quality and low-effort questions and the subsequent unwelcoming answers and comments\footref{silge2019welcome}.
|
||||
|
||||
\citeauthor{lin2017better} investigated how growth affects a community\cite{lin2017better}. They looked at Reddit communities that were added to the default set of subscribed communities of every new user (defaulting) which lead to a huge influx of new users to these communities as a result. The authors found that contrary to expectations, the quality stays largely the same. The vote score dips shortly after defaulting but quickly recovers or even raises to higher levels than before. The complaints of low-quality content did not increase, and the language used in the community stayed the same. However, the community clustered around fewer posts than before defaulting. \citeauthor{srba2016stack} did a similar study on the StackOverflow community \cite{srba2016stack}. They found a similar pattern in the quality of posts. The quality of questions dipped momentarily due to the huge influx of new users. However, the quality did recover after 3 months.
|
||||
|
||||
\citeauthor{tausczik2011predicting} found reputation is linked to the perceived quality of posts in multiple ways \cite{tausczik2011predicting}. They suggest reputation could be used as an indicator of quality. Quality also depends on the type of platform. \citeauthor{lin2017better} showed that expert sites who charge fees, for instance, library reference services, have higher quality answers compared to free sites\cite{lin2017better}. Also, the higher the fee the higher the quality of the answers. However, free community sites outperform expert sites in terms of answer density and responsiveness.
|
||||
|
||||
\textbf{Content abuse}\\
|
||||
\citeauthor{srba2016stack} identified 3 types of users causing the lowering of quality: \emph{Help Vampires} (these spend little to no effort to research their questions, which leads to many duplicates), \emph{Noobs} (they create mostly trivial questions), and \emph{Reputation Collectors}\cite{srba2016stack}. They try to gain reputation as fast as possible by methods described by \citeauthor{bosu2013building}\cite{bosu2013building} but often with no regard of what effects their behavior has on the community, for instance, lowering overall content quality, turning other users away from the platform, and encouraging the behavior of \emph{Help Vampires} and \emph{Noobs} even more.
|
||||
\citeauthor{srba2016stack} identified 3 types of users causing the lowering of quality: \emph{Help Vampires} (these spend little to no effort to research their questions, which leads to many duplicates), \emph{Noobs} (they create mostly trivial questions), and \emph{Reputation Collectors}\cite{srba2016stack}. They try to gain reputation as fast as possible by methods described by \citeauthor{bosu2013building}\cite{bosu2013building} but often with no regard for what effects their behavior has on the community, for instance, lowering overall content quality, turning other users away from the platform, and encouraging the behavior of \emph{Help Vampires} and \emph{Noobs} even more.
|
||||
|
||||
Questions of \emph{Help Vampires} and \emph{Noobs} direct answerers away from much more demanding questions. On one hand, this leads to knowledgeable answerers answering questions for which they are overqualified to answer, and on the other hand to a lack of adequate quality answers for more difficult questions. \citeauthor{srba2016stack} suggest a system that tries to match questions with answerers that satisfy the knowledge requirement but are not grossly overqualified to answer the question. A system with this quality would prevent suggesting simple questions to overqualified answerers, and prevent an answer vacuum for questions with more advanced topics. This would ensure more optimal utilization of the answering capability of the community.
|
||||
|
||||
\textbf{Content moderation}\\
|
||||
\citeauthor{srba2016stack} proposed some solutions to improve the quality problems. One suggestion is to restrict the openness of a community. This can be accomplished in different ways, for instance, introducing a posting limit for questions on a daily basis\cite{srba2016stack}. While this certainly limits the amount of low-quality posts, it does not eliminate the problem. Furthermore, this limitation would also hurt engaged users which would create a large volume of higher quality content. A much more intricate solution that adapts to user behavior would be required, otherwise, the limitation would hurt the community more than it improves.
|
||||
\citeauthor{srba2016stack} proposed some solutions to improve the quality problems. One suggestion is to restrict the openness of a community. This can be accomplished in different ways, for instance, by introducing a posting limit for questions on a daily basis\cite{srba2016stack}. While this certainly limits the amount of low-quality posts, it does not eliminate the problem. Furthermore, this limitation would also hurt engaged users which would create a large volume of higher-quality content. A much more intricate solution that adapts to user behavior would be required, otherwise, the limitation would hurt the community more than it improves.
|
||||
|
||||
\citeauthor{ponzanelli2014improving} performed a study where they looked at post quality on StackOverflow\cite{ponzanelli2014improving}. They aim to improve the automatic low-quality post detection system which is already in place and reduce the size of the review queue selected individuals have to go through. Their classifier improves by including popularity metrics of the user posting and the readability of the post itself. With these additional factors, they managed to reduce the amount of misclassified quality posts with only a minimal decrease of correctly classified low-quality posts. Their improvement to the classifier reduced the review queue size by 9\%.
|
||||
\citeauthor{ponzanelli2014improving} performed a study where they looked at post quality on StackOverflow\cite{ponzanelli2014improving}. They aim to improve the automatic low-quality post detection system which is already in place and reduce the size of the review queue selected individuals have to go through. Their classifier improves by including popularity metrics of the user posting and the readability of the post itself. With these additional factors, they managed to reduce the amount of misclassified quality posts with only a minimal decrease in correctly classified low-quality posts. Their improvement to the classifier reduced the review queue size by 9\%.
|
||||
|
||||
|
||||
% other studies which suggest changes to improve community interaction/qualtity/sustainability
|
||||
@@ -264,7 +264,7 @@ Questions of \emph{Help Vampires} and \emph{Noobs} direct answerers away from mu
|
||||
% -> matching questions with answerers \cite{srba2016stack} (difficult questions -> expert users, easier questions -> answerers that know it but are not experts), dont overload experts, utilize capacities of the many nonexperts
|
||||
|
||||
|
||||
Another solution is to find content abusers (noobs, help vampires, etc.) directly. One approach is to add a reporting system to the community, however, a system of this kind is also driven by user inputs and therefore can be manipulated as well. This would lead to excluding users flagged as false positives and missing a portion of content abusers completely. A better approach is to systematically find these users by their behavior. \citeauthor{kayes2015social} describe a classifier which achieves an accuracy of 83\% on the \emph{Yahoo! Answers} platform \cite{kayes2015social}. The classifier is based on empirical data where they looked at historical user activity, report data, and which users were banned from the platform. From these statistics, they created the classifier which is able to distinguish between falsely and fairly banned users. \citeauthor{cheng2015antisocial} performed a similar study on antisocial behavior on various platforms. They too looked at historical data of users and their eventual bans as well as on their deleted posts rates. Their classifier achieved an accuracy of 80\%.
|
||||
Another solution is to find content abusers (noobs, help vampires, etc.) directly. One approach is to add a reporting system to the community, however, a system of this kind is also driven by user inputs and therefore can be manipulated as well. This would lead to excluding users flagged as false positives and missing a portion of content abusers completely. A better approach is to systematically find these users by their behavior. \citeauthor{kayes2015social} describe a classifier which achieves an accuracy of 83\% on the \emph{Yahoo! Answers} platform \cite{kayes2015social}. The classifier is based on empirical data where they looked at historical user activity, report data, and which users were banned from the platform. From these statistics, they created the classifier which is able to distinguish between falsely and fairly banned users. \citeauthor{cheng2015antisocial} performed a similar study on antisocial behavior on various platforms. They too looked at historical data of users and their eventual bans as well as their deleted posts rates. Their classifier achieved an accuracy of 80\%.
|
||||
|
||||
|
||||
|
||||
@@ -304,7 +304,7 @@ When analyzing a community, one typically finds 2 types of data: text, and metad
|
||||
% alle sentiment methoden + vader
|
||||
\subsection{Sentiment analysis}
|
||||
|
||||
Researchers put forth many tools for sentiment analysis over the years. Each tool has its advantages and drawbacks and there is not a silver bullet solution that fits all research questions. Researchers have to choose a tool that best fits their needs and they need to be aware of the drawbacks of their choice. Sentiment analysis poses three important challenges:
|
||||
Researchers put forth many tools for sentiment analysis over the years. Each tool has its advantages and drawbacks and there is no silver bullet solution that fits all research questions. Researchers have to choose a tool that best fits their needs and they need to be aware of the drawbacks of their choice. Sentiment analysis poses three important challenges:
|
||||
\begin{itemize}
|
||||
\item Coverage: detecting as many features as possible from a given piece of text
|
||||
\item Weighting: assigning one or multiple values (value range and granularity) to detected features
|
||||
@@ -347,7 +347,7 @@ Creating hand-crafted tools is often a huge undertaking. They depend on a hand-c
|
||||
% - TODO list some application examples
|
||||
% ...
|
||||
|
||||
Linguistic Inquiry and Word Count (LIWC) \cite{pennebaker2001linguistic,pennebakerdevelopment} is one of the more popular tools. Due to its widespread usage, LIWC is well verified, both internally and externally. Its lexicon consists of about 6,400 words where words are categorized into one or more of the 76 defined categories \cite{pennebaker2015development}. 620 words have a positive and 744 words have a negative emotion. Examples for positive words are: love, nice, sweet; examples for negative words are: hurt, ugly, nasty. LIWC also has some drawbacks, for instance, it does not capture acronyms, emoticons, or slang words. Furthermore, LIWC's lexicon uses a polarity-based approach, meaning that it cannot distinguish between the sentences ''This pizza is good`` and ''This pizza is excellent``\cite{hutto2014vader}. \emph Good and \emph excellent are both in the category of positive emotion but LIWC does not distinguish between single words in the same category.
|
||||
Linguistic Inquiry and Word Count (LIWC) \cite{pennebaker2001linguistic,pennebakerdevelopment} is one of the more popular tools. Due to its widespread usage, LIWC is well-verified, both internally and externally. Its lexicon consists of about 6,400 words where words are categorized into one or more of the 76 defined categories \cite{pennebaker2015development}. 620 words have a positive and 744 words have a negative emotion. Examples of positive words are: love, nice, and sweet; examples of negative words are: hurt, ugly, and nasty. LIWC also has some drawbacks, for instance, it does not capture acronyms, emoticons, or slang words. Furthermore, LIWC's lexicon uses a polarity-based approach, meaning that it cannot distinguish between the sentences ''This pizza is good`` and ''This pizza is excellent``\cite{hutto2014vader}. \emph Good and \emph excellent are both in the category of positive emotion but LIWC does not distinguish between single words in the same category.
|
||||
|
||||
%General Inquirer (GI) \cite{stone1966general} 1966 TODO ref wrong?
|
||||
% - 11k words, 1900 pos, 2300 neg, all approx (vader)
|
||||
@@ -365,7 +365,7 @@ General Inquirer (GI)\cite{stone1966general} is one of the oldest sentiment tool
|
||||
% - bootstrapped from wordnet (wellknown english lexical database) (vader, hu2004mining)
|
||||
|
||||
%TODO refs
|
||||
Hu-Liu04 \cite{hu2004mining,liu2005opinion} is a opinion mining tool. It searches for features in multiple pieces of text, for instance, product reviews, and rates the opinion of the feature by using a binary classification\cite{hu2004mining}. Crucially Hu-Liu04 does not summarize the texts but summarizes ratings of the opinions about features mentioned in the texts. Hu-Liu04 was bootstrapped from WordNet\cite{hu2004mining} and then extended further. It now uses a lexicon consisting of about 6800 words where 2000 words have a positive sentiment and 4800 words have a negative sentiment attached\cite{hutto2014vader}. This tool is, by design, better suited for social media texts, although it also misses emoticons, acronyms, and initialisms.
|
||||
Hu-Liu04 \cite{hu2004mining,liu2005opinion} is an opinion mining tool. It searches for features in multiple pieces of text, for instance, product reviews, and rates the opinion of the feature by using a binary classification\cite{hu2004mining}. Crucially Hu-Liu04 does not summarize the texts but summarizes ratings of the opinions about features mentioned in the texts. Hu-Liu04 was bootstrapped from WordNet\cite{hu2004mining} and then extended further. It now uses a lexicon consisting of about 6800 words where 2000 words have a positive sentiment and 4800 words have a negative sentiment attached\cite{hutto2014vader}. This tool is, by design, better suited for social media texts, although it also misses emoticons, acronyms, and initialisms.
|
||||
|
||||
%SenticNet \cite{cambria2010senticnet} 2010
|
||||
% - concept-level opinion and sentiment analysis tool (vader)
|
||||
@@ -424,7 +424,7 @@ Word-Sense Disambiguation (WSD)\cite{akkaya2009subjectivity} is not a sentiment
|
||||
%updateing (extend/modify) hard (e.g. new domain) (vader)
|
||||
|
||||
\textbf{Machine Learning Approches}\\
|
||||
Because handcrafting sentiment analysis requires a lot of effort, researchers turned to approaches that offload the labor-intensive part to machine learning (ML). However, this results in a new challenge, namely: gathering a \emph good data set to feed the machine learning algorithms for training. Firstly, \emph good data set needs to represent as many features as possible, otherwise, the algorithm will not recognize it. Secondly, the data set has to be unbiased and representative for all the data of which the data set is a part of. The data set has to represent each feature in an appropriate amount, otherwise, the algorithms may discriminate a feature in favor of other more represented features. These requirements are hard to fulfill and often they are not\cite{hutto2014vader}. After a data set is acquired, a model has to be learned by the ML algorithm, which is, depending on the complexity of the algorithm, a very computational-intensive and memory-intensive process. After training is completed, the algorithm can predict sentiment values for new pieces of text, which it has never seen before. However, due to the nature of this approach, the results cannot be comprehended by humans easily if at all. ML approaches also suffer from a generalization problem and therefore cannot be transferred to other domains without accepting a bad performance, or updating the training data set to fit the new domain. Updating (extending or modifing) the model also requires complete retraining from scratch. These drawbacks make ML algorithms useful only in narrow situations where changes are not required and the training data is static and unbiased.
|
||||
Because handcrafting sentiment analysis requires a lot of effort, researchers turned to approaches that offload the labor-intensive part to machine learning (ML). However, this results in a new challenge, namely: gathering a \emph good data set to feed the machine learning algorithms for training. Firstly, \emph good data set needs to represent as many features as possible, otherwise, the algorithm will not recognize it. Secondly, the data set has to be unbiased and representative of all the data of which the data set is a part of. The data set has to represent each feature in an appropriate amount, otherwise, the algorithms may discriminate a feature in favor of other more represented features. These requirements are hard to fulfill and often they are not\cite{hutto2014vader}. After a data set is acquired, a model has to be learned by the ML algorithm, which is, depending on the complexity of the algorithm, a very computationally-intensive and memory-intensive process. After training is completed, the algorithm can predict sentiment values for new pieces of text, that it has never seen before. However, due to the nature of this approach, the results cannot be comprehended by humans easily if at all. ML approaches also suffer from a generalization problem and therefore cannot be transferred to other domains without accepting a bad performance, or updating the training data set to fit the new domain. Updating (extending or modifying) the model also requires complete retraining from scratch. These drawbacks make ML algorithms useful only in narrow situations where changes are not required and the training data is static and unbiased.
|
||||
|
||||
% naive bayes
|
||||
% - simple (vader)
|
||||
@@ -440,7 +440,7 @@ Maximum Entropy (ME) is a more sophisticated algorithm. It uses an exponential m
|
||||
%- mathemtical anspruchsvoll (vader)
|
||||
%- seperate datapoints using hyper planes (vader)
|
||||
%- long training period (other methods do not need training at all because lexica) (vader)
|
||||
Support Vector Machines (SVM) uses a different approach. SVMs put data points in an $n$-dimentional space and differentiate them with hyperplanes ($n-1$ dimensional planes), so data points fall in 1 of the 2 halves of the space divided by the hyperplane. This approach is usually very memory and computation-intensive as each data point is represented by an $n$-dimentional vector where $n$ denotes the number of trained features.
|
||||
Support Vector Machines (SVM) use a different approach. SVMs put data points in an $n$-dimentional space and differentiate them with hyperplanes ($n-1$ dimensional planes), so data points fall in 1 of the 2 halves of the space divided by the hyperplane. This approach is usually very memory and computation-intensive as each data point is represented by an $n$-dimentional vector where $n$ denotes the number of trained features.
|
||||
|
||||
%generall blyabla, transition to vader
|
||||
|
||||
@@ -470,9 +470,9 @@ This shortcoming was addressed by \citeauthor{hutto2014vader} who introduced a n
|
||||
% ursprüngliches paper ITS, wie hat man das früher (davor) gemacht
|
||||
\subsection{Trend analysis}
|
||||
|
||||
When introducing a change to a system (experiment), one often wants to know whether the intervention achieves its intended purpose. This leads to 3 possible outcomes: a) the intervention shows an effect and the system changes in the desired way, b) the intervention shows an effect and the system changes in an undesired way, or c) the system did not react at all to the change. There are multiple ways to determine which of these outcomes occur. To analyze the behavior of the system, data from before and after the intervention as well as the nature of the intervention has to be acquired. The are multiple ways to run such an experiment and one has to choose which type of experiment fits best. There are 2 categories of approaches: actively creating an experiment where one design the experiment before it is executed (for example randomized control trials in medical fields), or using existing data of an experiment that was not designed beforehand, or where setting up a designed experiment is not possible (quasi-experiment).
|
||||
When introducing a change to a system (experiment), one often wants to know whether the intervention achieves its intended purpose. This leads to 3 possible outcomes: a) the intervention shows an effect and the system changes in the desired way, b) the intervention shows an effect and the system changes in an undesired way, or c) the system did not react at all to the change. There are multiple ways to determine which of these outcomes occur. To analyze the behavior of the system, data from before and after the intervention as well as the nature of the intervention has to be acquired. The are multiple ways to run such an experiment and one has to choose which type of experiment fits best. There are 2 categories of approaches: actively creating an experiment where one designs the experiment before it is executed (for example randomized control trials in medical fields), or using existing data of an experiment that was not designed beforehand, or when setting up a designed experiment is not possible (quasi-experiment).
|
||||
|
||||
As this thesis investigates a change that has already been implemented by another party, this thesis covers quasi-experiments. A tool that is often used for this purpose is an \emph{Interrupted Time Series} (ITS) analysis. The ITS analysis is a form of segmented regression analysis, where data from before, after, and during the intervention is regressed with separate line segements\cite{mcdowall2019interrupted}. ITS requires data at (regular) intervals from before and after the intervention (time series). The interrupt signifies the intervention and the time of when it occurred must be known. The intervention can be at a single point in time or it can be stretched out over a certain time span. This property must also be known to take it into account when designing the regression. Also, as the data is acquired from a quasi-experiment, it may be baised\cite{bernal2017interrupted}, for example, seasonality, time-varying confounders (for example, a change in measuring data), variance in the number of single observations grouped together in an interval measurement, etc. These biases need to be addressed if present. Seasonality can be accounted for by subtracting the average value of each of the months in successive years (i.e. subtract the average value of all Januaries in the data set from the values in Januaries).
|
||||
As this thesis investigates a change that has already been implemented by another party, this thesis covers quasi-experiments. A tool that is often used for this purpose is an \emph{Interrupted Time Series} (ITS) analysis. The ITS analysis is a form of segmented regression analysis, where data from before, after, and during the intervention is regressed with separate line segements\cite{mcdowall2019interrupted}. ITS requires data at (regular) intervals from before and after the intervention (time series). The interrupt signifies the intervention and the time when it occurred must be known. The intervention can be at a single point in time or it can be stretched out over a certain time span. This property must also be known to take into account when designing the regression. Also, as the data is acquired from a quasi-experiment, it may be baised\cite{bernal2017interrupted}, for example, seasonality, time-varying confounders (for example, a change in measuring data), variance in the number of single observations grouped together in an interval measurement, etc. These biases need to be addressed if present. Seasonality can be accounted for by subtracting the average value of each of the months in successive years (i.e. subtract the average value of all Januaries in the data set from the values in Januaries).
|
||||
%\begin{lstlisting}
|
||||
% deseasonalized = datasample - average(dataSamplesInMonth(month(datasample)))
|
||||
%\end{lstlisting}
|
||||
|
||||
Reference in New Issue
Block a user