\chapter{Related Work} % py umschreiben auf how the new contributor thing works https://meta.stackexchange.com/questions/314472/what-are-the-exact-criteria-for-the-new-contributor-indicator-to-be-shown ; change date = 2018-08-21T21:04:49.177 %read template notes again and adjust %askubuntu analyse, stackexchange.com/sites anschauen was noch analyse This section is divided into two parts. The first part explains what StackExchange is, how it developed since its inception, and how it works. The second part shows previous and related work. %TODO more % first look at how stackexchange works in backgeound section % \section{Background} StackExchange\footnote{\url{https://stackexchange.com}} is a community question and answering (CQA) platform where users can ask and answer questions, accept answers as an appropriate solution to the question, and up-/downvote questions and answers. StackExchange uses a community-driven knowledge creation process by allowing everyone who registers to participate in the community. Invested users also get access to moderation tools to help maintain the vast community. All posts on the StackExchange platform are publicly visible, allowing non-users to benefit from the community as well. Posts are also accessible for web search engines so users can find questions and anwsers easily with a simple web search. StackExchange keeps an archive of all questions and answers posted, creating a knowledge archive for future visitors to look into. Originally, StackExchange started with StackOverflow\footnote{\url{https://stackoverflow.com}} in 2008 \cite{atwood2008stack}. Since then StackExchange grew into a platform hosting sites for 174 different topics \cite{stackexchangetour}, for instance, programming (StackOverflow), maths (MathOverflow\footnote{\url{https://mathoverflow.net}} and Math StackExchange\footnote{\url{https://math.stackexchange.com}}), and typesetting (TeX/LaTeX\footnote{\url{https://tex.stackexchange.com}}). Questions on StackExchange are stated in natural English language and consist of a title, a body containing a detailed description of the problem or information need, and tags to categorize the question. After a question is posted the community can submit answers to the question. The author of the question can then accept an appropriate answer which satisfies their question. The accepted answer is then marked as such with a green checkmark and shown on top of all the other answers. Figure \ref{soexamplepost} shows an example of a StackOverflow question. Questions and answers can be up-/downvoted by every user registered on the site. Votes typically reflect the quality and importance of the respective question or answers. Answers with a high voting score raise to the top of the answer list as answers are sorted by the vote score in descending order by default. Voting also influences a user's reputation \cite{stackexchangetour, movshovitz2013analysis}. When a post (question or answers) is voted upon the reputation of the poster changes accordingly. Furthermore, downvoting of answers also decreases the reputation of the user who voted \cite{stackoverflowvotedown}. Reputation on StackExchange indicates how trustworthy a user is. To gain a high reputation value a user has to invest a lot of time and effort to reach a high reputation value by asking good questions and posting good answers to questions. Reputation also unlocks privileges which may differ slightly from one community to another \cite{stackoverflowprivileges, mathoverflowprivileges}. With privileges, users can, for instance, create new tags if the need for a new tag arises, cast votes on closing or reopening questions if the question is off-topic or a duplicate of another question, or when a question had been closed for no or a wrong reason, or even get access to moderation tools. StackExchange also employs a badge system to steer the community \cite{stackoverflowbadges}. Some badges can be obtained by performing one-time actions, for instance, reading the tour page which contains necessary details for newly registered users, or by performing certain actions multiple times, for instance, editing and answering the same question within 12 hours. Furthermore, users can comment on every question and answer. Comments could be used for further clarifying an answer or a short discussion on a question or answer. For each community on StackExchange, a \emph Meta page is offered where members of the respective community can discuss the associated community \cite{stackoverflowmeta, mamykina2011design}. This place is used by site admins to interact with the community. The \emph Meta pages are also used for proposing and voting on new features and reporting bugs. \emph Meta pages run the same software as the normal CQA pages so users on vote the ideas and suggestions in the same way they would do on the actual CQA sites. \begin{figure} \includegraphics[scale=0.47]{figures/stackoverflow_example_post} \caption{A typical question on StackOverflow. In the top middle section of the page, the question is stated. The question has 4 tags and 3 comments attached to it. Beneath the question, all answers are listed by their score in descending order (only one answer is visible in this screenshot). The accepted answer is marked by a green checkmark. To the left of the question and answers, the score (computed via votes) is indicated.} \label{soexamplepost} \end{figure} % explain SO and SE in detail and how it works (https://stackexchange.com/tour) %- question answer platform with 174 sites for different topics, eg programming (biggest one), latex, ... %- questions and answers in natural language %- questions can have tags %- questioners should post their question in the appropiate community, and formulation the question precisely, question should meet standards defined by the community %- asker can accept 1 answer %- question, answers up/downvoting, include voting and reputation changes from tour site, reputation == trustworthyness %- badges and privilesges with higher reputation %- suggestion can be made by others to improve the question, eg add tags or add/change content in the question for better finding, answering question %- comments for questions and answers %- each community has a meta page for discussion about community itself (not questions within the community) %- each community uses the same software, although layout may differ from community to community but generally speaking same structure of the page %- add pictures of typical stackexchange question page %community driven knowlege creation process %higher reputation also gives moderation tools (site management, flagging question offtopic, unspecific, ...) TODO add reference % % not only ``forum`` for fast q&a but also knowledge base % public posts and therefore good search engine availibity eg. google % so success: Design Lessons from the Fastest Q&A Site in the West \cite{mamykina2011design} understanding SO success % change introduced mid august 2018 % write about that post % include user question on how exactly it works \section{State of the Art} Since the introduction of Web 2.0 and the subsequential spawning of platforms for social interaction, researchers started investigating the emerging online communities. Research strongly focuses on the interactions of users on various platforms. Community knowledge platforms are of special interest, for instance, StackExchange/StackOverflow \cite{slag2015one, ford2018we, bazelli2013personality, movshovitz2013analysis, bosu2013building, yanovsky2019one, kusmierczyk2018causal, anderson2013steering, immorlica2015social, tausczik2011predicting}, Quora \cite{wang2013wisdom}, Reddit \cite{lin2017better, chandrasekharan2017you}, Yahoo! Answers \cite{bian2008finding}, and Wikipedia \cite{yazdanian2019eliciting}. These platforms allow communication over large distances and facilitate fast and easy knowledge exchange and aquisition by connecting thousands or even millions of users and create valuable repositories of knowledge in the process. Users create, edit, and consume little pieces of information and collectively build a community and knowledge repository. However, not every piece of information is factual \cite{wang2013wisdom, bian2008finding} and platforms often employ some kind of moderation to keep up the value of the platform and to ensure a certain standard within the community. %allow communitcation over large distances %fast and easy knowledge exchange %many answers to invaluable \cite{bian2008finding} % DONE How Do Programmers Ask and Answer Questions on the Web? \cite{treude2011programmers} qa sites very effective at code review and conceptual questions % DONE The role of knowledge in software development \cite{robillard1999role} people have different areas of knowledge and expertise All these communities differ in their design. Wikipedia is a community-driven knowledge repository and consists of a collection of articles. Every user can create an article. Articles are edited collaboratively and continually improved an expanded. Reddit is a platform for social interaction where users create posts and comment on other posts or comments. Quora, StackExchange, and Yahoo! Answers are community questions and answer (CQA) platforms. On Quora and Yahoo! Answers users can ask any question regarding any topics whereas on StackExchange users have to post their questions in the appropriate subcommunity, for instance, StackOverflow for programming related questions or MathOverflow for math related questions. CQA sites are very effective at code review \cite{treude2011programmers}. Code may be understood in the traditional sense of source code in programming related fields but this also translates to other fields, for instance, mathematics where formulas represent code. CQA sites are also very effective at solving conceptual questions. This is due to the fact that people have different areas of knowledge and expertise \cite{robillard1999role} and due to the large user base established CQA sites have, which again increases the variety of users with experise in different fields. Despite the differences in purpose and manifestation of these communities, they are social communities and they have to follow certain laws. In their book on ''Building successful online communities: Evidence-based social design`` \cite{kraut2012building} Kraut lie out five equally important criteria online platforms have to fulfill in order to thrive. 1) When starting a community, it has to have a critical mass of users who create content. StackOverflow already had a critical mass of users from the beginning due to the StackOverflow team already being experts in the domain \cite{mamykina2011design} and the private beta \cite{atwood2008stack}. Both aspects ensured a strong community core early on. 2) The platform must attract new users to grow as well as to replace leaving users. Depending on the type of community new users should bring certain skills, for example, programming background in open source software developement, or extended knowledge on certain domains; or qualities, for example, a certain illness in medical communities. New users also bring the challenge of onboarding with them. Most newcomers will not be familiar with all the rules and nuances of the community \cite{yazdanian2019eliciting, hanlon2018stack}. 3) The platform should encourage users to commit to the community. Online communities are often based on voluntary commitment of their users \cite{ipeirotis2014quizz}, hence the platform has to ensure users are willing to stay. Most platforms do not have contracts with their users, so users should see benefits for staying with the community. 4) Contribution by users to the community should be encouraged. Content generation and engagement are the backbone of an online community. 5) The community needs regulation to sustain it. Not every user in a community is interested in the wellbeing of the community. Therefore, every community has to deal with trolls and inappropriate or even destructive behavior. Rules need to be established and enforced to limit and mitigate the damage malicious users cause. %new structure: % list community knowledge platforms % platforms need certian mechanisms and features to live and thrive: kraut etal % - starting a community: critical mass, enought users to attract other users who also create content % - attracting new users: attract new users to replace leaving ones, new users should be skilled and motivated to contribute (chanllange, depends on community some accept everyone others need specific skills (Eg OSS) or qualitities (eg illness for medical suppport groupgs, etc), mew users less commitment thatn old ones, newcommers may not behave according to community standard as they dont now them % - encoraging commitment: willingness to stay in community (increases statisfaction, les likely to leave, better performance, more contribution), harder than in companies with employee contracts, contrast to OSS (no contract, voluntarity), greter competition from other communities in contrast to rl where options are limimted by location and distance % - encouraging contribution: online communities need contributions by users (not lurking), content is foundation of community, contributions by users follows power law (usally, also confirmed in my results) % - regualting behavior: maintain a funtioning community, prevent troll, inappropiate behavior, limit damage if it occurs, ease of entry & exit -> high turnover All these criteria are heavily intertwined. Attracting new users often depends on the welcomingness and support of users that are already on the platform. Keeping users commited to the platform depends on the engagement with the community and how well the system design supports this. For the purpose of this thesis, the criteria layed out by \citeauthor{kraut2012building} can be grouped into two main categories: 1) onboarding of new users, 2) keeping users engaged, contributing, and well behaved. \subsection{Onboarding of new users} The onboarding process is a permanent challenge for online communities and differs from one platform to another. %TODO short intro into folling paragraphs %on day flies, on multiple platforms, solutions on other platforms %bad comment section %lurking %several project by SE to improve site %- mentorship program, ... %marginalized groups \citeauthor{slag2015one} investigated why many users on StackOverflow only post once after their registration \cite{slag2015one}. They found that 47\% of all users on StackOverflow posted only once and called them one-day-flies. They suggest that code example quality is lower than that of more involved users, which often leads to answers and comments to first improve the question and code instead of answering the stated question. This likely discourages new users from using the site further. Negative feedback instead of constructive feedback is another cause for discontinuation of usage. The StackOverflow staff also conducted their own research on negative feedback of the community \cite{silge2019welcome}. They investigated the comment sections of questions by recruiting their staff members to rate a set of comments and they found more than 7\% of the reviewed comments are unwelcoming. One-day-flies are not unique to StackOverflow. \citeauthor{steinmacher2015social} investigated the social barriers newcomers face when they submit their first contribution to an open-source software project \cite{steinmacher2015social}. They based their work on empirical data and interviews and identified several social barriers preventing newcomers to place their first contribution to a project. Furthermore, newcomers are often on their own in open source projects. The lack of support and peers to ask for help hinders them. \citeauthor{yazdanian2019eliciting} found that new contributors on Wikipedia face challenges when editing articles. Wikipedia hosts millions of articles \cite{sizeofwikipedia} and new contributors often do not know which articles they could edit and improve. Recommender systems can solve this problem by suggesting articles to edit but they suffer from the cold start problem because they rely on past user activity which is missing for new contributors. \citeauthor{yazdanian2019eliciting} proposed a solution by establishing a framework that automatically creates questionnaires to fill this gap. This also helps matching new contributors with more experienced contributors that could help newcomers when they face a problem. \citeauthor{allen2006organizational} showed that the one-time-contributors phenomenon also translates to workplaces and organizations \cite{allen2006organizational}. They found out that socialization with other members of an organization plays an important role in turnover. The better the socialization within the organization the less likely newcomers are to leave. This socialization process has to be actively pursued by the organization. One-day-flies may partially be a result of lurking. Lurking is consuming content generated by a community but not contributing content to it. \citeauthor{nonnecke2006non} investigated lurking behavior on Microsoft Network (MSN) \cite{nonnecke2006non} and found that contrary to previous studies lurking is not necessarily a bad behavior. Lurkers show passive behavior and are more introverted and less optimistic than actively posting members of a community. Previous studies suggested lurking is free riding, a taking-rather-than-giving process. However, the authors found that lurking is important in getting to know a community, how a community works and learning the nuances of social interactions on the platform. This allows for better integration into the community when a person decides to join the community. StackExchange, and especially the StackOverflow community, probably has a large lurking audience. Many programmers do not register on the site and those who do only ask one question and revert to lurking, as suggested by \cite{slag2015one}. % DONE Non-public and public online community participation: Needs, attitudes and behavior \cite{nonnecke2006non} about lurking, many programmers do that probably, not even registering, lurking not a bad behavior but observing, lurkers are more introverted, passive behavior, less optimistic and positive than posters, prviously lurking was thought of free riding, not contributing, taking not giving to comunity, important for getting to know a community, better integration when joining The StackOverflow team acknowledged the one-time-contributors trend \cite{hanlon2018stack, silge2019welcome} and took efforts to make the site more welcoming to new users \cite{friend2018rolling}. They lied out various reasons: Firstly, they have sent mixed messages whether the site is an expert site or for everyone. Secondly, they gave too little guidance to new users which resulted in poor questions from new users and in the unwelcoming behavior of more integrated users towards the new users. New users do not know all the rules and nuances of communication of the communities. An example is that ''Please`` and ''Thank you`` is not well received on the site as they are deemed unnecessary. Also the quality, clearness and language quality of the questions of new users is lower than more experienced users which leads to unwelcoming or even toxic answers and comments. Moreover, users who gained moderation tool access could close questions with predefined reasons which often are not meaningful enough for the poster of the question \cite{hanlon2013war}. Thirdly, marginalized groups, for instance, women and people of color \cite{hanlon2018stack, stackoversurvey2019, ford2016paradise}, are more likely to drop out of the community due to unwelcoming behavior from other users \cite{hanlon2018stack}. They feel the site is an elitist and hostile place. The team suggested several steps to mitigate these problems. Some of these steps include appealing to the users to be more welcoming and forgiving towards new users \cite{hanlon2018stack, silge2019welcome, spolsky2012kicking}, other steps are geared towards changes to the platform itself: The \emph{Be nice policy} (code of conduct) was updated with feedback from the community \cite{jaydles2014the}. This includes: new users should not be judged for not knowing all things. Furthermore, the closing reasons were updated to be more meaningful to the poster, and questions that are closed are shown as ''on hold`` instead of ''closed`` for the first 5 days \cite{hanlon2013war}. Moreover, the team investigates how the comment sections can be improved to lessen the unwelcomeness and hostility and keep the civility up. The StackOverflow team partnered with \citeauthor{ford2018we} and implemented the Mentorship Research Project \cite{ford2018we, hanlon2017mentorship}. The project lasted one month and aimed to help newcomers improve their first questions before they are posted publicly. The program went as follows: When a user is about to post a question the user is asked whether they want their question to be reviewed by a mentor. If they confirmed they are forward to a help room with a mentor who is an experienced user. The question is then reviewed and the mentor suggests some changes if applicable. These changes may include narrowing the question for more precise answers, adding a code example or adjusting code, or removing of \emph Please and \emph{Thank you} from the question. After the review and editing, the question is posted by publicly the user. The authors found that mentored questions are received significantly better by the community than non-mentored questions. The questions also received higher scores and were less likely to be off-topic and poor in quality. Furthermore, newcomers are more comfortable when their question is reviewed by a mentor. For this project four mentors were hand selected and therefore the project would not scale very well as the number of mentors is very limited but it gave the authors an idea on how to pursue their goal of increasing the welcomingness on StackExchange. The project is followed up by a \emph{Ask a question wizard} to help new users as well as more experienced users improve the structure, quality, and clearness of their questions \cite{friend2018rolling}. % DONE One-day flies on StackOverflow \cite{slag2015one}, 1 contribution during whole registration, only user with 6 month of registration % DONE Eliciting New Wikipedia Users’ Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start \cite{yazdanian2019eliciting}, cold start recommender system problem for recommending newcommers arictles to read and get a feeling for how to write articles; similar to SO because new commers % newcomers socialization, experienced users as models/mentors, positive feedback to newcomers % DONE Do organizational socialization tactics influence newcomer embeddedness and turnover? \cite{allen2006organizational} #newcommers to organizations, actively embedding newcomers into organization, shows connection between socialaization and turnover (leaving the organization) % DONE We Don't Do That Here: How Collaborative Editing with Mentors Improves Engagement in Social Q\&A Communities \cite{ford2018we} # mentoring new commers questions (before posting), 1 month experiment, collaborative experiment with stackoverflow team, novices got a choice upon submitting a question whether or not the want feedback from a mentor regaurding the question, if so redirect to help room where mentor reviews question and suggests changes to question, mentored questions significatly better than non-mentored ones, higher scores fewer offtopic or poor questions, novices more comfortable with mentor reviewed questions % DONE Stack Overflow Isn't Very Welcoming: It's Time for That to Change \cite{hanlon2018stack} # passt sehr gut in die story, effort to make site more welcoming, marginalized group feel SO is a hostile and elitist place, new coders, women, people of color, etc, admitting of problem that have not been addressed (enough), mixed messages (expert site or for everyone), to little guidance for new users, pecking on new users who dont know all little things on what (not) to do (no plz and thx, low quality question -> low qualtity answer -> comments about support for low quality) or bad english, previous attempts to improve welcoming, Summer of Love (https://stackoverflow.blog/2012/07/20/kicking-off-the-summer-of-love/), The War of the Closes (https://stackoverflow.blog/2013/06/25/the-war-of-the-closes/), The NEW new “Be Nice” Policy (“Code of Conduct”) — Updated with your feedback (https://meta.stackexchange.com/questions/240839/the-new-new-be-nice-policy-code-of-conduct-updated-with-your-feedback), Mentorship Research Project - Results + Wrap-Up (https://meta.stackoverflow.com/questions/357198/mentorship-research-project-results-wrap-up?noredirect=1&lq=1) also \cite{ford2018we}, removal condesting and sarcastic comments, ideas about beginner ask page (TODO already implemted?), dont judge users for not knowing things (e.g. posting duplicates) % DONE Welcome Wagon: Classifying Comments on Stack Overflow \cite{silge2019welcome} #all about comments, effort to make site more welcoming, staff internal rating of comments (fine, unwelcoming, abusive, 57 raters, 13742 ratings, 3992 comments) % DONE Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects\cite{steinmacher2015social} onboarding in open source software projects, difficulties for newcomers, newcommers often on their own, barriers when 1st contributing to a project, % Rolling out the Welcome Wagon: June Update \cite{friend2018rolling} “Ask a Question Wizard” prototype, reduce exclusion (negative feelings, expectations and experiences), improve inclusion (learn from other communities facing similar problems), classification of abusive and unwelcoming comments %TODO Unwelcomeness is a large problem on StackExchange; not so strong; maybe other sentence Unwelcomeness is a large problem on StackExchange \cite{hanlon2018stack, friend2018rolling, ford2016paradise}.Although unwelcomeness affects all new users, users from marginalized groups suffer significantly more \cite{hanlon2018stack, vasilescu2014gender}. \citeauthor{ford2016paradise} investigated barriers users face when contributing to StackOverflow. The authors identified 14 barriers in total hindering newcomers to contribute and five barriers were rated significantly more problematic for women than men. On StackOverflow only 5.8\% (2015 \cite{stackoversurvey2015}, 7.9\% 2019 \cite{stackoversurvey2019}) of active users identify as women. \citeauthor{david2008community} found similar results of 5\% women in their work on \emph{Community-based production of open-source software} \cite{david2008community}. These numbers are comparatively small to the number of degrees in Science, Technology, Engineering, and Mathematics (STEM) \cite{clark2005women} where 20\% are achieved by women \cite{hill2010so}. Despite the difference, the percentage of women on StackOverflow has increased in recent years. %discrimitation % DONE Paradise Unplugged: Identifying Barriers for Female Participation on Stack Overflow \cite{ford2016paradise} gender gap, females only 5\%, contribution barriers, found 5 gender specific (women) barriers among 14 barrier in total, barriers also affect groups like industry programmers % DONE Community-based production of open-source software: What do we know about the developers who participate? \cite{david2008community} only 5% women contribute to OSS % DONE https://insights.stackoverflow.com/survey/2019: 7.9% women, increase since 2015: 5.8% \cite{stackoversurvey2019} % Gender, Representation and Online Participation: A Quantitative Study \cite{vasilescu2014gender} investigation on minorities (eg women), under representation of minorities % DONE Why So Few? Women in Science, Technology, Engineering, and Mathematics. \cite{hill2010so} women only 20 percent of bachelor degrees % DONE Women and science careers: leaky pipeline or gender filter? \cite{clark2005women} underrepresentation in STEM % Stack Overflow Isn't Very Welcoming: It's Time for That to Change \cite{hanlon2018stack} # passt sehr gut in die story, effort to make site more welcoming, marginalized group feel SO is a hostile and elitist place, new coders, women, people of color, etc, admitting of problem that have not been addressed (enough), mixed messages (expert site or for everyone), to little guidance for new users, pecking on new users who dont know all little things on what (not) to do (no plz and thx, low quality question -> low qualtity answer -> comments about support for low quality) or bad english, previous attempts to improve welcoming, Summer of Love (https://stackoverflow.blog/2012/07/20/kicking-off-the-summer-of-love/), The War of the Closes (https://stackoverflow.blog/2013/06/25/the-war-of-the-closes/), The NEW new “Be Nice” Policy (“Code of Conduct”) — Updated with your feedback (https://meta.stackexchange.com/questions/240839/the-new-new-be-nice-policy-code-of-conduct-updated-with-your-feedback), Mentorship Research Project - Results + Wrap-Up (https://meta.stackoverflow.com/questions/357198/mentorship-research-project-results-wrap-up) TODO also refer paper about that here, removal condesting and sarcastic comments, ideas about beginner ask page (TODO already implemted?), dont judge users for not knowing things (e.g. posting duplicates) \subsection{Keeping users engaged, contributing and well behaved} %intro .. se employes serveral features to engage/keep contributing users %reputation %badge system %quality Reputation plays a important role on StackExchange and indicates the credibility of a user as well as a primary source of answers of high quality \cite{movshovitz2013analysis}. Although the largest chunk of all questions is posted by low-reputated users, high-reputated users post more questions on average. To earn a high reputation a user has to invest a lot of effort and time into the community, for instance, asking good questions or providing useful answers to questions of others. Reputation is earned when a question or answer is upvoted by other users, or if an answer is accepted as the solution to a question by the question creator. \citeauthor{mamykina2011design} found that the reputation system of StackOverflow encourages users to compete productively \cite{mamykina2011design}. But not every user participates equally, and participation depends on the personality of the user \cite{bazelli2013personality}. \citeauthor{bazelli2013personality} showed that the top-reputated users on StackOverflow are more extroverted compared to users with less reputation. \citeauthor{movshovitz2013analysis} found that by analyzing the StackOverflow community network, experts can be reliably identified by their contribution within the first few months after their registeration. Graph analysis also allowed the authors to find spamming users or users with other extreme behavior. Although gaining reputation takes time and effort, users can take certain advantages to gain reputation faster by gaming the system \cite{bosu2013building}. \citeauthor{bosu2013building} analyzed the reputation system and found five strategies to increase the reputation in a fast way: Firstly, answering questions with tags that have a small expertise density. This reduces competitiveness against other users and increases the chance of upvotes and answer acceptance. Secondly, questions should be answered promptly. The question asker will most likely accept the first arriving answer that solves the question. This is also supported by \cite{anderson2012discovering}. Thirdly, answering first also gives the user an advantage over other answerers. Fourthly, activity during off-peak hours reduces the competition from other users. Finally, contributing to diverse areas will also help in developing a higher reputation. % DONE Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow \cite{anderson2012discovering} accepted answer strongly depends on when answers arrive, considered not only the question and accepted answer but the set of answers to a question % reputation % DONE On the personality traits of stackoverflow users \cite{bazelli2013personality} analyzing personality traits, top reputated users are more extroverted than less reputated users % DONE Building reputation in stackoverflow: an empirical investigation. \cite{bosu2013building} gaming the reputation system of SO, answering question with tags with lower expertise density, answering promptly, first one to answer, activity during off peak hours, contributing to diverse areas % DONE Analysis of the reputation system and user contributions on a question answering website: Stackoverflow \cite{movshovitz2013analysis} about the reputation system, high reputation indicates primary source of answers and high quality, most questions asked by low reputation users but high reputation users post most questions on avg compared to low reputation users, effective finding of spam users and other extreme behaviors via graph analysis, predicting which users become influential longterm contributors, experts can be reliably identified based on the participation in the first few months after registration % DONE Design Lessons from the Fastest Q&A Site in the West \cite{mamykina2011design} understanding SO success, 1) productive competition (gamification reputation), 2) founders were already experts on site the created (ensured success early on, founders involved in community not external), 3) meta page for discussion and voting on features (same mechanics as on SO page) Complementary to the reputation system StackOverflow also employs a badge system \cite{stackoverflowbadges} to stimulate contributions by users \cite{cavusoglu2015can}. The goal of badges is to keep users engaged with the community \cite{li2012quantifying}. Therefore, badges are often used in a gamification setting where users contribute to the community and are rewarded for their behavior if it alignes with the requirements of the badges. Badges are visible in questions and answers as well as the profile page of the user and can be earned by performing certain actions. Badges are often seen as a steering mechanism by researchers \cite{yanovsky2019one, kusmierczyk2018causal, anderson2013steering}. Although users want to achieve badges and are therefore steered to perform certain actions, steering also occurs in the reputation system. However, badges allow a wider variety of goals, for instance, asking and answering questions, voting on questions and answers, or writing higher quality answers. Badges also work as a motivator for users \cite{anderson2013steering}. Users often put in non-trivial amounts of work and effort to achieve badges and so badges become powerful incentives. However, not all users are equal and therefore do not pursue badges in the same way \cite{yanovsky2019one}. Contrary to \cite{anderson2013steering}, \citeauthor{yanovsky2019one} \cite{yanovsky2019one} found that users do not necessarily increase their activity prior to achieving a badge followed by an immediate decrease in contribution thereafter but users behave differently based on their type of contribution. The authors found users can be categorized into three groups: Firstly, some users are not affected at all by the badge system and still contribute a lot to the community. Secondly, users increase their activity too before gaining a badge and keep their level of contribution afterward. Finally, users increase their activity before achieving a badge and return to their previous level of engagement thereafter. Different badges also create status classes \cite{immorlica2015social}. The harder a badge can be earned by users, the more unique it is within the community and therefore the badge symbolizes some sort of status. Often rare badges are hard to achieve and take significant effort. For some users, depending on their type, this can be a huge motivator. \citeauthor{kusmierczyk2018causal} found first-time badges play an important role in steering users \cite{kusmierczyk2018causal}. The steering effect only takes place if the benefit to the user is greater than the effort the user has to put into to obtain the badge. If the effort is greater the user will likely not pursue the badge and therefore the steering effect will not occur. % badge % DONE One Size Does Not Fit All: Badge Behavior in Q\&A Sites \cite{yanovsky2019one} # all abount badges, steering users, motivation; previous paper say that contribution increases before badge obtaining and decrases afterwards, but they find it depends on type of user: 1) users are not affected by badge system but still contribute much, 2) contribution increase ans stays the same after badge achievement 3) return to previous levels % DONE Can gamification motivate voluntary contributions? The case of StackOverflow Q&A community \cite{cavusoglu2015can} stimulting users to contribute via badges % DONE SOCIAL STATUS AND BADGE DESIGN \cite{immorlica2015social} about badges and how they create status classes, badges for every user and individual badges % DONE Quantifying the impact of badges on user engagement in online Q&A communities \cite{li2012quantifying} maintain consistent engagement, gamification via badges % DONE On the Causal Effect of Badges \cite{kusmierczyk2018causal} # all abount badges, steering users, motivation, first-time badges, first time badges steer user behavior if benefit greater then effort, otherwise no effect % Quizz: Targeted Crowdsourcing with a Billion (Potential) Users \cite{ipeirotis2014quizz} many online comunities bysed on volutarty of users not paid workers % DONE Steering user behavior with badges \cite{anderson2013steering} # all abount badges, steering users, motivation, user may put in non trivial amounts of work to achieve badges -> powerful incentives, badges used in multiple ways (steer users to ask/answer more questions, voting, etc.) Quality is often a concern in online communities. Platform moderators and admins want to keep a certain level of quality or even raise it. However, higher-quality posts take more time and effort than lower-quality posts. In the case of CQA platforms, this is an even bigger problem as higher quality answers fight against fast responses. Despite that, StackOverflow also has a problem with low quality and effort questions and subsequent unwelcoming answers and comments \cite{silge2019welcome}. StackOverflow has grown into a large community and larger communities are harder to control. \citeauthor{lin2017better} investigated how growth affects a community. They looked at Reddit communities that were added to the default set of subscribed communities of every new user (defaulting) which lead to a huge influx of new users to these communities as a result. The authors found that contrary to expectations, the quality stays largely the same. The vote score dips shortly after defaulting but quickly recovers or even raises to higher levels than before. The complaints of low-quality content did not increase, and the language used in the community stayed the same. However, the community clustered around fewer posts than before defaulting. \citeauthor{tausczik2011predicting} found reputation is linked to the perceived quality of posts in multiple ways \cite{tausczik2011predicting}. They suggest reputation could be used as an indicator of quality. Quality also depends on the type of platform. \cite{lin2017better} showed that expert sites who charge fees, for instance, library reference services, have higher quality answers compared to free sites. Also, the higher the fee the higher the quality of the answers. However, free community sites outperform expert sites in terms of answer density and responsiveness. % quality % DONE Predicting the perceived quality of online mathematics contributions from users' reputations \cite{tausczik2011predicting} about mathoverflow and quality % DONE Predictors of Answer Quality in Online Q&A Sites cite{harper2008predictors} 1) shows that fee or expert sites are better than open qa sites (greater fee better answers), 2) big communty sites like Yahoo! Answers outperform sites which depend on experts (e.g. library refernce services) (higher answer diversity and responsiveness) % DONE Better When It Was Smaller? Community Content and Behavior After Massive Growth \cite{lin2017better}, defaulting of subreddit, quality remains high, dip in upvotes directly after defaulting but recover quickly and get even higher than before, complaints about low-quality content do not increase, language stays the same, however community clusters among fewer posts than before defaulting % lowering content quality (Gorbatai 2011) %TODO read and add to list of notizen % other % DONE Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow \cite{anderson2012discovering} accepted answer strongly depends on when answers arrive, considered not only the question and accepted answer but the set of answers to a question % DONE Quizz: Targeted Crowdsourcing with a Billion (Potential) Users \cite{ipeirotis2014quizz} many online comunities based on volutarty of users not paid workers % DONE Design Lessons from the Fastest Q&A Site in the West \cite{mamykina2011design} understanding SO success, 1) productive competition (gamification reputation), 2) founders were already experts on site the created (ensured success early on, founders involved in community not external), 3) meta page for discussion and voting on features (same mechanics as on SO page) % DONE How Do Programmers Ask and Answer Questions on the Web? \cite{treude2011programmers} qa sites very effective at code review and conceptual questions % DONE The role of knowledge in software development \cite{robillard1999role} people have different areas of knowledge and expertise % Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media \cite{bian2008finding}, about Yahoo! Answers, finding factual answers by using available data on user interaction % No Country for Old Members: User Lifecycle and Linguistic Change in Online Communities \cite{danescu2013no} % DONE Non-public and public online community participation: Needs, attitudes and behavior \cite{nonnecke2006non} about lurking, many programmers do that probably, not even registering, lurking not a bad behavior but observing, lurkers are more introverted, passive behavior, less optimistic and positive than posters, prviously lurking was thought of free riding, not contributing, taking not giving to comunity, important for getting to know a community, better integration when joining % A comprehensive survey and classification of approaches for community question answering \cite{srba2016comprehensive}, meta study on papers published between 2005 and 2014 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %paper links bekommen: %tutorial: Bernal et al. \cite{bernal2017interrupted} %You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech \cite{chandrasekharan2017you} % -> reddit hate community ban: change = ban % -> todo %literature % Tracing Community Genealogy: How New Communities Emerge from the Old \cite{tan2018tracing} % On the personality traits of stackoverflow users \cite{bazelli2013personality} analyzing personality traits, top reputated users are more extroverted than less reputated users % -> gute vorlage http://softwareprocess.es/pubs/bazelli2013ICSMERA-Personality.pdf % <- One-day flies on StackOverflow \cite{slag2015one}, 1 contribution during whole registration, only user with 6 month of registration % -> [1] Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow \cite{anderson2012discovering} accepted answer strongly depends on when answers arrive, considered not only the question and accepted answer but the set of answers to a question % -> [23] Predicting the perceived quality of online mathematics contributions from users' reputations \cite{tausczik2011predicting} about mathoverflow and quality % -> [4] Predictors of Answer Quality in Online Q&A Sites cite{harper2008predictors} 1) shows that fee or expert sites are better than open qa sites (greater fee better answers), 2) big communty sites like Yahoo! Answers out perform sites which depend on experts (e.g. library refernce services) (higher answer diversity and responsiveness) % -> todo done % -> todo done % -> contains generic refernces to boost ref count % -> [5] Building reputation in stackoverflow: an empirical investigation. \cite{bosu2013building} gaming the reputation system of SO, answering question with tags with lower expertise density, answering promptly, first one to answer, activity during off peak hours, contributing to diverse areas % -> [8] Analysis of the reputation system and user contributions on a question answering website: Stackoverflow \cite{movshovitz2013analysis} about the reputation system, high reputation indicates primary source of answers and high quality, most questions asked by low reputation users but high reputation users post most questions on avg compared to low reputation users, effective finding of spam users and other extreme behaviors via graph analysis, predicting which users become influential longterm contributors, experts can be reliably identified based on the participation in the first few months after registration % -> todo done % -> [1] Design Lessons from the Fastest Q&A Site in the West \cite{mamykina2011design} understanding SO success, 1) productive competition (gamification reputation), 2) founders were already experts on site the created (ensured success early on, founders involved in community not external), 3) meta page for discussion and voting on features (same mechanics as on SO page) % -> [2] How Do Programmers Ask and Answer Questions on the Web? \cite{treude2011programmers} qa sites very effective at code review and conceptual questions % -> [10] The role of knowledge in software development \cite{robillard1999role} people have different areas of knowledge and expertise % -> [3] Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media \cite{bian2008finding}, about Yahoo! Answers, finding factual answers by using available data on user interaction % No Country for Old Members: User Lifecycle and Linguistic Change in Online Communities \cite{danescu2013no} % Better When It Was Smaller? Community Content and Behavior After Massive Growth \cite{lin2017better}, defaulting of subreddit, quality remains high, dip in upvotes directly after defaulting but recover quickly and get even higher than before, complaints about low-quality content do not increase, language stays the same, however community clusters among fewer posts than before defaulting % -> breaching community norms (kraut 2012) % starting a community: critical mass, enought users to attract other users who also create content % attracting new users: attract new users to replace leaving ones, new users should be skilled and motivated to contribute (chanllange, depends on community some accept everyone others need specific skills (Eg OSS) or qualitities (eg illness for medical suppport groupgs, etc), mew users less commitment thatn old ones, newcommers may not behave according to community standard as they dont now them % encoraging commitment: willingness to stay in community (increases statisfaction, les likely to leave, better performance, more contribution), harder than in companies with employee contracts, contrast to OSS (no contract, voluntarity), greter competition from other communities in contrast to rl where options are limimted by location and distance % encouraging contribution: online communities need contributions by users (not lurking), content is foundation of community, contributions by users follows power law (usally, also confirmed in my results) % regualting behavior: maintain a funtioning community, prevent troll, inappropiate behavior, limit damage if it occurs, ease of entry & exit -> high turnover % -> lowering content quality (Gorbatai 2011) %TODO read and add to list of notizen % Eliciting New Wikipedia Users’ Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start \cite{yazdanian2019eliciting} % -> cold start recommender system problem for recommending newcommers articles to read and get a feeling for how to write articles; similar to SO because new commers don't know the rules so well; familiarize newcommers with how things work on the site, onboarding % Do organizational socialization tactics influence newcomer embeddedness and turnover? \cite{allen2006organizational} #newcommers to organizations, actively embedding newcomers into organization, shows connection between socialaization and turnover (leaving the organization) % -> todo % We Don't Do That Here: How Collaborative Editing with Mentors Improves Engagement in Social Q\&A Communities \cite{ford2018we} # mentoring new commers questions (before posting), 1 month experiment, collaborative experiment with stackoverflow team, novices got a choice upon submitting a question whether or not the want feedback from a mentor regaurding the question, if so redirect to help room where mentor reviews question and suggests changes to question, mentored questions significatly better than non-mentored ones, higher scores fewer offtopic or poor questions, novices more comfortable with mentor reviewed questions % -> todo % -> Non-public and public online community participation: Needs, attitudes and behavior \cite{nonnecke2006non} about lurking, many programmers do that probably, not even registering, lurking not a bad behavior but observing, lurkers are more introverted, passive behavior, less optimistic and positive than posters, prviously lurking was thought of free riding, not contributing, taking not giving to comunity, important for getting to know a community, better integration when joining % -> Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects\cite{steinmacher2015social} onboarding in open source software projects, difficulties for newcomers, newcommers often on their own, barriers when 1st contributing to a project, % -> Paradise Unplugged: Identifying Barriers for Female Participation on Stack Overflow \cite{ford2016paradise} gender gap, females only 5%, contribution barriers, found 5 gender specific (women) barriers among 14 barrier in total, barriers also affect groups like industry programmers % -> Community-based production of open-source software: What do we know about the developers who participate? \cite{david2008community} only 5% women contribute to OSS % -> https://insights.stackoverflow.com/survey/2019: 7.9% women, increase since 2015: 5.8% % -> Gender, Representation and Online Participation: A Quantitative Study \cite{vasilescu2014gender} investigation on minorities (eg women), under representation of minorities % -> Why So Few? Women in Science, Technology, Engineering, and Mathematics. \cite{hill2010so} women only 20 percent of bachelor degrees % -> Women and science careers: leaky pipeline or gender filter? \cite{clark2005women} underrepresentation in STEM % Stack Overflow Isn't Very Welcoming: It's Time for That to Change \cite{hanlon2018stack} # passt sehr gut in die story, effort to make site more welcoming % -> marginalized group feel SO is a hostile and elitist place, new coders, women, people of color, etc % -> admitting of problem that have not been addressed (enough), mixed messages (expert site or for everyone), to little guidance for new users, pecking on new users who dont know all little things on what (not) to do (no plz and thx, low quality question -> low qualtity answer -> comments about support for low quality) or bad english, previous attempts to improve welcoming, Summer of Love (https://stackoverflow.blog/2012/07/20/kicking-off-the-summer-of-love/), The War of the Closes (https://stackoverflow.blog/2013/06/25/the-war-of-the-closes/), The NEW new “Be Nice” Policy (“Code of Conduct”) — Updated with your feedback (https://meta.stackexchange.com/questions/240839/the-new-new-be-nice-policy-code-of-conduct-updated-with-your-feedback), Mentorship Research Project - Results + Wrap-Up (https://meta.stackoverflow.com/questions/357198/mentorship-research-project-results-wrap-up?noredirect=1&lq=1) TODO also refer paper about that here, removal condesting and sarcastic comments, ideas about beginner ask page (TODO already implemted?), dont judge users for not knowing things (e.g. posting duplicates) % Rolling out the Welcome Wagon: June Update \cite{friend2018rolling} “Ask a Question Wizard” prototype, reduce exclusion (negative feelings, expectations and experiences), improve inclusion (learn from other communities facing similar problems), classification of abusive and unwelcoming comments % Welcome Wagon: Classifying Comments on Stack Overflow \cite{silge2019welcome} #all about comments, effort to make site more welcoming, staff internal rating of comments (fine, unwelcoming, abusive, 57 raters, 13742 ratings, 3992 comments) % One Size Does Not Fit All: Badge Behavior in Q\&A Sites \cite{yanovsky2019one} # all abount badges, steering users, motivation; previous paper say that contribution increases before badge obtaining and decrases afterwards, but they find it depends on type of user: 1) users are not affected by badge system but still contribute much, 2) contribution increase ans stays the same after badge achievement 3) return to previous levels % -> todo % -> []Can gamification motivate voluntary contributions? The case of StackOverflow Q&A community \cite{cavusoglu2015can} stimulting users to contribute via badges % -> []SOCIAL STATUS AND BADGE DESIGN \cite{immorlica2015social} about badges and how they create status classes, badges for every user and individual badges % -> []Quantifying the impact of badges on user engagement in online Q&A communities \cite{li2012quantifying} maintain consistent engagement, gamification via badges % On the Causal Effect of Badges \cite{kusmierczyk2018causal} # all abount badges, steering users, motivation, first-time badges, first time badges steer user behavior if benefit greater then effort, otherwise no effect % -> [] Quizz: Targeted Crowdsourcing with a Billion (Potential) Users \cite{ipeirotis2014quizz} many online comunities bysed on volutarty of users not paid workers % -> todo % Steering user behavior with badges \cite{anderson2013steering} # all abount badges, steering users, motivation, user may put in non trivial amounts of work to achieve badges -> powerful incentives, badges used in multiple ways (steer users to ask/answer more questions, voting, etc.) % -> todo % A comprehensive survey and classification of approaches for community question answering \cite{srba2016comprehensive}, meta study on papers published between 2005 and 2014 %literatur analyse todo %paper lesen und sachen rausschreiben; keywords ... %struktur neu machen %schreiben % old %structure %- various research on collaborative online communities, yahoo answers, stackoverflow/exchange, quora, wikipedia, ... % - A comprehensive survey and classification of approaches for community question answering \cite{srba2016comprehensive} # good description of SO % - Design Lessons from the Fastest Q&A Site in the West \cite{mamykina2011design} understanding SO success %- maintaining a community: % - onboarding of newcomers % - keeping users on the platform %- onboarding problem e.g. wikipedia, stackexchange % - getting users to stay and contribute to the site % - One-day flies on StackOverflow \cite{slag2015one} % - Eliciting New Wikipedia Users’ Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start \cite{yazdanian2019eliciting} % -> cold start recommender system problem for recommending newcommers arictles to read and get a feeling for how to write articles; similar to SO because new commers % - incentives for new users via reputation % - gaming the system: Building reputation in stackoverflow: an empirical investigation. \cite{bosu2013building} gaming the reputation system of SO % - prevent 1 day flies & keep new users engaged: Analysis of the reputation system and user contributions on a question answering website: Stackoverflow \cite{movshovitz2013analysis} about the reputation system % - badges % - One Size Does Not Fit All: Badge Behavior in Q\&A Sites \cite{yanovsky2019one} # all about badges, steering users, motivation % -> Can gamification motivate voluntary contributions? The case of StackOverflow Q&A community \cite{cavusoglu2015can} stimulting users to contribute via badges % -> SOCIAL STATUS AND BADGE DESIGN \cite{immorlica2015social} about badges and how they create status classes, badges for every user and individual badges % % -> Quantifying the impact of badges on user engagement in online Q&A communities \cite{li2012quantifying} maintain consistent engagement, gamification via badges % - On the Causal Effect of Badges \cite{kusmierczyk2018causal} # all abount badges, steering users, motivation % - Steering user behavior with badges \cite{anderson2013steering} # all abount badges, steering users, motivation % - newcomers socialization, experienced users as models/mentors, positive feedback to newcomers % - Do organizational socialization tactics influence newcomer embeddedness and turnover? \cite{allen2006organizational} #newcommers to organizations % - We Don't Do That Here: How Collaborative Editing with Mentors Improves Engagement in Social Q\&A Communities \cite{ford2018we} # mentoring newcomers questions (before posting), 1 month experiment % - Stack Overflow Isn't Very Welcoming: It's Time for That to Change \cite{hanlon2018stack} # passt sehr gut in die story, effort to make site more welcoming % - Welcome Wagon: Classifying Comments on Stack Overflow \cite{silge2019welcome} #all about comment, effort to make site more welcoming % %- quality: % - Predictors of Answer Quality in Online Q&A Sites cite{harper2008predictors} shows that open qa sites are better than paywall or expert sites % - Predicting the perceived quality of online mathematics contributions from users' reputations \cite{tausczik2011predicting} about mathoverflow and quality % %- quasi experiments % - stackexchange change % - You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech \cite{chandrasekharan2017you} % -> reddit hate community ban: change = ban %reminder % write in aspect of new users % -> getting users on board (community guide lines) % -> incentives for new users via reputation (maybe batches (do research on that)) % -> gaming the system % -> prevent 1 day flies % -> keep new users engaged % 2 problems: onboarding and keeping users active (eg badges)