Page MenuHomePhabricator

Provide data on dewiki "add link" structured task
Closed, ResolvedPublic

Description

Dewiki community members decided in August 2024 to activate the "add link" structured task again T371597#10058927 (after disabling it in 2021 due to the low quality of recommendations back then T294712).

Some community members feel "add link" still leads to too many low quality edits while others support the task, arguing that the edit quality doesn't seem lower than regular newbie edits (e.g. discussions in September 2024, January 2025). We would love to see some data on add link edits instead of arguing about our subjective perception.

Please provide data on:

  • revert rate of add link edits compared to the revert rate of other newbie edits
  • percentage of add link edits getting approved (-> FlaggedRevisions) compared to the percentage of other newbie edits getting approved in the same timeframe.

I think we should only look at accounts younger than 3 months and with less than 50 edits, as users are supposed to move to more complex tasks once they've learned more about editing. For the revert rate / approval percentage I think the last 2-4 weeks might be a good timeframe? We are open to other suggestions for those parameters.


Acceptance Criteria:


Data on German Wikipedia "add link" structured task:

All data narrowed by:

  • Time frame: Sep 1, 2024 - Mar 1, 2025
  • Edits narrowed to: Project: de.wikipedia
  • User Tenure Bucket: <90 days
  • Edit Count Bucket: <100 edits
FilterTotal Edit CountTotal Edit Count RevertedRevert Percentage
ALL newcomer edits107,60117,35616.1%
Newcomer edits minus “newcomer task add link” edits98,47716,68816.9%
“newcomer task add link” edits9,1426687.3%

The "Add a link" task has a significantly lower revert rate than the average newcomer edit. While we acknowledge that these edits may have limited individual impact, multiple experiments have shown that this simple task helps more new account holders take their first editing step—leading to increased newcomer retention. Full experiment results.
Additionally, we look forward to sharing the results of the enwiki "Add a link" experiment soon: T382603: Add a link (Structured task): English Wikipedia A/B test & Experiment Analysis (FY24/25 WE1.2.11).

Event Timeline

KStoller-WMF subscribed.

@Johannnes89 - thanks for clearly defining the request in this task!
I'll see what data we can pull for this easily.

I know you are looking for dewiki specific data here, but just in case it's helpful:

We have two past A/B tests with data here:

And we are currently running an A/B test on enwiki: T382603: Add a link (Structured task): English Wikipedia A/B test & Experiment Analysis (FY24/25 WE1.2.11)

Thanks for the links! I was aware of the 2021 analysis, but not about the one from 2024. Interested to see what the enwiki results look like.

We are indeed still interested in dewiki specific data. Our articles undergo a thorough quality assurance process (supported by our flaggedrevision system) which should lead to most articles already having a good ratio of wikilinks. That's another reason why some people fear the add link task might produce lots of useless edits. From what I can see filtering recent changes it looks to me that most add link edits don't get reverted, but having actual numbers would be helpful in our community discussions.

I'll see if we can get a WMF data scientist to take a look, but if not I'll try to grab some data for you myself.

I'll see if we can get a WMF data scientist to take a look, but if not I'll try to grab some data for you myself.

Let me know how it goes. I'm happy to take a look at this as well if that'd be helpful.

percentage of add link edits getting approved (-> FlaggedRevisions) compared to the percentage of other newbie edits getting approved in the same timeframe.

@Johannnes89 I'm wondering, does the speed of the approval matter? If so, would you be interested in edits approved in...24 hours after the save? Or is 48 hours more reasonable? I'm not sure how quick edit approval generally is at dewiki. I think the time limit would need to be included, as any edit is either approved or reverted, but I am not 100% sure on what it should be. Do you have any thoughts?

I'll see if we can get a WMF data scientist to take a look, but if not I'll try to grab some data for you myself.

Thanks!

percentage of add link edits getting approved (-> FlaggedRevisions) compared to the percentage of other newbie edits getting approved in the same timeframe.

@Johannnes89 I'm wondering, does the speed of the approval matter? If so, would you be interested in edits approved in...24 hours after the save? Or is 48 hours more reasonable? I'm not sure how quick edit approval generally is at dewiki. I think the time limit would need to be included, as any edit is either approved or reverted, but I am not 100% sure on what it should be. Do you have any thoughts?

Dewiki has many active patrollers, which means bad edits usually get reverted within minutes and good edits get approved in the same timeframe. It's certainly interesting to see how many "add link"-edits get approved within 24 hours, although I can imagine that some stay unapproved a bit longer if patrollers judge them as "not bad, but also not an obvious improvement" and leave them for later review. If it's not too much effort, getting numbers for both 24hours after saving the edit and something like 2 weeks later would be great.
Both timeframes are also interesting for the revert rate – I can imagine patrollers not reverting most add link edits in the first 24hours, because it's clearly not vandalism, but when articles with unapproved edits get reviewed in the days/weeks later, the revert rate might go up.

Thank you for the clarification, that makes sense to me.

KStoller-WMF moved this task from Needs Discussion to Backlog on the Growth-Team board.

@Johannnes89 - I had hoped WMF Product Analytics could support this request, but they’ve been busy. Fortunately, pulling the revert data should be relatively straightforward using Wikimedia’s standard filters for edit count buckets and user tenure.

Would you be okay with me sharing results narrowed down as follows?

  • Project: de.wikipedia
  • Time frame: Sep 1, 2024 - Mar 1, 2025
  • User Tenure Bucket: <90 days
  • Edit Count Bucket: <100 edits
  • Is Reverted: true / false

With data narrowed by the above filters, I would then provide a table like this:

FilterEdit CountEdit Count RevertedPercentage Reverted
all newcomer edits
“newcomer task add link” edits

Notes:

  • Time frame: I can easily adjust this if needed—just let me know!
  • Edit count bucket: I realize this isn’t exactly what you requested, but <100 or <5 edits are the closest available filters.
  • Reverts: This reflects whether an edit was ever reverted.

Would this be useful? I'll have a Data Analyst outside the Growth team to review the calculations for accuracy and to ensure I’m interpreting them correctly before I share.

Additionally, I’ll follow up with Product Analytics to see if they have any guidance on your question about FlaggedRevisions and the percentage of add link edits getting approved.

Thanks a lot! Starting with the revert data is already useful for dewiki discussions.

Is it possible to easily distinguish between edits reverted within a short time frame like 24 hours and edits which are reverted later (e.g. after two weeks) per T385103#10511158? Given the usual speed of dewiki patrolling this would help determine if reverted edits are seen as clearly bad or if patrollers initially decided to leave the edits and someone only reverted on later review.

Thanks for following up with Product Analytics regarding FlaggedRevs! The approval rate should roughly correspond to the percentage of edits not getting reverted. But the speed of approval would be particularly interesting, because it might indicate if checking add link edits is a burden to reviewers or if most edits are good enough to get quickly approved.

I've had the chance to connect with a WMF data analyst to have them double-check my Turnilo queries and conclusions.
I also asked them about some of your questions that I don't know how to answer myself.

Is it possible to easily distinguish between edits reverted within a short time frame like 24 hours and edits which are reverted later

  1. An easy, but imprecise option is to look at dewiki Recent changes narrowed to "Add a Link" edits. Look at yesterday's edits and count reverts.
  2. To get a precise measurement of edit status at 24 hours, I think a data analyst with access to Superset would need to help. There is public access to Superset via Wikimedia Cloud Services. Or perhaps an analyst at WMDE could support this?

I looked at revert data, but essentially the data I have access to is only a Reverted: true/false, I can't specify the revert time frame.

And I also don't have access to Flagged Revisions data. My understanding is that there is some instrumentation logging data, but it's not easily available in Turnilo or Superset.

Data on German Wikipedia "add link" structured task:

All data narrowed by:

  • Time frame: Sep 1, 2024 - Mar 1, 2025
  • Edits narrowed to: Project: de.wikipedia
  • User Tenure Bucket: <90 days
  • Edit Count Bucket: <100 edits
FilterTotal Edit CountTotal Edit Count RevertedRevert Percentage
ALL newcomer edits107,60117,35616.1%
Newcomer edits minus “newcomer task add link” edits98,47716,68816.9%
“newcomer task add link” edits9,1426687.3%

The "Add a link" task has a significantly lower revert rate than the average newcomer edit. While we acknowledge that these edits may have limited individual impact, multiple experiments have shown that this simple task helps more new account holders take their first editing step—leading to increased newcomer retention. Full experiment results.
Additionally, we look forward to sharing the results of the enwiki "Add a link" experiment soon: T382603: Add a link (Structured task): English Wikipedia A/B test & Experiment Analysis (FY24/25 WE1.2.11).

Thanks! The revert data looks already quite helpful. I will check with WMDE if their UX team can help pulling more data in case our community would like to see more numbers :)

You're welcome! I wish I had unlimited data analyst time so we could answer everything, but hopefully this is a good start.

Also, FYI, we will be working on a few tasks soon to address community feedback about the "Add a Link" task, so please feel free to pass along any community feedback or frustrations with the task. One task that I'm hoping will help is:
T386034: Add a Link: Community Configuration setting to allow limiting "Add a Link" to new editors