Page MenuHomePhabricator

Prepare ProofreadPage extension for IP Masking
Closed, ResolvedPublic

Description

A preliminary investigation (T326759) has found that the ProofreadPage extension may be affected by IP Masking

Event Timeline

kostajh subscribed.

As we are considering deploying to some wikisource wikis in October, marking this as a blocker.

@kostajh Which parts are affected ? I remember working on this for a bit in early 2023.

@kostajh Which parts are affected ? I remember working on this for a bit in early 2023.

Not sure yet—need to look through the code to see how it is dealing with isAnon, isRegistered etc.

As far as I can tell, the only potential concern is in the pagequality permission, which allows for interacting with the tool. That is granted to the user group by default in GroupPermissions in extension.json. For all wikis except dewikisource, this remains as user. For dewikisource, the value is * meaning that temporary accounts would be able to interact with the tool.

As far as I can tell, the only potential concern is in the pagequality permission, which allows for interacting with the tool. That is granted to the user group by default in GroupPermissions in extension.json. For all wikis except dewikisource, this remains as user. For dewikisource, the value is * meaning that temporary accounts would be able to interact with the tool.

On dewikisource, it looks like all Page and Index namespace content is protected from edits by anonymous users. In that case, it's possible we could update operations/mediawiki-config to remove the override ($wgGroupPermissions['*']['pagequality'] = true; # 27516) which was set in 2012, or possibly earlier. @sgrabarczuk could we ask the dewikisource community about this? If we don't change anything, it means that temporary account users could interact with ProofreadPage tools--although not as their first edit to the project.

@kostajh Why would the pagequality permission be unavailable to a temp account on their first edit ?

@kostajh Why would the pagequality permission be unavailable to a temp account on their first edit ?

Sorry for the ambiguity. On dewikisource, Page and Index namespace content appears to be protected (e.g. https://de.wikisource.org/wiki/Index:Sammlung_alt-_und_mitteldeutscher_W%C3%B6rter_aus_lateinischen_Urkunden?action=info&uselang=en). That means that a not-logged-in visitor would not be able to make their first edit on such a page. With the existing config of $wgGroupPermissions['*']['pagequality'] = true; , if the visitor made an edit to e.g. the main namespace, they would have a temporary account--but actually, they would still not be able to edit the Page/Index content that is protected for autoconfirmed and above, because temporary accounts cannot be in the autoconfirmed group.

@kostajh Why would the pagequality permission be unavailable to a temp account on their first edit ?

Sorry for the ambiguity. On dewikisource, Page and Index namespace content appears to be protected (e.g. https://de.wikisource.org/wiki/Index:Sammlung_alt-_und_mitteldeutscher_W%C3%B6rter_aus_lateinischen_Urkunden?action=info&uselang=en). That means that a not-logged-in visitor would not be able to make their first edit on such a page. With the existing config of $wgGroupPermissions['*']['pagequality'] = true; , if the visitor made an edit to e.g. the main namespace, they would have a temporary account--but actually, they would still not be able to edit the Page/Index content that is protected for autoconfirmed and above, because temporary accounts cannot be in the autoconfirmed group.

OK. Looking at the DB, it seems like not all Page/Index pages are marked as protected, e.g. here is an example https://de.wikisource.org/w/index.php?title=Seite:Abschied_der_Casselaner_vom_Koenig_von_Westphalen.pdf/10&action=edit which would allow temp users to edit and set page quality status. It looks like out of 316,225 Page content items on Dewikisource, 283,619 have restrictions implemented. It's a similar ratio for Index content items.

mysql:research@dbstore1007.eqiad.wmnet [dewikisource]> SELECT COUNT(*) FROM page WHERE page_namespace = 102;
+----------+
| COUNT(*) |
+----------+
|   316225 |
+----------+
1 row in set (0.084 sec)
mysql:research@dbstore1007.eqiad.wmnet [dewikisource]> SELECT COUNT(DISTINCT(page_restrictions.pr_page)) FROM page_restrictions INNER JOIN page ON page_restrictions.pr_page = page.page_id WHERE page.page_namespace = 102;
+--------------------------------------------+
| COUNT(DISTINCT(page_restrictions.pr_page)) |
+--------------------------------------------+
|                                     283619 |
+--------------------------------------------+
1 row in set (0.951 sec)

So, I think the best option is to remove $wgGroupPermissions['*']['pagequality'] = true; so that the default setting of user (which excludes temp accounts) is set.

If someone wants to do further work to update Extension:ProofreadPage to work well with temporary account users, that is fine, but out of scope for what Trust and Safety Product Team would be able to manage.

Change #1080621 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/mediawiki-config@master] ProofreadPage: Remove pagequality permission override

https://gerrit.wikimedia.org/r/1080621

Change #1080621 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/mediawiki-config@master] ProofreadPage: Remove pagequality permission override

https://gerrit.wikimedia.org/r/1080621

This patch proposes to remove the permission override on dewikisource that gives the pagequality permission to all users. If the patch is deployed, only named users will be able to use features gated behind the pagequality permission.

Copying from the commit message of the patch:

Why:

- The pagequality permission should not be granted to
  anonymous/temporary users when temporary accounts are deployed,
  because the features in the extension that check the `pagequality`
  permission may need some work to function properly with
  temporary accounts enabled
- Approximately 2/3 of Page and Index content is manually marked as
  protected (T326940#10233208) so this change is probably not going to
  be especially controversial or impactful on editing patterns on that
  wiki

What:

- Remove the override for dewiki source that allows anonymous/temporary
  users to edit

@sgrabarczuk could you please let us know if/when it would be OK to deploy this patch?

@kostajh I think this a fragile stop-gap approach, since we are effectively breaking/disallowing this configuration from ever existing. Can you point out what exact work ProofreadPage 's pagequality extension needs (or how I can test if ProofreadPage works with temporary accounts) ? I'll be happy to test and/or make the necessary changes.

@kostajh I think this a fragile stop-gap approach, since we are effectively breaking/disallowing this configuration from ever existing. Can you point out what exact work ProofreadPage 's pagequality extension needs (or how I can test if ProofreadPage works with temporary accounts) ? I'll be happy to test and/or make the necessary changes.

I think it would involve setting $wgAutoCreateTempUser['enabled'] = true; in your local environment, enabling the extension, and then checking the feature set of Proofread Page to validate that nothing is broken.

@kostajh I think this a fragile stop-gap approach, since we are effectively breaking/disallowing this configuration from ever existing. Can you point out what exact work ProofreadPage 's pagequality extension needs (or how I can test if ProofreadPage works with temporary accounts) ? I'll be happy to test and/or make the necessary changes.

I think it would involve setting $wgAutoCreateTempUser['enabled'] = true; in your local environment, enabling the extension, and then checking the feature set of Proofread Page to validate that nothing is broken.

Hmm, tested, does not appear to have show-stopping bugs (no full-page errors),
Only one thing, I assume having the username as part of revision output like so:

<noinclude><pagequality level="3" user="~2024-1" /></noinclude>....<noinclude></noinclude>

is not a problem?

@kostajh I think this a fragile stop-gap approach, since we are effectively breaking/disallowing this configuration from ever existing. Can you point out what exact work ProofreadPage 's pagequality extension needs (or how I can test if ProofreadPage works with temporary accounts) ? I'll be happy to test and/or make the necessary changes.

I think it would involve setting $wgAutoCreateTempUser['enabled'] = true; in your local environment, enabling the extension, and then checking the feature set of Proofread Page to validate that nothing is broken.

Hmm, tested, does not appear to have show-stopping bugs (no full-page errors),
Only one thing, I assume having the username as part of revision output like so:

<noinclude><pagequality level="3" user="~2024-1" /></noinclude>....<noinclude></noinclude>

is not a problem?

That seems fine to me. How did you test, can you share your steps?

@kostajh I think this a fragile stop-gap approach, since we are effectively breaking/disallowing this configuration from ever existing. Can you point out what exact work ProofreadPage 's pagequality extension needs (or how I can test if ProofreadPage works with temporary accounts) ? I'll be happy to test and/or make the necessary changes.

I think it would involve setting $wgAutoCreateTempUser['enabled'] = true; in your local environment, enabling the extension, and then checking the feature set of Proofread Page to validate that nothing is broken.

Hmm, tested, does not appear to have show-stopping bugs (no full-page errors),
Only one thing, I assume having the username as part of revision output like so:

<noinclude><pagequality level="3" user="~2024-1" /></noinclude>....<noinclude></noinclude>

is not a problem?

That seems fine to me. How did you test, can you share your steps?

The steps I roughly followed are:

  • Setup ProofreadPage per the instructions in this guide I wrote a while back
  • Set $wgAutoCreateTempUser['enabled'] = true; and $wgGroupPermissions['*']['pagequality'] = true;
  • In a incognito window, as a anon:
    • Create the page Index:War and Peace.djvu
    • Navigate to and create the page Page:War and Peace.djvu/438
    • Reopen the same page, set the page status/quality bar to a different color and save (see pic below)
    • Use the API sandbox to request the latest revision of the page

Screenshot From 2024-10-16 09-34-01.png (834×1 px, 372 KB)

Tchanders claimed this task.

This appears to be resolved. Please re-open if not!

Change #1080621 abandoned by Kosta Harlan:

[operations/mediawiki-config@master] ProofreadPage: Remove pagequality permission override

https://gerrit.wikimedia.org/r/1080621