Page MenuHomePhabricator

Report progress on rollout of unified mobile routing (Hadoop query)
Closed, ResolvedPublic

Assigned To
Authored By
Krinkle
Wed, Sep 24, 12:56 AM
Referenced Files
F66761724: 2025_mobile_reqall_abs_1060_log.png
Sun, Oct 19, 9:42 AM
F66761751: 2025_mobile_recovery_dewiki.png
Sun, Oct 19, 9:42 AM
F66761749: 2025_mobile_regression_dewiki.png
Sun, Oct 19, 9:42 AM
F66761753: 2025_mobile_recovery_idwiki.png
Sun, Oct 19, 9:42 AM
F66761747: 2025_mobile_regression_idwiki.png
Sun, Oct 19, 9:42 AM
F66761755: 2025_mobile_recovery_fawiki.png
Sun, Oct 19, 9:42 AM
F66761787: 2025_mobile_regression_fawiki.png
Sun, Oct 19, 9:42 AM
F66761757: 2025_mobile_recovery_world_p50.png
Sun, Oct 19, 9:42 AM

Description

This task is part of WE6.4.4 Mobile domain sunsetting (FY25-26 Q1), as tracked by the parent task T214998.


The hypothesis is as follows (emphasis mine):

If we unify our domains by serving all page views on MediaWiki sites through a canonical domain, then we will reduce platform complexity and Search Engine Optimization (SEO) risks by eliminating the mobile-subdomain redirect. Completion is measured by decreasing redirects for mobile visits on canonical domains from 100% to 0%.

Work

  • A week after the pilot rollout (T401595), review traffic on a pilot wiki and validate the basic working of the query and the change.
  • Decide on which wikis and wiki families of the main rollout (T403510) to check as milestones.
  • Write Hadoop query that measures the completion metric, and run it for July 2025 and August 2025 to set our baseline.
  • During the main rollout (T403510), run it every few weeks for the week and report progress, looking for anything unexpected that might prevent us from reaching our target.

Reports

Links to comments on this task:

See also

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptWed, Sep 24, 12:56 AM

Chatting with @BBlack last week, we agreed to not include redirecting the m-dot domains to standard domains based on a fixed schedule (e.g. 1 week after rollout), but instead follow real traffic draining. This minimizes CDN and browser overhead for these redirects in the opposite direction, at the cost of keeping the UX downside of "unexpected mobile view on desktop" for a little while longer.

Given that we can't know how quickly external traffic sources will switch over, this could be many months and so we decided to separate this step from the completion metric of this project. We may very well start doing it soon for some wikis already, but we won't hold the completion metric to it.

Hypothesis as published in July 2025:

[โ€ฆ] Completion is measured by increasing mobile page views on canonical domains from 0% to 100%.

This would have counted mobile pageviews (on any domain) and measure the percentage that take place on the mobile domain (initially ~100%), and what percentage on the standard domain (initially ~0%).

This would have meant that even when we ramp up the project to 100% on all wikis, we would not reach 100% of the metric, because the metric includes traffic from an external link (outside our control) pointing directly to a mobile URL (as opposed to from our redirect).

We agreed on the following revision.

Revised hypothesis as of 23 Sep 2025:

[โ€ฆ] Completion is measured by decreasing redirects for mobile visits on canonical domains from 100% to 0%.

This will count requests for pageview URLs to standard domains from mobile browsers, and measure the percentage we redirect to a mobile domain (initially ~100%) vs how many we serve directly on the standard domain (initially ~0%).

The prep work last month updated the webrequest dataset in Hadoop works, so that it understands both legacy m-dot pageviews and unified mobile pageviews as access_method="mobile web" (T390924, T389696, T401576, T401665).

After the pilot rollout to test.wikipedia.org, wikitech.wikimedia.org, and mediawiki.org on 5-8 Sep 2025, I confirmed that this is indeed working.

[โ€ฆ] I checked pageviews for wikitech.wikimedia.org. This makes for a great test case, because it has no mobile subdomain and thus has no existing "mobile web" pageviews (details about why, at T190384#11153446). Any such pageviews counted now, must be from the unified mobile routing:

Following the 5 Sep rollout, it now has mobile pageviews! https://w.wiki/FH7E

Screenshot 2025-09-08 at 16.01.11.png (1ร—1 px, 161 KB)

I also checked test.wikipedia.org and mediawiki.org, but those are harder to get a strong signal from, because we're expecting "no change" and they're fairly small wikis with a lot of fluctuation. Plus, they still receive explicit traffic to m-dot URL as well, so we can't attribute it fully to unified mobile routing.

From the task description:

Decide on which wikis and wiki families of the main rollout (T403510) to check as milestones.

The 15-18 September batch is primarily for the three group1 canary wikis (ca.wikipedia.org, he.wikipedia.org, it.wikipedia.org). The largest one, in terms of pageviews, is it.wikipedia.org. That will give us the strongest signal as well as potential for edge cases to further refine the query.

Page views with "user" agent type in July and August 2025:

  • ca.wikipedia.org: 15-20 million
  • he.wikipedia.org: 55 million
  • it.wikipedia.org: 400 million

For the rest of the rollout, I'll run the query for the subtotal of all enabled projects, alongside the overall total.

From the task description:

Write Hadoop query that measures the completion metric [โ€ฆ]

Starting in Turnilo, looking broadly at pageviews on it.wikipedia.org and it.m.wikipedia.org.

https://w.wiki/FS3a

  • Tue 16 Sep (133.9K):
    • it.wikipedia.org: 52.4K
    • it.m.wikipedia.org: 81.5K
  • Thu 18 Sep (141.8K):
    • it.wikipedia.org: 135.0K
    • it.m.wikipedia.org: 6.8K

Screenshot 2025-09-24 at 03.17.16.png (1ร—2 px, 102 KB)

https://w.wiki/FS3w

  • Wed 17 Sep:
    • 14:00-15:00: 3.0K / 3.7K
    • 15:00-16:00: 2.9K / 3.5K
    • 16:00-17:00: 2.7K / 3.4K
    • 17:00-18:00: 5.3K / 0.6K
  • Thu 18 Sep:
    • 17:00-18:00: 6.3K / 0.3K

Screenshot 2025-09-24 at 03.30.15.png (1ร—1 px, 282 KB)

I'm using an crude pageview approximation as 200 OK and /wiki/, because is_pageview is missing per T212778. The switch date is 17 Sep (Varnish change around 18:00 UTC, which shifted incoming traffic; MediaWiki change around 23:00 UTC, which removed link promotion from HTML metadata and footer links). Note that this dataset is sampled 1/128, so the absolute numbers are much lower than actual pageviews those days.

Around 90% of traffic shifted within an hour of the rollout. This confirms what we learned during the planning phase back in February (mw:Mobile domain sunsetting ยง Google), which is that a majority of pageviews is referred from search engines, and that Google refers both mobile and desktop clients alike to the standard domain. As such, because Google does not refer to m-dot URLs in their index, the switch is thus instantaneous for these visitors because we are simply not redirecting them anymore.

A small percentage of traffic on m-dot remains, which is presumably based on existing sessions continuing (given that we use relative URLs between articles), bookmarks, cached redirects, and referals in existing contenet on websites and apps that used the mobile URL (e.g. news sites, blogs, video descriptions, and social media).

Looking at HTTP 302 redirects, I was surprised to learn that, while mobile-domain pageviews reduced by 90%, standard-domain redirects reduced only by 50%: https://w.wiki/FS4B

Screenshot 2025-09-24 at 03.45.17.png (1ร—2 px, 146 KB)

What are all these redirects? https://w.wiki/FS4N

Screenshot 2025-09-24 at 03.56.51.png (682ร—2 px, 135 KB)

99% of it is Special:CentralAutoLogin, which is a JavaScript endpoint that we hit in the background during your first visit in a logged-out browsing session. Because it uses a pageview-ish URL, and the same HTTP 302 status code code as the mobile redirect, it shows up in this plot. This isn't an issue with the real data, because is_pageview would reject this either for carrying a redirect status code or (for the non-redirect destination) for being in the Special-namespace.

Apart from Special:CentralAutoLogin, the second most common redirect on standard domains (after turning off the mobile redirect) is Special:Random, aka Speciale:PaginaCasuale in Italian.

I notice that while is_pageview is unaffected in Hadoop, the is_redirect_to_pageview field does get fooled by these, and apparently always has been.

SELECT http_status,uri_host,uri_path,is_pageview,is_redirect_to_pageview,x_analytics_map FROM wmf.pageview_actor WHERE year=2025 AND month=9 AND day=22 AND hour=13 AND uri_host='it.wikipedia.org' AND uri_path LIKE '/wiki/Special:CentralAutoLogin%' LIMIT 10;
http_status     uri_host        uri_path        is_pageview     is_redirect_to_pageview x_analytics_map
302     it.wikipedia.org        /wiki/Special:CentralAutoLogin/start    false   true    {"WMF-Last-Access":"22-Sep-2025","WMF-Last-Access-Global":"22-Sep-2025","client_port":#,"https":"1","ismobile":"1","ja3n":#,"wmfuniq":#}
302     it.wikipedia.org        /wiki/Special:CentralAutoLogin/start    false   true    {"WMF-Last-Access":"22-Sep-2025","WMF-Last-Access-Global":"22-Sep-2025","client_port":#"https":"1","ismobile":"1","ja3n":#,"wmfuniq":#}
302     it.wikipedia.org        /wiki/Special:CentralAutoLogin/start    false   true    {"WMF-Last-Access":"22-Sep-2025","WMF-Last-Access-Global":"22-Sep-2025","client_port":#,"https":"1","ismobile":"1","ja3n":#,"wmfuniq":#}

In the analytics/refinery, PageviewDefinition.java#isRedirectToPageview passes a mostly-empty getXAnalyticsHeader down to pageDenotesContentConsumption, which defaults to true in absence of any pageview-related attributes. I'll have to keep this in mind later when writing the real query.

Onwards to Hadoop proper, then. I tend to start by getting a small sample from a single partition (1 hour of 1 day), so that I can iterate quickly on the query.

SELECT access_method,http_status,is_pageview,is_redirect_to_pageview
FROM wmf.pageview_actor
WHERE year=2025 AND month=9 AND day=22 and hour=13
AND uri_host='it.wikipedia.org' AND (is_pageview OR is_redirect_to_pageview)
LIMIT 10;

SELECT COUNT(*),access_method,http_status,is_pageview,is_redirect_to_pageview
FROM wmf.pageview_actor WHERE year=2025 AND month=9 AND day=22 and hour=13
AND uri_host='it.wikipedia.org' AND (is_pageview OR is_redirect_to_pageview)
GROUP BY access_method,http_status,is_pageview,is_redirect_to_pageview;
_count   access_method  http_status  is_pageview  is_redirect_to_pageview
493340   desktop        200          true    false
121304   desktop        302          false   true
452149   mobile web     200          true    false
146854   mobile web     302          false   true
# other statuses [โ€ฆ]
# mobile app [โ€ฆ]
Time taken: 19 seconds, Fetched 11 row(s)

I'm looking only at the standard domain (it.wikipedia.org, not it.m.wikipedia.org), and only at requests where either is_pageview or is_redirect_to_pageview is true. The is_redirect_to_pageview case is meant to resemble mobile redirects. And the is_pageview case (depending on access_method) the unified mobile page views on the standard domain.

SELECT
  CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) _date,
  is_pageview,
  is_redirect_to_pageview,
  COUNT(*) _count
FROM wmf.pageview_actor
WHERE year=2025 AND month=9
AND uri_host='it.wikipedia.org' AND access_method='mobile web'
AND (is_pageview OR is_redirect_to_pageview)
GROUP BY year, month, day, is_pageview, is_redirect_to_pageview
ORDER BY _date ASC, is_pageview, is_redirect_to_pageview;
_date   is_pageview     is_redirect_to_pageview _count
2025-09-17      false   true    78348
2025-09-17      true    false   4001615
2025-09-18      false   true    2860871
2025-09-18      true    false   9448690
2025-09-19      false   true    2884878
2025-09-19      true    false   9492527
2025-09-20      false   true    3208558
2025-09-20      true    false   10703710
2025-09-21      false   true    3551409
2025-09-21      true    false   11864476
2025-09-22      false   true    2978334
2025-09-22      true    false   9787520
2025-09-23      false   true    2997908
2025-09-23      true    false   9990996
2025-09-24      false   true    50366
2025-09-24      true    false   293099

Nothing before Sep 17. Interesting...

Mobile redirect are generated on the edge in Varnish, which means they don't set X-Subdomain:M on the origin request, because there is no origin request. That means we don't set x_analytics[ismobile]=1 and thus access_method=desktop. This makes sense in hindsight, and is consistent with how access_method worked historically as well (prior to T390924 and T401665), because it was simply based on the domain, and this is the canonical/desktop domain. It's not worth changing, either, since these requests won't exist after next month anyway.

My goal is to count "requests to the standard domain from mobile browsers", of which a portion is served a redirect, and or portion is served a pageview. This won't work as-is because the redirects don't distinguish access_method. So... I'll just re-classify it based on a crude user_agent regex myself then. Combined with what we learned about Special:CentralAutoLogin, this brings us to:

SELECT
  CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
  CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket,
  COUNT(*) AS _count
FROM wmf.pageview_actor
WHERE year=2025 AND month=9
AND uri_host='it.wikipedia.org' AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%' 
AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web'))
GROUP BY year, month, day, is_pageview, is_redirect_to_pageview
ORDER BY _date ASC, _bucket ASC;
_date   _bucket _count
2025-09-01      mobile_redirect 6160757
2025-09-02      mobile_redirect 6096130
2025-09-03      mobile_redirect 5809544
2025-09-04      mobile_redirect 6252553
2025-09-05      mobile_redirect 6103911
2025-09-06      mobile_redirect 6800843
2025-09-07      mobile_redirect 7953640
2025-09-08      mobile_redirect 6088333
2025-09-09      mobile_redirect 5982818
2025-09-10      mobile_redirect 6027794
2025-09-11      mobile_redirect 5715456
2025-09-12      mobile_redirect 5475739
2025-09-13      mobile_redirect 6263999
2025-09-14      mobile_redirect 6671414
2025-09-15      mobile_redirect 5390873
2025-09-16      mobile_redirect 5811129
2025-09-17      mobile_pageview 4001615
2025-09-17      mobile_redirect 3302302
2025-09-18      mobile_pageview 9448690
2025-09-18      mobile_redirect 193866
2025-09-19      mobile_pageview 9492527
2025-09-19      mobile_redirect 210654
2025-09-20      mobile_pageview 10703710
2025-09-20      mobile_redirect 268878
2025-09-21      mobile_pageview 11864476
2025-09-21      mobile_redirect 279961
2025-09-22      mobile_pageview 9787520
2025-09-22      mobile_redirect 238670
2025-09-23      mobile_pageview 9222392
2025-09-23      mobile_redirect 228306
Time taken: 262 seconds, Fetched 30 row(s)

Screenshot 2025-09-24 at 04.58.12.png (728ร—1 px, 64 KB)

I'm not happy with the outcome yet, because the total is notably higher after the switch which suggests something is being counted that shouldn't.

A few possible explanations (to be validated):

  • The "mobile_redirect" could be an undercount, if the user_agent regex is significantly lacking. This seems unlikely because I confirmed at T214998#10551073 that mobile|android covers 99%.
  • The increase could be real, if there are lots of views on URLs that are unsafe to redirect. For example, a URL like https://es.wikipedia.org/wiki/Sonido?curid=7611 is unsafe to redirect and was given desktop view in-place to mobile users previously, and now activates MobileFrontend in-place. This also seems unlikely as these URLs should be rare. Unless... I'm including bot/spider views? I am. OK. I should exclude those.
  • The increase could be real, if there are lots of "mobile"-like bots/spiders that don't follow our redirect. That would mean where previously they would give up, they now succeed. This is interesting and worth confirming separately, but per the previous point, for this metric we should focus on agent_type='user'.

[โ€ฆ]

  • The increase could be true, if there are lots of views on URLs that are unsafe to redirect. For example, a URL like https://es.wikipedia.org/wiki/Sonido?curid=7611 [โ€ฆ]. This also seems unlikely as these URLs should be rare. Unless... I'm counting bot/spider views? I am. OK. We should fix that.
  • The increase could be true, if there are lots of "mobile"-like bots/spiders that don't follow our redirect. That would mean where previously they would give up, they now succeed. This is interesting [โ€ฆ], but [โ€ฆ] for this metric we should focus on agent_type='user'.

Looks like most of the "excess" requests are in fact agent_type=user, which means fixing this, we actually have a bigger gap than before.

SELECT CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date, agent_type, CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket, COUNT(*) AS _count FROM wmf.pageview_actor WHERE year=2025 AND month=9 AND day=15 AND uri_host='it.wikipedia.org' AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%' AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web')) GROUP BY year,month,day,agent_type,is_pageview,is_redirect_to_pageview ORDER BY _date ASC, _bucket ASC;

_date           agent_type      _bucket          _count
2025-09-15      user            mobile_redirect 4503239
2025-09-15      spider          mobile_redirect 794178
2025-09-15      automated       mobile_redirect 93456
Time taken: 51 seconds, Fetched 3 row(s)


SELECT CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date, agent_type, CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket, COUNT(*) AS _count FROM wmf.pageview_actor WHERE year=2025 AND month=9 AND day=22 AND uri_host='it.wikipedia.org' AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%' AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web')) GROUP BY year,month,day,agent_type,is_pageview,is_redirect_to_pageview ORDER BY _date ASC, _bucket ASC;

_date           agent_type      _bucket         _count
2025-09-22      user            mobile_pageview 8874722
2025-09-22      spider          mobile_pageview 774169
2025-09-22      automated       mobile_pageview 138629
2025-09-22      user            mobile_redirect 148951
2025-09-22      spider          mobile_redirect 82338
2025-09-22      automated       mobile_redirect 7381
Time taken: 69 seconds, Fetched 6 row(s)

The spider/automated traffic stayed more or less constant. It is the user traffic that nearly doubled.

2025-09-15      mobile_redirect 5.4M     # = 4.5M user + 0.8M spider + 0.1M automated

2025-09-22      mobile_pageview 9.8M     # = 8.9M user + 0.8M spider + 0.1M automated
2025-09-22      mobile_redirect 0.2M     # = 0.1M user + 0.1M spider + 0.007M automated

Revised plot:

SELECT CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date, CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket, COUNT(*) AS _count FROM wmf.pageview_actor WHERE year=2025 AND month=9 AND uri_host='it.wikipedia.org' AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%' AND agent_type='user' AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web')) GROUP BY year,month,day,is_pageview,is_redirect_to_pageview ORDER BY _date ASC, _bucket ASC;

_date   _bucket _count
2025-09-01      mobile_redirect 5389132
2025-09-02      mobile_redirect 5373279
2025-09-03      mobile_redirect 4967930
2025-09-04      mobile_redirect 5399847
2025-09-05      mobile_redirect 5349266
2025-09-06      mobile_redirect 5955947
2025-09-07      mobile_redirect 7131795
2025-09-08      mobile_redirect 5328855
2025-09-09      mobile_redirect 5147603
2025-09-10      mobile_redirect 5132344
2025-09-11      mobile_redirect 4834211
2025-09-12      mobile_redirect 4690828
2025-09-13      mobile_redirect 5517998
2025-09-14      mobile_redirect 5800554
2025-09-15      mobile_redirect 4503239
2025-09-16      mobile_redirect 4776361
2025-09-17      mobile_pageview 3772867
2025-09-17      mobile_redirect 2538447
2025-09-18      mobile_pageview 8562604
2025-09-18      mobile_redirect 142301
2025-09-19      mobile_pageview 8603030
2025-09-19      mobile_redirect 142190
2025-09-20      mobile_pageview 9732128
2025-09-20      mobile_redirect 165987
2025-09-21      mobile_pageview 10860276
2025-09-21      mobile_redirect 177213
2025-09-22      mobile_pageview 8874722
2025-09-22      mobile_redirect 148951
2025-09-23      mobile_pageview 9005710
2025-09-23      mobile_redirect 147370
2025-09-24      mobile_pageview 9339080
2025-09-24      mobile_redirect 151483
โ€ฆ
Time taken: 221 seconds, Fetched 34 row(s)

Screenshot 2025-09-25 at 19.32.52.png (728ร—1 px, 65 KB)

[โ€ฆ] Revised plot:

Screenshot 2025-09-25 at 19.32.52.png (728ร—1 px, 65 KB)

Compared to: stats.wikimedia.org - Total page views it.wikipedia.org for agent=user and access=mobile web

Screenshot 2025-09-26 at 20.56.59.png (866ร—1 px, 79 KB)

My "after" state of mobile_pageviews corresponds almost 1:1 with the official pageview data. My numbers are ~1% lower because as previously decided, we exclude mobile pageviews directly on it.m.wikipedia.org, which the official data includes.

My "before" state is much lower, so instead of looking at what we may be overcounting, we're probably undercounting something there. And once I knew which side to look at, it became obvious: The 'before" mobile_redirect effectively counts only the first request in a browsing session, because any subsequent pageviews by the same user will take place on the mobile domain and thus are not a redirect. In the "after" mobile_pageview data, has no such distinction because we count first and sebsequent pageviews equally within the it.wikipedia.org domain.

Excluding referal_class=internal:

SELECT
    CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
    CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket,
    COUNT(*) AS _count
FROM wmf.pageview_actor
    WHERE year=2025 AND month=9
    AND agent_type='user'
    AND referer_class!='internal'
    AND uri_host='it.wikipedia.org'
    AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%'
    AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web'))
GROUP BY year,month,day,is_pageview,is_redirect_to_pageview
ORDER BY _date ASC, _bucket ASC;
_date   _bucket _count
2025-09-01      mobile_redirect 5328850
2025-09-02      mobile_redirect 5314775
2025-09-03      mobile_redirect 4910652
2025-09-04      mobile_redirect 5339340
2025-09-05      mobile_redirect 5289513
2025-09-06      mobile_redirect 5891584
2025-09-07      mobile_redirect 7060375
2025-09-08      mobile_redirect 5270266
2025-09-09      mobile_redirect 5091284
2025-09-10      mobile_redirect 5073852
2025-09-11      mobile_redirect 4778834
2025-09-12      mobile_redirect 4637169
2025-09-13      mobile_redirect 5457036
2025-09-14      mobile_redirect 5735112
2025-09-15      mobile_redirect 4448900
2025-09-16      mobile_redirect 4720290
2025-09-17      mobile_pageview 2272714
2025-09-17      mobile_redirect 2489612
2025-09-18      mobile_pageview 5206524
2025-09-18      mobile_redirect 5583
2025-09-19      mobile_pageview 5230591
2025-09-19      mobile_redirect 5171
2025-09-20      mobile_pageview 5943701
2025-09-20      mobile_redirect 7452
2025-09-21      mobile_pageview 6696925
2025-09-21      mobile_redirect 8159
2025-09-22      mobile_pageview 5412327
2025-09-22      mobile_redirect 7484
2025-09-23      mobile_pageview 5468716
2025-09-23      mobile_redirect 7739
2025-09-24      mobile_pageview 5653914
2025-09-24      mobile_redirect 7179
2025-09-25      mobile_pageview 5170886
2025-09-25      mobile_redirect 6553
2025-09-26      mobile_pageview 3593041
2025-09-26      mobile_redirect 5129
Time taken: 279 seconds, Fetched 36 row(s)

Screenshot 2025-09-26 at 21.19.59.png (728ร—1 px, 62 KB)

I'll roll with this query for now to set our baseline and initial report. I'm open to tweaking and re-doing those queries if we learn something to improve the accuracy of this.

Report 26 Sep 2025 - Traffic ramp up

As of Wednesday, the unified mobile router is live on 29% of wikis.

Screenshot 2025-09-27 at 02.24.46.png (740ร—1 px, 72 KB)

As of Thursday, 98.9% of incoming mobile visits on canonical domains for these wikis are served directly, without a redirect. Globally this makes up about 6.8%.

2025-09-27 rolled-abs.png (896ร—1 px, 63 KB) 2025-09-27 all-abs.png (908ร—1 px, 59 KB)
2025-09-27 rolled-pc.png (904ร—1 px, 27 KB) 2025-09-27 all-pc.png (902ร—1 px, 28 KB)

References:

wmf-config repo
operations-mediawiki-config$ ./multiversion/bin/expanddblist 'all' | wc -l
1060

operations-mediawiki-config$ composer buildConfigCache; ack --no-filename '"wmgUseMdotRouting":' tests/data/config-cache/conf-production-* | sort | uniq -c
    311     "wmgUseMdotRouting": false,
    749     "wmgUseMdotRouting": true,
Hadoop query - Rolled out wikis
SELECT
    CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
    CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket,
    COUNT(*) AS _count
FROM wmf.pageview_actor
    WHERE year=2025 AND month=9
    AND agent_type='user'
    AND referer_class!='internal'
    AND uri_host RLIKE '(^www\.mediawiki\.org|^(ca|he|it|fa)\.wikipedia\.org|^meta\.wikimedia\.org|\.(wikinews|wikibooks|wikiquote|wikivoyage|wikiversity)\.org)$'
    -- From puppet/varnish/text-frontend.vcl@cluster_fe_recv_pre_purge
    AND uri_host NOT RLIKE '(?i)^([a-z0-9-]+\\.)?m\\.'
    AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%'
    AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web'))
GROUP BY year,month,day,is_pageview,is_redirect_to_pageview
ORDER BY _date ASC, _bucket ASC;
Hadoop result - Rolled out wikis
_date   _bucket _count
2025-09-01      mobile_redirect 9173124
2025-09-02      mobile_redirect 9139723
2025-09-03      mobile_redirect 8591564
2025-09-04      mobile_redirect 9007455
2025-09-05      mobile_pageview 20249
2025-09-05      mobile_redirect 9236422
2025-09-06      mobile_pageview 58671
2025-09-06      mobile_redirect 9956883
2025-09-07      mobile_pageview 60916
2025-09-07      mobile_redirect 10808248
2025-09-08      mobile_pageview 52414
2025-09-08      mobile_redirect 8637792
2025-09-09      mobile_pageview 39944
2025-09-09      mobile_redirect 9013440
2025-09-10      mobile_pageview 42360
2025-09-10      mobile_redirect 9264013
2025-09-11      mobile_pageview 44976
2025-09-11      mobile_redirect 8463800
2025-09-12      mobile_pageview 51543
2025-09-12      mobile_redirect 8813351
2025-09-13      mobile_pageview 36480
2025-09-13      mobile_redirect 9018222
2025-09-14      mobile_pageview 41452
2025-09-14      mobile_redirect 9851504
2025-09-15      mobile_pageview 35700
2025-09-15      mobile_redirect 8299137
2025-09-16      mobile_pageview 29582
2025-09-16      mobile_redirect 8415992
2025-09-17      mobile_pageview 2594172
2025-09-17      mobile_redirect 5755458
2025-09-18      mobile_pageview 6950199
2025-09-18      mobile_redirect 1973708
2025-09-19      mobile_pageview 8992180
2025-09-19      mobile_redirect 302325
2025-09-20      mobile_pageview 9654453
2025-09-20      mobile_redirect 517064
2025-09-21      mobile_pageview 10353964
2025-09-21      mobile_redirect 498543
2025-09-22      mobile_pageview 8782507
2025-09-22      mobile_redirect 436490
2025-09-23      mobile_pageview 8892142
2025-09-23      mobile_redirect 454979
2025-09-24      mobile_pageview 9405849
2025-09-24      mobile_redirect 199306
2025-09-25      mobile_pageview 9181829
2025-09-25      mobile_redirect 104708
2025-09-26      mobile_pageview 9384746
2025-09-26      mobile_redirect 126200
Hadoop query - All wikis
SELECT
    CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
    CASE WHEN is_redirect_to_pageview THEN 'mobile_redirect' ELSE 'mobile_pageview' END AS _bucket,
    COUNT(*) AS _count
FROM wmf.pageview_actor
    WHERE year=2025 AND month=9
    AND agent_type='user'
    AND referer_class!='internal'
    -- From puppet/varnish/text-frontend.vcl@cluster_fe_recv_pre_purge
    AND uri_host NOT RLIKE '(?i)^([a-z0-9-]+\\.)?m\\.'
    AND uri_path NOT LIKE '/wiki/Special:CentralAutoLogin%'
    AND ((is_redirect_to_pageview AND user_agent RLIKE '(?i)(android|mobi)') OR (is_pageview AND access_method='mobile web'))
GROUP BY year,month,day,is_pageview,is_redirect_to_pageview
ORDER BY _date ASC, _bucket ASC;
Hadoop result - All wikis
_date   _bucket _count
2025-09-01      mobile_redirect 142201266
2025-09-02      mobile_redirect 138331858
2025-09-03      mobile_pageview 592
2025-09-03      mobile_redirect 135349366
2025-09-04      mobile_pageview 2099
2025-09-04      mobile_redirect 136241581
2025-09-05      mobile_pageview 28675
2025-09-05      mobile_redirect 142230435
2025-09-06      mobile_pageview 74057
2025-09-06      mobile_redirect 156415641
2025-09-07      mobile_pageview 63517
2025-09-07      mobile_redirect 166823456
2025-09-08      mobile_pageview 55108
2025-09-08      mobile_redirect 140957942
2025-09-09      mobile_pageview 48609
2025-09-09      mobile_redirect 140381094
2025-09-10      mobile_pageview 53205
2025-09-10      mobile_redirect 149529379
2025-09-11      mobile_pageview 48522
2025-09-11      mobile_redirect 140852499
2025-09-12      mobile_pageview 62666
2025-09-12      mobile_redirect 140701802
2025-09-13      mobile_pageview 38824
2025-09-13      mobile_redirect 152986136
2025-09-14      mobile_pageview 53612
2025-09-14      mobile_redirect 168662255
2025-09-15      mobile_pageview 45451
2025-09-15      mobile_redirect 142692572
2025-09-16      mobile_pageview 35286
2025-09-16      mobile_redirect 141959762
2025-09-17      mobile_pageview 2598732
2025-09-17      mobile_redirect 135176660
2025-09-18      mobile_pageview 6953730
2025-09-18      mobile_redirect 128125946
2025-09-19      mobile_pageview 8996338
2025-09-19      mobile_redirect 127717003
2025-09-20      mobile_pageview 9659733
2025-09-20      mobile_redirect 143640675
2025-09-21      mobile_pageview 10358343
2025-09-21      mobile_redirect 153239971
2025-09-22      mobile_pageview 8786518
2025-09-22      mobile_redirect 131176161
2025-09-23      mobile_pageview 8898484
2025-09-23      mobile_redirect 130905838
2025-09-24      mobile_pageview 9412157
2025-09-24      mobile_redirect 127066627
2025-09-25      mobile_pageview 9186344
2025-09-25      mobile_redirect 125437159
2025-09-26      mobile_pageview 9388601
2025-09-26      mobile_redirect 127979199
Time taken: 305.849 seconds, Fetched 50 row(s)
Krinkle renamed this task from Write Hadoop query for progres metric of unified mobile routing metric to Report progress on rollout of unified mobile routing (Hadoop query).Sat, Sep 27, 1:51 AM
Krinkle updated the task description. (Show Details)

Report 30 Sep 2025 - Performance preview

I'd like share some early performance report from the rollout so far.

Back in February, I gathered the following data for RFC: Mobile domain sunsetting ยง Site speed on mediawiki.org:

Screenshot 2025-09-30 at 18.28.48.png (680ร—1 px, 44 KB) Screenshot 2025-09-30 at 18.29.01.png (680ร—1 px, 47 KB) Screenshot 2025-09-30 at 18.28.55.png (680ร—1 px, 46 KB)

This demonstrates the ~200ms regression in time to first byte as incurred through the mobile redirect. The redirect has been there since 2010, but from 2010 to 2024 Google stored both a mobile URL and a desktop URL for every search result, and linked directly to the mobile URL for mobile visitors of Google.com. This special-casing stopped mid-2024. Thus, with about 60% of incoming pageviews referred from Google, that change is directly visible in our data. Of course, for the other 40% this redirect has always been there.

We could only guess that this was the cause, because we only noticed the difference six months later in Dec 2024. We don't know when Google changed this. In theory, the regression could be caused by anything. Until now!

Following the rollout on the Italian and Persian Wikipedia last week, we see sharp cut with 200ms off the fetchStart metric for Italian Wikipedia and over half as second (500ms) for Persian Wikipedia (p75, mobile pageviews). This improvement is both a recovery, to reverse the regression, and also a net-improvement because it applies to everyone, not just those treated special by Google (as pre-2024).

Screenshot 2025-09-30 at 18.36.54.png (680ร—1 px, 55 KB) Screenshot 2025-09-30 at 18.36.58.png (680ร—1 px, 57 KB)

Usually when we improve a frontend metric, the earlier the metric, the bigger the impact. For example, an improvement in TTFB (time to first byte) often results in an even bigger improvement on Page Load Time and Visual Completion metrics, because the request cost is incurred many times during a page load for images and other assets. In this case, however, part of the saving is absorbed. Of the 500ms spent in the mobile redirect on Persian Wikipedia, about 250ms is for DNS and HTTPS. These are warmed up during the redirect and so don't happen again after that. With the redirect gone, the DNS and HTTP setup is now measured by responseStart instead of fetchStart. This is how it already worked on desktop.

References:

  • Raw data and Hadoop queries for 2024: P73601.
  • Raw data and Hadoop queries for 2025: (below).
  • Chart: Google Sheet
Hadoop query - p75 itwiki
SELECT
    CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
    COUNT(*) AS _count,
    ROUND(PERCENTILE(event['fetchStart'],0.75)) _fetchStart,
    ROUND(PERCENTILE(event['responseStart'],0.75)) _responseStart
FROM event_sanitized.NavigationTiming
WHERE
year=2025 AND month=9
AND event['action']='view' AND event['isAnon']=true AND event['isOversample']=false
AND event['mobileMode']='stable'
AND wiki='itwiki'
GROUP BY year, month, day
ORDER BY _date ASC;
_date   _count  _fetchStart     _responseStart
2025-09-01      9174    231.0   464.0
2025-09-02      9300    224.0   458.0
2025-09-03      8591    228.0   477.0
2025-09-04      9069    239.0   481.0
2025-09-05      8969    235.0   476.0
2025-09-06      10023   227.0   466.0
2025-09-07      12271   216.0   442.0
2025-09-08      9330    223.0   452.0
2025-09-09      8702    228.0   469.0
2025-09-10      8759    235.0   502.0
2025-09-11      8177    231.0   477.0
2025-09-12      7886    229.0   469.0
2025-09-13      9007    251.0   525.0
2025-09-14      9611    231.0   483.0
2025-09-15      7575    241.0   508.0
2025-09-16      8062    248.0   500.0
2025-09-17      7729    119.0   514.0
2025-09-18      7602    27.0    424.0
2025-09-19      7684    27.0    404.0
2025-09-20      8795    26.0    397.0
2025-09-21      9676    26.0    430.0
2025-09-22      7915    26.0    412.0
2025-09-23      8012    27.0    371.0
2025-09-24      8286    27.0    369.0
2025-09-25      7495    26.0    415.0
2025-09-26      7457    27.0    421.0
2025-09-27      8975    25.0    390.0
2025-09-28      10139   25.0    371.0
2025-09-29      7415    25.0    372.0
2025-09-30      7233    25.0    367.0
Hadoop query - p75 fawiki
SELECT
    CONCAT(year,'-',LPAD(month, 2, '0'),'-',LPAD(day, 2, '0')) AS _date,
    COUNT(*) AS _count,
    ROUND(PERCENTILE(event['fetchStart'],0.75)) _fetchStart,
    ROUND(PERCENTILE(event['responseStart'],0.75)) _responseStart
FROM event_sanitized.NavigationTiming
WHERE
year=2025 AND month=9
AND event['action']='view' AND event['isAnon']=true AND event['isOversample']=false
AND event['mobileMode']='stable'
AND wiki='fawiki'
GROUP BY year, month, day
ORDER BY _date ASC;
_date   _count  _fetchStart     _responseStart
2025-09-01      4303    530.0   941.0
2025-09-02      3932    530.0   950.0
2025-09-03      3844    515.0   928.0
2025-09-04      4064    574.0   1050.0
2025-09-05      4569    508.0   925.0
2025-09-06      4035    513.0   947.0
2025-09-07      4158    557.0   1003.0
2025-09-08      3910    510.0   962.0
2025-09-09      3832    522.0   976.0
2025-09-10      4338    526.0   975.0
2025-09-11      4222    526.0   955.0
2025-09-12      4528    528.0   959.0
2025-09-13      4125    523.0   966.0
2025-09-14      4169    526.0   957.0
2025-09-15      4106    531.0   959.0
2025-09-16      4061    540.0   1016.0
2025-09-17      3982    541.0   981.0
2025-09-18      4069    374.0   920.0
2025-09-19      4368    31.0    770.0
2025-09-20      4247    31.0    777.0
2025-09-21      4288    30.0    769.0
2025-09-22      3952    31.0    744.0
2025-09-23      3718    30.0    776.0
2025-09-24      3620    31.0    810.0
2025-09-25      4051    29.0    785.0
2025-09-26      4210    31.0    800.0
2025-09-27      3671    31.0    800.0
2025-09-28      3802    30.0    774.0
2025-09-29      3746    31.0    806.0
2025-09-30      3906    31.0    809.0

Report 4 Oct 2025 - Traffic update and performance improvement

Unified mobile routing is now live on 67% of wikis (711 of 1060).

2025_mobile_wikicount_711.png (710ร—1 px, 43 KB)

99.3% of incoming mobile pageviews on canonical domains for these wikis are served directly, without a redirect. Globally this brings us to 40.1%.

2025_mobile_reqrolled_abs_711.png (896ร—1 px, 65 KB) 2025_mobile_reqall_abs_711.png (906ร—1 px, 64 KB)
2025_mobile_reqrolled_pc_711.png (904ร—1 px, 29 KB) 2025_mobile_reqall_pc_711.png (902ร—1 px, 28 KB)

Performance

Back in February, I gathered the following data for RFC: Mobile domain sunsetting ยง Site speed on mediawiki.org. It demonstrates a 200ms regression in page load time on the Indonesian Wikipedia starting in May 2024 (p75, mobile), that we believed was caused by the mobile redirect.

2025_mobile_regression_worldwide_p50.png (678ร—1 px, 41 KB) Screenshot 2025-09-30 at 18.28.48.png (680ร—1 px, 44 KB)
Screenshot 2025-09-30 at 18.29.01.png (680ร—1 px, 47 KB) Screenshot 2025-09-30 at 18.28.55.png (680ร—1 px, 46 KB)

Wikipedia has had a mobile domain with redirect since 2010. From 2010 to 2024, Google stored both a mobile and a desktop URL for each search result, and linked directly to the mobile URL when using Google Search from a mobile device (ref, ref). Google transitioned most sites to a "mobile-first" crawler between 2016 and 2023 (ref, ref, ref), which removes this separation and displays the same link to everyone. It seems wikipedia.org was among the last sites Google transitioned (ref), with May 2024 as likely change date. About 60% of Wikipedia incoming pageviews are referred from Google, so we should see the effect in our data. Of course, for the other 40% this redirect has always been there!

Here's what happened when we enabled unified mobile routing on several Wikipedias:

2025_mobile_recovery_fawiki.png (680ร—1 px, 57 KB) 2025_mobile_recovery_itwiki.png (680ร—1 px, 55 KB) 2025_mobile_recovery_idwiki.png (678ร—1 px, 57 KB) 2025_mobile_recovery_dewiki.png (680ร—1 px, 54 KB)

Across the board we see a 20-25% improvement in response times from mobile devices.

Persian Wikipedia saw a quarter second cut in the responseStart metric from 1.0s down to 0.75s (p75, mobile pageviews). Indonesian Wikipedia from 0.8s to 0.6s. Italian and German Wikipedia both from 0.47s to 0.37s. The worse your connection the more you paid for the redirect (akin to a regressive tax). Fortunately, we now ride that curve in the opposite direction, with the biggest benefactors of this change those that need it the most โ€”ย by proxy of being the furthest from our data centers.

This improvement is not merely a recovery reversing the regression, it is also a net-improvement because it applies to everyone now, not just those treated special by Google (as pre-2024). And indeed the metric confirms we gained back more than we lost on the same wikis.

When we improve a frontend metric, there is often an amplified impact on later metrics. For example, an improvement in TTFB (time to first byte) may result in an even bigger improvement on Page Load Time and Visual Completion metrics, because there are multiple requests during a page load for images and other assets. In this case, some of the savings are absorbed. Of the 500ms spent in the mobile redirect (fetchStart) on Persian Wikipedia, about 250ms was for DNS and HTTPS to establish the connection. Once established, it doesn't need to happen again. With the redirect gone, the DNS and HTTP steps are now measured by responseStart instead of fetchStart. This is how these browser metrics are meant to work, and how it already was for desktop.

References
Krinkle updated the task description. (Show Details)

Report 17 Oct 2025 - Final traffic and perf update

Unified mobile routing is now enabled on 100% of the 1060 wikis. The final rollout days were Tuesday 7 and Wednesday 8 October.

2025_mobile_wikicount_1060.png (710ร—1 px, 51 KB)

Between 9 Oct and 17 Oct, every day we have served at or above 99.5% of incoming mobile pageviews on canonical domains directly, instead of via a redirect. In previous reports I projected we would reach 99.3% (based on then-partial rollout).

2025_mobile_reqall_abs_1060_stack.png (896ร—1 px, 63 KB) 2025_mobile_reqall_pc_1060.png (900ร—1 px, 23 KB)

Performance

Back in February, I gathered the data for RFC: Mobile domain sunsetting ยง Site speed on mediawiki.org which demonstrated a 10-20% regression in response times worldwide starting in May 2024 (p75, mobile). For example, the Indonesian Wikipedia experienced a 200ms delay that we believe was caused by the mobile redirect.

Now that the rollout is more than a week behind us, we can assess the impact on the "Worldwide" metrics. And as projected, we indeed see a 20% improvement in response times from mobile devices across the board.

Below I plot the 2025 improvements side-by-side with the 2024 regressions of the same percentile and wiki scope, along with concrete numbers and percentage differences for easier interpretation. I've also included fresh data for Persian/Indonesian/German Wikipedia first previewed a few weeks ago. The preview was based on a few days of data, whereas we now have several weeks worth, for a fairer and higher confidence comparison.

Scope2023-2024 plot2025 Sep-Oct plot2024 avg-baseline/regressed2025 avg-baseline/improved
Wordwideย p802025_mobile_regression_worldwide_p80.png (680ร—1 px, 44 KB)2025_mobile_recovery_world_p80.png (676ร—1 px, 49 KB)0.63s to 0.70s +11%0.73s to 0.60s -18%
Wordwideย p752025_mobile_regression_worldwide_p75.png (680ร—1 px, 44 KB)2025_mobile_recovery_world_p75.png (676ร—1 px, 47 KB)0.54s to 0.61s +13%0.64s to 0.52s -19%
Wordwideย p502025_mobile_regression_worldwide_p50.png (678ร—1 px, 41 KB)2025_mobile_recovery_world_p50.png (676ร—1 px, 47 KB)0.27s to 0.33s +22%0.34s to 0.27s -21%
fawiki-p752025_mobile_regression_fawiki.png (680ร—1 px, 47 KB)2025_mobile_recovery_fawiki.png (680ร—1 px, 57 KB)0.84s to 0.99s +18%0.97s to 0.78s -20%
idwiki-p752025_mobile_regression_idwiki.png (680ร—1 px, 47 KB)2025_mobile_recovery_idwiki.png (680ร—1 px, 54 KB)0.68s to 0.76s +12%0.85s to 0.69s -19%
dewiki-p752025_mobile_regression_dewiki.png (680ร—1 px, 46 KB)2025_mobile_recovery_dewiki.png (670ร—1 px, 55 KB)0.37s to 0.44s +19%0.49s to 0.39s -20%
Remnant

2025_mobile_reqall_abs_1060_log.png (906ร—1 px, 59 KB)

The rollout is 100%. The apparent remnant of 0.5% measured by mobile_redirect (0.7M of 150M/day) represents false positives from unrelated MediaWiki redirects that are included in "HTTP redirects on pageview-like URLs from a mobile device", but are not redirects to the mobile domain. This is because the wmf.pageview_actor dataset only tells us where we redirect from, not where we redirect to. These features are not specific to mobile devices and not related to the mobile domain or MobileFrontend. They include, for example:

Again, these do not redirect "to" the mobile domain, however they are redirects "from" the standard domain for a mobile device. The same redirects happen for desktop devices too. The long tail shows most URLs have a few hits a day, with none worth excluding from the Hadoop query (unlike the CentralAuth case in earlier reports).

References
Krinkle claimed this task.
Krinkle triaged this task as Medium priority.
Krinkle updated the task description. (Show Details)