Page MenuHomePhabricator

bd808 (Bryan Davis)
Principal Software EngineerAdministrator

Today

  • No visible events.

Tomorrow

  • No visible events.

Saturday

  • No visible events.

User Details

User Since
Oct 3 2014, 2:36 PM (576 w, 5 d)
Roles
Administrator
Availability
Available
IRC Nick
bd808
LDAP User
BryanDavis
MediaWiki User
BDavis (WMF) [ Global Accounts ]

I'm BDavis (WMF) on wiki, bd808 on irc & GitLab, and BryanDavis on Gerrit & Wikitech.

I've got a thing for 🦄s. Don't judge.

I work for or provide services to the Wikimedia Foundation, but this is my only Phabricator account. Edits, statements, or other contributions made from this account are my own, and may not reflect the views of the Foundation.

Recent Activity

Yesterday

bd808 renamed T408032: Consider replacing gitiles with a separate gitea or forgejo deployment from Consider replacing gitiles with a separate gitea deployment to Consider replacing gitiles with a separate gitea or forgejo deployment.
Wed, Oct 22, 10:16 PM · Gerrit
bd808 added a comment to T408032: Consider replacing gitiles with a separate gitea or forgejo deployment.

Primary git hosting is Gerrit (https://review.opendev.org/) with gitea used as a repo viewer.

Wed, Oct 22, 9:58 PM · Gerrit
bd808 updated the task description for T408032: Consider replacing gitiles with a separate gitea or forgejo deployment.
Wed, Oct 22, 9:57 PM · Gerrit
bd808 added a comment to T408032: Consider replacing gitiles with a separate gitea or forgejo deployment.

Can we build a POC for this in Cloud VPS somewhere? My naive assumption is that we would need a fair amount of disk storage (150G?) to keep a clone of every repo in gerrit, but that seems to be the biggest expense.

Wed, Oct 22, 9:50 PM · Gerrit
bd808 created T408032: Consider replacing gitiles with a separate gitea or forgejo deployment.
Wed, Oct 22, 9:41 PM · Gerrit
bd808 added a comment to T399485: Find a way for pywikibot GitHub Actions to avoid IP range blocks of Microsoft Azure hosted runners.

@Xqt I see passing tests upstream marked as using "wpbeta" (e.g. https://github.com/wikimedia/pywikibot/actions/runs/18721896853/job/53396228584). Does this mean that things are working now? Can this task be updated with info about whay y'all had to change and resolved if so?

Wed, Oct 22, 8:00 PM · Pywikibot, Pywikibot-tests, Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Wed, Oct 22, 7:54 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Wed, Oct 22, 7:51 PM · Epic, Beta-Cluster-Infrastructure
bd808 added a watcher for Tool-sal: bd808.
Wed, Oct 22, 6:00 PM
bd808 created T408007: Migrate sal project from github to gitlab.wikimedia.org + phabricator.wikimedia.org.
Wed, Oct 22, 5:57 PM · Tool-sal
bd808 added a comment to T407993: Gitlab logo in light theme is not very defined.

The background is what has changed, not the logo. But yes it looks bad now.

Wed, Oct 22, 3:51 PM · Patch-For-Review, GitLab

Tue, Oct 21

bd808 changed Due Date from Fri, Oct 10, 12:00 AM to Tue, Nov 4, 12:00 AM on T379844: Define a process for keeping the committee membership "fresh".
Tue, Oct 21, 9:34 PM · User-bd808, Toolforge-standards-committee
bd808 added a comment to T379844: Define a process for keeping the committee membership "fresh".

@JJMC89 has called for a vote on the proposed wording the mailing list. The vote will end 31 October 2025 AoE.

Tue, Oct 21, 9:34 PM · User-bd808, Toolforge-standards-committee
bd808 created T407916: Add a floating `v1` tag that tracks the latest stable Blubber release.
Tue, Oct 21, 9:19 PM · Release Pipeline (Blubber)

Mon, Oct 20

bd808 added a comment to T407750: toolviews: distutils Version classes are deprecated.

I suspect this check is no longer relevant and could be removed entirely:

Mon, Oct 20, 4:26 PM · Toolforge, cloud-services-team
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Mon, Oct 20, 4:00 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Mon, Oct 20, 3:56 PM · Epic, Beta-Cluster-Infrastructure

Fri, Oct 17

bd808 closed T407205: Puppet agent failure detected on instance deployment-webperf21 in project deployment-prep as Invalid.

Already resolved by the time I manually checked the report:

bd808@deployment-webperf21:~$ sudo -i puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-webperf21.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(7f48520a8f) gitpuppet - [BETA HACK] Changes to profile::puppetserver::volatile'
Notice: /Stage[main]/Webperf::Statsv/Systemd::Service[statsv]/Service[statsv]/ensure: ensure changed 'stopped' to 'running' (corrective)
Info: /Stage[main]/Webperf::Statsv/Systemd::Service[statsv]/Service[statsv]: Unscheduling refresh on Service[statsv]
Notice: Applied catalog in 8.29 seconds
Fri, Oct 17, 8:09 PM · Beta-Cluster-Infrastructure
bd808 closed T407596: No Puppet resources found on instance deployment-mwmaint03 on project deployment-prep as Invalid.

Already resolved by the time I manually checked the report:

bd808@deployment-mwmaint03:~$ sudo -i puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-mwmaint03.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(7f48520a8f) gitpuppet - [BETA HACK] Changes to profile::puppetserver::volatile'
Notice: Applied catalog in 25.95 seconds
Fri, Oct 17, 8:08 PM · Beta-Cluster-Infrastructure
bd808 updated the task description for T407108: LibUp fails for all repositories with Internal Server Error.
Fri, Oct 17, 3:10 PM · LibUp

Thu, Oct 16

bd808 updated subscribers of T396924: kokkuri cannot publish "public" images from WMCS runners due to a lack of a local registry.

At the moment a job will still need to set a KOKKURI_REGISTRY_PUBLIC: registry.cloud.releng.team envvar in the job config to point kokkuri at the registry. That is a relatively simple addition and also something we can push up into the shared config at some point.

Thu, Oct 16, 11:10 PM · Patch-For-Review, GitLab (CI & Job Runners)
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Invalid.
Thu, Oct 16, 10:21 PM · Epic, Beta-Cluster-Infrastructure
bd808 added a comment to T405596: Disable IO for diffusion repositories.
  • For Diffusion repos which observe a gerrit or gitlab URI and also mirror to github [1], change that setup not to have Diffusion "in the middle", per T405596#11242761.

I can't immediately cite a source for this but I feel like GitLab repos don't automatically mirror to GitHub (in the way that Gerrit repos do). Wouldn't disabling GitHub mirroring for these GitLab Diffusion-repo-mirrors stop the GitHub-repo-mirrors from being updated (or am I misunderstanding something)?

Thu, Oct 16, 9:50 PM · Release-Engineering-Team (Priority Backlog 📥), Patch-For-Review, collaboration-services, Diffusion, Phabricator
bd808 created T407577: Add wikimedia as co-maintainer of wikimedia/zest-css at Packagist.
Thu, Oct 16, 8:41 PM · Parsoid, Composer
bd808 created T407576: Add wikimedia as co-maintainer of wikimedia/webidl at Packagist.
Thu, Oct 16, 8:40 PM · Parsoid, Composer
bd808 created T407575: Add wikimedia as co-maintainer of wikimedia/update-history at Packagist.
Thu, Oct 16, 8:39 PM · Composer
bd808 created T407574: Add wikimedia as co-maintainer of wikimedia/toolforge-skeleton at Packagist.
Thu, Oct 16, 8:34 PM · ToolforgeBundle, Community-Tech, Composer
bd808 created T407573: Add wikimedia as co-maintainer of wikimedia/langconv at Packagist.
Thu, Oct 16, 8:30 PM · MediaWiki-Language-converter, Composer
bd808 created T407572: Add wikimedia as co-maintainer of wikimedia/idle-dom at Packagist.
Thu, Oct 16, 8:30 PM · Composer
bd808 created T407571: Add wikimedia as co-maintainer of wikimedia/dodo at Packagist.
Thu, Oct 16, 8:28 PM · Parsoid (Dodo), Composer
bd808 created T407569: Add wikimedia as co-maintainer of wikimedia/bcp-47-code at Packagist.
Thu, Oct 16, 8:21 PM · Bcp47Code, Composer
bd808 created T407568: Add wikimedia as co-maintainer of wikimedia/alea at Packagist.
Thu, Oct 16, 8:19 PM · Composer
bd808 created T407531: Add wikimedia as co-maintainer of wikimedia/json-codec at Packagist.
Thu, Oct 16, 5:43 PM · JsonCodec, Composer
bd808 added a comment to T407502: Identify candidate tools to test migrating to the Toolforge on bare metal POC.

I maintain/co-maintain a number of tools that could be offered up for tribute here if desired:

Thu, Oct 16, 4:55 PM · Toolforge, cloud-services-team
bd808 added a comment to T407403: Error: Invalid serialization data for DatePeriod object.

When the offending code is tracked down, https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#Don't_use_built_in_serialization and T161647: RFC: Deprecate using php serialization inside MediaWiki should be justification for rewriting the blob serialization for the cached data.

Thu, Oct 16, 12:05 AM · MW-1.45-notes (1.45.0-wmf.24; 2025-10-21), Growth-Team (FY2025-26 Q2 Sprint 2), User-Michael, GrowthExperiments, WMF-General-or-Unknown, PHP 8.3 support, Wikimedia-production-error, MediaWiki-Engineering

Wed, Oct 15

bd808 added a comment to T407403: Error: Invalid serialization data for DatePeriod object.

Changing the 3v4l.org test @hashar did in T407403#11279288 to one that includes legacy versions of PHP so that 8.1 output shows up reproduces the serialization diff between 8.1 and 8.3.

Wed, Oct 15, 11:09 PM · MW-1.45-notes (1.45.0-wmf.24; 2025-10-21), Growth-Team (FY2025-26 Q2 Sprint 2), User-Michael, GrowthExperiments, WMF-General-or-Unknown, PHP 8.3 support, Wikimedia-production-error, MediaWiki-Engineering
bd808 added a comment to T407403: Error: Invalid serialization data for DatePeriod object.
$ docker pull docker-registry.wikimedia.org/php8.1-fpm-multiversion-base:latest
$ docker run -it --entrypoint 'php' docker-registry.wikimedia.org/php8.1-fpm-multiversion-base:latest -i | grep timelib
timelib version => 2021.19
$ docker pull docker-registry.wikimedia.org/php8.3-fpm-multiversion-base:latest
$ docker run -it --entrypoint 'php' docker-registry.wikimedia.org/php8.3-fpm-multiversion-base:latest -i | grep timelib
timelib version => 2022.12
Wed, Oct 15, 10:36 PM · MW-1.45-notes (1.45.0-wmf.24; 2025-10-21), Growth-Team (FY2025-26 Q2 Sprint 2), User-Michael, GrowthExperiments, WMF-General-or-Unknown, PHP 8.3 support, Wikimedia-production-error, MediaWiki-Engineering
bd808 updated subscribers of T407430: Upgrade MediaWiki-Docker PHP images to next target production version.

We have PHP 8.4 jobs in CI already, so I think that could be the next dev target without too much hassle. @Krinkle may have an idea if prod is likely to update to 8.4 or if we will be more likely to skip to 8.5 (RC2 today, likely to release in November 2025).

Wed, Oct 15, 10:00 PM · dev-images, MediaWiki-Docker
bd808 added a comment to T407403: Error: Invalid serialization data for DatePeriod object.

When the offending code is tracked down, https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#Don't_use_built_in_serialization and T161647: RFC: Deprecate using php serialization inside MediaWiki should be justification for rewriting the blob serialization for the cached data.

Wed, Oct 15, 8:29 PM · MW-1.45-notes (1.45.0-wmf.24; 2025-10-21), Growth-Team (FY2025-26 Q2 Sprint 2), User-Michael, GrowthExperiments, WMF-General-or-Unknown, PHP 8.3 support, Wikimedia-production-error, MediaWiki-Engineering
bd808 added a comment to T359211: Add a "reason" field in phab-ban tool.

Implementing the data entry and storage as text on https://wikitech.wikimedia.org/wiki/Tool:Phab-ban/Log is relatively straight forward. As an actual audit tool however it feels pretty weak. The folks complaining on JJMC89's talk page would still be complaining if there was a summary line that said "Faster_than_Thunder was disabled by JJMC89 for doing whatever". Blocks are trivially set and trivially reversed. This is by design.

Wed, Oct 15, 8:22 PM · Tool-phab-ban
bd808 renamed T254190: Allow developers to disable their own OAuth clients from Allow developers to disable their own OAuth 2.0 clients to Allow developers to disable their own OAuth clients.
Wed, Oct 15, 4:51 PM · MediaWiki-Platform-Team (Roadmap), serviceops, MediaWiki-extensions-OAuth

Tue, Oct 14

bd808 closed T171417: Request rename of "d3r1ck01" to "xSavitar" on wikitech/LDAP/Gerrit as Declined.

We do not rename developer accounts. If you are very unhappy with the name of your account, you can create a new one with the name you prefer, request that it be granted equivalent permissions to your existing account, and request that your existing account be deactivated.

We used to rename accounts, but it frequently led to various errors throughout the many separate systems which consume developer accounts, due to their local databases and authentication methods getting out of sync. In the long term, this may change if we develop better tooling for identity management.

Tue, Oct 14, 11:20 PM · Release-Engineering-Team (Priority Backlog 📥), LDAP
bd808 added a comment to T407299: What are the parts of toolforge?.

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Redis and https://wikitech.wikimedia.org/wiki/Help:Toolforge/Elasticsearch are more shared services.

Tue, Oct 14, 10:26 PM · Toolforge, cloud-services-team
bd808 added a comment to T407299: What are the parts of toolforge?.

https://wikitech.wikimedia.org/wiki/Portal:Toolforge/About_Toolforge says that multi-maintainer tool accounts with access to shared storage, $TOOLNAME.toolforge.org hosting, cronjobs, daemons, per-tool databases, and access to Dumps and Wiki Replicas are part of Toolforge.

Tue, Oct 14, 10:18 PM · Toolforge, cloud-services-team
bd808 closed T406650: Copy the Traffic team on alerts for deployment-cache* hosts as Resolved.

With https://gitlab.wikimedia.org/ladsgroup/Phabricator-maintenance-bot/-/merge_requests/31 merged and deployed, @Maintenance_bot will now check for tasks tagged as Beta-Cluster-Infrastructure that were authored by @wmcs-alerts and having titles containing "deployment-cache". It will add Traffic when matching tasks are found. The job that does this work currently runs once per hour, so there will be some lag before changes are applied. If this lag starts to seem problematic we can probably work with Amir to tune the bot so that it runs more frequently.

Tue, Oct 14, 5:48 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Tue, Oct 14, 5:38 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Tue, Oct 14, 5:35 PM · Epic, Beta-Cluster-Infrastructure

Thu, Oct 9

bd808 changed the status of T406650: Copy the Traffic team on alerts for deployment-cache* hosts from Open to In Progress.
Thu, Oct 9, 4:43 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure

Wed, Oct 8

bd808 created E1905: Offsite.
Wed, Oct 8, 10:56 PM · events
bd808 closed T405888: Puppet agent failure detected on instance deployment-maps-master02 in project deployment-prep as Resolved.

Fixed by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1194156. Thanks @Muehlenhoff!

Wed, Oct 8, 8:01 PM · Beta-Cluster-Infrastructure

Tue, Oct 7

bd808 added a comment to T406632: SyntaxHighlight fails in MediaWiki 1.43.4+: ModuleNotFoundError: No module named 'importlib.metadata' (with old Python version).

Since T364249: New upstream release for Pygments (2.18.0) a Python 3.8+ runtime is required.

Tue, Oct 7, 11:59 PM · SyntaxHighlight
bd808 closed T406504: sshd-session killed by Wheel of Misfortune on Toolforge bastion as Resolved.
Tue, Oct 7, 11:20 PM · User-bd808, cloud-services-team, Toolforge
bd808 updated subscribers of T406650: Copy the Traffic team on alerts for deployment-cache* hosts.

I don't know the skillset of @Maintenance_bot which could be another option, in theory.

Yeah, that might be the right place to look if trying to go the bot route. It looks like the project_grouper.py task there would need a bit of feature enhancement to be able to add the "title contains" and "author is" criteria. Hacking python bots is in my wheelhouse though. :)

Tue, Oct 7, 11:15 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 added a comment to T406650: Copy the Traffic team on alerts for deployment-cache* hosts.

@bd808: I do not know how/where alertmanager/@wmcs-alerts currently sets the Beta-Cluster-Infrastructure tag and task title for the Phab task to be created. Maybe the code on that side could be expanded to pass additional tags to Phab based on substrings in the task title to be created? If not feasible, then Herald makes most sense.

Tue, Oct 7, 10:04 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 added a comment to T406650: Copy the Traffic team on alerts for deployment-cache* hosts.

@Aklapper what do you think about using Herald for this sort of thing? It would need to be a global rule something like:

Screenshot 2025-10-07 at 14.47.58.png (1×2 px, 232 KB)

Tue, Oct 7, 8:52 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 updated the task description for T406650: Copy the Traffic team on alerts for deployment-cache* hosts.
Tue, Oct 7, 8:43 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 created T406650: Copy the Traffic team on alerts for deployment-cache* hosts.
Tue, Oct 7, 8:42 PM · User-bd808, Traffic, Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Tue, Oct 7, 8:00 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Tue, Oct 7, 7:56 PM · Epic, Beta-Cluster-Infrastructure
bd808 added a comment to T390213: Create a runbook for troubleshooting the CDN in deployment-prep.

https://cheatsheet.krishnaneupane.com/posts/haproxy has been helpful as more blocking has moved to haproxy. In our deployment the socket for talking to the running instance is at /var/run/haproxy/haproxy.sock.

Tue, Oct 7, 7:47 PM · Traffic, Documentation, Beta-Cluster-Infrastructure
bd808 added a comment to T403105: Remove need for manually applied MaxMind data hacks on Beta Cluster cache servers.

Hack achieved. :/

Tue, Oct 7, 7:41 PM · Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Tue, Oct 7, 6:24 PM · Epic, Beta-Cluster-Infrastructure
bd808 added a comment to T403105: Remove need for manually applied MaxMind data hacks on Beta Cluster cache servers.
/etc/haproxy/lua/maxmind-lookup.lua
local isp_dbpath = "/usr/share/GeoIP/GeoIP2-ISP.mmdb"
local datacenter_dbpath = "/usr/share/GeoIP/datacenter.mmdb"
$ ls -alh /usr/share/GeoIP/
total 62M
drwxr-xr-x   2 root root 4.0K Aug 26 23:38 ./
drwxr-xr-x 116 root root 4.0K Aug 26 22:49 ../
lrwxrwxrwx   1 root root   18 Jun 22  2023 GeoIP2-City.mmdb -> GeoLite2-City.mmdb
lrwxrwxrwx   1 root root   21 Jun 22  2023 GeoIP2-Country.mmdb -> GeoLite2-Country.mmdb
lrwxrwxrwx   1 root root   36 Aug 26 23:38 GeoIP2-ISP.mmdb -> /usr/share/GeoIP/GeoIP2-Country.mmdb
lrwxrwxrwx   1 root root   15 Jun 22  2023 GeoIPCity.dat -> GeoLiteCity.dat
lrwxrwxrwx   1 root root   18 Jun 22  2023 GeoIP.dat -> GeoLiteCountry.dat
-rw-r--r--   1 root root    0 Jun 22  2023 .geoipupdate.lock
-rw-r--r--   1 root root  37M Jun 22  2023 GeoLite2-City.mmdb
-rw-r--r--   1 root root 2.0M Jun 22  2023 GeoLite2-Country.mmdb
-rw-r--r--   1 root root 3.9M Jun 22  2023 GeoLiteASNum.dat
-rw-r--r--   1 root root  19M Jun 22  2023 GeoLiteCity.dat
-rw-r--r--   1 root root 792K Jun 22  2023 GeoLiteCountry.dat

/usr/share/GeoIP/datacenter.mmdb appears to be the missing file this time.

Tue, Oct 7, 6:20 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T403105: Remove need for manually applied MaxMind data hacks on Beta Cluster cache servers.

Something has changed making this an active problem again. While debugging an unblocking failure (T406091) I found:

bd808@deployment-cache-text08.deployment-prep.eqiad1:~$ haproxy -f /etc/haproxy/haproxy.cfg -c
[NOTICE]   (2862995) : haproxy version is 2.8.16-1~bpo11+1
[NOTICE]   (2862995) : path to executable is /sbin/haproxy
[ALERT]    (2862995) : config : parsing [/etc/haproxy/haproxy.cfg:13] : Lua runtime error: Error opening the specified MaxMind DB file
[ALERT]    (2862995) : config : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT]    (2862995) : config : Fatal errors found in configuration.
Tue, Oct 7, 6:13 PM · Beta-Cluster-Infrastructure
bd808 closed T394802: Write-access to Video2Commons GitHub repo as Resolved.

I wonder if @Don-vip actually tried to add you to the repo or not. I can see that he has the "role: admin" at https://github.com/toolforge/video2commons/settings/access. That should have let him add you I would have thought, but github org related permissions can be weird.

Tue, Oct 7, 12:10 AM · User-bd808, Toolforge-standards-committee, video2commons

Mon, Oct 6

bd808 moved T405888: Puppet agent failure detected on instance deployment-maps-master02 in project deployment-prep from To Triage to Puppet errors on the Beta-Cluster-Infrastructure board.
Mon, Oct 6, 11:54 PM · Beta-Cluster-Infrastructure
bd808 updated subscribers of T405888: Puppet agent failure detected on instance deployment-maps-master02 in project deployment-prep.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1191680

Mon, Oct 6, 11:53 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T405888: Puppet agent failure detected on instance deployment-maps-master02 in project deployment-prep.
bd808@deployment-maps-master02:~$ sudo -i puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-maps-master02.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(f95088e9ff) gitpuppet - [BETA HACK] Changes to profile::puppetserver::volatile'
Error: Cannot create /etc/wikimedia/maps; parent directory /etc/wikimedia does not exist
Error: /Stage[main]/Profile::Maps::Osm_master/File[/etc/wikimedia/maps]/ensure: change from 'absent' to 'directory' failed: Cannot create /etc/wikimedia/maps; parent directory /etc/wikimedia does not exist
Notice: /Stage[main]/Profile::Maps::Osm_master/File[/etc/wikimedia/maps/kartotherian]: Dependency File[/etc/wikimedia/maps] has failures: true
Warning: /Stage[main]/Profile::Maps::Osm_master/File[/etc/wikimedia/maps/kartotherian]: Skipping because of failed dependencies
Warning: /Stage[main]/Profile::Maps::Osm_master/File[/etc/wikimedia/maps/tegola]: Skipping because of failed dependencies
Notice: /Stage[main]/Osm::Imposm3/Systemd::Service[imposm]/Service[imposm]/ensure: ensure changed 'stopped' to 'running' (corrective)
Info: /Stage[main]/Osm::Imposm3/Systemd::Service[imposm]/Service[imposm]: Unscheduling refresh on Service[imposm]
Info: Stage[main]: Unscheduling all events on Stage[main]
Notice: Applied catalog in 12.59 seconds
Mon, Oct 6, 11:52 PM · Beta-Cluster-Infrastructure
bd808 closed T405788: No Puppet resources found on instance deployment-puppetserver-1 on project deployment-prep as Invalid.
bd808@deployment-puppetserver-1.deployment-prep.eqiad1:~$ sudo -i puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-puppetserver-1.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(b88963634a) gitpuppet - [BETA HACK] Changes to profile::puppetserver::volatile'
Notice: /Stage[main]/Profile::Puppetserver::Volatile/File[/srv/puppet_fileserver/volatile/datacenter_vendors]: Not removing directory; use 'force' to override
Notice: /Stage[main]/Profile::Puppetserver::Volatile/File[/srv/puppet_fileserver/volatile/datacenter_vendors]/ensure: removed (corrective)
Notice: Applied catalog in 12.96 seconds

Whatever it was, it's fixed now.

Mon, Oct 6, 11:46 PM · Beta-Cluster-Infrastructure
bd808 closed T405547: Failed to update Puppet repository /srv/git/operations/puppet on instance deployment-puppetserver-1 in project deployment-prep as Invalid.

This was probably a stale cherry-pick that someone cleaned up without seeing the ticket.

Mon, Oct 6, 11:43 PM · Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Mon, Oct 6, 11:13 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Mon, Oct 6, 11:09 PM · Epic, Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Mon, Oct 6, 11:02 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Mon, Oct 6, 10:57 PM · Epic, Beta-Cluster-Infrastructure
bd808 closed Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), as Resolved.
Mon, Oct 6, 10:48 PM · Epic, Beta-Cluster-Infrastructure
bd808 changed the status of Restricted Task, a subtask of T393487: 2025 tracking task for Beta Cluster (deployment-prep) traffic overload protection (blocking unwanted crawlers), from Open to In Progress.
Mon, Oct 6, 10:43 PM · Epic, Beta-Cluster-Infrastructure
bd808 added a comment to T395401: Migrate CopyPatrol repo from GitHub to GitLab.

If it were up to me, the repo wouldn't be migrated to GitLab since GitLab is an inferior experience to GitHub and Gerrit. (I won't move the backend repo off GitHub.)

Mon, Oct 6, 10:26 PM · GitLab (Project Migration), Community-Tech, CopyPatrol
bd808 added a comment to T395401: Migrate CopyPatrol repo from GitHub to GitLab.

@brennen @bd808 Is there a preference for VPS tools to have their repos under /cloudvps-repos rather than /toolforge-repos?

Mon, Oct 6, 10:23 PM · GitLab (Project Migration), Community-Tech, CopyPatrol
bd808 changed the status of T406504: sshd-session killed by Wheel of Misfortune on Toolforge bastion from Open to In Progress.
Mon, Oct 6, 4:11 PM · User-bd808, cloud-services-team, Toolforge
bd808 created T406504: sshd-session killed by Wheel of Misfortune on Toolforge bastion.
Mon, Oct 6, 3:51 PM · User-bd808, cloud-services-team, Toolforge

Sep 19 2025

bd808 set Due Date to Fri, Oct 10, 12:00 AM on T379844: Define a process for keeping the committee membership "fresh".
Sep 19 2025, 10:03 PM · User-bd808, Toolforge-standards-committee
bd808 removed a member for acl*Project-Admins: JWheeler-WMF.
Sep 19 2025, 9:56 PM
bd808 removed a watcher for acl*Project-Admins: ALFAN_SOFARI.
Sep 19 2025, 9:56 PM
bd808 closed T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports, a subtask of T369112: Pretrain (née Group -1) QTE validation environment, as Resolved.
Sep 19 2025, 9:39 PM · Release-Engineering-Team (Priority Backlog 📥), Quality-and-Test-Engineering-Team (Test Infrastructure), Epic
bd808 closed T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports as Resolved.

See https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Pretrain/Progress_reports/2025-09-19 for the final report on this hypothesis. We are declaring the experiment a success and planning to follow up with work to put the new container to use in 2025.

Sep 19 2025, 9:38 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 changed the subtype of T369115: [FY24-25 WE6.2.1] Publish pre-train single version containers from "Feature Request" to "Goal".
Sep 19 2025, 9:37 PM · Release-Engineering-Team (Doing 😎), OKR-Work, Epic
bd808 moved T380881: Re-create poolcounter instance in Beta Cluster (deployment-prep) from Extensions & services config to Future on the Beta-Cluster-Infrastructure board.
Sep 19 2025, 8:08 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports.

@bd808 The work in the title of this ticket was completed a while ago. Shall we mark this resolved?

Sep 19 2025, 4:04 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 added a subtask for T369112: Pretrain (née Group -1) QTE validation environment: T404399: wmf/next branch cut job on releases-jenkins and systemd timer on deployment server times overlap.
Sep 19 2025, 3:49 PM · Release-Engineering-Team (Priority Backlog 📥), Quality-and-Test-Engineering-Team (Test Infrastructure), Epic
bd808 removed a subtask for T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports: T404399: wmf/next branch cut job on releases-jenkins and systemd timer on deployment server times overlap.
Sep 19 2025, 3:49 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 edited parent tasks for T404399: wmf/next branch cut job on releases-jenkins and systemd timer on deployment server times overlap, added: T369112: Pretrain (née Group -1) QTE validation environment; removed: T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports.
Sep 19 2025, 3:49 PM · OKR-Work, Release-Engineering-Team (Priority Backlog 📥)
bd808 added a subtask for T369112: Pretrain (née Group -1) QTE validation environment: T401749: Modify wmf/next branching process to avoid leaving an incomplete branch on failure.
Sep 19 2025, 3:48 PM · Release-Engineering-Team (Priority Backlog 📥), Quality-and-Test-Engineering-Team (Test Infrastructure), Epic
bd808 removed a subtask for T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports: T401749: Modify wmf/next branching process to avoid leaving an incomplete branch on failure.
Sep 19 2025, 3:48 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 edited parent tasks for T401749: Modify wmf/next branching process to avoid leaving an incomplete branch on failure, added: T369112: Pretrain (née Group -1) QTE validation environment; removed: T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports.
Sep 19 2025, 3:48 PM · OKR-Work, Release-Engineering-Team (Priority Backlog 📥)

Sep 18 2025

bd808 edited projects for T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports, added: Release-Engineering-Team (Doing 😎); removed Release-Engineering-Team (Priority Backlog 📥).
Sep 18 2025, 11:05 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 edited projects for T402350: Add a staging kubernetes cluster to train-dev, added: Release-Engineering-Team (Doing 😎); removed Release-Engineering-Team (Priority Backlog 📥).
Sep 18 2025, 11:05 PM · OKR-Work, Release-Engineering-Team (Doing 😎), Patch-For-Review
bd808 triaged T402350: Add a staging kubernetes cluster to train-dev as Medium priority.
Sep 18 2025, 11:04 PM · OKR-Work, Release-Engineering-Team (Doing 😎), Patch-For-Review
bd808 changed the subtype of T401749: Modify wmf/next branching process to avoid leaving an incomplete branch on failure from "Task" to "Bug Report".
Sep 18 2025, 11:03 PM · OKR-Work, Release-Engineering-Team (Priority Backlog 📥)
bd808 closed T401533: Why does build-and-push-container-images take 11 minutes?, a subtask of T398868: [FY25-26 WE6.1.1] Move image build to deployment server and update for backports, as Resolved.
Sep 18 2025, 11:02 PM · Release-Engineering-Team (Doing 😎), Epic, OKR-Work
bd808 closed T401533: Why does build-and-push-container-images take 11 minutes? as Resolved.

Half of the time is apparently a workaround for T390251: docker-registry.wikimedia.org keeps serving bad blobs:
01:06:59 [mediawiki-publish-83] Waiting 300 seconds for swift after full mediawiki image build (T390251)

Sep 18 2025, 11:02 PM · User-bd808, OKR-Work, Release-Engineering-Team (Priority Backlog 📥)