Page MenuHomePhabricator

Data-Platform-SRE (2025.09.05 - 2025.09.26)Milestone
ArchivedPublic

Members (6)

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

Milestone for DPE SRE

Recent Activity

Fri, Oct 10

Gehel closed T405978: Re-image remaining full graph hosts to post-graph-split roles, a subtask of T395772: Teardown lvs for wdqs public pool, as Resolved.
Fri, Oct 10, 8:33 AM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

Tue, Oct 7

Stevemunene reopened T404576: Enable the Container Storage Interface (CSI) and the Ceph CSI plugin on dse-k8s-codfw cluster as "Open".

Re opening this task since we have had some issues using ceph on dse-k8s-codfw.
To test the integration, we tried a simple pvc definition as a raw block device

Tue, Oct 7, 7:19 AM · Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)

Wed, Oct 1

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin2002 for host wdqs2017.codfw.wmnet with OS bullseye executed with errors:

  • wdqs2017 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2017.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Wed, Oct 1, 11:16 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by ryankemper@cumin2002 for host wdqs1018.eqiad.wmnet with OS bullseye executed with errors:

  • wdqs1018 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1018.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Wed, Oct 1, 11:14 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin2002 for host wdqs2017.codfw.wmnet with OS bullseye

Wed, Oct 1, 9:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by ryankemper@cumin2002 for host wdqs1018.eqiad.wmnet with OS bullseye

Wed, Oct 1, 9:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.
Wed, Oct 1, 1:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192890 merged by Bking:

[operations/puppet@production] wdqs-scholarly: Add wdqs2016 to load balancer pool

https://gerrit.wikimedia.org/r/1192890

Wed, Oct 1, 1:49 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192890 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs-scholarly: Add wdqs2016 to load balancer pool

https://gerrit.wikimedia.org/r/1192890

Wed, Oct 1, 1:35 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Tue, Sep 30

bking changed the status of T405978: Re-image remaining full graph hosts to post-graph-split roles from Open to In Progress.
Tue, Sep 30, 8:46 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking changed the status of T405978: Re-image remaining full graph hosts to post-graph-split roles, a subtask of T395772: Teardown lvs for wdqs public pool, from Open to In Progress.
Tue, Sep 30, 8:46 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata
bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.
Tue, Sep 30, 8:45 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:35:45Z] <bking@deploy2002> Finished deploy [wdqs/wdqs@fea7794]: T405978 (duration: 00m 10s)

Tue, Sep 30, 8:36 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:35:40Z] <bking@deploy2002> Started deploy [wdqs/wdqs@fea7794]: T405978

Tue, Sep 30, 8:35 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:33:58Z] <bking@deploy2002> Finished deploy [wdqs/wdqs@fea7794]: T405978 (duration: 00m 20s)

Tue, Sep 30, 8:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:33:44Z] <bking@deploy2002> Started deploy [wdqs/wdqs@fea7794]: T405978

Tue, Sep 30, 8:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192626 merged by Bking:

[operations/puppet@production] wdqs: add newly-reimaged hosts as scap targets

https://gerrit.wikimedia.org/r/1192626

Tue, Sep 30, 8:22 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.
Tue, Sep 30, 8:14 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a project to T405978: Re-image remaining full graph hosts to post-graph-split roles: Patch-For-Review.
Tue, Sep 30, 8:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192626 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs: add newly-reimaged hosts as scap targets

https://gerrit.wikimedia.org/r/1192626

Tue, Sep 30, 8:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.
Tue, Sep 30, 8:09 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T18:51:11Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 6:51 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:50Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:59 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:26Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:21Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:57:47Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:57 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:57:39Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:57 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin2002 for host wdqs2016.codfw.wmnet with OS bullseye executed with errors:

  • wdqs2016 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509301633_bking_3986025_wdqs2016.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2016.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Sep 30, 5:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin2002 for host wdqs2016.codfw.wmnet with OS bullseye

Tue, Sep 30, 4:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

sudo cookbook sre.hardware.upgrade-firmware -n -c nic wdqs2017.codfw.wmnet is failing with the error

  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 1072, in run
    failures += self._run_host(hostname)                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 1120, in _run_host     if not self.update_driver(
           ^^^^^^^^^^^^^^^^^^^                                                                           File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 943, in update_driv
er
    member = self._get_hw_member(redfish_host, driver_category)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 912, in _get_hw_mem
ber
    return self._filter_network(redfish_host, members)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 885, in _filter_net
work
    if port_data['LinkStatus'].lower() == 'up':
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                   AttributeError: 'NoneType' object has no attribute 'lower'

I'm going to try updating its other firmwares first and see what happens.

Tue, Sep 30, 2:37 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

wdqs201[6-7] have failed their reimages multiple times. I'm applying all outstanding firmware updates to both hosts and will try the reimages again after that.

Tue, Sep 30, 2:31 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Gehel removed a project from T405978: Re-image remaining full graph hosts to post-graph-split roles: Epic.
Tue, Sep 30, 1:38 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1002 for host wdqs2016.codfw.wmnet with OS bullseye executed with errors:

  • wdqs2016 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509292148_bking_159615_wdqs2016.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2016.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Sep 30, 12:51 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Mon, Sep 29

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1002 for host wdqs2017.codfw.wmnet with OS bullseye executed with errors:

  • wdqs2017 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2017.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Mon, Sep 29, 10:54 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1002 for host wdqs2017.codfw.wmnet with OS bullseye

Mon, Sep 29, 9:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
Maintenance_bot removed a project from T395772: Teardown lvs for wdqs public pool: Patch-For-Review.
Mon, Sep 29, 9:33 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata
ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1002 for host wdqs2016.codfw.wmnet with OS bullseye

Mon, Sep 29, 9:28 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
bking created T405978: Re-image remaining full graph hosts to post-graph-split roles.
Mon, Sep 29, 9:26 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata
gerritbot added a comment to T395772: Teardown lvs for wdqs public pool.

Change #1191525 merged by Ryan Kemper:

[operations/puppet@production] wdqs: shift old full graph hosts to new roles

https://gerrit.wikimedia.org/r/1191525

Mon, Sep 29, 9:17 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

Fri, Sep 26

Gehel archived Data-Platform-SRE (2025.09.05 - 2025.09.26).
Fri, Sep 26, 1:49 PM
gerritbot added a project to T405557: Request for airflow-wikidata-ops primary group: Patch-For-Review.
Fri, Sep 26, 1:40 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service
gerritbot added a comment to T405557: Request for airflow-wikidata-ops primary group.

Change #1191695 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Track airflow-wikidata-ops for offboarding

https://gerrit.wikimedia.org/r/1191695

Fri, Sep 26, 1:40 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service
BTullis closed T405508: Build a new container images for spark version 3.5.7 and spark-operator version 2.2.1 as Resolved.
(base) btullis@barracuda:~$ docker run -it docker-registry.wikimedia.org/repos/data-engineering/spark:3.5.7-2025-09-26-113540-22a5ded19e72b0705bce176096b9becec779cf4e@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce bash
Unable to find image 'docker-registry.wikimedia.org/repos/data-engineering/spark:3.5.7-2025-09-26-113540-22a5ded19e72b0705bce176096b9becec779cf4e@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce' locally
docker-registry.wikimedia.org/repos/data-engineering/spark@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce: Pulling from repos/data-engineering/spark
d62f11b5abe0: Already exists 
2b76c2925be8: Already exists 
9b108dfdd561: Pull complete 
226e4fecf9f8: Pull complete 
d46a2035a8fc: Pull complete 
309f0b5071dc: Pull complete 
1f7d4d323250: Pull complete 
1d886fccdf97: Pull complete 
Digest: sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce
Status: Downloaded newer image for docker-registry.wikimedia.org/repos/data-engineering/spark@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce
++ id -u
+ myuid=926
++ id -g
+ mygid=926
+ set +e
++ getent passwd 926
+ uidentry=spark:x:926:926::/home/spark:/bin/sh
+ set -e
+ '[' -z spark:x:926:926::/home/spark:/bin/sh ']'
+ '[' -z /usr/lib/jvm/java-8-openjdk-amd64 ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
++ command -v readarray
+ '[' readarray ']'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n /opt/hadoop ']'
+ '[' -z '' ']'
++ /opt/hadoop/bin/hadoop classpath
+ export 'SPARK_DIST_CLASSPATH=/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/contrib/capacity-scheduler/*.jar'
+ SPARK_DIST_CLASSPATH='/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/contrib/capacity-scheduler/*.jar'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*:/var/tmp'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
Non-spark-on-k8s command provided, proceeding in pass-through mode...
+ CMD=("$@")
+ exec /usr/bin/tini -s -- bash
spark@fc8f61613ef5:/var/tmp$
Fri, Sep 26, 1:16 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Patch-For-Review
BTullis closed T405508: Build a new container images for spark version 3.5.7 and spark-operator version 2.2.1, a subtask of T405490: Upgrade spark-operator to v2.2.1, as Resolved.
Fri, Sep 26, 1:16 PM · Essential-Work, Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)
gerritbot added a comment to T405490: Upgrade spark-operator to v2.2.1.

Change #1191137 merged by jenkins-bot:

[operations/deployment-charts@master] Remove our custom spark-operator helm chart

https://gerrit.wikimedia.org/r/1191137

Fri, Sep 26, 1:05 PM · Essential-Work, Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)
gerritbot added a comment to T405492: Review upstream helm chart for spark-operator v2.2.1.

Change #1191137 merged by jenkins-bot:

[operations/deployment-charts@master] Remove our custom spark-operator helm chart

https://gerrit.wikimedia.org/r/1191137

Fri, Sep 26, 1:05 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17)
MoritzMuehlenhoff added a comment to T405557: Request for airflow-wikidata-ops primary group.

I've created the group with @gmodena as the initial member:

Fri, Sep 26, 12:49 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service
Stashbot added a comment to T405557: Request for airflow-wikidata-ops primary group.

Mentioned in SAL (#wikimedia-operations) [2025-09-26T12:35:43Z] <moritzm> created cn=airflow-wikidata-ops group T405557

Fri, Sep 26, 12:35 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service
MoritzMuehlenhoff triaged T405557: Request for airflow-wikidata-ops primary group as Medium priority.
Fri, Sep 26, 12:30 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service
gerritbot added a comment to T405490: Upgrade spark-operator to v2.2.1.

Change #1191136 merged by jenkins-bot:

[operations/deployment-charts@master] Remove the existing spark-operator release

https://gerrit.wikimedia.org/r/1191136

Fri, Sep 26, 11:43 AM · Essential-Work, Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)