Projects Data-Platform-SRE 2025.09.05 - 2025.09.26

Data-Platform-SRE (2025.09.05 - 2025.09.26)Milestone
ArchivedPublic
Watch Project

Members (6)

ChromboKen (Kenny Chirombo)
Volunteer Software Developer
Gehel (Guillaume Lederrey)
Engineering Manager, Search Platform / Data Platform SRE
RKemper (Ryan Kemper)
User
bking (Brian King)
Senior Site Reliability Engineer, Search Platform Team
Stevemunene (Stevemunene)
Data Platform SRE
BTullis (Ben)
Staff SRE
View All

Watchers

This project does not have any watchers.
View All

Details

Description

Milestone for DPE SRE

Recent Activity
View All

Fri, Oct 10

Gehel closed T405978: Re-image remaining full graph hosts to post-graph-split roles, a subtask of T395772: Teardown lvs for wdqs public pool, as Resolved.

Fri, Oct 10, 8:33 AM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

Tue, Oct 7

Stevemunene reopened T404576: Enable the Container Storage Interface (CSI) and the Ceph CSI plugin on dse-k8s-codfw cluster as "Open".

Re opening this task since we have had some issues using ceph on dse-k8s-codfw.
To test the integration, we tried a simple pvc definition as a raw block device

Tue, Oct 7, 7:19 AM · Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)

Wed, Oct 1

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin2002 for host wdqs2017.codfw.wmnet with OS bullseye executed with errors:

wdqs2017 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2017.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Wed, Oct 1, 11:16 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by ryankemper@cumin2002 for host wdqs1018.eqiad.wmnet with OS bullseye executed with errors:

wdqs1018 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1018.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.

Wed, Oct 1, 11:14 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin2002 for host wdqs2017.codfw.wmnet with OS bullseye

Wed, Oct 1, 9:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by ryankemper@cumin2002 for host wdqs1018.eqiad.wmnet with OS bullseye

Wed, Oct 1, 9:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.

Wed, Oct 1, 1:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192890 merged by Bking:

[operations/puppet@production] wdqs-scholarly: Add wdqs2016 to load balancer pool

https://gerrit.wikimedia.org/r/1192890

Wed, Oct 1, 1:49 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192890 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs-scholarly: Add wdqs2016 to load balancer pool

https://gerrit.wikimedia.org/r/1192890

Wed, Oct 1, 1:35 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Tue, Sep 30

bking changed the status of T405978: Re-image remaining full graph hosts to post-graph-split roles from Open to In Progress.

Tue, Sep 30, 8:46 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking changed the status of T405978: Re-image remaining full graph hosts to post-graph-split roles, a subtask of T395772: Teardown lvs for wdqs public pool, from Open to In Progress.

Tue, Sep 30, 8:46 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.

Tue, Sep 30, 8:45 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:35:45Z] <bking@deploy2002> Finished deploy [wdqs/wdqs@fea7794]: T405978 (duration: 00m 10s)

Tue, Sep 30, 8:36 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:35:40Z] <bking@deploy2002> Started deploy [wdqs/wdqs@fea7794]: T405978

Tue, Sep 30, 8:35 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:33:58Z] <bking@deploy2002> Finished deploy [wdqs/wdqs@fea7794]: T405978 (duration: 00m 20s)

Tue, Sep 30, 8:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T20:33:44Z] <bking@deploy2002> Started deploy [wdqs/wdqs@fea7794]: T405978

Tue, Sep 30, 8:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192626 merged by Bking:

[operations/puppet@production] wdqs: add newly-reimaged hosts as scap targets

https://gerrit.wikimedia.org/r/1192626

Tue, Sep 30, 8:22 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.

Tue, Sep 30, 8:14 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a project to T405978: Re-image remaining full graph hosts to post-graph-split roles: Patch-For-Review.

Tue, Sep 30, 8:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Change #1192626 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs: add newly-reimaged hosts as scap targets

https://gerrit.wikimedia.org/r/1192626

Tue, Sep 30, 8:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking updated the task description for T405978: Re-image remaining full graph hosts to post-graph-split roles.

Tue, Sep 30, 8:09 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T18:51:11Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 6:51 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:50Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:59 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:26Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:58:21Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:58 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:57:47Z] <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:57 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Stashbot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T17:57:39Z] <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T405978, transfer scholarly graph to newly-reimaged host) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2016.codfw.wmnet w/ force delete existing files, repooling both afterwards

Tue, Sep 30, 5:57 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin2002 for host wdqs2016.codfw.wmnet with OS bullseye executed with errors:

wdqs2016 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509301633_bking_3986025_wdqs2016.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2016.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Tue, Sep 30, 5:56 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin2002 for host wdqs2016.codfw.wmnet with OS bullseye

Tue, Sep 30, 4:12 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

sudo cookbook sre.hardware.upgrade-firmware -n -c nic wdqs2017.codfw.wmnet is failing with the error

  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 1072, in run
    failures += self._run_host(hostname)                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 1120, in _run_host     if not self.update_driver(
           ^^^^^^^^^^^^^^^^^^^                                                                           File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 943, in update_driv
er
    member = self._get_hw_member(redfish_host, driver_category)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 912, in _get_hw_mem
ber
    return self._filter_network(redfish_host, members)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/deployment/spicerack/cookbooks/sre/hardware/upgrade-firmware.py", line 885, in _filter_net
work
    if port_data['LinkStatus'].lower() == 'up':
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                   AttributeError: 'NoneType' object has no attribute 'lower'

I'm going to try updating its other firmwares first and see what happens.

Tue, Sep 30, 2:37 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

wdqs201[6-7] have failed their reimages multiple times. I'm applying all outstanding firmware updates to both hosts and will try the reimages again after that.

Tue, Sep 30, 2:31 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Gehel removed a project from T405978: Re-image remaining full graph hosts to post-graph-split roles: Epic.

Tue, Sep 30, 1:38 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1002 for host wdqs2016.codfw.wmnet with OS bullseye executed with errors:

wdqs2016 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509292148_bking_159615_wdqs2016.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2016.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Tue, Sep 30, 12:51 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Mon, Sep 29

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1002 for host wdqs2017.codfw.wmnet with OS bullseye executed with errors:

wdqs2017 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs2017.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Mon, Sep 29, 10:54 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1002 for host wdqs2017.codfw.wmnet with OS bullseye

Mon, Sep 29, 9:34 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

Maintenance_bot removed a project from T395772: Teardown lvs for wdqs public pool: Patch-For-Review.

Mon, Sep 29, 9:33 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

ops-monitoring-bot added a comment to T405978: Re-image remaining full graph hosts to post-graph-split roles.

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1002 for host wdqs2016.codfw.wmnet with OS bullseye

Mon, Sep 29, 9:28 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

bking created T405978: Re-image remaining full graph hosts to post-graph-split roles.

Mon, Sep 29, 9:26 PM · Data-Platform-SRE (2025.09.26 - 2025.10.17), Essential-Work, Wikidata-Query-Service, Wikidata

gerritbot added a comment to T395772: Teardown lvs for wdqs public pool.

Change #1191525 merged by Ryan Kemper:

[operations/puppet@production] wdqs: shift old full graph hosts to new roles

https://gerrit.wikimedia.org/r/1191525

Mon, Sep 29, 9:17 PM · Essential-Work, Data-Platform-SRE (2025.09.05 - 2025.09.26), Epic, Wikidata-Query-Service, Wikidata

Fri, Sep 26

Gehel archived Data-Platform-SRE (2025.09.05 - 2025.09.26).

Fri, Sep 26, 1:49 PM

gerritbot added a project to T405557: Request for airflow-wikidata-ops primary group: Patch-For-Review.

Fri, Sep 26, 1:40 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service

gerritbot added a comment to T405557: Request for airflow-wikidata-ops primary group.

Change #1191695 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Track airflow-wikidata-ops for offboarding

https://gerrit.wikimedia.org/r/1191695

Fri, Sep 26, 1:40 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Infrastructure-Foundations, Wikidata, Wikidata-Query-Service

BTullis closed T405508: Build a new container images for spark version 3.5.7 and spark-operator version 2.2.1 as Resolved.

(base) btullis@barracuda:~$ docker run -it docker-registry.wikimedia.org/repos/data-engineering/spark:3.5.7-2025-09-26-113540-22a5ded19e72b0705bce176096b9becec779cf4e@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce bash
Unable to find image 'docker-registry.wikimedia.org/repos/data-engineering/spark:3.5.7-2025-09-26-113540-22a5ded19e72b0705bce176096b9becec779cf4e@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce' locally
docker-registry.wikimedia.org/repos/data-engineering/spark@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce: Pulling from repos/data-engineering/spark
d62f11b5abe0: Already exists 
2b76c2925be8: Already exists 
9b108dfdd561: Pull complete 
226e4fecf9f8: Pull complete 
d46a2035a8fc: Pull complete 
309f0b5071dc: Pull complete 
1f7d4d323250: Pull complete 
1d886fccdf97: Pull complete 
Digest: sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce
Status: Downloaded newer image for docker-registry.wikimedia.org/repos/data-engineering/spark@sha256:d887963947332977d36fee6888aedf101068abce4afac6ba7af0a29a2ead03ce
++ id -u
+ myuid=926
++ id -g
+ mygid=926
+ set +e
++ getent passwd 926
+ uidentry=spark:x:926:926::/home/spark:/bin/sh
+ set -e
+ '[' -z spark:x:926:926::/home/spark:/bin/sh ']'
+ '[' -z /usr/lib/jvm/java-8-openjdk-amd64 ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
++ command -v readarray
+ '[' readarray ']'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n /opt/hadoop ']'
+ '[' -z '' ']'
++ /opt/hadoop/bin/hadoop classpath
+ export 'SPARK_DIST_CLASSPATH=/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/contrib/capacity-scheduler/*.jar'
+ SPARK_DIST_CLASSPATH='/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/contrib/capacity-scheduler/*.jar'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*:/var/tmp'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
Non-spark-on-k8s command provided, proceeding in pass-through mode...
+ CMD=("$@")
+ exec /usr/bin/tini -s -- bash
spark@fc8f61613ef5:/var/tmp$

Fri, Sep 26, 1:16 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Patch-For-Review

BTullis closed T405508: Build a new container images for spark version 3.5.7 and spark-operator version 2.2.1, a subtask of T405490: Upgrade spark-operator to v2.2.1, as Resolved.

Fri, Sep 26, 1:16 PM · Essential-Work, Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)

gerritbot added a comment to T405490: Upgrade spark-operator to v2.2.1.

Change #1191137 merged by jenkins-bot:

[operations/deployment-charts@master] Remove our custom spark-operator helm chart

https://gerrit.wikimedia.org/r/1191137

Fri, Sep 26, 1:05 PM · Essential-Work, Patch-For-Review, Data-Platform-SRE (2025.09.26 - 2025.10.17)

gerritbot added a comment to T405492: Review upstream helm chart for spark-operator v2.2.1.

Change #1191137 merged by jenkins-bot:

[operations/deployment-charts@master] Remove our custom spark-operator helm chart

https://gerrit.wikimedia.org/r/1191137