production SAL

1-50 of 10000 results (88ms)

2024-07-23 §
15:22	<claime>	Uncordoning dse-k8s-worker1008.eqiad.wmnet after T365998	[production]
15:20	<andrewbogott>	find /srv/mediawiki/images/wikitech/archive -type f \| xargs delete on wikitech-static, drive is full of nonsense	[production]
15:07	<brennen@deploy1002>	Finished deploy [phabricator/deployment@3902e30]: deploy phab1004 for T370776 (duration: 00m 33s)	[production]
15:06	<brennen@deploy1002>	Started deploy [phabricator/deployment@3902e30]: deploy phab1004 for T370776	[production]
15:06	<brennen@deploy1002>	Finished deploy [phabricator/deployment@3902e30]: deploy phab2002 for T370776 (redux, first deploy a mistaken no-op) (duration: 00m 34s)	[production]
15:05	<brennen@deploy1002>	Started deploy [phabricator/deployment@3902e30]: deploy phab2002 for T370776 (redux, first deploy a mistaken no-op)	[production]
15:05	<brennen@deploy1002>	Finished deploy [phabricator/deployment@7335128]: deploy phab2002 for T370776 (duration: 01m 17s)	[production]
15:03	<brennen@deploy1002>	Started deploy [phabricator/deployment@7335128]: deploy phab2002 for T370776	[production]
15:03	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update	[production]
15:03	<jelto@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update	[production]
15:03	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update	[production]
15:02	<jelto@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update	[production]
15:02	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update	[production]
15:02	<jelto@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update	[production]
15:01	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: JunOS upgrade lsw1-f3-eqiad	[production]
15:01	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: JunOS upgrade lsw1-f3-eqiad	[production]
15:01	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-f3-eqiad,lsw1-f3-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt with reason: JunOS upgrade lsw1-f3-eqiad	[production]
15:00	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-f3-eqiad,lsw1-f3-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt with reason: JunOS upgrade lsw1-f3-eqiad	[production]
15:00	<topranks>	rebooting lsw1-f3-eqiad to complete JunOS upgrade (T365998)	[production]
14:59	<XioNoX>	deploy CR1055546 border-in: remove authdns filter	[production]
14:59	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply	[production]
14:58	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/shellbox-video: apply	[production]
14:54	<Emperor>	moss-be1003 into maintenance mode for network downtime T365998	[production]
14:48	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:50:00 on lsw1-f3-eqiad.mgmt with reason: prep JunOS upgrade lsw1-f3-eqiad	[production]
14:48	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 0:50:00 on lsw1-f3-eqiad.mgmt with reason: prep JunOS upgrade lsw1-f3-eqiad	[production]
14:10	<klausman@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1001.eqiad.wmnet	[production]
14:09	<ChrisDobbins901_>	cdobbins@cumin1002:~$ sudo cumin 'A:cp' 'run-puppet-agent "merging CR #1041705"'	[production]
14:06	<akosiaris@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on deploy1003.eqiad.wmnet with reason: host reimage	[production]
14:03	<klausman@cumin1002>	START - Cookbook sre.hosts.reboot-single for host ml-serve1001.eqiad.wmnet	[production]
14:03	<akosiaris@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on deploy1003.eqiad.wmnet with reason: host reimage	[production]
13:58	<Lucas_WMDE>	UTC afternoon backport+config window done	[production]
13:57	<logmsgbot>	lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for [[gerrit:1056155\|MoveLogFormatter::getPreloadTitles: Handle bad titles (T370396)]] (duration: 09m 24s)	[production]
13:52	<logmsgbot>	lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Continuing with sync	[production]
13:51	<XioNoX>	deploy CR1055544 border-in: remove squid and nrpe filters, expand LVS filter	[production]
13:51	<logmsgbot>	lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for [[gerrit:1056155\|MoveLogFormatter::getPreloadTitles: Handle bad titles (T370396)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
13:50	<sukhe>	running authdns-update after dns6001 depool	[production]
13:50	<XioNoX>	deploy CR1055543: border-in: remove git-ssh term	[production]
13:49	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
13:49	<akosiaris@cumin1002>	START - Cookbook sre.hosts.reimage for host deploy1003.eqiad.wmnet with OS bullseye	[production]
13:48	<cgoubert@cumin1002>	START - Cookbook sre.dns.netbox	[production]
13:47	<logmsgbot>	lucaswerkmeister-wmde@deploy1002 Started scap sync-world: Backport for [[gerrit:1056155\|MoveLogFormatter::getPreloadTitles: Handle bad titles (T370396)]]	[production]
13:44	<ChrisDobbins901_>	cdobbins@cumin1002:~$ sudo cumin 'A:cp' 'disable-puppet "merging CR #1041705"'	[production]
13:43	<brouberol@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply	[production]
13:40	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org [reason: finished upgrading anycast-hc: T370068]	[production]
13:38	<cmooney@cumin1002>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow7001.magru.wmnet	[production]
13:37	<sukhe@puppetmaster1001>	conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org [reason: upgrading anycast-hc: T370068]	[production]
13:34	<cmooney@cumin1002>	START - Cookbook sre.ganeti.reboot-vm for VM netflow7001.magru.wmnet	[production]
13:34	<cmooney@cumin1002>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow6001.drmrs.wmnet	[production]
13:31	<cmooney@cumin1002>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow5002.eqsin.wmnet	[production]
13:30	<cmooney@cumin1002>	START - Cookbook sre.ganeti.reboot-vm for VM netflow6001.drmrs.wmnet	[production]