Archive for the ‘Downtime’ Category

Matrix (chat) server maintenance

Wednesday, August 30th, 2023

All Matrix services will be offline for maintenance starting on Thursday 31th Aug 2023 in the morning around 06h00. Minimal downtime is 1h, but some bots/bridges may take longer.

The host system will be upgraded from Debian 10 to 12, followed by an upgrade of the database (PostgreSQL 11 to 15) and ~20 application servers.

Upgrade schedule:

phys.ethz.ch homeserver (people accounts)

First priority is the phys.ethz.ch homeserver hosting our accounts and rooms. Estimated (best case) downtime ~1h.

Your Matrix clients (Element) will show connectivity errors during the downtime:

Matrix homeserver offline

mbot.ethz.ch homeserver (bots, bridges)

Second priority is the mbot.ethz.ch homeserver, all bots, bridges present in the #mbot:phys.ethz.ch room and anything else. I expect most to be back after another few hours but Thursday evening at the latest.

The deprecated webhook bridge will be put out of service.

Not affected

The ETH homeservers staffchat.ethz.ch and studentchat.ethz.ch are not affected by this downtime.

Alternative: jitsi.phys.ethz.ch video conferencing with integrated chat.

home server maintenance

Wednesday, July 5th, 2023

Scheduled maintenance will be taking place on our home.phys.ethz.ch file server on Wednesday, July 12, starting at 16:00. The service will be down for approximately 4 hours. We will be replacing the hardware with all-flash storage and upgrade the base system.

Update 18:15: the new home server is open for business. Most SMB + NFS clients will not have survived the 2h downtime and will have to be rebooted. We'll go through the most obvious ones, but if yours won't work, try restarting.

All home directories (Linux, Windows and Mac, SMB and NFS) will be unavailable during this time.

For emergency cases, you'll have read-only access to the backups as described here.

This migration will mark the end of the huge storage migration project of 2023. Thanks for your patience.

group-data server maintenance

Wednesday, May 31st, 2023

Scheduled maintenance will be taking place on our group-data.phys.ethz.ch server on Wednesday, June 7, starting at 16:00. The service will be down for approximately 4 hours. We will be replacing some hardware and upgrade the base system.

All group shares will be affected except IPA, IGP and Galaxy.

For emergency cases, you'll have read-only access to the backups as described here.

Web server upgrade

Tuesday, February 8th, 2022

This Thursday 2022-02-10 starting at 07:00 we will upgrade the server hosting most of our websites.

Affected websites

The following websites are unavailable during the downtime:

Important changes for website owners

All website owners: If you are a website owner/admin, please join our new Matrix room #web:phys.ethz.ch, to get support and news. After the upgrade, please check your websites for problems.

Python WSGI app owners: All WSGI apps have been switched to use a virtual environment to pin the currently used Python package versions. We encourage you to review and upgrade your dependency versions (via requirements.txt) after the server upgrade. Please read our new WSGI documentation for details.

Versions

  • OS: Debian 10 -> 11
  • Python: 3.7 -> 3.9
  • PHP: 7.3 -> 7.4

Partial Network Downtime on Mon 6th Dec after 19h00

Monday, November 29th, 2021

The central Informatikdienste will have a scheduled downtime of all networking (cable and wireless) in the buildings HPK, HEZ, HPM, HPL and HPW on Monday 6th Dec 2021 in the evening between 19h00 and 23h00.

This is the second of three downtimes for the ongoing project to split the current networks into smaller chunks. This major undertaking will also induce a short downtime for some computers in the dynamic DHCP pool in other buildings (as some of our IP ranges are being moved to the listed buildings).

Users don’t need to do anything and their computers should come back online automatically. Otherwise try to reboot or get in touch with us.

In order to prepare for the migration, Informatikdienste will forbid all changes to their DHCP servers between Friday 3th Dec 13:00 and Tuesday morning. As a consequence we will not be able to register new devices or hostnames during this period.

Partial Network Downtime on Mon 8th Nov after 19h00

Monday, November 1st, 2021

The central Informatikdienste will have a scheduled downtime of all networking (cable and wireless) in the buildings HPH, HPP, HPR, HPS, HPV and HPZ on Monday 8th Nov 2021 in the evening between 19h00 and 23h00.

This is the first of three downtimes for the ongoing project to split the current networks into smaller chunks. This major undertaking will also induce a short downtime for some computers in the dynamic DHCP pool in other buildings (as some of our IP ranges are being moved to the listed buildings).

Users don't need to do anything and their computers should come back online automatically. Otherwise try to reboot or get in touch with us.

In order to prepare for the migration, Informatikdienste will forbid all changes to their DHCP servers between Friday 5th Nov 13:00 and Tuesday morning. As a consequence we will not be able to register new devices or hostnames during this period.

Web services downtime

Tuesday, January 19th, 2021

Update 07:00 All web services are back online.

Tomorrow Wednesday 2021-01-20 starting at 06:00 we will upgrade the server hardware hosting most of our web services. We expect them to be back by 08:00 at the latest.

Affected web services

The following services are unavailable during the downtime:

Our Debian, Ubuntu and Raspbian mirror as well as Grafana, InfluxDB and Webmail will not be affected.

We will not be able to send any status updates via our news blog or via our Matrix news and status rooms.

Hardware maintenance of storage front-end server igp-data.

Monday, November 30th, 2020

 Update 00:30:

All shares coming back online now. My sincerest apologies for the delay. And good night to all.

Update 22:40: We have run into hardware problems with one of the backends. The shares remain offline as we continue to diagnose it. A further update will be posted as soon as possible.

On Wednesday, 02 XII 2020, between 18:00 and 20:00, access to
igp-data shares will be interrupted for scheduled maintenance.
The shares need to be taken entirely offline for a network upgrade.
At the same time we will be adding more space to the underlying SAN.
This affects all shares on the igp-data storage gateway (ggl, pf and gsg).

Thank you for your patience, and kindest regards.

Hardware maintenance of storage front-end servers.

Thursday, July 30th, 2020

Update 23:50: we ran into severe problems and the migration took longer than expected. Everything is back online now. Sorry we're late.


Planned maintenance will be taking place on all shared-storage front-end servers on Thursday, August 6th, starting at 17:00. The service will be down for approximately 2-3 hours. This post will be updated as soon as work is completed, were we to finish earlier than expected. We will be upgrading the network switch and replacing hardware in several machines.

All group shares will be affected, i.e. group-data, IPA, IGP and Galaxy. Only the home and backup servers will be accessible during this time.

For emergency cases, there will be read-only access to last night’s backup as described here.

Group-data server hardware maintenance.

Tuesday, June 16th, 2020

Update 18:15 group-data is back!

Planned maintenance will be taking place on our group-data.phys.ethz.ch server on Friday, June 19, starting at 17:00. The service will be down for approximately 2 hours. We will be replacing the network interface card to improve service stability.

All group shares will be affected except IPA, IGP and Galaxy.

For emergency cases, there will be read-only access to last night’s backup as described here.