Archive for the ‘Downtime’ Category

Migration of Dynamic DNS

Tuesday, January 21st, 2020

Some of you make use of our DynDNS infrastructure that automatically assigns hostnames to computers with a dynamic IP address. This feature enables you to connect to your computer using its sent hostname, followed by the dhcp.phys.ethz.ch domain (eg example.dhcp.phys.ethz.ch) instead of the ever-changing dynamic IP address.

Thursday morning

Jan 30 2020 between 9:00 and 11:00

we will be migrating our DynDNS service to the servers of central Informatikdienste. As a consequence the resolution of example.dhcp.phys.ethz.ch to its dynamic IP address may not always work during that time. The global phys.ethz.ch and ethz.ch domains are not affected. Therefore the bulk of our users will not even notice the migration.

Update: Informatikdienste have postponed the migration from 23rd to 30th January.

Groupware upgrade

Wednesday, September 25th, 2019

Update 08:00: Migration completed. Please note that a legacy CalDAV URL has changed - if you're using a CalDAV client (for example Thunderbird or Apple Calendar), make sure you have the correct URL according to the documentation

For our calendar solution groupware.phys we schedule a migration on Friday, September 27, starting at 07:30. The service will be down for approximately 1 hour. We will move the service to a new virtual machine and upgrade to a new version.

FileMaker Upgrade

Friday, August 30th, 2019

We will upgrade our FileMaker server next Tuesday 3rd September 2019 between 20:00 and 22:00 o'clock. This will lead to a downtime of the services that depend on a FileMaker database, for instance experimente.phys.ethz.ch and lager.phys.ethz.ch.

The new FileMaker server will only work with FileMaker clients version 16 or newer. If you need to access a FileMaker database from your computer, we recommend you install the latest FileMaker 18 from the IT Shop. If you have a ISG-managed computer, we will take care of upgrading the FileMaker client.

Home server maintenance on Tue, July 9, 17:00

Wednesday, July 3rd, 2019

Update 20:10 Migration finished! Everything should work as normal.

In order to guarantee sustained performance and availability of our storage system, we schedule a maintenance downtime of our home directory server on

Tuesday, July 09, starting at 17:00

This only affects the home shares (technically: smb:\\home.phys.ethz.ch & /home/USERNAME). Email and group shares will have no interruption.

Since the server also needs a file system check, the downtime will take several hours.

For emergency cases, there will be read-only access to last night’s backup as described here .

We will update this posting once the home server is back online.

Storage migration

Monday, December 3rd, 2018

Update 21:00 - IGP shares are back. Welcome to igp-data!
Update 19:30 - the D-PHYS shares are back. IGP will take a little more time.

In order to guarantee sustained performance and availability of our storage system, we need to schedule a few storage maintenance windows. The first one will take place on Wednesday, 12.12.2018 at 16:00 and affect all D-PHYS and IGP group shares, but not IPA or galaxy (technically: windata/macdata, but not astrogate or ipa-data). The relevant shares will be offline for at least 3 hours.

For emergency cases, there will be read-only access to last night's backup as described here.

Please note that these migrations will bring some overall changes to the D-PHYS storage setup:

  • the SMBv1 protocol will be disabled on all file servers. It has a long history of security issues and we've migrated all clients to newer versions, so this should not affect anyone. However, there's a small chance that we didn't catch all connections, so please contact us if you experience any issues after the migration.
  • all SMB protocol versions will be restricted to ETH-internal access. This step has been long overdue and since most ISPs block the necessary ports anyway, it shouldn't affect too many users. What it means however: in the future, file server access from outside ETH requires VPN.
  • IGP/D-BAUG will get their own front-end server igp-data. If you're with IGP and have already switched your file server mounts from windata to igp-data, you're good and don't have to do anything. If you haven't, you should do so before Dec 12 in order to get a seamless migration experience.

We'll update this post as the migration progresses and as soon as the systems are back.

Groupware migration

Thursday, September 27th, 2018

On Tuesday, October 2, starting at 07:00, we will migrate our groupware instance to another server. For about 1 hour you won't have access to your calendar. If you're one of the few people who also sync their email via groupware, mail will be offline too (you can always use webmail). After the migration your clients should just reconnect and resume syncing. If you notice any issues after we're done, please get in touch.

Update Wed 07:45: migration completed, please let us know if you experience any problems.

Mail server maintenance on Tue, March 27

Friday, March 23rd, 2018

Update 07:25 The migration is complete and our mail server is back online. Please let us know if you notice anything peculiar. This concludes our multi-step migration to the new mail server hardware

---

In order to finalize the upgrade of the D-PHYS mail server, we schedule a maintenance downtime on

Tuesday, March 27, between 06:30 and 08:00 in the morning

During that time it will not be possible to send or receive emails. In particular, incoming external emails will not be lost, but held on the sender’s side and will be delivered after the migration. Outgoing mail will be kept in your mail client until the connection is restored.

We will update this posting once the mail server is back online.

New location for mail filtering rules, forwarding and vacation auto-replies

After the migration, all mail-related settings will be consolidated into the Roundcube Webmail interface:

  • spam filtering rules (whitelist, blacklist)
  • forwarding of your emails to a different account
  • setting a vacation or out-of-office auto-reply message
  • defining rules to automatically file incoming mails into specific folders

This will make configuring your email settings easier and also give you more options than before (for example, the out-of-office auto-reply can now be configured to automatically terminate at the end of your absence).

Please refer to our readme for details on how to customize these settings in the future. Feel free to contact us if you have any questions.

The current settings of all active users have been converted and imported.

In technical terms we are migrating from procmail to sieve. In particular the hidden text file ~/.procmailrc in the user's home folder will be ignored after the migration.

Mail server maintenance on Wed, Jan 24

Friday, January 19th, 2018

Update 07:25 Migration finished, welcome on the new mail server!

We schedule a maintenance downtime for the D-PHYS mail server on

Wednesday, January 24, between 07:00 and 08:00 in the morning

During this period, sending and receiving new emails will have interruptions, thereby delaying incoming and outgoing mails. In particular, incoming external emails will not be lost, but held on the sender's side and will be delivered after the migration. Outgoing mail will be kept in your mail client until the connection is restored. The IMAP server will not be affected, so all email clients should have continuous access to the existing mailboxes.

This maintenance window will be used to migrate the first part of our mail server infrastructure to the latest version of the operating system and new hardware with fast SSD storage.

New location for SpamAssassin user preferences

We re-designed how our mail server is parsing the user's configuration for the spam filtering. Currently one has to edit the hidden text file ~/.spamassassin/user_prefs in the home folder. Starting from next Wednesday the spam filtering rules can be edited more conveniently through the settings in the Webmail interface. This will allow users to easily

  • accept mail from a given sender and never mark it as spam (whitelist)
  • reject mail from a given sender and always mark it as spam (blacklist)
  • set the threshold score required for any message to be considered as spam

The existing user preferences have been parsed and all of the above settings have been imported into the new setup. The contents of ~/.spamassassin/ will be ignored after the migration. Please contact us if you have questions regarding your advanced SpamAssassin rules.

Group share woes

Friday, December 8th, 2017

Update 20.12.: the strange intermittent permission problems some of you experienced could be traced back to a kernel regression. We're now back to using an older kernel.

Update 13.12.: we're cautiously optimistic that the problems have been fixed. Since Monday the file server has survived everything we threw at it. The culprit seems to be an Infiniband switch that sporadically disconnected under heavy load. We're now also turning on some performance improvements again, so you should see a speed increase when browsing files.

Update 06:45: group shares are back. Please let us know if you encounter any problems.

As some of you might have noticed, we've had some service quality issues with our group share server in the last few months. While not all interruptions are under our control (Informatikdienste lately have been very busy upgrading the ETH network, causing various network disruptions), we do have a problem with the group share server: it runs fine for weeks on end until it suddenly doesn't. To this day we have not been able to pinpoint the underlying problem, despite having changed a lot of parameters, both software and hardware. Our next step will be replacing the kernel on the disk backends and switch some hardware - for that we need a scheduled downtime on

Monday, December 11, starting at 06:00

during which the group shares will be unavailable for about 90 minutes. This affects all D-PHYS and IGP shares except the Astro and newly migrated IPA ones. We will post an update when the system is back.

We do apologize for the inconvenience these service issues might have caused you. Please bear with us while we're trying to locate and eliminate the root cause. We're monitoring the situation 24/7 and try to react as quickly as possible whenever a problem occurs. But wait! You can help! There seems to be a correlation between crash probability and large scale small file I/O. This means you should, whenever possible, avoid reading or writing a lot of small files and bundle your data into fewer and larger files. This also increases performance!

Server room migration on Wed, Aug 23

Tuesday, July 25th, 2017

Update Thursday 01:45: we hit some unexpected problems with the non-Astro group shares. Everything is back now, please let us know if you expericence any problems..

Some months ago, we were informed by Informatikdienste that we would have to migrate our two water cooled racks in the HIT server room due to upcoming remodeling. This move will take place on

Wednesday, August 23, starting at 16:00

and last for several hours. During this time, all our IT services will be unavailable, including login, e-mail, storage and ISG-hosted websites. Incoming e-mail will be kept back and delivered afterwards. We will give our best to have login and e-mail back up within the first two hours, but group drives will take a bit longer due to the sheer amount of hardware we have to move.
We apologize for any inconvenience. Unfortunately, this migration cannot be performed on a weekend as we might have to interact with our colleagues at Informatikdienste, but it will ensure secure and enduring operation of our servers in the future.

some impressions from the migration - thanks to the whole team!