UPDATE Thu 09:30
sometimes it just has to work, and fast!
– all systems should be back to normal. Please let us know if you still encounter problems. Thanks to Axel and Paddy for their commitment and the incredible Dalco service for fixing it within 6h (at 8am, mind you).
UPDATE Thu 00:50 – a broken valve blocked the cooling water in the HIT D 13 server room and all 14 water cooled racks severely overheated (not just D-PHYS). We managed to revive almost all services with the exception of the GGL file shares (this server is dead). We’ll post updates later today when we have more information.
complete loss of cooling in the server room. We have yet to assess the damage.
Update Thursday 07:15 – all systems back to normal. whew.
Update Wednesday 11:00 – first rack back to normal. The picture shows how it looked during the remodeling.
Update Wednesday 07:00 – the most difficult part is over and the first rack is being retrofitted as we speak
All of D-PHYS’s important servers (and services: mail, homes, SAN, web) reside in two water-cooled racks in HIT D 13. On Wednesday, August 20 those racks will have to be retrofitted by our colleagues of Informatikdienste since certain spare parts are no longer available. We have an elaborate plan how to externally power the servers while the racks are offline that schedules a 5-minute downtime that most of you won’t even notice. However, there is a small chance that this external power supply does not work as expected which would lead to a longer interruption. Unfortunately we have no influence on the date, time and procedure of this modification and can only try our best to minimize potential consequences. So if something should go wrong next Wednesday, please don’t panic, we’ll be hard at work to fix it ASAP.
Thank you for your cooperation.
We have scheduled a software maintenance of the D-PHYS mail server for tomorrow, Wednesday, the 18th of June 2014, starting in the late afternoon around 5pm. A downtime of all D-PHYS mail services during the evening will be part of the maintenance. The downtime is expected to take approximately 15 to 30 minutes.
During the downtime sending and receiving e-mails will not be possible and the web mail service will be not available. Incoming mails during the downtime will be delayed.
Additionally there will be a downtime of our “BackupPC” backup service for laptops and lab PCs due to server relocation on Thursday (19th of June 2014) starting around 9am.
Microsoft provided a final bunch of patches for Windows XP in April 2014. Since then no more security and stability fixes are going to be released. This means that still running Windows XP machines conflict with the ETH Bot (Acceptable Use Policy for Telematics) which requires that every computer connected to the ETH network must be fully updated and secured.
The central IT security group of ETHZ continuously inspects the network streams for signatures of XP computers. In the D-PHYS public networks they still detect around 15 Windows XP based computers. If you have a running XP machine connected to the public network, please migrate the operating system to a newer version i.e Windows 7.
In case you are forced to keep Windows XP up and running, you can migrate the machine to our eXile network. Simply send the required information to firstname.lastname@example.org after you’ve read and understood the eXile Terms-of-Use, so we can prepare the machine for the eXile network.
If you have any questions or need help please do not hesitate to contact the ISG D-PHYS Helpdesk
On Monday, May 5, there will be a scheduled power outage in the HPT building, between 13:00 and 22:00. This will also affect ISG’s offices, but none of the servers. Our services will run as usual, but we’ll have to move the helpdesk to a temporary location during the outage. So please be patient when calling and wait for your call to be redirected to our pager or write an email instead. We hope to be back to normal by Tuesday morning.
On Monday the public was made aware of a severe bug in OpenSSL, a cryptography library which is used as the core of many cryptographically secured IT services. Since the bug was in the Heartbeat extension it has been named “Heartbleed”.
This bug allowed attackers to stealthily access parts of the memory used for cryptographic actions, i.e. it may include digital keys in use on servers or passwords transferred over encrypted connections.
If you used any password-protected D-PHYS web services or the D-PHYS mail server between 12th of December 2013 (or used the BackupPC web-interface since end of 2012) and Tuesday, the 8th of April 2014, there is a very small chance that your D-PHYS password and possibly other transmitted data may have been leaked to an attacker. We currently have no indication that this has actually happened on our servers.
To be safe, you might want to change the password of your D-PHYS account and any other account where the same password is used. See this Heise article for a discussion (in German) about whether you should change your password or not.
Read the rest of this entry »
The central network group informed us about a planed network interruption between 6:30 and 7:30 a.m. on the 10th of April 2014 due to maintenance work.
The following rooms are affected by this interruption:
HPT D1 – HPT D20 and HPT E1 – HPT E17.
Due to this interruption it may not be possible to access the D-PHYS services and internet from this rooms.
As announced in an earlier post last year, Microsoft is going to end the support for Windows XP in April 2014.
After this date the central network security group of the ETH will frequently scan our public networks to identify any existing Windows XP machines. Every Windows XP detected by such a scan will be disabled on the network level since it is strictly prohibited to keep this operating system up and running on the public network of ETH.
Since we are aware that there may be Windows XP machines living on after the end-of-life date, we worked out a solution to support these situations and to help you not to get in conflict with the network usage regulations.
We founded a project called eXile which provides very locked down network environments that are monitored by advanced security techniques and provide excessive firewall setups. Furthermore eXile provides easy interfaces for you to manage your computers and overview the security state and network access to your machines in eXile.
You can send your machines to the eXile when they match one of the following scenarios:
- Lab computers (controlling, collecting measure data, or monitoring other systems)
- Industrial computers
- Embedded systems
The following applications are not suitable for eXile and need to be migrated to a supported operating system:
- Office Computers
- Computers on which internet access needs to be available
- Computers on which emails are received and sent
- Computers that provide any services to public computers in the internet
Please note that eXile should not be seen as an excuse not to migrate your Windows XP to a supported operating system as soon as possible. The purpose of eXile is really only to address those few machines that are somehow locked to their operating system.
Nevertheless we invented eXile to address the Windows XP end-of-live problem, it is capable to take up any other computer for which you want to have an extra level of security or on which you run any other outdated or insecure operating system.
If you think your remaining Windows XP computers are candidates to send to eXile, we would be happy if you could send a message to email@example.com and inform us about the number of computers and what application you are using these computers for. Later this month a web interface will be made available on https://exile.phys.ethz.ch/ where you can directly register every machine you want to send to eXile.
After eXile is fully online, another post will be submitted here.
On Monday morning we found out that large incoming mails (1 MBytes or larger) were dropped without leaving any error messages in our log files. These mails were lost between Thursday (Jan 9) evening 18:27 and Monday (Jan 13) morning 11:06. Some indicators (i.e. spam filter rules for this case) lead us to estimate the number of about 560 broken local deliveries to about 300 unique recipients.
If you expected e-mails with attachments close to 1 MB or larger within this time frame there is a high likelihood that they got lost. The only information we still have about these mails are sender, recipient and arrival date and time. If you were one of these recipients, please contact the sender to send it again.
You can check on this web page if mails you should have received were lost. You’ll have to log in with your D-PHYS account and will see sender (or mailing list) of and time when the lost mail arrived. Additionally we’ll inform all affected recipients individually, too.
The problem occured after one of the software updates on Thursday which brought stricter code checking, and is solved since Monday morning 11:06.
The issue was caused by a long standing and subtle programming error in the check which prevents bigger mails from being inspected closely by the main spam filter for performance reasons. It was only triggered upon local mail delivery, so mails sent from D-PHYS to outside D-PHYS were not affected. E-mails to D-PHYS mailing lists (or other mailing lists) with archive should be available in the according mailing list archives.
We’re truly sorry for any inconvenience this may have caused and have already taken measures so that similar issues won’t result in mail loss from now on.
Update: it happens to the best of us: Gmail for iOS bug might cause data loss
On Thursday, the 9th of January 2014, starting in the late afternoon, we will run multiple software updates on the D-PHYS mail server. We do expect multiple downtimes throughout the evening, partially of single mail services, partially of the whole mail server.
This will likely also delay the delivery of incoming mails up to several hours.
Update, 22:30: Everything back to normal.