On Monday morning (7), the rupture of a channel that supplies water to the cooling towers in the data center that hosts the IBM Cloud (SAO 01) caused the temperature rise of servers and data storage equipment, giving rise to connection instabilities and also affecting the Brazilian internet, since many companies depend on the services of this cloud.
Amazon Web Services, Caixa, Itaú, Vivo, Correios, Banco Inter, TIM, NuBank, Mercado Livre, WhatsApp, FreeFire and even the Brazilian Federal Police feel the effects of the incident, which, for now, has no set time to be resolved. Several bulletins have already been released by IBM, with intervals of 30 to 60 minutes, informing customers about the progress of the situation.
“The fracture has already been repaired and the replacement water is arriving at the site to refill the tanks. Based on the current and sustained high temperatures, the Sao01 Server Room 01 suite will be de-energized to reduce the load and mitigate the increase in temperature”, he says one of them.
“All devices in Sao01 Server Room 01 will lose power as a result of this action and will remain offline until the root problem with the cooling incident is resolved and temperatures are stabilized. There are currently reports of impacts on storage offerings (file, block and VSI-SAN), Bare Metal and VSI. As a result of the de-energization of Sao01 Sr01, Sao01 Sr02 will experience a drop in network connectivity and services “, he adds.
Instability of several services must remain until the problem is solved.Source: reproduction
In other documents, you can find details regarding the triggering of various system overheating alarms and ongoing work to “mitigate any effects the problem may have on IBM Cloud customer services”. At 7:30 am (Brasília time), there were reports of impacts on storage offers (file, block and VSI-SAN), Bare Metal and VSI.
For the time being, the most recent situation is this: “Based on current and sustained elevated temperatures, the IBM Cloud is deciding to shut down bare metal hosts as an additional mitigation effort. This decision is being made based on the provider’s current ETA. location and observed behavior of equipment. “
The information is from CISO Advisor.