Today around 7:40am we noticed intermittent Internet connectivity at the area where we are located.
At 9:00am all Internet connectivity was lost, and we alerted our upstream provider.
At 10:30am the upstream provider has notified us that an emergency repair technicians crew would have to be dispatched to our location.
At 11:30am the upstream provider's technicians arrived on site and began troubleshooting the connectivity problem on our location.
At 11:50am the upstream provider's techicians notified us that a major Internet connectivity backbone was severed (cut) affecting the whole area where we are located (Lakatamia), and that they are working on fixing the cable. The area affected includes a few hundred houses and offices. Their estimated time for repair is 24 hours.
That's the bad news.
The good news is that we will be honoring our uptime guarantee. Actually going beyond that. Here is the related paragraph taken from our TOS (Terms Of Service):
60. In the event that the user's website and/or cloud and/or dedicated server and/or virtual private server and/or colocated system is unavailable for less than 100%, deZillium will credit the following month's service fee as follows. The user's credit shall be retroactive and measured in 24 hours a day of a calendar month, with the maximum credit not exceeding 50% of the monthly service charge for the affected month.
- 95% to 99.9% - YOUR account will be credited 10% of your monthly fee
- 90% to 94.9% - YOUR account will be credited 20% of your monthly fee
- 89.9% or below - YOUR account will be credited 50% of your monthly fee
Ignoring the maximum set in our TOS (50% of the monthly fee), we will offer a month's worth of hosting (100% of the monthly fee) for all customers affected.
Both connections going down at the same time meant that the failover systems did not function as expected, and could not failover to a separate location. Just to be clear, we did account for cables getting cut in our immediate vicinity, and installed 50 separate lines going into our neighbourhood. If a single line was cut in that cable, we would immediately failover to a separate cable. 10 of those lines are wired and ready to go on a moment's notice.
The problem was upstream of (before) that cable segment, and that cable is entirely controled by our ISP. Even multihoming (using different ISPs) would not account for that cable segment, since it is shared between ISPs. Learning from *OUR* mistakes, we will be working on getting this fixed. We are considering the possibility of adding a wireless high capacity connection to eliminate any future problems with any cables.
We could go into apologetic damage control mode, but we won't. It was our mistake for not taking into consideration the possibility of a cable section beyond our control getting damaged.
Set up an emergency connection at a different location. Unfortunately this connection cannot be used for hosting services. Just wanting to let you know what's going on, just in case you thought we disappeared over night. That's all folks!