Downtime is often a phrase that strikes dread into any network manager. In today's always-on, hyper-connected environment, the consequences of any loss of availability can be severe.
But while unplanned downtime caused by issues such as power outages, hardware failure or cyber attacks can lead to major financial and reputational harm, there is another type of downtime that you need to prepare for: planned downtime for maintenance or updates.
While planned downtime is at least something you have more control over than for unplanned outages, that doesn't mean it won't still be disruptive to both employees and customers. And, if something goes wrong, it can quickly turn into a lengthy unplanned outage. Therefore, it's vital you take steps beforehand to plan these activities and minimize any disruption.
The costs of network downtime
No matter what the reason behind it is, all downtime can be highly costly for your business, both financially and in terms of your reputation. For instance, figures from Veeam suggest that in 2021, the average cost of downtime stood at almost $85,000, with the average outage lasting 79 minutes.
However, the true costs of these issues take into consideration more than just financial loss. Other problems that arise can include reduced productivity and a lack of confidence among both employees and customers. Indeed, if you have too many frequent periods of planned downtime, even if they're short, they can quickly add up. This not only results in more direct lost revenue, but can also give a poor impression of your company as a whole.
Creating an effective downtime schedule
An essential step in reducing these costs is to have a clear idea of exactly how long you'll need to be offline, and when will be the best time to do this. For global enterprises with operations around the world, there may be no 'best time' to initiate scheduled downtime, as even if it is outside working hours in one location, you could still be causing disruption for users elsewhere.
To minimize the impact, look closely at your usage patterns, and also take into account external factors. For instance, performing maintenance right before major holidays may be the best solution for an application used by employees, but could have more impact on a consumer-facing service.
Having a schedule also prevents you from slipping into a mindset that leaves maintenance until the last minute - or worse, until something has actually broken. Even if you have no update plans, you should have clear guidelines for what you want to achieve during each period, such as testing your systems and looking for more minor defects.
This is also helpful in preparing end-users. If employees and customers know that maintenance will always be carried out at a specific time each month, for example, they can adjust their own schedules to accommodate it. This doesn't necessarily mean there won't be any inconvenience, but it reduces many frustrations.
Keeping planned downtime to a minimum
Once planned maintenance periods are initiated, keeping them as short as possible is vital. To do this, there are a few things you should keep in place and prepare for beforehand.
Firstly, you should prioritize your operations and identify which issues should be tackled first. This ensures you have as much time as possible to take on the most pressing issues without needing to extend any downtime.
It also pays to have a good idea of the overall picture of your network, including where any bottlenecks or other constraints lie. For example, these may be caused by such issues as older machines that require more frequent inspections. Or, if you regularly have delays in the production process when developing a particular service or application, you can make sure to plan ahead and ensure these processes are expedited in order to stay within a timeline or deadline.
Good planning also means you should have a backup plan in case anything goes wrong. For instance, Sony's PlayStation division recently got into difficulty when a planned update to its latest flagship Gran Turismo 7 title, which should have been a relatively brief outage, went wrong and introduced new errors.
The result was extended downtime while the problem was fixed, leaving customers unable to access the game for over 30 hours and leading to many negative headlines. To avoid this kind of incident, it's important to have a fallback option that you can revert to quickly.
Developing a predictive maintenance solution
The best way to minimize the impact of planned downtime is not to have it at all. While some updates or fixes will be inevitable, you can reduce the need for this by taking a more proactive approach to monitoring your system, so you can spot potential problems before they occur.
By addressing issues early, you can ensure that any necessary downtime is kept as short as possible. Do this by embracing a predictive maintenance approach and installing monitoring sensors and automation tools that can detect any changes in status or performance and send alerts when attention is required.
Access the latest business knowledge in IT
Get Access
Comments
Join the conversation...