Businesses are investing more than ever on the devices their employees need to perform their work. But what is the overall cost and how do your device management choices affect the business?
Hot on the heels of talking about platform updates lets talk some business time!! Now business time can mean a whole lot of different things to different people in different situations, so lets get specific and talk context.......
As much as I love Flight of the Concords we're not talking their type of business time, we're talking about upgrades, patching, expansions, replacements, and all that good stuff that happens behind scenes!
Typically, this type of work happens out of "business time" or business hours if you prefer. We do this because we want to minimise any risk of disruption to workloads running on the platform and that seems like a perfectly good approach......... or does it?
Now don't get me wrong, we absolutely want to and actively DO, manage risk in our environment and that's not about to change. As a service provider we're absolutely risk averse when it comes to managing out customer workloads given they are our life blood! Currently we schedule maintenance windows (broadly speaking) into two specific times a week - Wednesday and Sunday evenings. From time to time we'll schedule emergency windows for more pressing issues as the need arises. Again, it all comes back to availability, performance expectations and managing risk for our customers. ⚖️
What I'm wondering here, is if this type of thinking is perhaps a little antiquated? Not the availability, performance and risk parts...... but the scheduling piece. "Back in the day" the working hours were typically a 8am to 5pm type thing where this all makes perfect sense. But businesses of today are a more complicated beast, with many demanding access to resources for considerably wider portions of the day and some even 24/7! If we consider these types of changes, when is it a good time to do the work and schedule that maintenance window?
So how about we run through a couple of scenarios and how they might be handled:
- a failed memory module in host
- a failed component in storage array
- the upgrade of storage array (software)
A Failed memory module in a host
So occasionally (not that frequently fortunately!), hardware failures like this can occur. Depending on the severity of the failure it could result in an HA event or possibly just an alert. An HA event results in all the virtual machines on that host being restarted on another host in the cluster