Defeating Downtime: High Availability IT Strategy for a Manufacturing Business

Written by Marc Schwartz | Jun 1, 2020 1:45:00 PM

Downtime is highly disruptive—and thus expensive—for manufacturing firms. Though exact costs are difficult to calculate, there’s little doubt that unplanned IT system outages throw inventory control procedures and production processes into disarray. The results include lost productivity, reduced product quality, unhappy customers, and damage to your business’s reputation. When operators must switch to manual control of production, product quality becomes inconsistent, meaning you’ll waste more materials. At the same time, you’ll see more inventory errors (and losses of products and raw materials) and an increase in customer returns.

Even though its costs are high and its impact on customer satisfaction is devastating, manufacturers see a great deal of downtime. The average manufacturing firm experiences 800 hours of production downtime every year. Many have IT system failures even more often than that.

It doesn’t have to be this way. By carefully designing IT systems that are resilient enough to operate continuously without interruptions for long periods of time, you can avoid the costs and losses associated with downtime. This concept, known as high availability computing, involves architecting systems that have redundant components standing ready to take over in case of failure or emergency. Such systems must be thoroughly and regularly tested, and must be built not only to keep on operating no matter what, but also to maintain adequate Internet connectivity and performance.

Designing IT infrastructures for high availability

To reduce the number of service interruptions and downtime events that your IT systems undergo, you must prepare for the unexpected. Generally speaking, this means maintaining duplicate systems, configured identically to those you use on a daily basis, that will automatically take over from faulty equipment at a moment’s notice. Such failover capabilities can be partial (meaning that you’ll keep ready-to-go duplicates only of your most critical applications, of Active Directory (AD), your Exchange server, and essential database contents) or complete. Naturally, the more extensive your failover capabilities, the more it will cost to implement and maintain them.

Individual hardware devices can also be designed for high availability or fault tolerance. Accomplishing this involves applying the same basic principle—redundancy—within the individual device. In one commonly-used data storage architecture, called a redundant array of independent disks (RAID), data is saved onto multiple hard drives in a way that improves read/write speed and system performance while ensuring that no data will be lost in the case of an individual drive’s failure.

Complexity and cost are the enemies—and they go hand-in-hand

The problem with maintaining redundant hardware equipment only to have it sit idle most of the time, merely standing by so as to be ready for us in case of emergency is simple: costs add up quickly and can easily become prohibitive. Few small to midsized organizations can afford to purchase even a single enterprise-class redundant storage array, let alone multiples that can be situated in different geographic locations.

In reality, adding redundant physical systems usually will more than double the cost of your initial hardware investment, and may exceed its cost by as much as 150 percent. This is because the presence of duplicate components makes your entire infrastructure far more complex. Additional testing would need to be performed each time you updated your software or added hardware components to ensure that your failover systems still work. And maintaining idle systems at an offsite location can be more expensive than maintaining on-premises ones that are in daily use.

The miracle of cloud computing

The advent of cloud computing has been a game changer when it comes to making high availability systems affordable for the majority of businesses. Build the right cloud-based failover capabilities, and you can enjoy the benefits of redundancy at a tiny fraction of the cost of building a physical disaster recovery site. Cloud-based backup, disaster recovery, and business continuity solutions are available in a variety of service tiers, and usually feature pay-as-you-go pricing, so that you’ll be charged only for the capabilities that you actually use.

This doesn’t mean you can always switch on cloud services at a moment’s notice to take over for your systems in case of failure. Depending on the age, design, and configuration of your environment, you may need to replicate your entire on-premises infrastructure in the cloud if you want to maintain full failover capabilities. This process is more complex, and thus more expensive, than simply storing backups of your data in the cloud.

Depending on your business goals, your budget, and how long you can afford to be without access to your data or critical business applications, you can choose one of several different availability models.

From simplest (and cheapest) to most resilient (and expensive), these are the availability models:

Storing backups in the cloud
Storing backups in the cloud, and maintaining access to high-availability compute
Maintaining access to high-availability compute, and storing geo-redundant backups in the cloud
Maintaining access to high availability compute in geographically redundant locations and also storing backups in geographically redundant locations.

Geo-redundancy, in which copies of your data are stored in data centers located in multiple different regions, can protect your information from outages that occur due to natural disasters or conditions like power outages affecting certain parts of the country. Geo-redundancy provides a valuable additional safeguard, but this doesn’t come cheap. And geo-redundant storage and compute are usually more affordable for larger organizations because per-employee costs decrease with scale. Companies with 50 employees or fewer may find the cost/benefit ratio less than favorable.

Proper system design and engineering can make all the difference

Numerous cloud-based backup and disaster recovery solutions are available for manufacturers looking to upgrade their IT systems to increase reliability and resilience. Today’s cloud offerings put high availability within reach for even the smallest of businesses.

Whether you’re relying on public cloud services, have implemented a private cloud solution, or are working with a hybrid of the two, it’s crucial to partner with a service provider who understands your business model, goals and needs. Your entire computing environment is only as resilient as the least reliable component within it. For some organizations, this might be wiring and switching; for others, maintaining Internet connectivity in rural or remote locations is a challenge. These problems must be solved in conjunction with your selection of the right cloud services or on-premises solutions—those that offer the features and benefits you need at a price you can afford.

Want to learn more about how to put struggles with too much downtime behind you forever? Check out “The Definitive Guide to Recovering from IT System Outages” to take a deeper dive into the concept of resilience.

View full post