Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Making Good Use of the DR Site
The best way to set up disaster recovery is by having a dedicated site with servers available and application software running so that an immediate fail over can be done when called for. This approach is also very expensive and not always popular. There are ways to implement disaster recovery sites, save money and be practical, all at the same time.
An excellent approach for the dual use of just such a facility is testing of upgrades. All operating systems, applications and databases require regular maintenance patches, fixes and upgrades. With environments available as exact duplicates of production systems, these are prime locations to test the maintenance releases.
Patches and fixes can be applied to a disaster recovery system on a regular schedule. An approved test plan can be administered against the environment to check for issues with the maintenance release. If no issues are found, the patches can be left in place and migrated to the test environment on a regular schedule as well. If no problems are found, the patches can then be migrated into production on a regular schedule.
If any issues are found at the disaster recovery site or in the test system, then the patch can be rolled back or tickets can be opened with the vendors if problems are minor. This eliminates the need for a separate laboratory environment, which can also be very costly. No additional hardware, software, licenses, maintenance, administration or space would be needed for a lab to test maintenance releases.
If you do not currently have a lab for testing patches and fixes for software, then this can be of a substantial benefit in three areas. The money has already been spent on the disaster recovery site, which was a necessity in itself. Secondly, a duplicate environment of your production systems now exists to test software patching, negating the need for a laboratory. Thirdly, less administrative maintenance is spent on systems once they are patched. Keeping software patched and fixed to current levels reduces downtime and the amount of time administrators spend on system repairs.
This approach can be especially helpful for database administrators. Many times a server may be available for database installations, patching and upgrades, but rarely are there complete environments for these tasks. The need for application developers and users is to test the application against the database after the patches have been installed. The database administrator can perform some limited testing, but the true tests come when users put the system through the motions.
Stocking the disaster recovery site with test servers is another great way to get the disaster recovery site up and running quickly and maximize the value of those servers. In most, if not every case, these servers are purchased for every new project that will be migrated into production. Test servers should be purchased with the same specifications, or better, than production. Most test servers will need higher capacity because more databases, application servers, Web servers and the like will be running on them than the production hardware. With test servers in the disaster recovery facility, much of the work of software installation is already done. Disaster recovery instances can be created on test servers and left idle. Application servers, Web servers and databases just wait for the day that a fail over will be alerted.
Using virtualized servers can assist in lower costs for a disaster recovery site, particularly as the technology becomes less expensive and less complex. It is now much easier to implement virtual servers than it has been in the past. Today, many applications, operating systems and databases support server virtualization software. This has changed since many of the virtualization vendors have tried to work closely and cooperate fully with the other software vendors.
Pressures from customers have also driven software companies to work with virtualization companies to certify and support their products. Through virtualization, a physical server can be imaged and reproduced in a virtual environment. A production system consisting of a Web server, application server and a database server can all be imaged and virtualized on a single physical server. This effectively consolidates three physical servers down to one without losing any functionality. Capacity may not be equal, but it may suffice perfectly in a disaster recovery scenario. This does not mean that all applications will work together on virtual servers; they must be able to coexist.
A step beyond cross training is mentoring. A mentoring program allows subject matter experts to work directly with management-identified employees who are interested in becoming experts in a different field than the one they are currently in. This can become a large financial gain for employers while increasing employee morale as well. Mentoring can also work well for employees who wish to cross train to qualify for positions on other technology teams that have unfilled vacancies.
By identifying and opening career opportunities across teams, individuals feel a sense of empowerment and are not stuck in their current roles. For example, a database administrator position may be difficult to fill externally. A current developer with talent, ability and desire to become a database administrator could miss an opportunity to make a lateral move due to lack of experience. Through mentoring, the developer could continue in her current role while cross training in a potentially new career path. In this way, mentoring programs can help manage expected retirements and workflow fluctuations while providing alternative career paths for qualified candidates.
A mentoring program spreads knowledge within and across teams, providing support when subject matter experts are inaccessible or incapacitated, a critical consideration for disaster recovery. By documenting processes and procedures through a mentoring program, the ability to respond quickly to outages or disasters is dramatically enhanced.
Kevin Medlin has been administering, supporting and developing in a variety of industries including energy, retail, insurance and government, since 1997. He is currently a DBA supporting Oracle and SQL Server, and is Oracle certified in versions 8 through 10g. He received his graduate certificate in Storage Area Networks from Regis University and he will be completing his MS in Technology Systems from East Carolina University in 2008.
Article courtesy of Enterprise IT Planet