Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Intel may power much of the world's PCs and servers, but when it comes to storage issues, the chip giant faces the same problems that any other enterprise must confront.
Except they're bigger. Intel, after all, is a massive concern, where exploding storage demands magnify the usual challenges. 80,000 employees, for example, have to deal with as many as 4.2 million email messages per day, a 247 percent increase in four years. Recently, the company counted 99 million messages in just one month.
From 10 TB of storage in 1997, the company now has 3 PB – and that's just what is available online. While annual average storage growth has been moving at a rapid clip in recent years – 74 percent annually – it is expected to soar to 125% growth next year.
"Our e-business applications needed 20 TB of space only two years ago," said Doug Busch, Intel's CIO. "Now, those applications take up over 160 TB."
Networked Storage Takes Center Stage
Not surprisingly, Intel has been working for the last five years to put order into its storage environment. In that time, it has moved its mostly direct attached storage (DAS) architecture to 40 percent storage area network (SAN), 40 percent network attached storage (NAS), and 20% DAS.
The process of upgrading the entire storage environment began with a fairly simple change in equipment purchasing policy. Intel addressed the area of server purchasing and consolidation to handle a recurring issue – running out of disk capacity.
"We regularly bought new servers because we were out of disk capacity," said Busch. "Continually adding and upgrading servers was a big cost factor, so we had to attack that to make Intel more cost-effective."
In order to address the rapid growth of storage demand, Busch realized he had to phase out DAS and what he calls a "bottoms-up" data mart architecture that had become too inefficient.
This translated into several critical issues: an overabundance of smaller and more distributed departmental data stores rather than having a few larger and more centralized enterprise data stores; too much manual storage management that was bogging down the storage administrators in time-consuming day-to-day tasks and preventing them from being able to look too far into the future; and largely unmonitored data quality – there wasn't enough time available to pay much attention to how clean the data was.
"By first introducing NAS and then SAN, we achieved a two-to-three times performance increase, better availability, decreased TCO, reduced data center space, and reduced the number of support personnel required," said Busch.
Problems remain, however. Transactional data dominates and continues to have a 60% growth rate. But Busch notes that reference data is growing even more rapidly at 92%. Further, mobile computing is causing headaches in terms of storage capacity and backup. The company now has 52,000 users who remotely access its systems. This is up 555% over the last five years.
The 'System Around The Storage Technology' Determines Success
"Obviously, capacity will continue to increase, yet that is happening while the willingness to pay for storage is continuing to decline," said Busch. "What we have realized, therefore, is that it is the system around the storage technology which determines success."
At Intel, this success is built on several key principles that underlie the storage architecture. To gain control of storage, the company always tries to manage the demand at the initial collection, replication (views, versions, scratch copies, etc.), and retention stage. This significantly cuts down on complexity, but can only be achieved if the system is built to accommodate this method of operation.
Another guiding principle is to manage utilization by pooling storage. By adopting an enterprise data mart design, storage managers can gradually move away from hard allocation of storage capacity. This is also assisted by introducing hierarchical storage.
Intel's hierarchy consists of:
Tier 1 is reserved for only mission-critical and high availability data. This category contains the most expensive data and means no data loss of any kind is acceptable. Thus disaster recovery technology is included in this tier."Overall, we have gained ground by having storage growth decoupled from servers, and by introducing hierarchical storage," said Busch.
Tier 2 includes most other production and pre-production SCSI drives. This is the mid-range category and may include some DR but without the "no data loss" condition.
Tier 3 at Intel means archiving and backup options. Tape and ATA drives are used as the lowest-cost option, with cost taking on more importance than performance.
And that is a lesson that any organization can learn from.
Article courtesy of Enterprise IT Planet