A Trip Down the Data Path: RAID and Data Layout Page 3 - EnterpriseStorageForum.com

A Trip Down the Data Path: RAID and Data Layout Page 3

Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure

Data Files

Going through the same set of steps is necessary for the actual data itself, but there are often numerous differences between how the data is accessed and how the index files are accessed. The process, however, is the same. Here are the steps:

  1. Determine how to lay out the LUNs based on:
    1. The file system and volume manager that you plan to use. Think about allocation sizes, round-robin, and striping
    2. The size of the files. Think about how the files will be allocated on the physical disk devices and how the RAID cache will be used

  2. Determine the I/O request size that will be used to read and write the data
    1. If it is large, then RAID-5 might be a good choice, as with RAID-1 you:
      1. Have to write out more data using more cache to disk bandwidth than with RAID-5
      2. Use more disks for the same amount of data space
    2. What amount of new data will be created (read/write ratio for setup of the cache)
If you understand the above information, you will be able to create the LUNs and set up the file system and/or volume manager based on the LUN creation. Of course, you will still have to tune the database internals, but in terms of I/O on the storage, it will be as efficient as it can possibly be. I define efficiency as the highest possible cache utilization and the lowest possible data latency.

This process can be used for any other application type and is not restricted to databases.


When I started this column late last year, I stated that the key to I/O performance is understanding the data path from end-to-end. Through the series of articles culminating with this column, I believe that we have completely covered this end-to-end understanding. RAID controllers have no knowledge of how the data will be accessed or how the files are mapped to the physical devices, yet they have built-in algorithms to cache the data, improve the I/O latency, and reduce the amount of I/O from disk to cache.

It is up to the architects, storage team, and/or administrators with knowledge of the volume manager and file system to assist the RAID controller with its caching algorithms. The RAID and the volume manager and file system do not communicate nor play well together, as they have no real communication. Maybe that will change in the future, but not for some time.

Now that we have completed the data path, next month we will start reviewing the "hows and whys" of benchmarking. If anyone has any suggestions, please let me know.

» See All Articles by Columnist Henry Newman

Page 3 of 3

Previous Page
1 2 3

Comment and Contribute


(Maximum characters: 1200). You have characters left.



Storage Daily
Don't miss an article. Subscribe to our newsletter below.

By submitting your information, you agree that enterprisestorageforum.com may send you ENTERPRISEStorageFORUM offers via email, phone and text message, as well as email offers about other products and services that ENTERPRISEStorageFORUM believes may be of interest to you. ENTERPRISEStorageFORUM will process your information in accordance with the Quinstreet Privacy Policy.

Thanks for your registration, follow us on our social networks to keep up-to-date