Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
How Does It All Work?
The most important thing to remember is the basic rule for storage hardware, which is that currently all physical requests must start on 512 block boundaries and end of 512 block boundaries. The picture below illustrates this:
The first request begins on a 512-byte boundary and ends on a 512-byte boundary, while in the second example the I/O request does not begin and end on 512-byte boundaries. So what happens in the system when you do not make I/O requests on 512 byte boundaries? In this case, the system must convert the requests to read and write on 512-byte boundaries for you, as requests can only be made on physical hardware boundaries. To do this conversion, the overhead will be extremely high.
What the System Does
There are a large number of N-cases that I will try to cover, but let's start with the simplest case and work to the most complex. Much of the information here will be covered in more detail in future columns as it relates to file system implementations and issues with direct I/O. So here is what happens in a few cases:
|I/O type||What Happens|
|Reading||As the data does not begin and end on 512-byte boundaries the system reads the data into a system buffer cache and transfers the data that the user asked for. More data is read in than is required. For requests that use C library I/O (fread(3)), the library manages requests on 512-byte boundaries as long as the buffer size is a multiple of 512-bytes plus the 8 bytes needed for the pointers into the buffer.|
|Writing||As the data does not begin and end on 512-byte boundaries on many if not most implementations, the operating system and file system must read the data from the disk/RAID device into a system buffer cache. The data read in is the size of the request rounded to a 512-byte boundary. So, for example, if you started writing at byte 5 and wrote to byte 131072 (128 KB), you would read into cache from 0 bytes to 131584 (131072+512-bytes) as that is the next nearest multiple of 512-bytes. The system writes the data to the buffer from the user, and then the system writes the request to the disk/RAID device.|
This is extremely inefficient and is often called read-modify-write. As the data is read in, the record is modified in memory and then written out. The system overhead, the amount of data transferred, and I/O wait time is far greater than if you made requests on 512-byte boundaries. For requests that use C library I/O (fwrite(3)), the library manages requests on 512-byte boundaries as long as the buffer size is a multiple of 512-bytes plus the 8 bytes needed for the points into the buffer.