Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
High density storage units that can fit 60 or more drives into a 4U space are prevalent. Using 4 TB drives you can get 240TB of raw space in these units. Or you can use a mix of SSDs and hard drives to perhaps get better performance.
There are other tricks you can use to improve performance depending upon your requirements and tolerance of data loss. These storage units can be coupled with a 1U server with multiple 10GigE links or a single 40GigE link creating a 5U NAS storage solution that can provide a great deal of storage capacity and performance.
Using a NITRO for every 4U set of microservers does dilute the density situation going from 288 servers in 4U (72 per U), to 288 servers and storage in 9U (32 per U). But in a 42U rack you can get 4 of the microserver/NITRO combinations resulting in 1,152 servers and 960TB's of network storage in a single rack. This is still a pretty amazingly dense solution with a great deal of IO to the servers.
When the servers have local storage we get the same number of microserver/NITRO combinations in a 42U rack (4). That results in 384 servers and 960 TB's of network storage. That is still very good and gives you a great of flexibility in terms of IO.
At this point there are a number of people reading this article who are positively seething because they now have to worry about five NAS units per rack: four NITRO units and one BBC-NAS, instead of one NAS server for a number of racks. I understand your agitation and agree that this can be a pain for administration. Been there, done that.
However, let me tell you about a customer who has over 20,000 total cores and 46 individual NAS units all managed by no more than six admins. And those admins also monitor and admin the local network and an array of Web servers and handle over a thousand users. While they may seem superhuman, and I like to think that they are, they actually get to sleep at night. The reason is planning and automation.
Using tools such as Puppet, or HPC tools such as Bright Cluster Manager or Warewulf, one could develop a standard image for each NITRO. In this image, you would need to be sure to include monitoring tools such as Ganglia, Nagios, and others. You also need to think about how the data from the NAS rack units will be rsynced to the higher-level NAS (BBC-NAS) and how a NITRO can be restored from the BBC-NAS. It's not a difficult thing to do, but you do need to test it.
Beyond images, you would also need to plan for user management. Which users will be assigned to which set of nodes and, consequently, which first-level NAS (NITRO)? What happens if that user leaves — how do you migrate that data to a different user? You also need to consider, plan, and test how the data is moved from NITRO to BBC-NAS from a user's perspective, particularly permissions, dates and extended attributes (xattr). All of this is doable and has been done. But it takes careful planning.
As with all things in life, there is a cost/benefit. The current model of storage for microservers, each node having a single disk and using network storage, has some fairly significant issues that impact cost and scalability.
I put forward the idea of something I labeled NITRO (NAS In The Rack Option). The idea is to actually have more NAS boxes, but to integrate them with a number of microservers to provide semi-permanent storage. Then these units are synced to a larger centralized network storage solution, resulting into two levels of NAS storage.
NITRO introduces some additional complexities because you have more NAS units to administer. There are data centers that routinely administer exactly this architecture — about 40+ NAS servers and one large centralized NAS server. It has proven to be very effective, and there is no reason why this approach can't work with microservers.