In many ways, it is similar to the aircraft engine field. When a new aero engine is under development, and for a few years after release, it is a huge deal. After that, the engine continues to do its job, but all the attention goes onto the plane using the engine – or the latest and greatest new type of engine on the radar. The CFM56 jet engine, for example, is the most popular in the world yet generates almost no press. Similarly, Boeing 737s using that engine dominate the skies yet receive little attention.
It’s the same with the SAN as the backbone of the storage industry. Yet SANs remain the backbone of on-premise storage, as well as being in heavy use among all the major cloud providers. Their performance is critical to the rapid availability of content around the world.
Good SAN performance, then, continues to be an essential element of modern computing. A number of tools are available that monitor SAN metrics to ensure there is no loss of efficiency or latency, no over-subscription of SAN resources, correct RAID allocation, and more.
SAN Performance Monitoring Benefits
The use cases for SAN monitoring are straightforward. The tools available cover a wide range of functions. But the general benefits of SAN performance monitoring include:
- Achieving the highest possible level of application or data uptime
- Error tracking to ensure errors do not lead to downtime or slow performance
- Planning for the best utilization of server, storage, and network resources. This encompasses areas such as capacity, bandwidth limitations, and more.
- Identification of potential areas of congestion and latency before they become a problem in terms of sluggish delivery of storage.
How to Select a SAN Performance Monitoring Tool
There are many factors to consider when choosing a SAN performance monitoring tool. It all depends on the situations or challenges being experienced and the platforms in play.
Key areas to consider include:
- Overall storage hardware and software environment: Many shops are all Dell EMC, all NetApp SANs, all IBM SANs, for example. It makes sense in such cases to gravitate toward the performance software offered by those vendors. But many others operate mixed environments comprised of a hodgepodge of hardware and software elements. In those cases, it’s a case of finding the solution that plays well with the underlying environment.
- Addressing overburdened components: There are workload limitations at each step along the path from the host bus adapter (HBA) through the SAN switch fabric to the storage system front-end ports, through the CPU and cache to the back-end media. An overburdened component anywhere along that path creates a bottleneck that can affect not only the host in question, but all hosts consuming media from the storage system.
- Detecting misconfigured components: A misconfigured component at any step in the path from the host to the storage system can create risks to both performance and resiliency of the solution
- Staying on top of workload changes: Unexpected changes in workload can impact performance and be harbingers of future performance issues
- Capacity planning: It is vital to ensure that there is sufficient SAN capacity to support both current and future initiatives. SAN monitoring tools often venture into capacity planning but vary in sophistication. They must at the very least be able to detect impending capacity bottlenecks and alert IT to such issues.
- Spotting resource contention: Many factors contribute to the perceived speed or sluggishness of SANs. But resource contention under peak loads ranks high on the list of key issues impacting SAN performance, especially when concurrent requests compete for single-threaded paths and end up being serialized.
- RAID configuration: The appropriate level of RAID protection should be configured in a way that leads to efficient storage management. Wrong RAID allocation can tie up too much capacity, slow performance, or fail to properly safeguard data.
Top SAN Performance Monitoring Tools
Enterprise Storage Forum has evaluated the market and considers these solutions among the top SAN performance monitoring tools available, in no particular order.
- DataCore SANsymphony
- IntelliMagic Vision for SAN
- SolarWinds SRM
- Broadcom Brocade SAN Health
- Nagios XI
- ManageEngine OpManager
- eG Enterprise
- Dell EMC SRM
DataCore SANsymphony aims to maximize the collective value from storage infrastructure by delivering fast, uninterrupted data access. At the same time, it reduces storage costs and increases flexibility and scalability.
Users can centrally automate and manage capacity provisioning and data placement across a diverse storage environment (any SAN, DAS, HCI, or JBOD). Powered by a block-level storage virtualization technology and a set of data services, SANsymphony gives flexibility to control how to store and protect data.
- Promises a several-fold improvement in the response and throughput from current storage assets without needing to replace them
- SANsymphony works transparently between applications and storage
- Parallel I/O technology is a software-defined storage approach to boosting storage response and lowering price/performance
- Instead of relying on media speed via expensive NVMe or flash, it makes clever use of RAM and CPU, as well as caching algorithms to sense and adapt to workload variations.
IntelliMagic Vision for SAN provides automated health insights. It leverages a hardware specific AIOps approach to identify and help prevent storage and fabric performance and capacity issues in a multi-vendor distributed systems environment. The metrics for each component are rated to warn of impending performance risk or to let you know when a performance risk is present.
Statistical analysis provides performance and capacity anomaly detection to see the change, as well as the magnitude and significance of that change. AIOps functions help understand future storage capacity requirements and prevent capacity constraints. The connectivity for each host is analyzed to detect deviation from best practices, such as orphaned ports, asymmetrical connectivity, or single paths for ESX hosts or masking views.
- The software is delivered via secure SaaS, requiring very little configuration or maintenance, and no need to install agents
- IntelliMagic Vision provides a complete view of the storage environment from ESX to SAN fabric to storage arrays in a single pane of glass
- AIOps-driven approach based on deep knowledge of the capabilities of each device and key components, as well as the capabilities of the device as configured
- Someone with little knowledge can troubleshoot SAN issues
- Thresholds take into account capability, configuration, and other factors that help determine what is good, a warning, or a performance exception
- Configuration analytics allows the user to see where configuration deviates from best practice
SolarWinds’ Storage Resource Monitor (SRM) reports on health, performance, and capacity of multi-vendor storage environments. It provides a view into the performance and capacity of storage with agentless NAS and SAN attached storage monitoring and reporting. It visualizes storage including volumes, RAID groups, storage pools, disks, and more. In addition, it integrates with other SolarWinds tools, enabling users to resolve issues by pulling information from the datastore details in SolarWinds Virtualization Manager (VMAN).
SolarWinds SRM investigates each layer (array, pool, and LUN) and SolarWinds Server & Application Monitor (SAM) identifies the dependencies between VMs, hosts, and storage. It also points to the relationship between the LUNs and the critical apps and servers they support in AppStack when you use SRM with SAM to better understand the root cause of problems.
- SRM provides end-to-end visibility into application and infrastructure performance, including storage resources
- Integration with AppStack brings to light application performance issues and identifies their causes across on-prem, virtual and storage infrastructures
- Hotspot detection and storage performance monitoring alert on potential issues as they arise
- Monitor and control storage devices and systems from multiple vendors
- Customizable reports with insight into the storage environment, from performance metrics to key capacity trends
- Easy integration with additional SolarWinds products for a holistic view of the IT infrastructure
- The SolarWinds THWACK community has over 130,000 members, many of which contribute templates, scripts and reports for ongoing work
Broadcom inherited its SAN monitoring tool from Brocade. As an early pioneer of the SAN field and the leading company for SAN switches for decades, Brocade accumulated to itself an immense amount of technical know-how on storage networking.
All of that hard-won know-how went into Broadcom Brocade SAN Health. Its purpose is to avoid SAN downtime, reduce troubleshooting and resolution time, and improve capacity planning and productivity.
- Automation makes Brocade SAN Health easy to run
- Installation can be done in one minute and audits in three minutes
- Minimizes the impact on network performance
- Runs in the background during backups, or at any other scheduled time
- It rapidly identifies congestion risks, oversubscribed ports, zoning issues, configuration anomalies, and high port errors
- Can run in multi-vendor storage and networking environments including Cisco devices
- Can see down to the device and port level
Nagios XI monitoring for SAN setups and infrastructure can check capacity of disks and directories, as well as RAID status, storage capacity planning, and network and infrastructure monitoring for applications.
Its strength is the breadth of areas it monitors – storage, servers, network. Thus, it is a good general monitoring tool, but lacks the depth of SAN monitoring provided by some other solutions.
- Monitoring of applications, services, operating systems, network protocols, systems metrics, SAN, and network infrastructure
- Hundreds of third-party add-ons provide for the monitoring of applications, services, and systems
- The Nagios Core 4 monitoring engine manages server performance
- Centralized view of the entire IT operations network and business processes
- Automated, integrated trending and capacity planning graphs
- Integrated web-based configuration interface
- Multi-user access to web interface
- User-specific views ensure clients only see the infrastructure components they’re authorized for.
ManageEngine OpManager is another of those solutions that monitor storage devices, but has a broader focus that lies elsewhere. The company positions it as more of a network monitoring product that can span geographically separated data centers, public and private clouds.
ManageEngine OpManager monitors network devices such as routers, switches, firewalls, load balancers, wireless LAN controllers, servers, VMs, printers, storage devices, and anything that has an IP and is connected to the network. It continuously monitors the network and provides visibility and control. Admins can drill down to the root cause and eliminate it before operations are affected.
- Manages host servers, bus adapter cards, storage arrays, and fabric switches
- Provides general network monitoring
- RAID monitoring to determine capacity, configuration, and performance metrics
- Monitors status of controllers, LUNs, volumes, and virtual disk groups, providing statistics and reports for each
- Automatic discovery of tape libraries and reporting on their status, health, and any faults
eG Innovations provides eG Enterprise. Again, it can be described more as a comprehensive network monitoring tool that also has SAN storage monitoring capabilities.
Performance monitoring is done across the entire infrastructure, and admins can view where storage, network or performance problems originate such as issues with physical components, storage bottlenecks, bandwidth hogs, or trouble coming from applications, servers, or virtual machines.
- eG Enterprise is an end-to-end application performance monitoring solution used by enterprises, service providers, and government agencies
- Automated root cause diagnosis technology to pinpoint the reason for service slowdowns in heterogeneous IT environments
- Connects performance insights across applications, user experiences, physical and virtual servers, network, and storage
- Real user monitoring shows slow transactions, as well as when, where, and why
- Synthetic monitoring tests applications 24×7 from different locations
- Traces web transactions as they are processed by each application tier, providing transaction flow graphs to detect bottlenecks in transaction handling
- Out-of-the-box monitoring for over 180 enterprise applications including Citrix, SAP, Microsoft Dynamics, SharePoint, Siebel, PeopleSoft, Java, and Oracle
Dell EMC SRM
Dell EMC Storage Resource Manager (SRM) is included as EMC is responsible for the bulk of the SANs out there. But it wasn’t easy to include the company. Parts of the old EMC catalog appear to have been lost within the vast Dell portfolio.
But SRM is still in there if you search hard enough. SRM is a comprehensive monitoring and reporting solution that helps IT visualize, analyze, and optimize storage infrastructure. It provides a management framework that supports investments in software-defined storage.
- EMC was the biggest player in the SAN space for more than 20 years. Its SRM platform contains that heritage and would be hard to beat when it comes to EMC SAN hardware
- SRM combines storage capacity, performance data, storage configuration and compliance data of heterogeneous storage into reports through a single-pane-of-glass
- End-to-end SAN visibility of storage arrays like PowerStore, PowerMax, Dell EMC Unity XT, XtremIO and other Dell EMC products
- Storage Resource Manager (SRM) is a comprehensive monitoring and reporting solution that helps IT visualize, analyze and optimize today’s storage infrastructure while providing a management framework that supports investments in software-defined storage.