Every knows of itinerant databases that sit on departmental servers, or worse, desktops. These databases grow up to meet the needs of an individual project or team, then gradually become part of the data the company depends upon.
So, does it make sense to leave these data stores scattered about your network in different locations? Most organizations are trying to bring this trend under more central control and, at least, get this valuable data into the data center environment. This strategy makes for better disaster recovery and business continuity. Consolidating these data stores helps your systems to scale better.
Consider a storage area network (SAN) if your data doesn’t need to be physically close to the servers that need it and you’ve budget to put in a SAN. For example, a financial organization might consider a SAN across its secured, private-line wide area network facilities to increase server performance. However, this set up wouldn’t save the organization any money. Most other organizations will find doing SANs over the wide area network to be expensive. If they had a virtual private network or used a managed virtual private network service instead, they’ll find the virtual private network’s packet loss and unreliability will make it a poor transport technology for SANs.
On the other hand, network attached storage devices can benefit from virtual private networks because the traffic is usually less interactive, and can tolerate latency. Remote small office, home office workers, for example, can take advanced of accessing their files from a NAS device.
What do you do if you want to consolidate the Web logs from 20 different Web servers? Depending on the application architecture, you could do a batch-type or a real-time transfer. You finally decide want to get the Web logs from the Web servers to the analytical application at the same physical location. Some of these analytical customer relations management applications have code that resides on the Web server, while other applications rely on FTP transfers of the logs prior to process (either through scripts or manual intervention). A SAN might work in the case of the applications with the code living on the Web servers, which is more interactive, while a NAS would likely be better for applications that rely on FTP transfers. If the analytical CRM application needed the data both interactively and through FTP transfers, than go with a SAN. Always plan for the highest level of interaction for customer relationship management application. To this end, youd be offer going with a SAN to support highly interactive customer relationship management applications.
Most organizations have many reasons for wanted to move data. Databases which are shared among many applications and users might fit quite nice into a SAN. Other databases that aren’t as heavily used, such as major tape backups sent to off-site locations, could function with a NAS or direct-attached storage system. For example, to speed up a full Oracle backup, you might go right to disk via a SAN and then move the backup to tape.
You need to understand how everyone uses the data on the system that you would like to move to a SAN. Say you want to move the interactive Web log data, but share copies of it. You might have two application databases servers sharing the same SAN and pointing to the same storage appliance. To this end, only the application and the Web log data travel over the SAN. Likewise, the content management system, which serves content to the Web site, could also share the SAN as a way to access information from the analytical application to personal the Web site for visitors. Having more applications access the same SAN can provide a good return on investment.
The sheer volume and frequency of data transferred could help you decide if you need a SAN. For example, if you transmit several 100 megabytes per second constantly around the clock, a SAN might be a good idea. On the other hand, if you’re moving 1,000 megabytes per second over throughout the day, or once or twice per day, than a NAS or direct-attached storage could suffice.
When considering volume and frequency, you must also consider the growth rate and pattern for this data. Try to determine if seasonal issues cause predictable increases in data flows for days or weeks and then stop. If you have seasonable high volume periods, such as a retailer’s back-to-school campaign, you might want to have your SAN as part of your online customer relationship management system linked to your Web site.
Going back to the Web analytical application, you find that the data volumes could be great depending on the size of the site. If the Web analystics applications is also intended to track and calculate the results of an ongoing sales and marketing application, you’ll need to move a lot of data to be analyzed. Once again, a SAN makes a lot of sense since you’ll be able to share the Web data among the applications and reduce bottlenecks.
The lesson here is this – the volume of data and its movement depend on understanding the applications involved and the way people use them.
Elizabeth M. Ferrarini is a free-lance writer from Boston, Massachusetts