Six Tips for Media Storage in a Big Data World

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Big Data is this year’s blockbuster, but for the broadcast industry the topic’s a rerun. It has had to deal with the storage of large quantities of media files for many years.

“Broadcast data is really the original big data,” said Moosa Matariyeh, enterprise storage specialist for CDW. “It’s a vast amount of data that is often accessed in large chunks.”

Now, that doesn’t mean that everything in IT can stay the same. In fact, broadcasters need to change their storage systems about as often as the BBC changes the actor playing Doctor Who. In this case, the main driver is on-demand video.

“The fact that you can go online and stream the replay of the afternoon news is putting greater performance demands on infrastructure and providing more points of access for that data,” said Matariyeh. “This just reinforces the reasoning behind scale-out architecture.”

Here are some tips for selecting and architecting media storage.

1. Define the Usage Scenario

The term Big Data applies to both storage and analytics.

“Big Data in terms of capacity is what broadcasters have been swimming in for many years,” said Jason Danielson, media and entertainment solutions, product and solution marketing for NetApp. “Big Data in terms of the analytics of large datasets of consumer information is still a ways off for most broadcasters.”

Nevertheless, some broadcasters are moving into Big Data analytics, such as analyzing Twitter to get faster feedback on programming impact than waiting for customer surveys or sweeps. From a storage viewpoint, the primary difference between Big Data analytics and media storage is that analytics needs low-latency access to massive numbers of small files, while media requires uninterrupted access to smaller numbers of massive files.

2. Use Flash for Streaming

When setting up the file system, put metadata and thumbnails on first-tier storage, preferably SSD or flash, which points to the longer files—both low-res and high-res, including 4K video—on lower-tiered storage. This same principal applies not just for editing purposes, but also for buffering streaming on demand.

“For example video thumbnails or head (beginning) portions of videos can be kept online in SSD for fast access leveraging back-end disks or tapes to support long tail access (e.g. reduce the wait time for a video to load or buffer),” said Greg Schulz, senior advisor, StorageIO Group.

3. Select the Right Storage for the Usage Scenario

Not all applications require the same type of storage system.

“The best solutions for broadcast media depend on the usage scenario. If we are talking about storage for some of the applications we typically see in media, many of those applications historically like to mount storage that is presented locally or via a fibre channel interface in order to accommodate the large amount of data moving across the medium,” said Matariyeh. “For scenarios like this, many of the applications have pre-approved manufacturer lists of products they support.”

He says that file system architectures work well for streaming media, both on-air programming or online streaming, since they allow for multiple mounts for a single set of data and can increase throughput and capacity as needed as more people try downloading or accessing a single song or video.

“The other advantage these systems have is the collaborative work that they can do, since multiple applications or users can access these systems through something as simple as an NFS mount,” Matariyeh said. “This enables a media file to be uploaded, edited, and streamed from a single platform.”

4. Keep the Tape

Disk is replacing tape in many businesses, but it still has a big role in broadcast for setting up active archives, given that a single video file can run several terabytes. However, rather than moving rarely used videos offline, use a tape library to keep them as an active archive, restoring them to disk when needed.

“The most popular implementation or strategy in big data environments is launching an active archive storage model,” said Hossein ZiaShakeri, senior vice president of business development and alliances at Spectra Logic. Active archive provides an affordable, online solution to access and store all created data by extending the file system to tape. An active archive contains production data, no matter how old or infrequently accessed, that can still be retrieved online.”

5. Don’t Look for Dedupe

While data deduplication and compression are major advantages for some types of files, they won’t make much of a difference with media storage. To begin with, many are already in a compressed format such as MP3, MP4 or WAV. Any further compression will can lower quality.

“Instead of a focus around dedupe-type features, instead look storage that can scale in terms of performance (bandwidth), space capacity, reliability, durability and ease of management in an economical manner,” said Schulz. “While metadata or editing or popular heavily accessed files are a good candidate for SSD, high-capacity lower-cost SAS and SATA drives, along with LTO tape combined with LTFS are good candidates to meet the bandwidth and space needs.”

Non-media files, however may benefit from the higher IOPS available with SSD, rather than large capacity SAS or SATA drives.

6. Include Analytics

“Big Data is a very applicable term when it comes to media storage,” said Sam Grocott, VP product management and marketing, EMC Isilon. “With some single files being up to terabytes in size and with each movie needing to be transcoded into roughly 16 different formats for worldwide and multiplatform distribution, the growth of media data is higher than most other industries.”

But broadcasters also need to use Big Data analytics in order to make sure they get a good return on the billions they spend generating content. In a world where viewers have access to millions of entertainment choices at the click of a mouse, Big Data analytics is needed shape what to offer to whom.

“The most important metric for broadcasters is how well they can monetize their assets,” Grocott continued. “To do this, broadcasters have to maximize not only the number of eyeballs, but also from the right demographic tuning in to watch their assets.”

Drew Robb
Drew Robb
Drew Robb is a contributing writer for Datamation, Enterprise Storage Forum, eSecurity Planet, Channel Insider, and eWeek. He has been reporting on all areas of IT for more than 25 years. He has a degree from the University of Strathclyde UK (USUK), and lives in the Tampa Bay area of Florida.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.