Finding Needles of Data in an Unstructured Haystack

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Every week, if not more often, there appears another story in the news exposing a security breach at some company or organization — the loss, theft or inadvertent exposure of sensitive material such as Social Security numbers, credit card numbers and other financial and personal data. In fact, according to the Privacy Rights Clearinghouse, since January 2005 there have been more than 104 million such breaches.

To avoid bad press, comply with a growing number of government and industry regulations aimed at protecting consumer privacy (such as HIPAA, SEC 17a-4, Sarbanes-Oxley and the Leahy-Specter Personal Data Privacy and Security Act of 2007) and, more importantly, to keep their customers happy, companies are now taking greater measures to ensure that sensitive customer data is properly protected.

The Unstructured Data Nightmare

One of the greatest challenges that large and small companies face in the race to protect their customers’ privacy is identifying sensitive data that is stored in non-database files and e-mails, what’s referred to as unstructured information. That’s where Kazeon comes in. Its mission: to help organizations identify and manage that unstructured information, no matter where it’s stored.

“Today in most organizations, unstructured data represents 70 to 80 percent of their online data,” explains Michael Marchi, vice president of solution marketing at Kazeon. “Yet organizations have little visibility into this information.”

To gain that all-important visibility and help organizations proactively identify sensitive and confidential information sitting out, exposed, on corporate networks, Kazeon created the Information Server IS1200-ECS. Billed as the first appliance to integrate content-aware indexing, classification, search, reporting and migration together in one package to address compliance, data privacy and security challenges, the Kazeon Information Server is being deployed by companies like Omnium Worldwide, a leading accounts receivable company, that collect and need to protect sensitive data.

Standards Compliance

For Steven Cartwright, Omnium’s director of information security, compliance with industry standards — such as the Payment Card Industry Data Security Standard (PCI DSS), the Statement on Auditing Standards (SAS) No. 70 and HIPAA — is a huge issue. As he explains, “the industries that we play in [mainly financial services, telecommunications and healthcare] each have their own unique regulations that they have to comply with, which they then push on us.”

Of particular concern to Cartwright and Omnium was SAS 70 certification, which would entail an auditor specifically looking at Omnium’s data security and how it handled client data. “It’s a client requirement — a way for our clients to have a third party validate that we’re doing things correctly,” explains Cartwright. So it was not something that Omnium could avoid — or afford to fail.

However, until recently, Omnium, like many companies its size, didn’t have an automated system for discovering whether sensitive customer information was stored properly or not. “We’d stumble across areas of non-compliance rather than having something that would tell us where our areas of non-compliance were,” says Cartwright. “We’d hear something through the grapevine: this isn’t stored properly or this group is using this share incorrectly. And we would do a lot of manual investigation: Where is it? What is it? Where does it need to go?”

So late last summer, when a vendor stopped by to chat with Cartwright about Omnium’s storage and security needs and mentioned Kazeon, Cartwright was all ears. If the Kazeon Information Server truly delivered on what it promised, it could be the solution Omnium needed to help it get those critical SAS 70 and PCI certifications.

It took many weeks, several internal discussions, looming standards reviews, product comparisons and ultimately 30 days of testing a Kazeon Information Server demo unit to convince management, but Ominium eventually gave Kazeon a purchase order.

Protecting Sensitive Data

Installation was easy. “It almost took longer to unbox [the appliance] and put it into the rack,” says Cartwright. And it played very nicely with Omnium’s existing systems, including its new IBM SAN and IBM Tivoli Storage Manger. “The nature of the appliance is it just doesn’t require a lot of integration,” he says.

Although deployed for less than six months, the Kazeon IS1200-ECS is “doing what we wanted it to do, which is going out [every night and every weekend] and discovering areas of our network — department drives, personal drives and things like that — where we have data stored that shouldn’t be stored,” says Cartwright. “We’re using it primarily to search for credit card numbers, social security numbers, personal health information, things that we want to make sure are secured.”

And when the IS1200-ECS discovers sensitive data among Omnium’s 1.5 TB of unstructured information, which it has (although Cartwright, understandably, is reluctant to say how much), it automatically spits out a report, so Cartwright and his team can immediately move and protect the exposed data. Now Cartwright is looking forward to the day when he can sit across the table from each of the bank auditors, who visit once a week between March and September, pull up the Kazeon Information Server, “and show him this report that ran last night that shows him that none of that data is stored inappropriately.”

Gaining Visibility

Like Omnium, a lot of companies are using the Kazeon Information Server to gain visibility into their stored data. “Quite simply, without visibility into information stored on the network, organizations have risk liabilities, inefficient access to information for knowledge workers, higher storage costs and lower chances to win court cases,” states Kazeon’s Marchi, who notes that many companies are also using the IS1200-ECS to help with legal discovery.

“Organizations need to understand and find what information is sensitive and confidential and get it off the network, because it is a liability,” he explains. Additionally, “IT needs to understand the data in their environment, so they can optimize their tiered storage infrastructure to help control costs — i.e., why store files that have not been accessed in one year on primary storage? Why not find those files, set up policies to move those files, and automate this process?” The Kazeon Information Server, he says, lets IT departments do all of this.

“We are the only company that can scale to meet the new [data privacy] requirements,” says Marchi. “We can cluster many nodes together (all managed as a single entity) to index and search across the enterprise. Our optimized crawlers and a pipelined indexing process enable us to index millions of files per day per node. And we’re the only product that integrates network discovery, classification, reporting, search and automation in a single product.”

Additionally, the Kazeon Information Server supports over 380 standard file types and can support hundreds of non-standard file types. It even supports the Digital Imaging and Communications in Medicine (DICOM) format, which is used in the medical imaging and healthcare industry.

States Marchi: “Simply install us on your network, give us read access to the shares you want us to index, and we do the rest.”

Back To Enterprise Storage Forum

Jennifer Schiff
Jennifer Schiff
Jennifer Schiff is a business and technology writer and a contributor to Enterprise Storage Forum. She also runs Schiff & Schiff Communications, a marketing firm focused on helping organizations better interact with their customers, employees, and partners.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.