As the volume of data mushrooms, archiving becomes more and more of a hot topic that organizations might ignore at their peril. The consequences of failing to archive content are many. Compliance mandates require certain information be retained for 5, 10, or sometimes 50 years or more. And CFOs get tired of monthly requests for yet more disk arrays or cloud storage. They need a way to reduce costs.
The basic idea of an archive is to store rarely accessed or inactive data in a secure place, typically a lower and cheaper tier of storage. In some cases, the archive has to be immutable; data placed there cannot be changed.
Access is a concern. Users want to be able to find and access that content rapidly in the rare case that they need something. That’s why you can’t call a shelf full of dusty backup tapes an archive. They are unsearchable and it might take a while to retrieve them and find what you need.
This factor causes many to archive on cheaper disks. That solves the accessibility problem and works for some in terms of cost. But for others, the price tag in running more and more disk arrays is high — when you take into account the cost of equipment and software acquisition, maintenance, cooling, and power. Hence, the growing popularity of tape-based archiving systems that offer relatively fast access.
Key Archiving Features
Archive software and solutions should provide all or most of the following capabilities:
- Fullfil legal and regulatory compliance requirements.
- Provide low-cost storage for data retention.
- Centralized storage for structured and unstructured data.
- Automated data retrieval and retention.
- Chain of custody to ensure tamper-proof records.
- Metadata for search, categorization, and discovery.
- Retention policy setting.
Archiving Use Cases
The primary use cases for archives are:
- A repository of organizational data for a long period
- A way to offload infrequently accessed or inactive data to free up storage space in more expensive platforms such as disk arrays and solid state drives.
- Compliance to regulations about data retention and file immutability.
- A way to retain unstructured data for long periods that may be useful for analysis, trending, and marketing.
- E-discovery features to improve litigation readiness.
Tips on Selecting Archiving Software
- Don’t confuse a backup with an archive. A backup is a way of keeping a copy of data in case of an emergency, data loss, security breach, etc. An archive, on the other hand, is a repository for inactive data and a way to retain it for long periods for regulatory and other purposes.
- Check with your backup vendor if they offer archiving tools. If so, the integration benefits should make them a top candidate.
- Balance the costs of archiving against the price tag for retaining everything on an expensive disk or SSD.
- Consider security. Disk is best for speed, but tape offers an air gap that prevents the bad guys from infecting the data with malware.
Also read: Storage Infrastructures for Edge Computing
Top Archiving Software Vendors
Enterprise Storage Forum considered the various vendors in this category. Here are our top picks, in no particular order:
The Veritas Digital Compliance portfolio includes Veritas Merge1 that captures more than 80 different native content sources into Veritas Enterprise Vault or EV.cloud (the SaaS version of Enterprise Vault). Enterprise Vault allows users to capture, archive, and find data quickly. The Veritas Digital Compliance portfolio helps users close the data governance gap to achieve compliance.
- Integrated support for over 80 content sources, collected natively for archiving, supervision and discovery.
- A partner ecosystem that includes voice, video, and telephony, bringing the total to over 110 content sources.
- A fully compliant, enterprise-class on-premise archiving solution with the flexibility to migrate data from on-premise archive to public cloud services (Azure, AWS) from the same console.
- A SaaS version with automatic upgrades, and a common user interface for classification and supervision.
- Classification includes over 1000 policies and patterns to ensure detection of personal/sensitive data, stale and orphaned data, business-relevant data, and threats such as ransomware.
- Utilizes predictive analytics and machine learning to identify anomalous behavior, and surface relevant information based on previous choices from reviewers
- eDiscovery that provides end-to-end collection.
- Integration with Microsoft 365 allows information governance capabilities such as tagging based on classification policies for Microsoft 365 administrators or true journaling capabilities, allowing an immutable copy of all the organization’s emails and other relevant content in one secure archive.
FujiFilm Object Archive is designed around best practices for data archiving of redundant copies off-site, data security with a tape air-gap solution, scalability, reliability, low cost of ownership, a standardized interface to data, and a seamless data migration process. Object Archive makes hybrid cloud storage and cold data archiving simple by integrating S3-compatible API with modern tape technology. It offers an object-based storage target that works like an on-premise Amazon Glacier. FujiFilm is part of the Active Archive Alliance, a vendor-neutral group that helps develop and implement archiving technologies and strategies.
Active archiving uses intelligent data management software to automatically classify and move data across multiple storage tiers and locations to optimize storage utilization and cost. This native intelligence keeps aging data online all the time, easily searchable and accessible without the need for IT intervention, saving time and money. The FujiFilm solution incorporates this approach.
- Designed to archive massive amounts of cold data.
- Object Archive operates like Amazon Glacier.
- Protects data with air-gap security.
- Seamless integrates with its S3-compatible API.
- Reduce costs for object storage up to 80%.
- Scalable with enterprise tape libraries.
- Software subscription model.
- Tape media included in subscription.
- Vendor neutral — Supports major tape library manufacturers.
- Supports LTO and IBM Enterprise tapes.
- Seamless integration with disk-based object storage vendors.
Proofpoint Enterprise Archive is a cloud-native archiving solution that helps customers meet long-term business and regulatory information retention requirements. With support for email and other digital communication platforms, such as instant messaging and collaboration and social media, it provides high performance, built-in search, litigation hold and export to address basic e-discovery needs. It also offers advanced features, such as case management and machine learning-based technology assisted review, to streamline processes and reduce related costs.
- The integrated Proofpoint Compliance Gateway ensures all email and other content captured is stored in the archive.
- Intelligent Supervision helps monitor and review email and other digital communications to enable regulatory compliance.
- NexusAI for Compliance uses machine learning to reduce false positives from supervisory review queues.
- No need to manage supporting hardware and software. Proofpoint also manages updating versions.
- Out-of-box machine learning models to reduce low-risk supervision content from review queues.
- The Compliance Risk Dashboard provides an overview of a compliance profile, highlighting compliance risks and violation trends over time.
- DoubleBlind security technology enables searching archived content without first having to decrypt, as well as encryption.
Micro Focus Compliance Archiving is a cloud-based offering that includes compliance archiving, supervision, surveillance, analytical reporting, end-user email archive search and ediscovery capabilities. Digital Safe Archiving includes scalable Precise Search compliance, delivering cross-channel granular results because of object store and attachment enrichment. This empowers teams to refine single searches, reducing the noise to find terms and potential violations within all e-communication, attachments, and in embedded voice and video. It is particularly suited to highly regulated complex global multi-jurisdictional implementations. It is built on the Compliance Data Lake (CDL), and includes AI technology that is said to save the data scientist time.
- Reporting and Insights offers interactive analytic workflows.
- All solutions are delivered with premium Managed Services, including managed operations, monitoring, customer success, support and expert consulting plus optional outsource staffing in key areas such as audit services.
- Micro Focus Information Management & Governance solutions focus on highly regulated industries such as Financial Services, Government, and Healthcare and provides organizations unified endpoint management, supervision, surveillance, eDiscovery, and governance, with AI and investigative analytics.
- Precise Search results from interactive Terms Hits and Faceted Search are possible because of message level enrichment. Personalized, easy to use search enabling interactive Booleans, nested parenthetical expressions across messages, attachments, at PB scale and beyond.
- Compliance Data Lake for social collaboration, voice, video, (business communications) and files ingested and enriched leveraging best practices for structure, object level enrichment delivering cross-channel and custodian insights, plus compliant search.
- Cloud native architecture for agility, upgradability, combined with Premier Support and Services.
Commvault is best known for backup, data protection, and disaster recovery. It also offers archiving tools for files and emails. However, these seem to get a little lost in the plethora of other products available. The Commvault file archiving solution enables you to move data to a secondary storage and use it to function as an archive copy. You can move data to disk, tape, or cloud libraries. The archived data will be available for quick and easy retrieval. You can recall the archived data by using stubs or placeholders in the original locations. This solution is probably best for existing Commvault users who can add archiving to their backup and DR offerings.
- Archiving reports can be used to view vital information, such as the compression rate and the space saved by archiving.
- Easily manage the backup data in your environment, by providing pre-defined archiving rules.
- Files are indexed when you perform archive operations, so you can search and retrieve archived files easily.
- You can archive data on the following file system clients: Macintosh, Windows, and UNIX.
Miria for Archiving (formerly Ada) is file archiving and data moving software that ensures constant availability and security for very large data volumes. It integrates into workflows that need performance and open, scalable formats, while helping organizations to avoid storage lock-in.
- A drag-and-drop interface, plus integration into workflows, allows end-users to archive and retrieve data without assistance.
- Supports automated tasks for archive according to IT policy.
- Customizable to fit your workflow and your applications.
- Search archive based on basic file properties, custom keywords, or file’s full text content.
- Scalable from tens to hundreds of terabytes with no performance bottlenecks.
- Archive based on specific policies, including retention rules based on owner or data type.
- Archive data using standard formats on a broad range of storage media, including disk, tape, object, and cloud.
Smarsh Connected Suite
Smarsh boasts the industry’s largest capture coverage across email, social media, mobile/text messaging, instant messaging and collaboration tools, websites, and voice. Its portfolio includes security, compliance, archiving, and e-discovery. It works with small businesses and the world’s largest banks and government entities. Its Connected Archive comes in three separate editions that retain communications in their native format with conversational context. The Smarsh Enterprise Archive uses web-scale technologies to ingest, search, and export content faster than legacy archives. It is built to scale as data volume grows
- Fast, consistent access to data.
- Performance at scale.
- No single point of failure.
- Professional Archive is a single, secure, search-ready compliance and e-discovery archive for small and mid-size organizations. It Includes communications capture and supervision, and e-discovery apps.
- Enterprise Archive is a cloud-native, context-aware, extensible archive for global enterprises with complex security, data privacy and regulatory requirements.
- Federal Archive is a text message archive designed specifically for U.S. federal agencies. Includes text message capture.
StrongBox is another vendor in the growing list of archive providers favoring the Active Archive Alliance approach. Strongbox StrongLink delivers a seamless active archive that works with any storage disk, cloud and tape. Its Autonomous Engines with Lifecycle Management Services and Dynamic Data Mover enables intelligent caching for hot data, with data placement optimization across any storage resource, location, vendor platform, or file system.
- Metadata-driven storage optimization intelligently offloads cool/cold data.
- Reduce storage costs by over 80% by cutting storage, DR and backup.
- Meet SLA & QoS with direct file access, no data rehydration.
- Integrate and optimize cloud and tape with zero recovery time objective.
- Verification, audit reports, and insights ensure visibility and data provenance.
- Dynamic Data Mover engine offers cross-platform migrations, tiering and archiving.
- Workflow engine provides automated workflows based on schedules and triggers.
- Visualization engine for alerting and trending.
- Metadata engine can extract, auto-classify, tag, organize, and manage data.
- Analytics engine provides intelligence and derived insights.
- Query engine can query globally across all storage.
Read next: Top Storage as a Service Providers 2021