Index Engines Collaborates with EMC, Launches Workshop for Legacy Tape Data Management

EMC and Index Engines launch a new workshop that delivers a customized analysis of existing legacy backup data environments to determine the optimal tape to disk mitigation plan that reduces risk and lowers costs associated with maintaining access to legacy backup data.

LAS VEGAS and HOLMDEL, N.J.– Information management company Index Engines has teamed up with EMC to launch a new workshop that delivers organizations an intelligent analysis of their legacy backup tape data and assists in the development of an information governance strategy to migrate sensitive data required by legal and compliance to disk for improved management and access. Index Engines is a Select partner in the EMC Business Partner Program for Technology Connect.

The Workshop for Legacy Tape Data Access Service, delivered through EMC® professional services, leverages Index Engines’ Catalyst software for the ingestion of legacy tape catalogs, metadata analysis and reporting on the content and a disposition strategy that cost-effectively restores data required by legal and compliance to disk allowing tapes to be remediated.

“Through this workshop, EMC can deliver knowledge of legacy tape data along with an intelligent disposition strategy that supports clients’ information governance needs,” said Jim Clancy, Senior Vice President, Global Sales, EMC Data Protection Solutions.

This new workshop provides the data necessary to make an informed decision on migration options given a customer’s actual circumstances and provides solutions for two key pain points including:

– Offering clients who have significant pain around legacy tape data, including costs associated with managing, archiving and restoring data in support of legal, compliance and regulatory requirements a simplified access point and disposition options.

– Providing clients who are managing multiple non-production backup environments a more effective means for validating the optimal method for maintaining access to legacy tape data.

During the workshop, Index Engines’ Catalog Engine will directly ingest TSM, NetBackup or CommVault backup catalogs. The Index Engines solution delivers full reporting and analysis of the content along with direct restoration of files and email without the need for the original backup software.

This EMC-run workshop will provide details back to the customer to enable them to make an informed decision regarding how to mitigate challenges associated with maintaining legacy data access.

Then, EMC will develop metadata level reports on the tape contents along with a disposition strategy recommendation, focusing on migration of valuable data to disk-based archive and retirement of the legacy platform and remediation of tapes.

“Legacy tape content has become a legal and security risk, especially for highly regulated organizations including financial services, healthcare, government, and energy firms,” said Jim McGann, Vice President, Index Engines.

The workshop is available this week through EMC. Please contact your EMC or EMC Partner Sales representative for more details.

About Index Engines

Index Engines provides unprecedented file-level knowledge to manage the growing costs and risks associated with unstructured user data.

EMC is a registered trademark or trademark of EMC Corporation in the United States and/or other countries.

All products mentioned are trademarked by their respective organizations.

Index Engines Announces Support of CommVault Catalog Ingestion and Management

The latest release of Index Engines’ Catalyst software enables companies to search, report and maintain access to data on tape without the need for the original CommVault software

HOLMDEL, N.J., April 23, 2015 /PRNewswire/ — Index Engines, the leader in enterprise information management and archiving solutions, announced streamlined access to and management of CommVault backup catalogs and tapes Thursday with its latest version of its Catalyst software.

The added support of CommVault backup data helps control costs and risks associated with legacy data on tape. Catalyst manages the legacy catalog and delivers a search view into the tape content allowing individual files and email to be found and extracted without the original backup software, enabling the retirement of legacy applications and providing transition to a new backup platform.

“The latest Catalyst release gives organizations more freedom to manage their backup software environment and make migration decisions without feeling locked into their current provider,” Index Engines Vice President Jim McGann said.

The software, which also supports IBM’s Tivoli Storage Manager and Symantec’s NetBackup, manages the legacy catalog and provides search and reporting of the backup catalog metadata and policies. Restoration of content is also supported, including mailboxes or individual email, without the need for the original backup software.

Index Engines also provides direct tape indexing that captures deeper metadata from tape content or, optionally, full text of unstructured data and email including Microsoft Exchange and IBM Notes databases.

Since its release, Catalyst has been enabling data centers to:

– Ingest catalogs allowing for metadata level reporting and classification of data in order to understand what exists on tape archives.
– Retire non-production instances of backup software while maintaining access to legacy tape data.
– Support corporate data governance policies with respect to legacy tape data, developing policies and strategies for defensible disposition of sensitive files and email.
– Migrate data of value from tape to archives for long-term preservation to support eDiscovery, and information governance policies, and remediate tapes from offsite storage.
– Simplify backup infrastructure inherited through upgrades, mergers or consolidations, and still maintain access to the tapes.

“Our goal is to allow clients to control costs associated with managing legacy tape archives, and allow clients to easily make a change in backup strategies without losing access to data on legacy tape archives,” McGann added.

Catalyst starts at $10,000, which includes the engine and ability to process up to 2,500 backup tapes.

An Intelligent Approach to Managing Risk and Liability of Legacy Backup Data

How eliminating tape as a long-term archive mitigates legacy data risks

The most significant expense and pain associated with stockpiling legacy data on backup tape or disk archives is the risk of the unknown. Backup images are archives of all user data, including email, text documents and PDFs from the CEO, to contracts from legal and manufacturing, to documents from research and discovery.

When these highly-sensitive records are not managed properly – archived, encrypted or even purged – they could be requested to support litigation and regulatory compliance. These potential “smoking guns” could cost your organization millions in fines along with even more in embarrassment and loss of public trust.

Managing legacy business records properly will allow mitigation of risk and control of future potential expense.
In the past, data on old backup tapes has been difficult to access. As tapes age they become more and more inaccessible and if you need to know what is on these tapes, or restore files and emails in support legal and compliance requirements, it can be extremely complex and expensive.

Index Engines provides an intelligent method of managing and restoring content from legacy backup tapes. By eliminating the need for the backup software that created the tape, Index Engines can ingest legacy backup catalogs and provide reports and analysis of the contents in order to determine a disposition strategy.

Additionally, Index Engines can scan tapes and provide deep intelligence, including content metadata, so individual files and mailboxes can be extracted without the original backup software. Leveraging this new approach towards backup data access, Index Engines eliminates traditional tape restoration and intelligently manages legacy data using a more cost-effective approach.

With intelligent knowledge and access to legacy backup data there is no need to maintain non-production backup software. Additionally, tape and disk archives can be analyzed and a disposition strategy can be defined that secures sensitive data and eliminates what no longer has value to the business.

As a result, the number of data center inefficiencies can be reduced and wasted costs can be recovered and reallocated to other critical initiatives.

The process of determining what legacy backup content has value and what is redundant, outdated and trivial is not as complex as one may think. If you can develop a policy of what should be preserved, which requires input from the legal and records management team, this would result in restoring less than 1% of the legacy backup content.

Even if you haven’t determined the policy of what to keep, a single instance of legacy tape data can be restored to a disk based archive and then retention policies can be defined. The right scenario is dependent on how sound your data retention policy is.

Some organizations can be very specific as to what should be preserved (based on content type, owner, and date range), others may not have a detailed policy or a “save everything” policy. Either way a solution exists to migrate and secure data of value online and eliminate the use of tape as a long-term archive.

Developing a Data Disposition Strategy: A Case Study in User Shares

Join us for a 30-minute Tech Break
Jan. 29 at 1 pm ET/10 am PT

One international financial services company discovered that 34% of their data was duplicate, 68% of the data had not been touched in over 3 years and they had nearly 17,200 individual files containing PII on their main network.

While most organizations share similar network profiles, what this company did next reduced their costs and mitigated untold millions in risk.

Leveraging metadata and the latest technology, discover how they developed a data disposition strategy and reclaimed 54% of their capacity and implemented a value-based archive.

Join us for a 30-minute tech break Thursday, Jan. 29 at 1 pm ET/10 am PT to see how they did it and what you can learn from their user data mistakes.
Register now

Index Engines Backup Migration Solutions Now Available through EMC

As a Select Partner in the EMC Business Partner Program for Technology Connect Partners, Index Engines today announced its Catalyst platform is now available through EMC and its channel partners.

Catalyst enables EMC clients to seamlessly transition to EMC®’s Data Protection Suite featuring NetWorker® and Avamar® solutions while still maintaining access to their legacy backup data without the need for the original software.

Catalyst offers a range of solutions including catalog management, data restoration without the need of original backup software, and single instance migration of tape data to disk to support retention requirements. Intelligent management and access to legacy backup data supports information governance and compliance requirements, while controlling data center costs.

“We’re thrilled with our enhanced partnership with EMC and the value we’ll jointly be able to deliver to customers,” Index Engines Vice President Jim McGann said. “Companies are no longer forced to maintain non-production backup environments out of fear of a compliance or eDiscovery requests. They can now maintain easy, searchable access to backup data while transitioning to state-of-the-art EMC technology.”

Catalyst enables clients to deploy EMC’s Data Protection Suite, retire their existing backup vendor and still maintain access to legacy data without the need for the original backup software. The Catalog Engine searches, manages and generates reports on legacy data with the Catalyst Tape Indexes utilized to deeply index the content so it can be searched and restored.

Catalyst also delivers comprehensive metadata search and migration tools in order to restore a single instance of tape data to disk without the need for the original backup software. This allows clients with risks and costs associated with utilizing tape as a long-term archive to efficiently migrate this sensitive data to EMC disk for long-term preservation and management. Once this migration is complete, the legacy tapes can be remediated recouping offsite tape storage expenses.

“EMC is pleased that Index Engines has joined EMC Technology Connect as a Select Partner, demonstrating its commitment to excellence in technology innovation for data protection and migration,” said Don Lamburn, Director, EMC Technology Connect. “We look forward to working with Index Engines to ensure that our mutual customers have the highest level of support possible for their information infrastructure initiatives.”

10 Mission-Critical Unstructured Data Projects To Control Costs and Streamline Operations in 2015

Everyone’s talking about unstructured data lately – the cost, the risk, the massive growth – but little is being done to manage it.
Analyst group IDC estimates unstructured data growth at 40-60 percent per year, a statistic that is not only startling, but puts a great deal of emphasis on the need to start managing it today or at least have it on the schedule for 2015.

With budgets tightening – often to pay for storage costs – data center managers are struggling to find the highest impact projects that will see an immediate ROI. While there’s no one project that will reclaim all of the unstructured data rotting away in the data center, there are 10 crucial projects that will help streamline and control costs in the data center.

1. Clean up abandoned data and reclaim capacity: When employees leave the organization, their files and email languish on networks and servers. With the owner no longer available to manage and maintain the content it remains abandoned and clogs up corporate servers. Data centers must manage this abandoned data to avoid losing any valuable content and to reclaim capacity.

2. Migrate aged data to cheaper storage tiers: As data ages on the network it can become less valuable. Storing data that has not been accessed in three years or longer is a waste of budget. Migrate this data to less expensive storage platforms. Aged data can represent between 40% of current server capacity.

3. Defensively remediate legacy backup tapes and recoup offsite storage expenses: Old backup tapes that have piled up in offsite storage are a big line item on your annual budget. Using unstructured data profiling technology these tapes can be scanned, without the need of the original backup software, and a metadata index of the contents generated. Using this metadata profile relevant content can be extracted and archived and the tapes can be defensibly remediated, reclaiming offsite storage expenses.

4. Purge redundant and outdated files and free-up storage: Network servers can easily be comprised of 35 – 45% duplicate content. This content builds over time and results in wasted storage capacity. Once duplicates are identified a policy can be implemented to purge what is no longer required such as redundant files that have not been accessed in over three years, or those owned by ex-employees.

5. Profile and move data to the cloud: Many data centers have cloud initiatives where aged and less useful business data is migrated to more cost effective hosted storage. Finding the data and on-ramping it to the cloud however is a challenge of you lack understanding of your data: who owns it, when it was last accessed, types of files, etc.

6. Audit and remove personal multimedia content (ie. music, video) from user shares: User shares become a repository not only aged and abandoned files, but personal music, photo and video content that have no value to the business and in fact may be a liability. Once this data is classified reports can be generated showing the top 50 owners of this content, total capacity and location. This information can be used to set and enforce quotas and work with the data owners to clean up the content and reclaim capacity.

7. Archive sensitive content and support eDiscovery more cost effectively: Legal and compliance requests for user files and email can be disruptive and time consuming. Finding the relevant content and extracting it in a defensible manner is the key challenge. Streamlining access to critical data so you can respond to legal requests quicker, not only lessons their time burden but saves you time and money during location efforts.

8. Audit and secure PII to control risk: Users don’t always abide by corporate data policies. Sharing sensitive information containing client social security and credit card numbers, such as tax forms, credit reports and application, can easily happen. Find this information, audit email and servers, and take the appropriate action to ensure client data is secure. Some content may need to be relocated and moved to an archive, encrypted or even purged from the network. Managing PII ensures compliance with corporate policies and controls liability associated with sensitive data.

9. Manage and control liability hidden in PSTs: Email contains sensitive corporate data including communications of agreements, contracts, private business discussions and more. Many firms have email archives in place to monitor and protect this data, however, users can easily create their own mini-archive or PST of the content that is not managed by corporate. PSTs have caused great pain when involved in litigation as email that was thought to be no longer in existence suddenly appears in a hidden PST.

10. Implement accurate charge-backs based on metadata profiles and Active Directory ownership: Chargebacks will allow data center to accurately recoup storage expenses and work with the departments to develop a more meaningful data policy including purging of what they no longer require.

There are a number of ways companies can approach these projects, but to maximize impact a number of file-level metadata tools, sometimes referred to unstructured data profiling, exist.

Through the file-level information date, owner, location, file type, number of copies and last accessed information can be determined, which will help data center managers classify data and put disposition policies in place.

The benefits of managing unstructured data include reduced risk, capacity and reclaimed data center budget. With finances already tight and data growing rapidly, don’t leave these projects off the schedule in 2015

Legacy Backup Catalog and Data Management

Catalyst from Index Engines unlocks content in IBM and Symantec backup formats (with Commvault support coming in 1H2015) allowing access and management of the data without the need for the original software.

Clients that currently use IBM’s TSM or Symantec’s NBU now have a cost effective and intelligent migration strategy to best of breed backup platforms. If a non-production TSM/NBU instance, resulting from a merger or acquisition is maintained in order to provide access to legacy tape data, these environments can be retired and replaced with a single solution that provides simplified access to the data going forward.

Additionally, clients who have large volumes of legacy backup tapes in offsite storage vaults (Iron Mountain, Recall, etc.), containing data required to support legal and compliance, can now intelligently restore this data and extract the records that are relevant to disk for simplified access and management. This is typically a small portion of what is contained on tape, which differentiates Catalyst from a migration tool where all data is moved from tape.

[embedplusvideo height=”500″ width=”640″ editlink=”” standard=”″ vars=”ytid=szXFKecHLQg&width=640&height=500&start=&stop=&rs=w&hd=0&autoplay=0&react=0&chapters=&notes=” id=”ep9656″ /]

Index Engines Offers Complimentary 1 TB Software Licenses to Shed Light on User Data Growth

Index Engines has announced it now offers a complimentary 1 TB licenses of its Catalyst product so organizations can get a better grasp on their unstructured user data before they start budgeting for 2015.

This VMware-based plug and play download gives organizations 30 days to perform a metadata and full-text profile on the LAN file data of their choice, giving them information on last accessed time, owner, created date, number of duplicates, file type and more while performing PII pattern searches for credit card and Social Security numbers.

“Most storage executives don’t know what they have, if it has value, if it poses a risk or liability, if it is a security violation or if it’s employee vacation photos and music libraries,” Index Engines VP Jim McGann said. “Catalyst will give them a peek into their data, understand growth sources, and develop intelligent disposition strategies in order to control costs and liability hidden in user content.”

Index Engines’ Catalyst software is designed to deliver a file-centric view of the data center. Catalyst processes all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and, optionally, what key terms are in it.

Leveraging the rich metadata or full text index in conjunction with Active Directory integration, content can be profiled and analyzed with a single click. High-level summary reports allow instant insight into enterprise storage providing unprecedented knowledge of data assets.

The limited-use license features the data profiling features of Catalyst, but does not include the automated disposition and archiving capabilities of the full version of the enterprise-ready software.

The no-cost Catalyst licenses are currently available to enterprise data centers on Index Engines’ website at

The Cost of Abandoned Data

The average corporate turnover rate for employees is 15.1 percent across all industries, with some specific verticals experiencing as high as 30%. For an organization with 10,000 employees this can account for 1,500 to 3,000 people annually (Compensation Force: 2013 Turnover Rates by Industry).

When an employee leaves an organization the IT department will typically wipe or recycle their hard drive, containing their digital files and email, however, they neglect to clean and manage former employees’ data on corporate networks and servers.

For this scenario, a company of 10,000 looking at the conservative annual turnover of 1,500 employees, this could account for easily 60 TB of data that is abandoned in the data center each year. Over 10 years this explodes to beyond half a petabyte.

Abandoned data is unstructured files, email and other data owned by ex-employees that languishes on networks and servers. Gartner estimates that the 2013 average Annual Storage Cost per Raw TB of capacity is $3,212 (Gartner: IT Key Metrics Data 2014: Key Infrastructure Measures: Storage Analysis: Current Year, Dec. 2013). This can account for millions of wasted expenses each year.

Abandoned data consists of old working documents that have long outlived their business value: revisions of letters, old spreadsheets, presentations and aged email. However, a small percentage of this content can easily contain sensitive files and email. It is this small percentage of contracts, confidential email exchanges, client records and other similar documents, which adds a level of risk and liability for the corporation.

The bulk of the data is typically what is known as redundant, outdated and trivial content – or ROT – that is simply taking up space and resulting in unnecessary management and data center costs.

The following are factors you will need to take into account in order to understand the cost impact of abandoned data:

Risk and Liability

The number one expense associated with abandoned data is the legal exposure created by not managing abandoned user data. The risk and liability inherent in sensitive data including client records, personally identifiable information (PII), or records required for eDiscovery or compliance can cost a company millions along with unwanted negative press and exposure.

Managing sensitive records is always a challenge; however, managing this content when the owner of the data is no longer an employee and no one knows it exists is an even more complex challenge. Think of the CEOs former admin creating a PST archive of their email and storing it on some obscure server. It is difficult to put a value on this exposure, but it is something that should be keeping your legal and compliance teams up at night.

Storage Costs

In the example above 60 TB of abandoned data can exist on corporate servers each year for a company of 10,000 employees. At the same time this data is cluttering the data center, organizations are increasing their storage capacity at a rate of 40-60 percent annually. Reclaiming this capacity and cleaning up abandoned data, most of it can disappear tomorrow and no one would miss it, is equivalent to getting free storage capacity. Since most IT budgets are decreasing, this is an easy approach towards making every dollar count.

Backup and Disaster Recover

One of the hidden costs of not managing and controlling abandoned data is in corporate disaster recovery costs. The cost and resources required to ensure all data is backed up and protected is one of the more expensive line items on an IT budget.

Compressed backup windows, offsite storage costs and management of backup content all contribute to ever-growing data center resources. With abandoned data accounting for tens, even hundreds of terabytes, it has become a significant component to the expenses associated with disaster recovery. Assuming a conservative 15 percent of data that is backup up no longer has any business value annually and should be moved offline or even remediated, this can easily reduce disaster recovery costs and expenses by up to 50 percent on a server over five years old.

Management Costs

Data is constantly migrated to new platforms or consolidated in order to streamline operations. Migrating and consolidating data is a constant and painful operation. It becomes even more painful when you know that much of the data no longer has value. If 30-50 percent of the data from a five-year-old storage platform is migrated to a new storage platform, or even the cloud, is owned by ex-employees, much of this effort is wasted.

Beyond a migration of data, day-to-day management of servers is a key task in any corporate data center. Reducing the volume of data under management will have a lasting impact on budgets and resources required to support the explosive growth of unstructured user data.

Untapped Knowledge

When a knowledge worker leaves the organization and their content converts from active to abandoned data it instantly “disappears” into the network. Since no one owns this content, even content that has long-term value to the organization, it can no longer be exposed and leveraged by existing employees. Research data, competitive analysis and historical reports all get lost and can no longer provide value to the organization.

The cost of not leveraging existing corporate knowledge can be significant. In today’s competitive market staying one step ahead is critical to maintaining and gaining market share. Arming your knowledge workers with all the data they need, including value added content generated by ex-employees, will help maintain leadership in the market.

Data Profiling

Data profiling, also known as file analysis, uncovers abandoned data so it can be managed. Understanding what abandoned data exists is the first step in defining a data policy that can reclaim wasted expense and control long-term risk and liability of this unknown and unmanaged content.

In “Market Guide for File Analysis Software”, published September 23, 2014, Gartner recommends profiling data to gain a better understanding of the unstructured data environment and ROT including abandoned data, stating:

“Data visualization maps created by file analysis can be presented to other parts of the organization and be used to better identify the value and risk of the data, enabling IT, line of business, compliance, etc., to make more-informed decisions regarding classification, information governance, storage management and content migration. Once known, redundant, outdated and trivial data can be defensibly deleted, and retention policies can be applied to other data.”

Data profiling works by processing all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and, optionally, what key terms are in it.

High-level summary reports allow instant insight into enterprise storage providing never-before knowledge of data assets. Through this process, mystery data can be managed and classified, including content that has outlived its business value or that which is owned by ex-employees and is now abandoned on the network.

This simple and analyst-recommended process helps organizations reclaim up to 40% of active data capacity and mitigates legal and compliance risks associated with unmanaged data.

Becoming Litigation Ready with Index Engines

Organizations have hoarded massive volumes of unstructured user files and email over decades. This unmanaged data is currently clustered away in enterprise servers, network computers, user shares, SharePoint, email databases, and even legacy backup tapes, with much of it consisting of mystery content. Within this content there is a significant volume of data with no business value that can be purged, as well as sensitive content that contains intellectual property or liabilities in future lawsuits.

Without detailed knowledge of this user content, organizations will continue to spend significant money and time managing and hoarding unknown data, stockpiled on massive servers, which only increases eDiscovery costs as well as future risk and liability.

Litigation Readiness® means actively managing data as well as the growing costs of storage and eDiscovery according to policy. Legal professionals are discovering the best way to battle this data growth is to leverage technology.

Using Index Engines’ Catalyst platform, organizations gain a high-level view into corporate user data including files and email. Only with this knowledge, can they easily manage corporate assets and identify and take action on responsive or sensitive data when necessary.

Discover more about litigation readiness here: