Legacy Backup Catalog and Data Management

Catalyst from Index Engines unlocks content in IBM and Symantec backup formats (with Commvault support coming in 1H2015) allowing access and management of the data without the need for the original software.

Clients that currently use IBM’s TSM or Symantec’s NBU now have a cost effective and intelligent migration strategy to best of breed backup platforms. If a non-production TSM/NBU instance, resulting from a merger or acquisition is maintained in order to provide access to legacy tape data, these environments can be retired and replaced with a single solution that provides simplified access to the data going forward.

Additionally, clients who have large volumes of legacy backup tapes in offsite storage vaults (Iron Mountain, Recall, etc.), containing data required to support legal and compliance, can now intelligently restore this data and extract the records that are relevant to disk for simplified access and management. This is typically a small portion of what is contained on tape, which differentiates Catalyst from a migration tool where all data is moved from tape.

[embedplusvideo height=”500″ width=”640″ editlink=”http://bit.ly/11KoKBc” standard=”http://www.youtube.com/v/szXFKecHLQg?fs=1″ vars=”ytid=szXFKecHLQg&width=640&height=500&start=&stop=&rs=w&hd=0&autoplay=0&react=0&chapters=&notes=” id=”ep9656″ /]

Index Engines Offers Complimentary 1 TB Software Licenses to Shed Light on User Data Growth

Index Engines has announced it now offers a complimentary 1 TB licenses of its Catalyst product so organizations can get a better grasp on their unstructured user data before they start budgeting for 2015.

This VMware-based plug and play download gives organizations 30 days to perform a metadata and full-text profile on the LAN file data of their choice, giving them information on last accessed time, owner, created date, number of duplicates, file type and more while performing PII pattern searches for credit card and Social Security numbers.

“Most storage executives don’t know what they have, if it has value, if it poses a risk or liability, if it is a security violation or if it’s employee vacation photos and music libraries,” Index Engines VP Jim McGann said. “Catalyst will give them a peek into their data, understand growth sources, and develop intelligent disposition strategies in order to control costs and liability hidden in user content.”

Index Engines’ Catalyst software is designed to deliver a file-centric view of the data center. Catalyst processes all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and, optionally, what key terms are in it.

Leveraging the rich metadata or full text index in conjunction with Active Directory integration, content can be profiled and analyzed with a single click. High-level summary reports allow instant insight into enterprise storage providing unprecedented knowledge of data assets.

The limited-use license features the data profiling features of Catalyst, but does not include the automated disposition and archiving capabilities of the full version of the enterprise-ready software.

The no-cost Catalyst licenses are currently available to enterprise data centers on Index Engines’ website at http://www.indexengines.com/catalyst-trial.

The Cost of Abandoned Data

The average corporate turnover rate for employees is 15.1 percent across all industries, with some specific verticals experiencing as high as 30%. For an organization with 10,000 employees this can account for 1,500 to 3,000 people annually (Compensation Force: 2013 Turnover Rates by Industry).

When an employee leaves an organization the IT department will typically wipe or recycle their hard drive, containing their digital files and email, however, they neglect to clean and manage former employees’ data on corporate networks and servers.

For this scenario, a company of 10,000 looking at the conservative annual turnover of 1,500 employees, this could account for easily 60 TB of data that is abandoned in the data center each year. Over 10 years this explodes to beyond half a petabyte.

Abandoned data is unstructured files, email and other data owned by ex-employees that languishes on networks and servers. Gartner estimates that the 2013 average Annual Storage Cost per Raw TB of capacity is $3,212 (Gartner: IT Key Metrics Data 2014: Key Infrastructure Measures: Storage Analysis: Current Year, Dec. 2013). This can account for millions of wasted expenses each year.

Abandoned data consists of old working documents that have long outlived their business value: revisions of letters, old spreadsheets, presentations and aged email. However, a small percentage of this content can easily contain sensitive files and email. It is this small percentage of contracts, confidential email exchanges, client records and other similar documents, which adds a level of risk and liability for the corporation.

The bulk of the data is typically what is known as redundant, outdated and trivial content – or ROT – that is simply taking up space and resulting in unnecessary management and data center costs.

The following are factors you will need to take into account in order to understand the cost impact of abandoned data:

Risk and Liability

The number one expense associated with abandoned data is the legal exposure created by not managing abandoned user data. The risk and liability inherent in sensitive data including client records, personally identifiable information (PII), or records required for eDiscovery or compliance can cost a company millions along with unwanted negative press and exposure.

Managing sensitive records is always a challenge; however, managing this content when the owner of the data is no longer an employee and no one knows it exists is an even more complex challenge. Think of the CEOs former admin creating a PST archive of their email and storing it on some obscure server. It is difficult to put a value on this exposure, but it is something that should be keeping your legal and compliance teams up at night.

Storage Costs

In the example above 60 TB of abandoned data can exist on corporate servers each year for a company of 10,000 employees. At the same time this data is cluttering the data center, organizations are increasing their storage capacity at a rate of 40-60 percent annually. Reclaiming this capacity and cleaning up abandoned data, most of it can disappear tomorrow and no one would miss it, is equivalent to getting free storage capacity. Since most IT budgets are decreasing, this is an easy approach towards making every dollar count.

Backup and Disaster Recover

One of the hidden costs of not managing and controlling abandoned data is in corporate disaster recovery costs. The cost and resources required to ensure all data is backed up and protected is one of the more expensive line items on an IT budget.

Compressed backup windows, offsite storage costs and management of backup content all contribute to ever-growing data center resources. With abandoned data accounting for tens, even hundreds of terabytes, it has become a significant component to the expenses associated with disaster recovery. Assuming a conservative 15 percent of data that is backup up no longer has any business value annually and should be moved offline or even remediated, this can easily reduce disaster recovery costs and expenses by up to 50 percent on a server over five years old.

Management Costs

Data is constantly migrated to new platforms or consolidated in order to streamline operations. Migrating and consolidating data is a constant and painful operation. It becomes even more painful when you know that much of the data no longer has value. If 30-50 percent of the data from a five-year-old storage platform is migrated to a new storage platform, or even the cloud, is owned by ex-employees, much of this effort is wasted.

Beyond a migration of data, day-to-day management of servers is a key task in any corporate data center. Reducing the volume of data under management will have a lasting impact on budgets and resources required to support the explosive growth of unstructured user data.

Untapped Knowledge

When a knowledge worker leaves the organization and their content converts from active to abandoned data it instantly “disappears” into the network. Since no one owns this content, even content that has long-term value to the organization, it can no longer be exposed and leveraged by existing employees. Research data, competitive analysis and historical reports all get lost and can no longer provide value to the organization.

The cost of not leveraging existing corporate knowledge can be significant. In today’s competitive market staying one step ahead is critical to maintaining and gaining market share. Arming your knowledge workers with all the data they need, including value added content generated by ex-employees, will help maintain leadership in the market.

Data Profiling

Data profiling, also known as file analysis, uncovers abandoned data so it can be managed. Understanding what abandoned data exists is the first step in defining a data policy that can reclaim wasted expense and control long-term risk and liability of this unknown and unmanaged content.

In “Market Guide for File Analysis Software”, published September 23, 2014, Gartner recommends profiling data to gain a better understanding of the unstructured data environment and ROT including abandoned data, stating:

“Data visualization maps created by file analysis can be presented to other parts of the organization and be used to better identify the value and risk of the data, enabling IT, line of business, compliance, etc., to make more-informed decisions regarding classification, information governance, storage management and content migration. Once known, redundant, outdated and trivial data can be defensibly deleted, and retention policies can be applied to other data.”

Data profiling works by processing all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and, optionally, what key terms are in it.

High-level summary reports allow instant insight into enterprise storage providing never-before knowledge of data assets. Through this process, mystery data can be managed and classified, including content that has outlived its business value or that which is owned by ex-employees and is now abandoned on the network.

This simple and analyst-recommended process helps organizations reclaim up to 40% of active data capacity and mitigates legal and compliance risks associated with unmanaged data.

Becoming Litigation Ready with Index Engines

Organizations have hoarded massive volumes of unstructured user files and email over decades. This unmanaged data is currently clustered away in enterprise servers, network computers, user shares, SharePoint, email databases, and even legacy backup tapes, with much of it consisting of mystery content. Within this content there is a significant volume of data with no business value that can be purged, as well as sensitive content that contains intellectual property or liabilities in future lawsuits.

Without detailed knowledge of this user content, organizations will continue to spend significant money and time managing and hoarding unknown data, stockpiled on massive servers, which only increases eDiscovery costs as well as future risk and liability.

Litigation Readiness® means actively managing data as well as the growing costs of storage and eDiscovery according to policy. Legal professionals are discovering the best way to battle this data growth is to leverage technology.

Using Index Engines’ Catalyst platform, organizations gain a high-level view into corporate user data including files and email. Only with this knowledge, can they easily manage corporate assets and identify and take action on responsive or sensitive data when necessary.

Discover more about litigation readiness here:

Meet us at ILTA

Are you attending ILTA’s 37th Annual Educational Conference?

We are, and we’d love to see you there! Index Engines has some exciting new product enhancements and a pricing model built for partners to meet and exceed their clients evolving needs, including:

– Producing data from tapes in response to a legal event or court orders.
– Creating a repository of legal hold data that is easily accessible, forensically defensible and cost effective.
– Producing information about their data for assessments in a manageable platform.
– Determining what, if any, legal liability may reside in the data contained in their infrastructure.

If you’re attending ILTA and would like to learn more about our technology or discuss opportunities to work together, please contact Michelle.King@indexengines.com.

Also, if you’re not attending, but would still like to talk, contact:

Michelle King
Business Development Manager
Index Engines

Email Lives Forever, Except When it’s Gone: An information technology take on Lois Lerner’s and all other lost email

When headlines hit that the IRS was missing data, most information technology professionals jumped to the same conclusion: how could data ever really be gone?

Data loss is not common in this day and age as millions of dollars have gone towards most organization’s data centers just to make sure data doesn’t get lost. In fact, the opposite can actually be the issue: there’s too many copies of the same data.

For example, an email sent is immediately stored on the sender and receivers PC as well as the server. Nightly, each email box is backed up on the company’s server for disaster recovery – in case of a computer crash or something more disruptive. After a week goes by, there’s seven or more copies of that email stored somewhere. In addition, there’s a good chance that a copy of that email has been copied to archive for long-term retention based on Lerner’s senior status.

If that email becomes lost from the desktop or email server, there should be many copies that exist in other locations. There are few reasons feasible that data could ever really be gone even if a company attempted to destroy its data, Enron taught us that, but breakdowns in policy and lack of information management and data center search solutions could make it lost.

Where does data go

Most mid to large-sized organizations store copies of their legacy data on a troublesome format called backup tape. Resembling a VHS that has been cut in half, data is backed up nightly from all the servers to sets of backup tapes.

Unlike a VHS that may be recorded over many times, these tapes are permanent and quickly fill up. Some are stored within a company’s data center, but the bulk of this data is sent to offsite vaults meant to house and protect these tape archives.

Retrieving that data isn’t as simple is putting a VHS in a VCR. Systems advanced and organizations changed storage vendors over the last 20 years, making many of these tapes inaccessible as they were originally recorded on proprietary technology that wasn’t compatible with other vendors.

The original software to access some tapes hasn’t been around for over 10 years and requires either specialty direct tape indexing technology or expensive restoration of the original software. In addition, knowing which data is where at a company with thousands of employees, years later, is no easy task and can be claimed as burdensome in a less high profile situation.

The less certainty about where the data is, the longer and more costly finding is, but the data still exists – somewhere.

Why can’t we find data

The backup environment at these organizations is massive and finding needed data is traditionally a long, expensive process that is only compounded by the breakdown in corporate policy.

Managing corporate data should be a unified effort between the IT department, legal team and records management, but in all actuality each assumes its own part and it causes large policy gaps.

Without this proactive communication and a partnership between legal and IT organizations, IT will continue to store information that no longer has business value but can turn into a liability. eDiscovery costs, finding and collecting data, will also remain high as every time a request is made a new and time consuming search must be commenced through thousands of legacy tapes.

In the past if legal asked IT what data exists where, there would be a blank response. If IT asked legal about data policies, what they should keep and what they can dispose of, the answer would not come easily and each department did “their” job. IT stores the data. Legal requests data. Records management recommends policy. Legal and IT can’t decide who implements policy so no one does.

The data stays on legacy tape, but no one knows exactly where.

Perception versus reality

The lost Lerner emails should serve as a wakeup call for enterprises to understand the lifecycle of data. In order for data to truly “no longer exist,” an organization would need to access all environments (all those backup tapes) and apply a defensible deletion policy. Otherwise claiming that data is “gone” is a weak excuse.

However, permanently removing email can be done, and is actually a beneficial way to control the long-term risk of aged data once it outlives its business, compliance and legal value. This isn’t a back-door ad-hoc job of users hitting a delete key or dumping tape in shredders, but firm policy dictated by those who are charged with protecting the company from any liability.

Deleting corporate data must be done under the guidance of legal and records management professionals – with the key challenge of ensuring the enterprise is keeping what is required for regulatory, compliance and legal purposes, while disposing data that can be misinterpreted or cause a security breach.

The only accurate and defensible way to get rid of data in a corporation is to define a solid policy, and apply it to not only your current production data, but the legacy data as well. By ignoring the legacy data, all an organization does is lose some of the copies. No data should ever be lost. It should be archived, managed or purged.

***for more information on backup tapes, or for quotes contact info@indexengines.com***

Manage Risk and Costs Through Proactive Management and Archiving of User Data

Organizations have accumulated significant volumes of user data over the years. Documents, spreadsheets, email communication, all contain critical intellectual property and sensitive content that could become a liability over time. Some of this data is properly managed, however a significant volume of unmanaged user files and email exists that contains highly sensitive information that can result in massive legal fines and sanctions.

In order to manage the expense of storing large volumes of unnecessary data along with the long term risk of this content, organizations are proactively implementing new information management strategies. These strategies include purging irrelevant and outdated data and securing important business records in archives, according policies.

Read about our Information Governance Platform here

Are you Litigation Ready?

Developing a sound corporate policy that satisfies legal, compliance and regulatory requirements is critical in today’s climate for managing risks and eDiscovery costs. Index Engines prepares companies by helping them create a Litigation Ready environment.

A Litigation Ready® environment is one where the data required to support a lawsuit is readily accessible and in a legal hold archive. This approach eliminates the pain and expense around identifying and collecting data under extremely tight timeframes when litigation arises.

Any solution implemented for litigation readiness must support eDiscovery and litigation hold requests on demand, support ongoing preservation policies for individual users and provide a view into user data for early-data assessment and risk analysis.

Find out how to achieve Litigation Readiness here

Bridging the Gap Between Data and Policy

Join Doculabs and Index Engines for this live webinar Thursday, May 15 at 2:00 pm ET/11:00 am PT.

Legal, IT and RIM professionals are different parts of a compliance cog that never seem to fit together.

Legal needs data for eDiscovery, RIM wants to set data policy and IT is storing massive amounts of data.

No one knows what really exists and where.

With the help of leading technology and pro-active policy setting, you can.

Organizations can not only achieve increased security, industry compliance and litigation readiness.

During this 45-minute webinar, you’ll discover how to:

Define data that is abandoned, lost or sensitive,
Set policy for compliance and legal data, and
Enforce disposition to assist legal, RIM and storage

Register here

Webinar: SharePoint indexing & more ways to accelerate time to data

Traditional SharePoint eDiscovery requires a copy made to disk before and processing can occur – wasting valuable time and money your clients don’t have.

But now you can directly index SharePoint. Discover how to get no-copy indexing of SharePoint and other ways to accelerate time to data in 2014 during this 30-minute web event.

During this 30-minute webinar you’ll discover insider knowledge on:
– Cost Effective ESI Collection
– Direct ESI Extraction from Backup Tape
– Intelligent Culling and Preservation
– Early data Assessment
– Selective SharePoint ESI Collection

[embedplusvideo height=”410″ width=”540″ editlink=”http://bit.ly/1miGXek” standard=”http://www.youtube.com/v/Ik8VhADKWtM?fs=1″ vars=”ytid=Ik8VhADKWtM&width=540&height=410&start=&stop=&rs=w&hd=0&autoplay=0&react=1&chapters=&notes=” id=”ep3021″ /]