Go Deep for eDiscovery Solutions – Commentary

In a recent issue of Legal Tech Newsletter Browning Marean, a partner at DLA Piper in San Diego, Calif., published an article outlining the impact of the FRCP amendments on ESI production and the risks of doing manual eDiscovery. In this article Marean outlines some key areas of consideration when selecting an eDiscovery solution. Index Engines market leading eDiscovery tools address these major points of concern.

eDiscovery performed with Index Engines is the most cost effective approach on the market. Accessing tape data via service providers performing traditional full tape restoration rather than only the responsive data can easily cost in the multi-millions of dollars – as this article points out. The Tape Engine performs a fully automatic catalog, index and search of data on backup tapes for a fraction of the cost of standard eDiscovery services.

Marean also outlines the reasons to consider proactive eDiscovery. The ability to review data assets upfront and intelligently pursue a course of action when faced with a litigation event is undeniably the way to go. With Index Engines, you can proactively index your data on backup tape, and also your stored network data. With this index you will have full insight into your enterprise data, allowing for proactive, on-going risk assessment.

Index Engines cost-effective eDiscovery platforms for offline tape data and online stored data have been developed to address the full spectrum of data discovery. Our clients are ready to produce ESI upon court requests in a timely manner, and also to review their data on an on-going basis to avoid future litigation whenever possible.

Changing Litigation Hold

Index Engines spoke with an attorney at a well respected west coast law firm recently; he brought up an interesting point. When a case begins the law firm sends out a memo to the client initiating the lock down of email and files that could be relevant to the case. This lock down is related to the online production data. This data can easily be modified and tampered with by the client. Thus even though it is locked down, the law firm has no assurances that it has not been manipulated.

However our legal partner had a different idea on how to approach and preserve this locked down data. The most current data, the data that the law firm needs to be locked down, has in most cases been backed up onto tape the night before the litigation hold was issued. Tape data has not typically been easily accessible. Now that Index Engines makes it accessible, why not lock down the tape data which is far more difficult to tamper with? Locking down the tape data provides an accurate view into the data in question and ensures it has not been modified – consider it an accurate time capsule.

Destroy or Save Indexed Backup Tapes?

If IT were able to tell legal everything about the data residing on offline tape, could legal tell IT what needs to be kept and what can be destroyed?

“Using this (Index Engines) technology you can quickly determine if a tape or set of tapes is relevant or otherwise to any pending investigations or other business requirements (prior art for patents, lost accounting records etc). If the tapes are not relevant and if they are outside of the record retention schedule, they can be recycled. This is legally defensible and makes good business sense. There is no need to store tapes if they are outside of their retention window and do not contain information of current value or of potential value to impending investigations. Keeping such tapes may result in future liability which any corporation would be smart to avoid.” – except from Tape Discovery Comes of Age, by Quintin Gregor.

As our clients retroactively process tapes and make the tape data discoverable, they are going back to their legal teams and asking them for a comprehensive list of custodians and content that should be placed on litigation hold. In some cases legal can provide that list. However they are often not confident that it is a comprehensive list. In other cases legal can provide a full set of queries and custodians, thus allowing IT to pull this data off tape, lock it down, and shred the rest.

Shredding irrelevant backup tape data is a controversial issue. We see some companies eagerly heading in this direction, while others will not pursue this effort due to the fear of destroying responsive data. Ultimately this data cannot stay around forever; tapes corrode and the volume would be unmanageable. On average less than 1% of the data on tape is relevant – unique email and unstructured files. The rest is made up of useless system files or duplicates, making 99% of the costs to store tape a waste. To avoid this unnecessary expense, Legal must make some decisions so IT can move forward and eliminate the bulk of the unresponsive data freeing up significant resources.

Upcoming Tape Discovery Webinar

Title: New Tape Discovery Methods for IT

Description: This webinar will provide a detailed technology overview of the new tape discovery process, how it’s done and how it integrates into existing infrastructure and archives.

Schedule: Thursday, May 8th, 2pm EST
Register Here

Upcoming eDiscovery Webinars

Title: Affordable End to End Automation of Tape Discovery

Description: This webinar will provide a presentation and demonstration of the Index Engines automated tape discovery solution.

Schedule: Thursday, May 1st, 2pm EST
Register Here

Title: New Tape Discovery Methods for IT

Description: This webinar will provide a detailed technology overview of the new tape discovery process, how it’s done and how it integrates into existing infrastructure and archives.

Schedule: Thursday, May 1st, 3pm EST
Register Here

Flexible Data Access

Index Engines and CommVault announced a partnership today. What this means is more options for folks concerned with archiving electronic information and eDiscovery of stored data. Index Engines allows people quick and easy access to data archived on backup tape without needing to keep in place the legacy infrastructure used to create these backups. Once this data is accessible, the option to switch storage methods becomes viable. This partnership is about alleviating the cost and burden associated with accessing data from tapes, giving customers options and the confidence to make the switch to more efficient technology. Read more about this partnership in the official press release.

ALSP Reviews Index Engines

At the ABA Tech Show in Chicago earlier this month, Index Engines caught the attention of Joe Howie, who writes for the Association of Litigation Support Professionals monthly newsletter. Joe spoke with Jim McGann, VP of Marketing for Index Engines and Jeff Fehrman, President of Electronic Evidence Labs, a division of Onsite3. In his ALSP article, Joe summarizes how FRCP has changed the requirements for producing hard to access electronic evidence found on backup tape. He chose to review the Index Engines technology as a solution for ALSP readers. The Index Engines product offers an alternative way to access data on backup tape that is significantly faster, easier and more cost-effective than traditional, burdensome, restore methods. The resulting article can be found here.

Retroactive Data Discovery

Think about three states of being; 1. Reactive, 2. Retroactive, 3. Proactive.

Reactive might be buying new pants when your waistband gets tight. Retroactive is reducing your pizza and beer to a lite lager and one slice when the button starts to strain. Proactive is running 5 miles every morning so you stay in shape.

An enterprise’s approach to data management can also be categorized into these three states.

Reactive eDiscovery occurs when a lawsuit is engaged and relevant digital evidence must be uncovered in a hurry. Proactive data management takes the form of smart information management procedures that include indexing data as archives are created for quick and easy future access.

The process of retroactive data discovery is a relatively new concept. Before Index Engines technology became available, electing to catalog and index information stored on backup tapes, without the threat of legal action, was just not done – it was far too time consuming and expensive. Index Engines allows companies to search and access data stored on tape without restoring the full content and without the need for any additional software such as the backup software or the version of the email that was originally used.

More and more often enterprise customers are using Index Engines technology to do retroactive data discovery. They are cataloging, indexing and searching their legacy information stored on tape, to better understand and manage their data. With this knowledge, enterprises are better able to mitigate risk, control data storage costs, and control ever-growing IT infrastructure costs. This movement toward controlling past tape data archives is not possible without Index Engines.

What type of data manager are you? Or maybe the better question is; How tight is your waistband?

Efficiency Pays Off

How much data is processed in a typical eDiscovery event? Hundreds of thousands of files? Millions? Billions? Many of the cases our clients are dealing with creep up into the millions very quickly. In fact one government agency is processing over one billion emails to support a specific case.

When we architected the Index Engines discovery platform we knew it had to scale to support large volumes of data. Therefore we made sure the index footprint was as small as possible, about 5 to 8% of the original data size. Others were not as cautious. Their index may require a cache copy of the file and as a result is bloated – resulting in an index size of 60 to 120% of the original data.

It became very clear at a recent trade show just how much a bloated index adds to the overall cost. At this event, Index Engines was located next to Google. Google has an indexing appliance, as do we. They make a cache copy of each file for indexing, so their index is at least 100% of the original data size. As a result, for a project that includes 30 million files they need 6 to 8 servers to process the job. Our technology, with a 5 to 8% index footprint, only requires one server (actually half a server) and 1/10th the cost.

It pays to be efficient!

Removing the Burden

Larry Wescott summarizes the burden ruling in Thomas v. IEM, decided on March 12, 2008 as follows:

In total, to avoid disclosure of confidential and sensitive information, IEM estimates that it would be forced to review more than 67,000 emails, requiring approximately 700 hours of staff time at a cost exceeding $ 120,000.00.

Nine business days was clearly an insufficient amount of time to review the overbroad request, which would have required over four (4) weeks of staff time, working twenty-four hours a day for seven days a week in order to respond to the request.

If only Thomas’ counsel had know to present Index Engines Technology as a counter argument to this burden!

Reviewing this volume of email using the Index Engines solution would take about 25 minutes. Yes – less than ½ an hour. Here’s how:

Volume of Data:

Average size of an email = 120K (.11MB)

.11 x 67000 = 7,370MB or 7.2GB (1024MB = 1GB)

Index Engines Ingestion Speed:

Dependent on Tape Format – Let’s assume DLT

DLT (5MB/s) or 17.6GB hour / 7.2GB = 25 minutes


Ingest data = 25 minutes

Search = 10,000 x 0.001 sec – 67,000 emails = 0.007 sec.

Cost = Less than half!

Granted there are some unknown variables such as; tape format, time frame i.e. number of tapes and retention period, number of custodians for search etc. But you get the picture.

A base-priced Index Engines platform can handle up to 100,000 objects. We would have made short work of discovering these 67,000 emails. This burden could have been easily lifted.

Related Links: