An article published in the July 19th issue of the New Jersey Law Journal explores the idea of using data locked away on backup tape as the source for eDiscovery. The article, Proven Methods to Prevent Spoilation of Online Data, points to the impenetrable nature backup tapes, and conversely the dynamic environment that houses online data. Online data is subject to editing or deletion even after a litigation hold notice is issued – emails can be deleted and documents can be overwritten. However, the smoking gun still exists on backup tape, which has preserved an unspoiled version of all corporate data including that relevant to a litigation event. In years past tape discovery was cumbersome and expensive. But the dawn of technology that indexes data directly from tape, without restoration, allows the fast and easy access to ESI to support litigation. By indexing the data on tape, either from last night’s routine backup, or from a forensic snapshot stored to tape, eDiscovery teams can be assured that the tape data has not been tampered with.
The volume of corporate data contained in network attached storage (NAS) grows daily. The amounts are many magnitudes greater than the data contained on the Internet. How can any enterprise attempt to manage it? After today’s announcement by Index Engines, the discovery of corporate data got quite a bit easier. Online data, stored on all major commercially available NAS platforms, can be indexed and made searchable at 1 TB per hour per indexing node. Index Engines’ press release today announced the validation of data residing on HP and NetApp storage systems at these unmatched indexing speeds. HP and NetApp join EMC, Data Domain and BlueArc, which have previously validated Index Engines 1 TB/Hr indexing speeds. By demonstrating this powerful indexing capability across the platforms that hold the majority of the storage market share, Index Engines has made it feasible for enterprise IT teams to regain control of their stored data.
The unit cost to identify and collect online data on hard drives, CD/DVD’s and network storage has dropped significantly over the past few years. However since the volume of data being examined during a discovery project has grown dramatically, the total project cost can still be quite high. Many tools used for the collection and discovery process contain advanced review capabilities. These are the tools the market is familiar with and ESI collected during eDiscovery is often delivered in formats compatible with these “household names”. However these tools can be expensive, often hundreds of dollars per GB. When dealing with a small collection, this may not be a significant added expense. However as the volume of ESI grows beyond the GB’s into the TB’s the total cost can quickly inflate.
The concept of two pass culling is gaining popularity. During the first pass, or bulk culling, the data is indexed, deduped and deNISTed and then the content is searched. This pass is performed first to cull the data down to a relevant subset, and is then delivered into the review platform for refined search and presentation to the client. When Index Engines is used for first pass culling the per GB fees are a mere fraction of what popular review platforms charge for culling. Breaking apart the culling process into two phases allows for tremendous cost savings. ESI will still be delivered in the client’s platform of choice, however the first pass of the data can be completed faster and more cost effectively.
Legal Tech Newsletter recently published “Top Five Backup Headaches Solved“. Authored by Jim McGann, VP at Index Engines, this article compiles common frustrations enterprise IT teams are encountering regarding data contained on backup tapes. Corporations are dealing with eDiscovery projects and electronic compliance regulations that more and more often require the disclosure of data from tape. McGann lists out the struggles IT is having and then offers the balm of direct tape indexing. Whether it is a reactive search through tape data to support litigation, or a proactive effort to get historical data under control, your head doesn’t need to pound. Index Engines platform automates the direct indexing of data on tape, removes the pain of restoration, making quick and easy work of tape discovery.
In a recent Law.com article, Jason Krause discusses a paradigm shift in eDiscovery. Technology is simplifying the discovery process and the per tape/GB price has been reduced. But the volume of data being considered for discovery is growing. So the overall cost of eDiscovery projects and the size of the eDiscovery market is expanding. Back when Zubulake was first ruled on, social media, audio and video files, and other such data types were not in the eDiscovery mix. But they are today. The eDiscovery market still grapples with what is considered burdensome. Index Engines technology has made tape discovery quick and affordable. According to Krause’s article,When is an E-Discovery Burden an Undue Burden?, the terms of the burden argument are changing. Index Engines and our partners certainly know that tape is no longer a dead stop. Implementing 1 TB/Hr indexing, large volumes are easier than ever before too. However, the scope of enterprise data is boundless, making the burden a changing but still relevant argument.
Organizations often live for years with their backup software vendor, even if over time the vendor may not be the best choice for the organization. As time progresses the needs of the organization may change, the cost and maintenance fees for the backup software may become unacceptable, or the functionality may fall behind when compared to new offerings. Companies may want to change their provider and upgrade to a new backup solution that better meets their needs. However many companies feel locked into their current provider. “Locked” because they have significant volumes of historical data contained on tapes created using their current provider’s solution. Migrating to a new backup solution makes this data inaccessible, a risk no company wants to face.
What if there was a way to break free of this risk, and choose the backup strategy that meets the organizations needs today, rather than living with the choice from yesterday? Index Engines recently authored an article on this subject for Computer Technology Review entitled Backup Data Migration Made Easy. In this article, Jim McGann outlines how organizations can avoid the risk of switching to a new backup strategy. By leveraging indexing technology that allows the search and access to backup data regardless of format, enterprise IT teams can now get at historical data without having to maintain the legacy backup infrastructure. This access can be used to migrate valuable data into an archive of their choice. Or simply serve to provide access after the organization switches to a new backup strategy. Either way – the choice is there, enabling freedom to choose a backup strategy that matches current needs.
Index Engines recently announced the enhancement of its 3.2 platform to include support for simultaneous indexing of multiple streams of data from backup tapes. This new functionality dramatically increases the ability to process large amounts of tape data within tight time constraints. Processing up to six simultaneous streams of tape data and reaching speeds up to 1TB/hour is now possible. Combining Index Engines’ automated approach toward tape discovery at these high speeds makes enterprise tape remediation and large scale eDiscovery projects cost effective and achievable.
With the direct indexing speed for tape data increasing six-fold, enterprise tape management and remediation projects have become even easier. Index Engines’ platform enables enterprise IT and litigation support teams to deduplicate, deNIST, search and extract tape data at speeds approaching an unprecedented 1 Tb/Hr/Node rate that has been validated for LAN data indexing. Large scale tape discovery enables the practical remediation of historical backup tapes. Index Engines’ complimentary tape assessment program shows how organizations can save millions in annual offsite storage costs and return their investment in Index Engines in less than one year. Read the full press release here.
The traditional method of getting data off tape has been to restore the content. Restoration uses the original backup software to remove data from tape and bring it back online in order to begin the discovery process. Direct indexing and extraction is a more intelligent process. This process was invented by Index Engines and it significantly streamlines the collection of ESI from tape. Extraction does not require the backup software to access tape content. Additionally, extraction leverages the index to understand data at a file and email level. Using direct indexing and extraction you can review the contents on tape, find relevant content and extract what is interesting from tape. Direct indexing is a non-invasive scan of the tape that allows intelligence to be obtained about the contents; files types, dates, custodians, etc… Extraction allows the selection and specific content to be gathered. Restoration requires you to restore data first before you can find the relevant content. It’s a radically different process. The benefits of extraction over restore are clear savings of both time and money.
Jason Krause, writing for Law.com, recently published an article encouraging a closer look at how the over 600 eDiscovery solutions on the market are built. While some argue that fancy bells and whistles make the solutions slick, Krause dives into the technology supporting these tools. Index Engines‘ Jim McGann contributes to this article with details about the limitations of Lucene, the common open source code many vendors are using. Index Engines, using patented purpose built technology, far exceeds the capabilities of the open source platforms. After all, bells and whistles are useless if the underpinnings of the platform aren’t up to snuff. Read the full Law.com article here.
The courts are starting to understand the benefits of using Index Engines in the collection of ESI, especially from backup tapes. In the recent case, Goshawk Dedicated and Kite Dedicated v American Viatical (Case No. 1:05-CV-2343-RWS Document 506 filed 12/08/2009) the judge issued a report on the production of ESI in support of the case. The original collection was to be focused on approximately 1TB of online data, however the decision was made to perform this collection on backup tape. The reasons given were, “backup tapes (a) would provide a more direct and economical route to responsive ESI vs. searching the online data. (b) would entail considerably less business disruption. The court also went on to direct the use of Index Engines for the collection of ESI for this case.
The fact that courts are now mandating the use of Index Engines for collection of ESI from backup tapes versus online data show how far we have come with respect to the inaccessibility argument. In the past many cases have argued backup tapes are burdensome. Now we see cases where tapes are not cited as burdensome, but are in fact determined to be less burdensome and more economical when compared to online ESI collection.