The ABCs of eDiscovery

May 24, 2019

Strategies for achieving better outcomes and saving costs in high-stakes legal disputes

Regardless of the industry, electronic information is the medium of life in the modern world. While paper still matters, business is largely conducted and documented through exchanging electronically stored information, or ESI.

So, when disputes arise, the side that can best wrangle all that electronically stored data to present a compelling story will likely prevail.

In a previous article, I wrote about the role that Big Data increasingly plays in today’s construction and real estate disputes. Now, we will take a deep dive into the world of eDiscovery, which represents the bulk of the Big Data involved in high-stakes litigation and claims for the built environment.

The Electronic Discovery Reference Model

Duke Law School has developed a model set of standards, guidelines, and resources for legal professionals working with ESI called the Electronic Discovery Reference Model.

The electronic discovery process is iterative, in that certain steps in the process may be repeated over and over, with the outcome of the previous repetition serving as a starting point for subsequent repetitions. In summary, here are the steps that eDiscovery may or may not entail:

  • Information Governance—Getting your electronic house in order to mitigate risk and expenses should eDiscovery become an issue, from initial creation of ESI through its final disposition 
  • Identification—Locating potential sources of ESI and determining their scope, breadth and depth
  • Preservation—Ensuring that ESI is protected against inappropriate alteration or destruction
  • Collection—Gathering ESI for further use in the eDiscovery process (processing, review, etc.)
  • Processing—Reducing the volume of ESI and converting it, if necessary, to forms more suitable for review and analysis
  • Review—Evaluating ESI for relevance and privilege
  • Analysis—Evaluating ESI for content and context, including key patterns, topics, people and discussion
  • Production—Delivering ESI to others in an appropriate form and using appropriate delivery mechanisms
  • Presentation—Displaying ESI before audiences (at depositions, hearings, trials, etc.), especially in native and near-native forms, to elicit further information, validate existing facts or positions, or persuade an audience

Taking Control of eDiscovery

In a perfect world, one in which the client or the party you represent has an unlimited budget and there are no deadlines, you might employ an army of trained researchers to review every single piece of data, and then catalog various attributes about that data into a database for further analysis. The cost for such an effort is astronomical, never mind the amount of time that is required.

In the real world, attorneys need to collect data from their clients, redact privileged information, produce the information required to comply with discovery requests, share that information with their own experts, analyze and distill the information as they build their case, and then ultimately present the most relevant information to the triers of fact for the final decision. And that all needs to be done on time and under budget.

Fortunately, there are a number of software tools that have been proven effective through the test of time for dealing with the challenges of eDiscovery. At its core, eDiscovery software is simply a relational database that associates electronically stored files with information that describes that file, known as metadata. If you are willing/able to pay for it, there are more sophisticated eDiscovery tools out there that will actually assist with processing the information, with some even employing advanced artificial intelligence (AI) to detect and categorize the content.

Perhaps one of the most important functions that a quality eDiscovery software will perform is optical character recognition or OCR. Quite simply, OCR is a process that takes an image—such as what you end up with when you scan a paper document—and then analyzes any alphanumeric characters in the image, before producing a text file containing the recognized words, numbers, or letters. That text file is then linked to the original file and then indexed by the software.

The real power of eDiscovery software comes from the metadata that can be extracted from native files. “Native” simply means that the file is in the original format that was used by the software that created that file. Some examples of native files include .doc or .docx Word files, .xls or .xlsx Excel files, or .dwg CAD files. Emails stored in native file formats such as .msg files are a boon for legal professionals because unlike a printout of an email, eDiscovery software will extract the actual date an email was sent, the subject, any attachments, and the name, email address, and company associated with the sender and recipients.

This allows someone to easily perform laser-focused searches and filtering across many hundreds of gigabytes of stored data in a short amount of time. Need to find every email sent by a certain sender during a specific time period? No problem. Need to pull together all the original CAD files for your architectural expert’s review, or the .xer P6 schedule files for delay analysis? It’s so easy a caveman could do it. Need to find every piece of supporting information associated with RFI No. 132? It likely takes only takes a few clicks of the mouse and the results are nearly instantaneous.

Truly we live in interesting times.

Garbage In, Garbage Out

The quality of the results to be gained from eDiscovery tools is dependent on two factors:

  1. The quality of the data you are working with
  2. The skill and training of your research team

Since most eDiscovery providers charge based on the quantity of the data you’ll be uploading into their system, each matter will require a bit of a cost-benefit analysis. Here’s a real-world example from a recent case Xpera Group was engaged in.

Our client was a trade contractor that was accused of poor workmanship resulting in a defective installation of a major building system, as well as significant delays to the overall project’s completion. The owner assessed liquidated damages against the general contractor who then passed through those costs and other damages to the trade contractor. The claimed damages represented several times what our client’s original contract was for. As a smaller, first-generation immigrant family-owned business, a decision of liability would have ended the company.

With limited budget and time available for the assignment, Xpera Group (now part of VERTEX) could not afford to manually review and analyze all of the nearly 100 gigabytes of data received from counsel.

Instead, we uploaded everything to eDiscovery software to perform our review. In addition, we were able to grant access to our co-experts as well as multiple law firms working on behalf of the client.

Unfortunately, we learned that a majority of the data that was uploaded into the eDiscovery software included the numerous CAD revisions issued by the project’s design team throughout the design evolution. In this matter, the design was not a major factor, nor were the changes to the design. Therefore, the contents of the CAD files were completely irrelevant to not only our analysis but also to the other people on our team.

Nevertheless, the CAD files took significant processing by the eDiscovery software, not to mention taking up tremendous space, resulting in significant cost. Fortunately, this scenario is avoidable.

Once we excluded the CAD files and re-uploaded the data, the costs were much more manageable. Plus, this allowed us to improve the quality of the search results within the eDiscovery software. Images, movies, and other media files are similar culprits that should be avoided when considering what to upload to your eDiscovery software.

The Bottom Line

We frequently find ourselves in the position of defending the costs associated with eDiscovery software to our clients.

Although ESI has played a critical role in many high-profile and high-stakes cases, most of them were in the technology industry, such as in the case of Apple v. Qualcomm. Outside of intellectual property disputes involving the elites of bleeding-edge technology, many lawyers and their clients are unaware of the tremendous benefits of eDiscovery.

Internally, we have a unit of measure we call the “truckload,” a reference to the fact that one gigabyte of data is roughly the equivalent of a pickup truck bed filled with paper. On one case several years ago, our attorney clients sent us about 45 gigabytes of data to process and analyze. Just imagine the time and resources involved with sorting through 45 pickup trucks’ worth of evidence!

As is typical, the data consisted of PDF files that were produced by scanning entire bankers’ boxes of paper documents. Not a single file included any descriptive information in its filename. It took nearly 30 hours for one of our researchers to just loosely describe the 66 PDF files, each of which contained over 2,000 pages. We then tasked a team of experienced researchers and analysts with manually reviewing the documents, first extracting each individual document as a separate file, before cataloging basic information about each document into a database.

Based on the time it took to review, analyze and process the first 5% of the entire document repository, I calculated it would end up costing close to $100,000 for our team to go through all of the documents. When we sought competitive quotes from multiple eDiscovery providers, the cost was reduced to about $15,000, for a cost savings of nearly 85%.

The case settled before going to trial, but it served as a real eye-opener for us. Now, when new matters come in with large quantities of documents, our recommendation for using eDiscovery software is based on how voluminous the collection of data is. If it is less than 20 gigabytes (or 20 truckloads), we might skip the eDiscovery process. But at 90 gigabytes (or 90 truckloads), the cost and time to process that quantity of data are too great to justify—eDiscovery software is simply the right choice.

Lessons to Learn From

I’ve been working with eDiscovery tools in various shapes and sizes for the past couple of decades. If you have a data-rich case that might be a good candidate for processing through eDiscovery software here are some considerations:

  1. Assess what you have, both in terms of quantity and quality. Consider not just how much data you are dealing with, but also what types of files are included.
  2. Do your privilege check first. While the software is a great tool for finding, identifying and purging privileged content, if you upload a bunch of potentially privileged ESI to eDiscovery software, you still have to pay for the data to be processed. By excluding the majority of the privileged data from the upload, you could save your client some serious money.
  3. Learn the software. Every eDiscovery provider has its own unique user interface with its own quirks. While they generally work somewhat similarly, it is worth spending some time familiarizing yourself with your particular tool.
  4. Work backwards. By understanding what your end use will be of eDiscovery tools, you will save a lot of time and effort during the research and analysis phase. For example, if you know that you need to tell the story behind a specific change order, tagging documents that relate to that change order with a relevant keyword will enable you and your team to quickly assemble the critical evidence needed.
  5. Filters are your friend. Sure, most eDiscovery software allows you to type in a search term and find all documents containing that term, similar to how Google provides you with a list of websites that contain the term you are searching for. But, when you use filters to combine multiple search parameters—such as date range, type of file, source, etc.—it can dramatically reduce the time required to separate the virtual wheat from the digital chaff.

On the one hand, eDiscovery tools are not exactly cheap. However, neither are trained professionals. In contrast to the tremendous cost of human review and analysis, leveraging eDiscovery software in a strategic manner can give your client a much better outcome, and save a ton of money.

