Failing to Understand Your Data Before You Begin Review is Perhaps the Costliest “Gotcha” of All

As we discussed last time, not understanding what “done” looks like can be the most significant “gotcha” associated with document review. Without defining “done,” you may never get there. But there is another “gotcha” that can sometimes be even costlier – not understanding your data before you begin document review. With business data doubling every 1.2 years, understanding the data within the collection (especially the portion that doesn’t need reviewing) is more important than ever to keep review costs manageable.


While understanding your “Big Data” may seem daunting, the good news is that the right combination of expertise and technology today can help you understand your data as early as possible in the investigation to minimize the number of documents that need reviewing. But, of course, the keyword in that sentence is “early.”


The Importance of Early Case Assessment

Early case assessment (ECA) is about estimating risk (i.e., the cost of time and money) to prosecute or defend a legal case. A big part of that risk assessment today concerns the ESI associated with the case. ECA has evolved to not only evaluate the risk but also reduce that risk by enabling the review team to understand the collection they are reviewing by making critical decisions about which documents are: 1) likely to be important to the case, 2) potentially important, and 3) not important.


The ability to stratify your collection into these categories early enables the review team to prioritize the review of the most likely responsive documents. In addition, doing this helps eliminate other potentially large groups of records from review altogether, ultimately saving on review costs while improving the ability to meet review deadlines.


Three Components of ECA Analysis

Just as there are three categories of documents to be classified, there are at least three components of ECA analysis to categorize those documents. They are 1) Culling Unwanted Documents, 2) Identifying Key People, and 3) Identifying Important Topics.


Culling Unwanted Documents

During ECA, many documents can be culled from the review population without any review required based on the metadata associated with the documents. Here are examples of metadata fields that we can use to cull unwanted documents:

Date Range: Documents outside the relevant date range should be culled from review.

De-Duplication: Generating a Hash value digital fingerprint for each file and then automatically excluding additional documents with the same Hash value is one of the best methods to cull many unwanted documents.

Sender Domain: Domain categorization to identify emails from non-responsive domains is another way to cull unwanted documents quickly and effectively.

File Type: Certain file types can also be excluded depending on the issues of the case. 

Key Terms: The presence or absence of key search terms could be the final method of culling unwanted documents. It’s also the least predictable for various reasons, so it’s essential to test the results and what’s not retrieved to ensure that the terms fit the scope.


Identifying Key People

While you may know some of the important people involved in the case, others may only be readily apparent once you look at the communication patterns of other key custodians. A communications analysis widget within an ECA tool can help identify those communication patterns that might lead to the identification of additional key custodians. In addition, drilling into communications between people to quickly analyze those communications can enable groups of documents to be rapidly classified.


Identifying Important Topics

Conceptual clustering is another way of identifying essential topics that lead to documents likely to be critical to the case. A cluster wheel within an ECA tool can quickly identify additional concepts that may also be important to your case and may need to be included when determining key search terms. In addition, your ability to drill into the cluster wheel enables you to quickly mark groups of documents that are responsive or non-responsive, saving them from a linear review process.



ECA technology today provides terrific tools to cull unwanted documents and prioritize what’s left quickly. Still, you can only go so far without the expertise to maximize the effective use of those tools, employing repeatable templates to manage workflows. Leveraging that expertise to create those templates using these tools to streamline the ECA-to-review workflow is the key to understanding your data.


Early case assessment involves leveraging technology and expertise to minimize the risk and cost associated with document review by understanding your data before you start the review – but only if you assess your data early! Data may double every 1.2 years, but budgets don’t!


For more information about Sandline’s Managed Review services, click here.