EDRM on its head - Why collect what you don't need?

June 3, 2019


The eDiscovery Reference Model (EDRM) has been a great guide to create a common language of what legal teams with an inside council or outside council are required to do as they prepare a case. The EDRM model, and the organization, has been such a good reference that the terminology and structure has been adopted by internal investigations, information governance, and information security. This has also attracted new technologies that question the current model by bringing exciting new innovations to the market. If you look at the accepted business model for eDiscovery, you see the following structure that actually maps nicely to the model. First, there is identification and collection. A number of organization focus on collecting as much data as possible that may be relevant to the case. This information is then passed on to another set of technologies to process, review, analyze and produce, and finally present. The cloud has seen a number of new startups appear in this last phase that provide new levels of usability - with flashy UIs. 


What about Identification and Collection?

However, the identification and collection part of the process has kept the basic same assumptions. These assumptions are based on a criminal model, where police officers are sent onsite to an organization with a search warrant to collect all relevant information. In IT criminal investigations, that has often meant either copying all of the data, or walking away with computers.

In the corporate world, we still operate somewhat under the same premise. When we get sued, we call our external council who will request that we copy (collect) a whole bunch of data (related and unrelated) and send it to them for further processing or analysis. The problem here is two-fold: first, the organization which is the owner of this information has the best knowledge of where relevant information resides and what constitutes relevant information; second, the business players receiving the information (often subcontractors to the outside council) charge fees proportional to the amount of data that needs to be reviewed. 

The suppliers of software to these companies also charge fees proportional to the amount of data. These suppliers have been trying to increase their piece of the pie by delivering smarter and smarter software with features such as machine learning and clustering that can reduce the time and the number of people required to perform reviews. This reduces time to results, which are important, and also reduces the number of humans and computers required to do review. So as an industry, we have made good progress, but not much progress has been made in terms of the identification and collection process. And that may be due to the business model of the eDiscovery software vendors. 


The left side of the EDRM is relevant too

What if there was a way for enterprise to reduce the amount of data that is collected? Move the more relevant bar from the right side of our eDiscovery model a bit more to the left. Healthcare organizations, corporations, and governments have this opportunity. They can take control of the eDiscovery costs and timelines by being smarter during the collection process. They own the data, so why not reduce the amount of data that is sent for review? This not only reduces costs, but it shortens the timeline to send reviewed data to their corporate council. 

The same thing applies to outside legal council. They have every reason to reduce the cost of eDiscovery and make their efforts with the customer more effective. Furthermore, they can get directly involved in helping customers reduce the cost by participating in this enhanced collection process. 


How does this work? 

We could call it preemptive collection and identification. The idea is to have software that connects to all the data sources that are relevant to an organization, keep a permanent collection, and an up-to-date index of the information. So when an eDiscovery request is made, inside or outside council can quickly identify a complete but minimum set of data by processing directly in the source location. Then in one swoop, the information is collected, put on legal hold, and exported thereby completing the review, produce, and present process.  

What is great is that this process applies not only to on premise data, for which many organizations have collection tools, but it also applies to all of the increasingly-popular cloud services such as box.com, OneDrive, O365 email, and more. So within one system and one UI, documents can be collated to a minimum data set that will significantly reduce the cost of the eDiscovery process. Furthermore, the overall process is much shorter, reducing timelines, and allowing organizations to prepare better for litigation.