Predicting the Future of Predictive Coding

Posted on July 9, 2012 by Hayes Hunt

By Hayes Hunt and Jillian R. Thornton

old file cabinets.jpg A decade ago, document review meant a small militia of lawyers sitting in a windowless warehouse surrounded by bankers’ boxes full of paper documents. Now, thanks to extreme information inflation, the bulk of document review takes place electronically. In order to keep up with the enormous volume of electronically stored information, lawyers have employed a method featuring a combination of keyword searches and manual review. Most importantly, e-discovery can be responsible for 70 to 90 percent of the client’s cost of litigation. However, recently, the universe of ESI has expanded in exponential fashion. Exabytes have devoured the smaller gigabytes in the ESI pond. What’s next? Predictive coding.

Predictive coding is being used to run algorithms that allow for computer characterization of a massive set of electronic data for a fraction of the cost of more traditional methods (i.e., a cadre of lawyers). Case law is now catching up to the technology and various judges are giving the green light for lawyers to employ predictive coding in e-discovery without running afoul of the rules. The proper use of predictive coding, especially in large-data-volume cases, provides huge benefits for lawyers and clients: Predictive coding of ESI takes much less time, saves a lot of money and is often as accurate or more accurate than manual review. Of course, predictive coding also can be problematic if, for example, privileged documents are disclosed.

A recent study by Rand Corp., which includes 57 case studies from eight large corporations, shows that the cost of e-discovery can be grouped into three main categories: collection, processing and review. Amazingly, the review phase accounted for 73 percent of the costs incurred during e-discovery. Predictive coding works to drastically reduce the number of documents that are manually reviewed by lawyers. Here’s how it works: The first step in the process is that lawyers review a small sample of documents and code those documents for relevance or privilege or subject matter. The software then studies the sample set and applies the coding principles that it has learned to a larger set of documents. Then, the lawyers review the computer-coded documents to further teach the program how to code. This program continues until the software identifies only relevant documents. After coding is finished, the software can be used to select a small, random population of documents for lawyers to perform quality-control checks. If errors are found, the lawyers code more sample documents until accuracy of the coding reaches an acceptable level. Then the review is complete. The software can reduce the documents that need to be manually reviewed from a set of 2 million, for example, to only 3,000 to 5,000 documents. Assume it takes a lawyer 60 seconds to review a one-page document and you can easily do the cost-effective math of predictive coding.

Published in The Legal Intelligencer on June 27, 2012

About The Author

Tagged with: Discovery, ESI, Litigation, Litigation Practice, Predictive Coding
Posted in Litigation

Predicting the Future of Predictive Coding

Leave a Reply Cancel reply