The classification functions used in ediscovery’s Technology Assisted Review (TAR) tools, which are used to rank documents as to how closely they match the relevant documents used to train the TAR tools, are usually judged to be successful based on their recall and precision. Dr. Herbert L. Roitblat has a great discussion on this topic here, which I agree with wholeheartedly. One key point that he skips over, though, is how to actually verify whether the recall percentage is correct. This is turn brings up the perennial question: What can we use to validate the results of a TAR review? which leads us to: What can we use to compare the results between different TAR tools and workflows?

Ideally, we would have available at least one document set for a civil case, containing the complete productions of both parties, for which each document has attached a tag marking the document as relevant or not relevant for the matter at hand. This tagged document set would be used by to compare the results of the TAR software’s predictions to the values of the actual tags. With this information, ediscovery professionals could build the best workflow for their needs, using the best TAR software needed as part of their workflow. So why don’t we have this?

The nature of ediscovery document sets is that they are private. Each document set contains the relevant documents in the case, which neither party likely wants to have public. Each set also contains documents which aren’t relevant, but which also likely contain business or personal secrets. When Enron executives Jeffrey Skilling and Kenneth Lay were found guilty of conspiracy, fraud, and insider trading in 2006, the documents of the case were made public. The EDRM project made these documents available online, and Nuix provided a cleansed version of the set. So why isn’t a tagged version of Enron ready for use?

When I tested TAR software, I spent time tagging documents. If you’ve done the same, you know that reading each document and determining its relevance can be slow, painstaking work. Even if you’re quickly tagging obviously not relevant documents — Amazon receipts, blank documents, random characters — the process is dull. Very, very dull. In order to tag all 18 gigabytes of Enron email, you would need to find enough qualified people to perform the review who are willing to do the work. You will also likely need to pay them. Possibly a lot.

Since, unfortunately, the tags from the Enron case didn’t come along with the documents themselves, any new tagging initiative will need to take all of this into consideration. The issues are considerable, but not insurmountable. Even knowing all of this, I still wanted to find a set of tagged documents that could be used as a gold standard for comparing the results of TAR review. In my next post, I’ll tell you what my quest turned up.