As we’ve discussed previously, society today is moving faster and faster into a world of complete digital existence, and the amount of ESI (Electronically Stored Information) is growing exponentially, making Predictive Coding the most discussed topic in E-Discovery. Every legal event like ILTA2013 (iltanet.com) and ACEDS 2013 (ACEDS.com) host a number of panels discussing Predictive Coding. Judges are also increasingly seeing the need to weed through the terabytes of data more efficiently, resulting in the emergence of multiple case decisions that support using Computer Assisted Review. ( Global Aerospace, Inc., et al. v. Landow Aviation, L.P., et al. Loudon Court Case #CL61040 (VA)) and (Da Silva Moore, et al. v. Publicas Group and MSL Group in the Southern District of New York). As an E-Discovery consultant, I can also say that I am receiving requests from attorneys about Predictive Coding when discussing case data with much more regularity. So if this is the next big game changer, what is it and how does it work? To understand Predictive Coding from a 30,000 foot view we will break it down into its three main components: Case expert(s), an Analytics Engine, and Data Validation. First is the Case Expert. It is imperative that the reviewers understand the case and can identify documents accordingly based on case criteria and case discussions. The computer assisted review is only as good as the teacher. If the reviewers don’t have a grasp of the goals of the case then the efforts put forth will be ineffective; “garbage in garbage out” as the saying goes. Without the case expert(s) the computer really has no way of identifying the importance of particular documents within the universe of ESI it is working with. Experts code for several factors in the review stage, thereby teaching the system which items are important. They may also make decisions on whether a particular document in the sample review set should be used as a representative in the sample. For example, a document with little or no text may not reveal enough about the document to be a good example for responsiveness. Next is the Analytics Engine. Like all other software, there are countless tools out there and as a service provider we utilize several. One of these engines is KCura’s Relativity Assisted Review and KCura’s Relativity Analytics. Relativity has grown in popularity with its Assisted Review platform. In 2012, in an article published by Bruce A. Olson on http://www.technolawyer.com">www.technolawyer.com, Relativity Assisted Review was given a LitigationWorld® Technoscore of A+ “for its ease of use and ability to triage large volumes of ESI in a statistically meaningful way.” In September 2013 KCura announced “Relativity Voted Best Predictive Coding Solution and Best Online Review Platform in New York Law Journal Reader Rankings”. To accomplish its Analytics, Relativity uses a specific type of text analytics called latent semantic indexing (LSI) which uses text and allows the samples to identify concepts within the text of the documents which will later help in categorizing the data as responsive or non-responsive based on the Experts sample set. Lastly is Validation. This is the final component in Predictive Coding. Utilizing the statistics coded in the review sample, a determination is made by the computer as to which documents it deems responsive. There will be a small computer generated set for the Case Experts to review to ensure that the initial coding sample gave the system enough data to correctly identify the responsive documents within the case. These quality control checks (or rounds) can be done over and over as changes are made in order to continue to teach the computer until you are satisfied with the results and that your review set is falling within your acceptable margin of error. So although this process cannot predict the final outcome of your case, it definitely will assist in the reduction and culling of non-responsive/non-relevant documents, help prioritize the responsive/relevant docs and can reduce the overall number of reviewers needed per case. Don’t be afraid of your “Big Data” but take control of it.