Dec 01, 2005

"The simple act of searching does not guarantee that the item being sought will be found."

While this statement runs a serious risk of being overly simplified, it also has a significant and often overlooked meaning to attorneys engaged in the discovery process of digital files.

The Problem: Quite often, an attorney or a litigation support staffer will conduct a computer search for electronic files and have zero returns as the search result. Finding no search hits, the assumption is made that the file being sought does not exist. If the success of the case rests on finding discoverable electronic documents or the wrath of a judge is to be avoided for failing to produce electronic documents matching certain criteria, then this initial assumption can be a dangerous one.

In this article, I will not bore you with the typical comparison of how searches were done in the "old days", over the course of many sleepless nights and bad takeout food, pouring over boxes and boxes of endless paper documents as opposed to today's modern digital fortress . Nor will I provide meaningless statistics describing how a single computer, if all documents were printed and stacked on top of each other, would reach the height of the Empire State Building, etc. Any attorney who pays attention to their craft is well aware of these facts.  This article will, however, address some of the more common reasons why a simple search within a computer, or collection of digital files, does not always find documents that do in fact exist and are within the scope of your search. These are documents that could make or break a case, the proverbial "Smoking Gun".

Identifying the Search: Most mistakes in digital discovery are made before the search even begins. The proper selection of search terms and search methods are crucial to success.  Research is the best ally in this undertaking.

Attorneys who take on a case are often labored with the daily counseling that is present in every legal matter. Appearances in and out of court, prep time before and debriefing after, the authoring of motions, witness interviews, depositions and the endless phone calls. Unless you have a staff of a dozen assistants to review each of the thousands of documents that can be produced with poorly crafted search terms, a careful and meaningful selection of keywords and search criteria will serve you and your client best.

The Human Aspect: The files you intend to discover or the files that you are required to produce were once created and used by a person or group of persons. These files were most likely named, created and sometimes maintained on a regular basis by this person(s). If you are seeking information to support your case, first you must identify this person and get to know the subject(s) of your search on a human level, not just as a legal target. Their common everyday language and even slang words will aid you in creating accurate and meaningful searches.

Chances are good that your client will have knowledge pertaining to the nature of this person's everyday personal life or work, social and/or religious environment. This knowledge may include known words or phrases that are not common everyday language for the matter at hand. Attorneys are often labeled as having "Legal Speak", such labels are also attached to many other occupations or social groups. As they relate to the case, get to know them and incorporate them into your search were they are relevant.

Abbreviations: Most daily events in social life and the work place involve repetition and as such, produce a great amount of abbreviations. Work places will often turn a document named "Over Time Request Form" into "OTRF" or "OT Form".  The list is endless. Searching for the formal terminology of a document might end with dismal results.

Some companies involve a numbering system for documents. The above mentioned "Over Time" form may be referred to as a "1018 form" or something similar. I have worked on cases were company management did not call their documents by an official numbering system, but employees took it upon themselves to give certain documents a numbered abbreviation.  Sometimes this numbering system was carried over from a subject's experience with previous employers who did use a numbering system and the usage gradually caught on with other employees. Only persons intimate with the subject of your search would know details like this. Locate and interview them, it could make the difference in the end.

Foreign Languages: Matters of law use Latin heavily on both the civil and criminal ends of the spectrum. Who else but a lawyer would know what pro bono publico or in camera means in a legal setting.  Even the Police and Military use many words of French origin to denote their ranking system such as Corporal, Sergeant and Lieutenant.

Sometimes you will find that a subject in the matter may not have full command of the English language and will occasionally revert to their native tongue within certain computer files. It's not so far fetched, because after all, they never really expected you or anyone else to be looking at their plan to embezzle money from the company one day. As well, I have investigated several cases where subjects possessed foreign language skills and deliberately used these skills to create documents that their employers or co-workers could not read. There are many other examples within our melting pot society. Find them if they exist in your case, and utilize a staffer or an interpreter if necessary.

What's in a Name? Most often, your searches will be limited to the files and emails created and/or used by certain persons or custodians. Custodians are usually identified by name and in some cases by sections or locations. (Miami Office or Sales Team) This can seem to be a rather simple undertaking, until the human aspect kicks in again.  As we know, its common for people who enter into the bonds of marriage to change their last name. A custodian named Jane Doe could have once been a person known as Jane Smith, her maiden name. While family and close friends or co-workers will know this detail, computers are not often privy to such information, they only know what they are told. In a digital world, a person cannot have two names; computers generally do not work that way. To complicate things even more, it is also common for the subject to keep their previous name and use a modified version such as Jane Doe-Smith.  Some cultures will dictate that it is the male who will change his name when he becomes married.

So it's easy to see why a current day search for Jane Smith will not produce files with the maiden name of Jane Doe, even though she was employed as Jane Smith at one time and files do exist under that name. If the discovery period covers the time before your subject changed their name, you have a serious problem. If you're on the side seeking files, you will have a potentially large hole in your discovery period. If the roles are reversed, this situation could make for some tense moments as you try to explain to the judge that it was just a simple clerical oversight (even though it was clearly printed in the subject's personnel file) and that you were not trying to actually hide data. I'm sure the other side's counsel will be most sympathetic to your position.

Another aspect to consider is religious naming conventions. For example, it is common for persons of the Muslim faith to take on an Islamic name upon entering into the faith. This will most likely result in a new first and last name. Again, if this name change happens within you discovery period, you will want to know this.
When possible, check the personnel files, talk to your client and look for these details.

A Bit About Orders: If your seeking a discovery order of the opposing side's computer(s), it's almost a fore drawn conclusion that they will object entirely or in part to your attempts to discover their client's data. You're likely to only get one shot at achieving a meaningful discovery order for electronic files, so be certain to do the research well before hand and choose the search terms/keywords and criteria that allow for the best possible discovery return. Be cautious of using search terms and criteria that box you into a corner.

All the research in the world will not be able to predict search terms that will find each and every relevant file you would wish to discover. If the other side has their act together, they will want to limit your search to as narrow a date range as possible and spell out specific search keywords to be used. Avoid the date range issue entirely, if possible, and push for search terms that address a keyword and a description of keywords rather than the actual keywords themselves, such as files containing YOUR KEYWORD HERE, and files which may be similar in literal or figurative meaning as the search term.

If the opposing side is going to push for a date range to be effected, chances are good that they will get it. You have no right to rummage through personal or company files and emails that are not within the effected time period of your litigation. However, human nature being what it is, chances are also good that the target of your litigation did not just wake up one morning and decide to violate your clients rights. They probably thought about it before hand, maybe they discussed it with someone, possibly in an email. As well, after the act had been committed, people, in their human nature, like to talk about things they have done, good and bad. There stands a chance that they might have talked about what they had done after the fact. Maybe they bragged about it or had remorse and decided to communicate this with someone.

There are numerous cases that have been litigated successfully on such communications or recordings of electronic files prior to or after the fact. And there are also many orders that were given to allow for expanded date range searches. I have investigated cases were there was little if any actual evidence that decisively proved the mechanics of the charges in the case, but they were successfully litigated because custodians had discussed the act, via email, before or afterward for some time.
Crafting your order properly can pay off big in the end. Failure to do so might cause you to miss files that are entirely relevant, but do not match the criteria as spelled out in the order, and as such, you would have no right to discover them.

If your searching your own client's computer system, there will most likely be no order, however, you will stand a good chance of missing crucial files if you are not flexible to pursue similar files and adapt your search terms on the fly.

Collection and Processing of Electronic Data: Assuming that the data your going to search has already been processed by your litigation support staff and is in good order, you are ready to being searching the electronic evidence for relevant information. It is important to note that the daunting task of collection and initial processing of digital evidence is a task best left to an outside digital forensics provider, particularly when it pertains to the opposing side's data. Some very well structured Litigation Support Departments are capable of performing this task; however, I will not address that topic in this article, as its primary goal is to assist in effective searching techniques.

When your litigation support staff receives the data, they will need to compile it and take care of issues such as compressed files, databases, duplicate files, password protected files, metadata and the like. Not to mention encrypted files, steganography, viruses as well as the issue of unallocated space (The area of the digital media that is no longer in use, but was once used to store files and may still contain data.)

Research in Motion: Now that you've done the research, your data is collected and processed and you have a good set of search terms, it's time to get to work on the files.

Simply entering search terms into the search box of your favorite litigation discovery program will not always be the best method for finding the files that you seek. The files that would be found with just a plain text search word are likely to be limited at best. Additionally, the hundreds of thousands of files, that you will never have time to review, which will result from basic search term usage, can be overwhelming.

The Techniques: Depending on which software program your firm uses, you will have certain abilities to search beyond using just a plain text word search. I am not going to describe the inner workings of Concordance, Doculex, Summation and the like, but will only state that most of these programs have the ability to search with some or all of the techniques mentioned below. Consult you litigation support staff for more information about your specific software product.
All of the search methods listed below, allow for searching within files as well as file names.

Basic Search Elements

Simple Word Search: By simply typing the word model, you will get returns for documents that contain the word model anywhere in the document. This is true even if the word appears in another word, such as remodeling.  Many search programs allow for a basic way around this type of result, you could enter a search for:

(space key)model(space key)

Note: (space key) denotes pressing the space key, do not type it as it appears.

Wild Cards: Most search programs will allow you to substitute wild cards for letters that are unknown. This is a very common search method and can be of great aid in a situation where you either do not know the full spelling of the word you're searching for or you have reason to believe that the word might be misspelled. A wild card search would look like this:


This search would produce results for "House" or "Mouse", etc. Note: Various litigation programs will also allow, if not require that you use ! or ? in place of the  * symbol.

Inclusion: If you need to find files that have all the words you enter, not just some of them. The + symbol lets you do this.  For example, if you want to find files that have references to both President Clinton and Kenneth Starr in the same file. You could search this way:

clinton +starr

Note: Various litigation programs will also allow, if not require that you use AND in place of the "+" symbol.

Exclusion: If you want to search for files that have one word in them but not another word. The "-" symbol lets you do this.
For example, if you want to find files with President Clinton but don't want to see files relating to the Monica Lewinsky scandal. You could search this way:

 clinton -lewinsky

Note: Various litigation programs will also allow, if not require that you use NOT in place of the "- " symbol.

Phrases: If you want to find files that have words in the exact order that you type them, using the phrase method lets you do this. The above methods will produce documents no matter where the keywords might appear in the document.

For example, if you want to find documents that have the words after hours trade, you could search this way.

"after hours trade"

Multiple methods at once: You can also use more than one of the above methods. If your case involves an SEC investigation, and after hours trading is suspected, the following might produce good or otherwise, selective results.

after hours +trading  -night club -party -bar

Advanced Search Methods

Proximity: A proximity search will find words within a designated proximity of other words, either before or after the main keyword.  This method is useful for finding documents that contain sentences or paragraphs of interest and can greatly cull down the mountain of documents in your database.  If you were looking for an email or document that might pertain to a custodian named "James" and a company product named "wigit", you could enter a search like this:

 James w/10 wigit

This search will find the word James within ten words, before or after, of the word wigit. You could increase or decrease the number of words to suit your needs. If you know that the document will contain a certain number of consistent words, or the communication or document that you're looking for will contain a paragraph that has the two words in your proximity search, then this method can make searching your documents very easy.

Fuzzy Logic: Spelling and punctuation mistakes happen often. If you were searching for the term wigit and the main custodian in your case spelled the word wiggit, you would never find the documents. With the use of fuzzy logic search terms, you can address known or foreseen spelling issues. Most software tools that offer this function will have a variable that can be set. Such as 1 thru 10. If a fuzzy logic of 1 is set, it will allow for one character within the search term to be off or additional to the word. Obviously, the higher the variable you set, the more false returns you will get, adjust accordingly.

Note: This is not as similar as it might appear to the above Wild Card search method, as wild card search requires you to place the wild card somewhere in the search word and it will only allow the wild card in that specific place. Fuzzy logic will allow the variable to exist anywhere within the word, the only constraint being the number of the logic variable.

Synonym: This is a very interesting and exotic function, not all tools possess it. Basically, the search tool will return search results for a synonym of the search term that is entered. If you search for the word raise, the program will return search results for files containing the word lift, etc.

This function can be a useful tool as it allows you to search for keywords and also words that sound similar. A search for the term raise will produce search results for documents which contain the term raze.

This function will return search hits for documents that contain words that stem from the original search term. For example, if you search for the term "raise", you would receive results for documents containing the word "raising".

Date and time stamps:
The creation, modification and access dates of electronic files can be a dizzying observation for even the most experienced computer literate person. This topic alone could warrant the authoring of an entire book on the subject. Instead of going into the finer nuances of what factors create the three main date and times stamps of most digital files, I'll stick to the topic of the article, searching the dates.

Sooner or later, you are going to want or need to apply your search terms to files within a date range. During most of your searches this will not be a problem, as the vast majority of digital files can be easily searched or filtered successfully by all of the current litigation software on the market today. Occasionally an attorney might be faced with a file that displays a Modified date that is listed as being some time BEFORE the Creation date. In this case, the obvious question will arise: How can a file be modified before it was created? The simplest way to explain this is that the file was most likely copied from another digital storage device after it was created, such as a USB thumb drive or a floppy disk, etc. The best practice for this situation is to apply the date range to the earliest or latest of the three dates (MAC), depending on what end of the spectrum you are at, as there will be little if any, initial, reason to question the date authenticity. Erroring on the side of inclusion is a safe bet in this circumstance.

Searching Within Results:
Another useful feature to take advantage of, if your software supports it, is to search within search results. This method will allow for better search results as well as an audit trail. Additionally, this method will allow for you to expand your results back to the original search results, and reapply your secondary searches with variations.

For instance, this can have benefits if your first search is a for a custodian's name. This search will obviously hit on just about every email the person ever sent or received as well as many documents that may contain his name. From there, you can apply a secondary search term which may be specific to the matter of your case.

Reigning in the Results:
The overall goal when searching for digital files, if possible, is to cast your search net wide and then cull it down to files and information that best serve your case. If the research is done thoroughly and the search methods above are well chosen and executed, you should have some nice results, or at least the best that could be obtained from the data you have.

Expert Assistance:
You should never be hesitant to engage expert assistance. As alluded to above, there are many situations where the assistance of digital forensics expert will be helpful, if not required. The benefits of such an engagement can be both technical and legal. From a technical prospective, a properly trained and certified forensic investigator will be more likely to find data than your average IT/IS staffer. Their investigative instincts are also likely to be honed for this specific type of activity. From a legal prospective, having a third party conduct the examination and searches also helps fend off any challenges to the credibility of the evidence or findings.  There is just no replacement for an experienced investigator who has spent the last fifteen years creating and conducting thousands of digital searches. The advantages of including an expert, who understands how operating systems and digital file formats work together, can be more beneficial than could ever be explained in this article.  Happy Hunting!