Karl Schieneman, Founder and President of the predictive coding consultancy Review Less, talks with Patrick Oot from the E-Discovery Institute about their recent study with Oracle. The study involved ranking different predictive coding tools on some ESI that Oracle had laying around from a previous case. This study has been getting a lot of traction in the “blogosphere.” I thought it would make for a nice show to discuss the results and where this fascinating study is going. What started as a “bake off” between technology assisted review (“TAR”) tools is now turning into a wonderful source of data on TAR.
Since TREC went dark last year, the industry has needed some more studies. This study fills that void and it looks like it will for a while with a number of follow up studies. Enjoy listening to this fascinating show.
This curated compendium of glossaries includes a profusion of valuable resources related to eDiscovery, big data, information governance, digital forensics, privacy, and security:
The Sedona Conference
The Sedona Conference Glossary is published as a tool to assist in the understanding and discussion of electronic discovery and electronic information management issues. For this excellent seminal work I'd like to personally thank—in addition to the various authors—Richard Braman (in memoriam), founder and Executive Director of The Second Conference, for his numerous contributions to our profession. Download the PDF of The Sedona Conference Glossary: E-Discovery and Digital Information Management.
eDiscovery People's Glossary: File Types of Electronically Stored Information (ESI) provides details about particular file types and extensions in all known ESI format categories. The objective of this glossary is to provide in one place all file types with known extensions for every category of ESI. To keep pace with inventions of new information types and formats, the eDiscovery People continuously update this glossary. Access eDiscovery People's ESI File Types Glossary at https://ediscoverypeople.com/glossary/esi/file-types If you learn about a new type of ESI, you may add a new entry to the eDiscovery People's Glossary of ESI File Types. You'll receive full attribution, including a link to your organization's website.
EDRM Glossary: The EDRM Glossary is a rather comprehensive listing of electronic discovery terms.
The Grossman-Cormack Glossary of Technology-Assisted Review (with Forward by John M. Facciola, U.S. Magistrate Judge), Federal Courts Law Review, Volume 7, Issue 1, 2013. Download the PDF of The Grossman-Cormack Glossary of Technology-Assisted Review.
InterNational Committee for Information Technology Standards (INCITS)
The InterNational Committee for Information Technology Standards (INCITS) is the central U.S. forum dedicated to creating technology standards for the next generation of innovation. INCITS members combine their expertise to create the building blocks for globally transformative technologies. From cloud computing to communications, from transportation to health care technologies, INCITS is the place where innovation begins. Download the PDF of the INCITIS glossary.
The National Institute of Standards and Technology (NIST)
This glossary of common security terms has been extracted from NIST Federal Information Processing Standards (FIPS), the Special Publication (SP) 800 series, NIST Interagency Reports (NISTIRs), and from the Committee for National Security Systems Instruction 4009 (CNSSI-4009). The glossary includes most of the terms in the NIST publications. Download the PDF of the Glossary of Key Information Security Terms (NIST.IR.7298r2, Revision 2), Richard Kissel, Editor
U.S. National Archives & Records Administration
The U.S. National Archives & Records Administration has published archival terminology that includes a flexible group of common words that have acquired specialized meanings for archivists. Frequently used archival terms are those that describe documentary materials and archival institutions. Visit the site to view an early release, free version.
Possibly the most significant impact on archival language and professional boundaries resulted from the challenges of electronic records. E-records forced archivists into collaborations with different disciplines. In response, archivists adopted terms from information technology, publishing, and knowledge management. They began to grapple with born-digital documents and to become familiar with arcane aspects of technology used to record and authenticate electronic documents, such as ciphers, encryption keys, and encoding schemes. At the same time, other professions adopted—sometimes appropriated—archival terms. The very word that identifies the profession, archives, took on the meaning of offline storage and backup.
Society of American Achivists
Published by the Society of American Achivists, browse terms (and download PDF) of A Glossary of Archival & Records Terminology, by Richard Pearce-Moses.
ARMA International published (at a nominal cost) a Glossary of Records and Information Management Terms, 4th Ed. (ARMA TR 22-12012), which includes about 800+ terms from various disciplines related to records and information management (RIM), including information technology, legal services, archives, and business management. PDF available here.
NOTE: This compendium of glossaries will be regularly updated. To suggest an additional glossary for inclusion, please contact me and provide pertinent details.
Karl Schieneman, Founder and President of Review Less, a predictive coding consultancy, and Adjunct Analyst with the E-Discovery Journal moderates a special ESIBytes show discussing recent publication by Dr. Maura Grossman and Dr. Gordon Cormack entitled The Grossman-Cormack Glossary of Technology-Assisted Review published in Volume 7 Issue 1 of the Federal Courts Law Review in 2013. Joining this show are the Honorable John M. Facciola from the District of Columbia who wrote the foreword to the Glossary and Maura R. Grossman, Counsel at Wachtell Lipton, Rosen & Katz and and Dr. Gordon V. Cormack Professor from the University of Waterloo in Waterloo, Ontario.
The Glossary of Technology-Assisted Review provides technical terms and ground-breaking cases with definitions that relate to the field of predictive coding. These are terms which one can encounter in the field of predictive coding and range from the expected terms like Recall and Precision but also encounter “hair on the back of your neck” terms like “Gaussian Distribution,” “Harmonic Mean,” “Jaccard Index” and many other technical terms. We will discuss why it is so important to have a glossary, move into how to use such a resource, and finally discuss future plans to put this information online for the legal community to update.
Karl Schieneman, Founder and President of Review Less, interviews Nicholas Pace, a social scientist with the Rand Institute and author of Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery. This podcast focuses on this study released in April, 2012 by the not for profit Rand Institute for Civil Justice on electronic discovery costs.
The Symantec eDiscovery Platform's Review & Production Module allows users to reduce review costs by up to 98% with Transparent Predictive Coding and provides flexible options to produce data.
Listen to Karl Schieneman, Founder and President of Review Less talk with Herb Roitblat, Chief Scientist at OrcaTec, a predictive coding software company and the Chairman of the E-Discovery Institute. Together we will discuss the validation of predictive coding, a topic which has become the central issue in the Da Silva Moore v. Publicis Group case in the S.D. of NY. We will also talk about statistics and other items which impact this discussion. For extra measure, we will discuss studies which are out in the field and what they mean when defending predictive coding. The Kleen case in the 7th Circuit will be mentioned quite a bit as well. This is an exciting show as we tackle some of the hottest issues in predictive coding the legal community is facing right now.