AN INFORMATION EXTRACTION FRAMEWORK FOR DIGITAL FORENSIC INVESTIGATIONS
Abstract
The pervasiveness of information technology has led to an explosion of evidence. Attempting to discover valuable information from massive collections of documents is challenging. This chapter proposes a two-phase information extraction framework for digital forensic investigations. In the first phase, a named entity recognition approach is applied to the collected documents to extract names, locations and organizations; the named entities are displayed using a visualization system to assist investigators in finding coherent evidence rapidly and accurately. In the second phase, association rule mining is performed to identify the relations existing between the extracted named entities, which are then displayed. Examples include person-affiliation relations and organization-location relations. The effectiveness of the framework is demonstrated using the well-known Enron email dataset.
Origin | Files produced by the author(s) |
---|
Loading...