Data Corpora for Digital Forensics Education and Research
Abstract
Data corpora are very important for digital forensics education and research. Several corpora are available to academia; these range from small manually-created data sets of a few megabytes to many terabytes of real-world data. However, different corpora are suited to different forensic tasks. For example, real data corpora are often desirable for testing forensic tool properties such as effectiveness and efficiency, but these corpora typically lack the ground truth that is vital to performing proper evaluations. Synthetic data corpora can support tool development and testing, but only if the methodologies for generating the corpora guarantee data with realistic properties.This paper presents an overview of the available digital forensic corpora and discusses the problems that may arise when working with specific corpora. The paper also describes a framework for generating synthetic corpora for education and research when suitable real-world data is not available.
Origin | Files produced by the author(s) |
---|
Loading...