Data Fingerprinting with Similarity Digests

Vassil Roussev

doi:10.1007/978-3-642-15506-2_15

Conference Papers Year : 2010

Data Fingerprinting with Similarity Digests

(1)

Vassil Roussev

Function : Author

University of New Orleans

Abstract

State-of-the-art techniques for data fingerprinting have been based on randomized feature selection pioneered by Rabin in 1981. This paper proposes a new, statistical approach for selecting fingerprinting features. The approach relies on entropy estimates and a sizeable empirical study to pick out the features that are most likely to be unique to a data object and, therefore, least likely to trigger false positives. The paper also describes the implementation of a tool (sdhash) and the results of an evaluation study. The results demonstrate that the approach works consistently across different types of data, and its compact footprint allows for the digests of targets in excess of 1 TB to be queried in memory.

Keywords

Domains

Digital Libraries [cs.DL]

Fichier principal

Roussev10.pdf (1.61 Mo)

Origin	Files produced by the author(s)
licence	CC BY 4.0 - Attribution

Connect in order to contact the contributor

https://inria.hal.science/hal-01060620

Submitted on : Tuesday, November 28, 2017-12:26:02 PM

Last modification on : Friday, August 5, 2022-2:56:50 PM

Dates and versions

hal-01060620 , version 1 (28-11-2017)

Licence

CC BY 4.0 - Attribution

Identifiers

HAL Id : hal-01060620 , version 1
DOI : 10.1007/978-3-642-15506-2_15

Cite

Vassil Roussev. Data Fingerprinting with Similarity Digests. 6th IFIP WG 11.9 International Conference on Digital Forensics (DF), Jan 2010, Hong Kong, China. pp.207-226, ⟨10.1007/978-3-642-15506-2_15⟩. ⟨hal-01060620⟩

Data Fingerprinting with Similarity Digests

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share