%0 Conference Proceedings %T Hashing Incomplete and Unordered Network Streams %+ Institute of Information Engineering [Beijing] (IIE) %+ Bank of China %+ Chinese Academy of Engineering (CAE) %A Zheng, Chao %A Li, Xiang %A Liu, Qingyun %A Sun, Yong %A Fang, Binxing %Z Part 3: Network Forensics %< avec comité de lecture %( IFIP Advances in Information and Communication Technology %B 14th IFIP International Conference on Digital Forensics (DigitalForensics) %C New Delhi, India %Y Gilbert Peterson %Y Sujeet Shenoi %I Springer International Publishing %3 Advances in Digital Forensics XIV %V AICT-532 %P 199-224 %8 2018-01-03 %D 2018 %R 10.1007/978-3-319-99277-8_12 %K Fuzzy hashing %K network traffic %K approximate matching %K file tracking %Z Computer Science [cs]Conference papers %X Deep packet inspection typically uses MD5 whitelists/blacklists or regular expressions to identify viruses, malware and certain internal files in network traffic. Fuzzy hashing, also referred to as context-triggered piecewise hashing, can be used to compare two files and determine their level of similarity. This chapter presents the stream fuzzy hash algorithm that can hash files on the fly regardless of whether the input is unordered, incomplete or has an initially-undetermined length. The algorithm, which can generate a signature of appropriate length using a one-way process, reduces the computational complexity from $$O\left( n \log n\right) $$ to O(n). In a typical deep packet inspection scenario, the algorithm hashes files at the rate of 68 MB/s per CPU core and consumes no more than 5 KB of memory per file. The effectiveness of the stream fuzzy hash algorithm is evaluated using a publicly-available dataset. The results demonstrate that, unlike other fuzzy hash algorithms, the precision and recall of the stream fuzzy hash algorithm are not compromised when processing unordered and incomplete inputs. %G English %Z TC 11 %Z WG 11.9 %2 https://inria.hal.science/hal-01988840/document %2 https://inria.hal.science/hal-01988840/file/472401_1_En_12_Chapter.pdf %L hal-01988840 %U https://inria.hal.science/hal-01988840 %~ IFIP-LNCS %~ IFIP %~ IFIP-AICT %~ IFIP-TC %~ IFIP-WG %~ IFIP-TC11 %~ IFIP-DF %~ IFIP-WG11-9 %~ IFIP-AICT-532