%0 Conference Proceedings %T RouAlign: Cross-Version Function Alignment and Routine Recovery with Graphlet Edge Embedding %+ Institute of Information Engineering [Beijing] (IIE) %+ School of Cyber Security %A Yang, Can %A Liu, Jian %A Luo, Mengxia %A Gong, Xiaorui %A Liu, Baoxu %Z Part 4: Detecting Malware and Software Weaknesses %< avec comité de lecture %( IFIP Advances in Information and Communication Technology %B 35th IFIP International Conference on ICT Systems Security and Privacy Protection (SEC) %C Maribor, Slovenia %Y Marko Hölbl %Y Kai Rannenberg %Y Tatjana Welzer %I Springer International Publishing %3 ICT Systems Security and Privacy Protection %V AICT-580 %P 155-170 %8 2020-09-21 %D 2020 %R 10.1007/978-3-030-58201-2_11 %K Edge embedding %K Calling routine recovery %Z Computer Science [cs]Conference papers %X Reverse engineering is labor-intensive work to understand the inner implementation of a program, and is necessary for malware analysis, vulnerability hunting, etc. Cross-version function identification and subroutine matching would greatly release manpower by indicating the known parts coming from different binary programs. Existing approaches mainly focus on function recognition ignoring the recovery of the relationships between functions, which makes the researchers hard to locate the calling routine they are interested in.In this paper, we propose a method using graphlet edge embedding to abstract high-level topology features of function call graphs and recover the relationships between functions. With the recovery of function relationships, we reconstruct the calling routine of the program and then infer the specific functions in it. We implement a prototype model called RouAlign, which can automatically align the trunk routine of assembly codes. We evaluated RouAlign on 65 groups of real-world programs, with over two million functions. RouAlign outperforms state-of-the-art binary comparing solutions by over 35% with a high precision of 92% on average in pairwise function recognition. %G English %Z TC 11 %2 https://inria.hal.science/hal-03440839/document %2 https://inria.hal.science/hal-03440839/file/497034_1_En_11_Chapter.pdf %L hal-03440839 %U https://inria.hal.science/hal-03440839 %~ IFIP %~ IFIP-AICT %~ IFIP-TC %~ IFIP-TC11 %~ IFIP-SEC %~ IFIP-AICT-580