Malware Family Classification Model Using User Defined Features and Representation Learning
Abstract
Malware is very dangerous for system and network user. Malware identification is essential tasks in effective detecting and preventing the computer system from being infected, protecting it from potential information loss and system compromise. Commonly, there are 25 malware families exists. Traditional malware detection and anti-virus systems fail to classify the new variants of unknown malware into their corresponding families. With development of malicious code engineering, it is possible to understand the malware variants and their features for new malware samples which carry variability and polymorphism. The detection methods can hardly detect such variants but it is significant in the cyber security field to analyze and detect large-scale malware samples more efficiently. Hence it is proposed to develop an accurate malware family classification model contemporary deep learning technique. In this paper, malware family recognition is formulated as multi classification task and appropriate solution is obtained using representation learning based on binary array of malware executable files. Six families of malware have been considered here for building the models. The feature dataset with 690 instances is applied to deep neural network to build the classifier. The experimental results, based on a dataset of 6 classes of malware families and 690 malware files trained model provides an accuracy of over 86.8% in discriminating from malware families. The techniques provide better results for classifying malware into families.
Origin | Files produced by the author(s) |
---|