New approach to combatting malware
Building on discoveries he published in a 2018 paper, a researcher at The University of Texas at Arlington is working to combat malware by building anti-malware code into computer hardware instead of relying on software.
Jiang Ming, assistant professor of computer science and engineering, is using funding from a $174,998 National Science Foundation grant for the research.
Malware is malicious code inserted into a computer’s operating system through software files. It can cause minor inconveniences or massive problems, such as the hijacking of corporate computer networks that could lead to millions of dollars in lost time and revenue.
Hackers camouflage their malware so it won’t be detected by anti-virus software. The most common obfuscation is binary packing, where code is placed into a “package” that is then hidden inside a number of other “written-then-executed” layers. Once the code starts running on a victim machine, the malware initiates an unpacking process that breaks down the outer layers until the sinister code is unveiled. Eighty percent of malware samples are packed because it is a very efficient, low-cost approach that is easily purchased on the black market.
Anti-virus software checks against a database of common code that show the presence of malware, but the big challenge in malware analysis is getting past packed encryption and compression. When security companies collect malware code from the internet or a victim machine, it’s packed and not the original code, which means it can’t be analyzed to stop the malicious code.
In his new research, Ming hopes to develop a novel machine-learning model, recorded on a chip at the hardware level, that will allow recovery of a fully functional version of the original application programming interface, or API, of the malware’s binary code. The API is a set of clearly defined methods of communication among various software components.
Being able to recover the unpacking code at the hardware level will be more effective in preventing malware infection because once the binary code has been detected by anti-virus software, it is often too late to prevent it from unpacking.
“The big impact of our work is to apply lots of artificial intelligence and deep-learning techniques to analyze a large amount of malicious code and find variants,” Ming said. “The challenge is to apply the deep-learning approach because most memory is packed. This will pave the way for high-performance, larger-scale malware analysis with a deep-learning algorithm.”
Ming’s paper, “Towards Paving the Way for Large-Scale Windows Malware Analysis: Generic Binary Unpacking with Orders-of-Magnitude Performance Boost,” was published in the Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18). He and his co-author, Binlin Cheng of Wuhan University and Hubei Normal University, presented their findings at the conference in October 2018.
Other collaborators were Jianming Fu and Guojun Peng of Wuhan University, Ting Chen and Xiaosong Zhang of the University of Electronic Science and Technology of China, and Jean-Yves Marion of L’Université de Lorraine, France.