Abstract
For years security machine learning research has promised to obviate the need
for signature based detection by automatically learning to detect indicators of
attack. Unfortunately, this vision hasn't come to fruition: in fact, developing
and maintaining today's security machine learning systems can require
engineering resources that are comparable to that of signature-based detection
systems, due in part to the need to develop and continuously tune the
"features" these machine learning systems look at as attacks evolve. Deep
learning, a subfield of machine learning, promises to change this by operating
on raw input signals and automating the process of feature design and
extraction. In this paper we propose the eXpose neural network, which uses a
deep learning approach we have developed to take generic, raw short character
strings as input (a common case for security inputs, which include artifacts
like potentially malicious URLs, file paths, named pipes, named mutexes, and
registry keys), and learns to simultaneously extract features and classify
using character-level embeddings and convolutional neural network. In addition
to completely automating the feature design and extraction process, eXpose
outperforms manual feature extraction based baselines on all of the intrusion
detection problems we tested it on, yielding a 5%-10% detection rate gain at
0.1% false positive rate compared to these baselines.
Users
Please
log in to take part in the discussion (add own reviews or comments).