Not all bytes are equal: Neural byte sieve for fuzzing

Rajpal, Mohit; Blum, William; Singh, Rishabh

Computer Science > Software Engineering

arXiv:1711.04596 (cs)

[Submitted on 10 Nov 2017]

Title:Not all bytes are equal: Neural byte sieve for fuzzing

Authors:Mohit Rajpal, William Blum, Rishabh Singh

View PDF

Abstract:Fuzzing is a popular dynamic program analysis technique used to find vulnerabilities in complex software. Fuzzing involves presenting a target program with crafted malicious input designed to cause crashes, buffer overflows, memory errors, and exceptions. Crafting malicious inputs in an efficient manner is a difficult open problem and often the best approach to generating such inputs is through applying uniform random mutations to pre-existing valid inputs (seed files). We present a learning technique that uses neural networks to learn patterns in the input files from past fuzzing explorations to guide future fuzzing explorations. In particular, the neural models learn a function to predict good (and bad) locations in input files to perform fuzzing mutations based on the past mutations and corresponding code coverage information. We implement several neural models including LSTMs and sequence-to-sequence models that can encode variable length input files. We incorporate our models in the state-of-the-art AFL (American Fuzzy Lop) fuzzer and show significant improvements in terms of code coverage, unique code paths, and crashes for various input formats including ELF, PNG, PDF, and XML.

Subjects:	Software Engineering (cs.SE); Machine Learning (cs.LG)
Cite as:	arXiv:1711.04596 [cs.SE]
	(or arXiv:1711.04596v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.1711.04596

Submission history

From: William Blum [view email]
[v1] Fri, 10 Nov 2017 01:29:47 UTC (323 KB)

Computer Science > Software Engineering

Title:Not all bytes are equal: Neural byte sieve for fuzzing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Not all bytes are equal: Neural byte sieve for fuzzing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators