简介

这篇文章提出了 Driller,这是一种混合漏洞挖掘工具,它以互补的方式将模糊测试和选择性混合执行结合起来,以发现隐藏更深的漏洞。模糊测试用于探索程序空间的不同区间,并使用混合执行来生成满足不同区间的输入。

Driller 概述

A core intuition behind the design of Driller is that applications process two different classes of user input: general input, representing a wide range of values that can be valid, and specific input, representing input that must take on one of a select few possible values. Conceptually, an application’s checks on the latter type of input split the application into compartments. Execution flow moves between compartments through checks against specific input, while, within a compartment, the application processes general input.

Driller is composed of multiple components:

In this example, the application parses a configuration file, containing a magic number, received over an input stream. If the received data contains syntax errors or an incorrect magic number, the program exits. Otherwise, control flow switches based on input between a number of new compartments, some of which contain memory corruption flaws.

模糊测试

To implement Driller, we leveraged a popular off-the-shelf fuzzer, American Fuzzy Lop (AFL). Our improvements mostly deal with integrating the fuzzer with our concolic execution engine. Since instrumentation that AFL relies on can be either introduced at compile-time or via a modified QEMU, we opted for a QEMU-backend to remove reliance on source code availability.

Fuzzer Features

Fuzzer Limitations

Because fuzzers randomly mutate input, and genetic fuzzers, in turn, mutate input that has, in the past, generated unique paths through a binary, they are able to quickly discover different paths that process “general” input. However, the generation of “specific” input to pass complex checks in the application is very challenging for fuzzers.

Transition to Concolic Execution

Driller aims to complement the fundamental weakness of fuzzing, determining specific user input required to pass complex checks, by leveraging the strength of concolic execution. When the fuzzing component has gone through a predetermined amount (proportional to the input length) of mutations without identifying new state transitions, we consider it “stuck”. Driller then retrieves the inputs that the fuzzer has deemed “interesting” in the current compartment and invokes the concolic execution engine on them.

The fuzzer identifies inputs as interesting if one of two conditions holds:

选择性混合执行

When Driller determines that the fuzzer is unable to find additional state transitions, the concolic execution engine is invoked. The concolic execution engine is used to leverage a symbolic solver to mutate existing inputs that reach but fail to satisfy complex checks into new inputs that reach and satisfy such checks.

When Driller invokes the concolic execution engine, it passes all of the “interesting” inputs that were identified by the fuzzing engine. Each input is traced, symbolically, to identify state transitions that the fuzzing engine was unable to satisfy. When such a transition is identified, the concolic execution engine produces input that would drive execution through this state transition.

After the concolic execution engine finishes processing the provided inputs, its results are fed back into the fuzzing engine’s queue and control is passed back to the fuzzing engine.

Concolic Execution

We leveraged angr for Driller’s concolic execution engine. The engine is based on the model popularized and refined by Mayhem and S2E.

Driller’s symbolic memory model can store both concrete and symbolic values. It uses an index-based memory model in which read addresses may be symbolic, but write address are always concretized. This approach, popularized by Mayhem, is an important optimization to keep the analysis feasible.

Limitations

The traditional approach to concolic execution involves beginning concolic execution from the beginning of a program and exploring the path state with the symbolic execution engine to find as many bugs as possible. However, this approach suffers from two major limitations.

Concolic Execution in Driller

In most cases, most of the work is offloaded from the concolic execution engine to the fuzzer, which will find many paths quickly, letting the concolic engine just work on solving the harder constraints.