The insertion of expressions mixing arithmetic operators and bitwise boolean operators is a widespread protection of sensitive data in source programs. This recent advanced obfuscation technique is one of the less studied among program obfuscations even if it is commonly found in binary code. In this paper, we formally verify in Coq this data obfuscation. It operates over a generic notion of mixed boolean-arithmetic expressions and on properties of bitwise operators operating over machine integers. Our obfuscation performs two kinds of program transformations: rewriting of expressions and insertion of modular inverses. To facilitate its proof of correctness, we define boolean semantic tables, a data structure inspired from truth tables.

Our obfuscation is integrated into the CompCert formally verified compiler where it operates over Clight programs. The automatic extraction of our program obfuscator into OCaml yields a program with competitive results.

This work has been submitted to CPP 2019.

The development has been made with Coq 8.8.2 and OCaml 4.06.1. The sources are available here.

Note: the sources contain a modified version of CompCert 3.4. CompCert also requires the Menhir parser generator. Refer to the CompCert manual for detailed information.

- Download and extract the archive:
`tar -xvf CompCert-3.4-obf.tar.xz`

. - Inside the folder, run
`./configure`

with appropriate options (refer to the manual) then`make all`

to:- verify the proofs (this step is CPU-intensive and may take several minutes to complete).
- extract the executable OCaml code.
- compile the modified version of CompCert with the obfuscation based on mixed boolean-arithmetic expression.

- Then, run
`make install`

to install the`ccomp`

executable.

`ccomp file.c -o file -mbaobf cfg.mba`

will compile an obfuscated version of`file.c`

to using the obfuscation intructions in the file`cfg.mba`

.- Adding the option
`-dclight`

will output an obfuscated C version of`file.c`

in`file.light.c`

. - The archive contains two example files :
`test.c`

and`cfg.mba`

. You can run`ccomp test.c -o test -mbaobf cfg.mba -dclight`

at the root of the folder, and observe the obfuscation in the file`test.light.c`

. - In the configuration file (
`cfg.mba`

), each line corresponds to an obfuscation pass, and contains 4 integers :`"a;b;c;d"`

. This intruction will obfuscate the`a`

^{th}expression (Breadth-first search of the AST of the program) using the`b`

^{th}rewrite rule, then introduce the affine function`(2*c+1)x + d`

and its inverse around the freshly obfuscated expression.

The documentation gives a commented listing of the Coq specifications and proofs for the main development files. Proof scripts are folded by default, but can be viewed by clicking on "Proof".

- BooleanDecomposition: useful lemmas for simplification of bitwise operations.
- BinRec: definition of a binary induction principle.
- BooleanSemanticTable: definition of the notion of boolean semantic table.
- MBAexpr: definition of a Mixed Boolean-Arithmetic expression type
- ModularInverse: computation of a modular inverse, using the extended Euclidean algorithm.
- FastModularInverse: computation of a modular inverse, using a fast method.
- EvalDeterminism: additional lemmas over the Clight semantics.
- MBAtoClight: link between MBAexpr and Clight expressions.

- MBAobf: a CompCert compilation pass that obfuscates the program with Mixed Boolean-Arithmetic expressions.
- MBAobfproof: semantic preservation for the MBAobf pass.