doop仓库是一个gradle项目， `./doop` 其实就是一个bash去调用gradle命令

#!/usr/bin/env bash

ARGS=""
while [ "$1" != "" ]; do
    case "$1" in
        # Quote arguments with spaces.
        *\ * )
            ARGS="${ARGS} '$1'"
            ;;
        *)
            ARGS="${ARGS} $1"
            ;;
    esac
    shift
done

# Export number of terminal columns for help display.
if command -v 'tput' &> /dev/null
then
    export COLUMNS=`tput cols`
fi
eval "./gradlew :run -Pargs=\"$ARGS\""

还提供了一个 ./doopOffline ，就是离线模式的doop。通常在每次调用Doop时，底层构建系统都会检查所有依赖库的更新版本。有时可能需要在离线模式下调用doop。为此目的有一个替代脚本。

#!/bin/bash
eval './gradlew :run -Pargs="'$@'" --offline'

其实就是gradle加了一个offline参数

Doop 执行流程大致可以分为几步：

使用soot生成jimple文件
使用 --generate-jimple 参数可以输出jimple文件，在 output/$(uuid)/database/jimple 文件夹下
将jimple文件转换为datalog引擎的输入事实（.facts）
使用souffle引擎执行选定的分析，将关系输出为 .csv ，即分析结果

以长城杯b4bycoffee为例，解压springboot项目，将class文件打包为jar包

unzip b4bycoffee.jar
cd BOOT-INF/classes
jar -cvf classes.jar *

然后运行

./doop -a context-insensitive --information-flow spring --fact-gen-cores 16 --souffle-jobs 16  -i /tmp/BOOT-INF/lib/classes.jar --stats none

解释一下各个参数

-a context-insensitive
--information-flow spring
--fact-gen-cores 16 --souffle-jobs 16
-i /tmp/BOOT-INF/lib/classes.jar
--stats none

如果是第一次执行会比较慢，因为他会去 http://centauri.di.uoa.gr:8081/ 拉一些jar包，等着就行了，第二次就快了。

构建完之后可见整个构建过程分为几部分

out/uuid/database

分析输出的结果如图

ubuntu@ubuntu:~/doop$ cat  last-analysis/MockObject.csv
javax.servlet.http.HttpServletRequest::MockObject       javax.servlet.http.HttpServletRequest
javax.servlet.http.HttpServletResponse::MockObject      javax.servlet.http.HttpServletResponse
com.example.b4bycoffee.controller.indexController::MockObject   com.example.b4bycoffee.controller.indexController
com.example.b4bycoffee.controller.coffeeController::MockObject  com.example.b4bycoffee.controller.coffeeController
com.example.b4bycoffee.model.CoffeeRequest::MockObject  com.example.b4bycoffee.model.CoffeeRequest

可见spring的controller会被自动进行污点分析。

除去 --information-flow 指定为spring以外还支持一些其他的比如webapps javaee项目，这里不再演示了。

添加自己的规则

doop的规则是基础规则，只给你了脚手架，针对我们的实际应用我们不得不写一些自定义规则，比如我们想要调用图，那么可以将如下规则保存为my.dl

.decl CG(?caller:Method, ?callee:Method)

CG(?caller, ?callee) :-
  mainAnalysis.AnyCallGraphEdge(?invocation, ?callee),
  Instruction_Method(?invocation, ?caller).

.output CG

然后加上参数 --extra-logic my.dl 重新构建，查看last-analysis下的CG.csv即可。

另一种自定义规则的方式

加参数 --extra-logic 仍然会很慢，会更新依赖包、重新编译、生成facts等固定步骤，有没有更快的方式？

在doop分析的时候可见是用gcc编译成二进制文件来分析

这个二进制文件是souffle编译完的可执行文件。

可以直接运行这个可执行文件会按照之前的规则重新输出结果，规则文件是 gen_xxx.dl

我们写的my.dl被追加到最后面去执行了。

那么我们想要改自定义的规则可以直接编辑这个 gen_xx.dl ，在最后面追加即可。然后用souffle去运行，毕竟facts事实文件都有了。

souffle -F database/ gen_1755044251944027223.dl -j32

这里提到bytecodedl是把doop的几部分拆出来做了。用soot-facts-generator生成facts，编写规则之后直接用souffle进行查询。

而doop的好处就是内置的规则比bytecodedl多。

缺点很明显就是慢，每次查询都需要下依赖并重新生成facts。

官方也提到了可以生成facts之后用souffle运行自定义规则，我直接复制过来。

在文件 temp.dl 中放入代码：

#!java
.decl Var_DeclaringMethod(v: symbol, m: symbol)
.input Var_DeclaringMethod(IO="file", filename="Var-DeclaringMethod.facts", delimiter="\t")

.decl VarPointsTo(c1: symbol, h: symbol, c2: symbol, v: symbol)
.input VarPointsTo(IO="file", filename="VarPointsTo.csv", delimiter="\t")

.decl Temp(v: symbol, h: symbol)
Temp(v, h) :-
  VarPointsTo(_, h, _, v),
  Var_DeclaringMethod(v, "<Example: void test(int)>").

.output Temp

复制 Var-DeclaringMethod.facts，使它们与输出关系 VarPointsTo 位于同一目录中（替换$id为您的分析 ID）：

#!bash
$ cp out/$id/facts/Var-DeclaringMethod.facts out/$id/database/

运行查询并查看其结果：

#!bash
$ souffle -F out/$id/database/ temp.dl
$ cat Temp.csv
<Example: void test(int)>/@this <Example: void main(java.lang.String[])>/new Example/0
<Example: void test(int)>/l0#_0 <Example: void main(java.lang.String[])>/new Example/0
<Example: void test(int)>/l3#_32        <Example: void test(int)>/new Cat/1
<Example: void test(int)>/l4#_33        <Example: void test(int)>/new Cat/2
<Example: void test(int)>/$stack5       <Example: void test(int)>/new Dog/0
<Example: void test(int)>/$stack6       <Example: void test(int)>/new Cat/1
<Example: void test(int)>/$stack7       <Example: void test(int)>/new Cat/2
<Example: void test(int)>/$stack8       <Example: void test(int)>/new Cat/0
<Example: void test(int)>/l2#_26        <Example: void test(int)>/new Cat/0
<Example: void test(int)>/l2_$$A_1#_28  <Example: void test(int)>/new Dog/0
<Example: void test(int)>/l2_$$A_2#_29  <Example: void test(int)>/new Cat/0
<Example: void test(int)>/l2_$$A_2#_29  <Example: void test(int)>/new Dog/0

这种方式需要你写import facts的规则，比较麻烦。bytecodedl就是这样。

可能会碰到的报错

soot生成facts事实的时候可能会报oom异常，这是因为内存给小了，修改build.gradle中

def factGenXmx='32G'
def factGenStack='16G'

给大一点就行了。

文末

其实doop最精华的应该是他的dl规则，我再看明白一点再写文章把。

说是最牛逼的指针分析框架，但实际学习的时候文档不全、资料太少、莫名其妙的报错等各种原因导致学习门槛太高了。

对实际挖洞而言，具体怎么用得再学一学再写，个人倾向于像bytecodedl那样拆出来改一改，加上危险函数sink规则，配合doop原有的spring、webapp的source，输出一条准确的污点分析过后的路径图是最好的。

本文含有隐藏内容，请开通VIP 后查看

Doop学习 part 1

doop仓库是一个gradle项目， `./doop` 其实就是一个bash去调用gradle命令

添加自己的规则

另一种自定义规则的方式

可能会碰到的报错

文末

网站公告

今日签到

热门文章

最新发布

Doop学习 part 1

doop仓库是一个gradle项目， ./doop 其实就是一个bash去调用gradle命令

添加自己的规则

另一种自定义规则的方式

可能会碰到的报错

文末

网站公告

今日签到

热门文章

最新发布

doop仓库是一个gradle项目， `./doop` 其实就是一个bash去调用gradle命令