Skip to content

Assignment 1

MingxiuWang edited this page May 12, 2025 · 69 revisions

Assignment-1 folder layout

Assignment-1
|-- CPP
|   |-- Assignment-1.cpp
|   |-- Assignment-1.h
|   |-- CMakeLists.txt
|   `-- test.cpp
|-- Python
|   |-- AndersenPTA.py
|   |-- Assignment1.py
|   |-- Main.py
|   |-- Test.py
`-- Tests
    |-- SrcSnk.txt
    `-- testcases
        |-- icfg
        |   |-- test1.c
        |   |-- test1.ll
        |   |-- test2.c
        |   `-- test2.ll
        |-- pta
        |   |-- test1.c
        |   |-- test1.ll
        |   |-- test2.c
        |   |-- test2.ll
        |   |-- test3.c
        |   |-- test3.ll
        |   |-- test4.c
        |   `-- test4.ll
        `-- taint
            |-- test1.c
            |-- test1.ll
            |-- test2.c
            `-- test2.ll


Before You Start

Each assignment (Assignment 1, 2, and 3) is provided in two versions: one in C++ and one in Python.
You only need to complete one version — feel free to choose the language you’re more comfortable with(though we recommend using C++).

For C++ examples, Visual Studio Code (VSCode) is used as the example IDE. For configuration details, please refer to this section.
For Python examples, we use PyCharm as the example IDE. For configuration details, please refer to this section.

1. Get the latest Assignment-1 code template

* Before coding, please type cd $HOME/Software-Security-Analysis and git pull in your terminal to make sure you always have the latest version of the code template before each assignment.

If git pull fails due to the conflict with your local changes, type git stash to store your current code in a temporal branch and type git pull again. If you want to retrieve your code back, type git stash pop.

2. Assignment 1 coding task

- Implement the following methods of class ICFGTraversal and AndersenPTA in Assignment-1.cpp (or Assignment1.py) by using some SVF APIsFor C++ (or For Python)

Function Description Marks
readSrcSnkFromFile Identify sources and sinks by parsing APIs in SrcSnk.txt for reachability analysis 20%
reachability Context-sensitive reachability analysis on the ICFG 30%
solveWorklist Field-sensitive inclusion-based points-to analysis (Andersen's analysis) 30%
aliasCheck Check aliases of the two variables at source and sink. Two variables
are aliases if their points-to sets have at least one overlapping element.
20%

- Tainted Information Flow:

Given a tainted source v1@src (variable v1 at program point src), we say that a sink v2@snk is also tainted if both the following conditions satisfy: (1) src reaches snk on the ICFG via context-sensitive reachability analysis, and (2) pts(v1) ∩ pts(v2) ≠ ∅ inferred by Andersen's field-sensitive analysis. Note that in this assignment, v1 is the return value(you can see API for C++ here or Python here) when calling a source function, and v2 is any argument(C++ API here or Python API here) of the sink function.

- Tips for implementing reachability and solve_worklist.

The implementation of reachability differs from the one in Lab-Exercise-1 in that the paths collected need to be feasible in terms of context sensitivity (calls and returns ICFGNodes must match on each program path). The implementation of solve_worklist also differs from the one in Lab-Exercise-1 by following an additional field-sensitive rule, which distinguishes fields of each struct but is array-insensitive (treating all elements of an array as one object). Please refer to C++ API here or Python API here to obtain a field object (getGepObjVar) given a struct object and a field index. The constraint solving stops until a fixed point is reached (i.e., no new COPY edges are added and the points-to sets are unchanged). No particular order when resolving edges is needed when performing the constraint solving.

C-like form Constraint form Solving rule Explaination
p = &o p <--ADDR-- o pts(p) = pts(p) ∪ {o} add o into p's points-to set
q = p q <--COPY-- p pts(q) = pts(q) ∪ pts(p) union p's points-to set into q's one
q = *p q <--LOAD-- p for each o ∈ pts(p) : add q <--COPY-- o for each o in p's points-to set, add a COPY edge from o to q (if it is not on the graph)
*p = q p <--STORE-- q for each o ∈ pts(p) : add o <--COPY-- q for each o in p's points-to set, add a COPY edge from q to o (if it is not on the graph)
q = &p->fld q <--GEP, fld-- p for each o ∈ pts(p) : pts(q) = pts(q) ∪ {o.fld} for each o in p's points-to set, add o's field object o.fld into q's points-to set

- How to Test Your Implementation (Sanity Checks)

For C++ testing methods, refer to this section.
For Python testing methods, refer to this section.

- Debugging Tips for ICFG (How to Visualize Your .ll File)

For C++ visualization methods, refer to this section.
For Python visualization methods, refer to this section.

- Upload Instructions

For C++, see the upload instructions here.
For Python, see the upload instructions here.


3. Configuration & Debugging

For C++ configuration and debugging, refer to this section.
For Python configuration and debugging, refer to this section.

Clone this wiki locally