Skip to content

Implementing an intra procedural data flow analysis in Soot

johspaeth edited this page Aug 27, 2014 · 15 revisions

What is an intra-procedural data-flow analysis anyway?

The Wikipedia page describes it quite alright: an intra-procedural data-flow analysis operates on a control-flow graph – called UnitGraph in Soot – for a single method. A unit graph contains statements as nodes, and there is an edge between two nodes if control can flow from the statement represented by the source node to the statement represented by the target node.

A data-flow analysis associates two elements with each node in the unit graph, usually two so-called flow sets: one in-set and one out-set. These sets are (1) initialized, then (2) propagated through the unit graph along statement nodes until (3) a fixed point is reached.

In the end, all you do is inspect the flow set before and after each statement. By the design of the analysis, your flow sets should tell you exactly the information you need.

Forward, backward or branched?

There exist three different kind of FlowAnalysis implementations in Soot:

  • ForwardFlowAnalysis – this analysis starts at the entry statement of a UnitGraph and propagates forward from there,
  • BackwardsFlowAnalysis – this analysis starts at the exit node(s) of a Unit Graph and propagates back from there (as an alternative you can produce the [InverseGraph](http://www.sable.mcgill.ca/soot/doc/soot/toolkits/graph/InverseGraph.html( of your UnitGraph and use a ForwardFlowAnalysis; it does not matter); and
  • ForwardBranchedFlowAnalysis – this is essentially a forward analysis but it allows you to propagate different flow sets into different branches. For instance, if you process a statement like if(p!=null) then you may propagate the into “p is not null” into the “then” branch and “p is null” into the ”else” branch.

What direction you want to use depends entirely on your analysis problem. In the following we assume that we want to do a forward analysis.

Implementing the analysis interface

To implement a forward flow analysis, we subclass ForwardFlowAnalysis. A good example of what this could look like is GuaranteedDefsAnalysis. Type parameters

ForwardFlowAnalysis is a generic class with two type parameters:

  • N: The node type. Usually this will be Unit but in general you are also allowed to have flow analyses over graphs that hold other kind of nodes.
  • A: The abstraction type. Oftentimes people use sets or maps as their abstraction but in general you can use anything you like. Beware though, that your abstraction type must implement equals(..) and hashCode() correctly, so that Soot can determine when a fixed point has been reached!

Constructor

You have to implement a constructor that at least takes a DirectedGraph as an an argument (where N is your node type; see above) and passes this on to the super constructor. Also, you should call doAnalysis() at the end of your constructor, which will actually execute the flow analysis. Between the super constructor call and the call to doAnalysis() you can set up your own analysis data structures. newInitialFlow() and entryInitialFlow()

The method newInitialFlow() must return an object of your abstraction type A. This object is assigned as the initial in-set and out-set for every statement, except the in-set of the first statement (respectively an exit statement, e.g. return or throw, if you are implementing a backwards analysis) of your unit graph. The in-set of the first statement is initialized via entryInitialFlow()

Clone this wiki locally