See also
These notes incorporate several other resources, in order of used-ness:
- Abstract Interpretation in a Nutshell
- Isil Dillig’s lecture notes on Abstract Interpretation
- Patrick Cousot’s paper on abstract interpretation
- The Wikipedia page for abstract interpretation, specifically beginning with the § on formalization
- This page on abstract interpretation
- These notes
Additionally, for practice talking about abstract interpretation, see 2024-07-31 • Justin, prelim targeted - abstract interpretation
Abstract interpretation is a general framework
A high-level overview
Before we dive into the textbook, let’s provide a very high-level overview of what abstract interpretation is.
Often times, we have some correctness property we want to prove about our program—for instance, “program
However, when proving properties about our programs, often times we don’t need to know the exact execution traces of our programs; instead, we can just approximate the behavior of our program by recording more general properties about our programs, and then having our semantics act on those general properties. For instance, if we want to weed out “divides by 0” bugs, we could the record the sign or zero-ness of every variable, and propagate this analysis by defining what analysis values should be returned from function calls. Or, if you want to do interval analysis generally, you could have that be your analysis.
In other words, we can use an abstract semantics—a conservative over-approximation of our program’s possible execution behavior. If we can prove program correctness holds for this over-approximation, then it must hold for the actual programs!
This is abstract interpretation, and it provides a general framework around program analyses, just as dataflow analysis provides a framework for program analyses. As we’ve alluded to in the example above, an abstract interpretation consists of two pieces:
- the abstract domain; the general property that we record in place of the programs themselves.
- the abstract semantics, which defines how programs transform these properties.
The upshot
We have two questions: why does this matter, and how is this related to other static analysis approaches: dataflow analysis and control-flow analysis?
See also
As alluded to in the above section, abstract interpretation is another framework for static analysis, similar to dataflow analysis. In particular, they both use lattices, though in different ways, as you’ll see below!
However, unlike dataflow analysis, abstract interpretation guarantees that our program analysis results are consistent with a program’s semantics! In abstract interpretation, we establish key relationships between the static analysis and the formal semantics. Additionally, we can iteratively build up more complex or more approximate analyses using simpler or more precise analyses, and still maintain guarantees that the analysis agrees with the semantics.
In dataflow analysis, we never establish this, and so we don’t have this guarantee.
A control flow analysis is a form of abstract interpretation!
Danger
Note that denotational semantics and domain theory also uses partial ordering, but in a completely different way than we’re using. Both use partial orders/posets to help with their formalizations, but completely differently. Don’t mix them up!
Abstract interpretations in depth
In the concrete world, a program
Example: imperative statements
For example, in IMP, a statement
transforms from state to state. In operational semantics land, we might say that a statement and a state eventually evaluates to a new state: . Similarly, for a single statement , we can state the following relation: Additionally, for a bunch of
statements , we could say: We’ll say that these concrete values in our language are part of the set
. Since evaluation may be non-deterministic (this is the whole point—it’s undecidable!), there’s no function to denote this. Our concrete semantics defines what programs and states are in this relation.
Note that we never enforce
Additionally, we’ll have a program analysis on expressions in our program. For instance, this might be an interval analysis that says “this expression must be within these two ints.” Maybe it’s a constant propagation analysis that says “this expression must have constant value
In our abstract semantics, programs act on a property
We can define a deterministic function
In the context of abstract interpretation, we insist that
In short, this means that we’re creating an abstract interpreter for our language. Instead of defining semantics for how values get transformed to values, we define semantics on how abstract values get transformed into other abstract values. This could be a range if we’re doing interval analysis, or information about the sign of different expression, or things like that.
To be honest, that’s all you really need to understand the technique. The next sections go through a ton of math, but most of this is used to formalize what we mean by “abstract interpreter,” and thus establish the important correctness properties of abstract interpretation:
- If we can define a representation function that represents concrete values in terms of analysis values, and this function fulfills a certain quality, we know that our analysis will preserve the semantics of our programs!
- If we have a precise analysis
and a more coarse analysis , and they fulfill certain properties with respect to each other, we can establish an equivalence between them, which gives us the ability to move between them and combine analyses with respect to this equivalence. - For sound analyses that need to run iteratively to fixpoint, we can use weakening and narrowing to ensure that we get relatively good results in a reasonable amount of time.
More info on the intuition between dataflow analysis and abstract interpretation
I’m not sure any of the answers here really address the intent of the original question, which seems to be asking for an intuitive, not technical, explanation. Dataflow analysis is concerned with getting the value of some piece of information at a given location. Examples of “information” are which definitions reach a given location, which variables are live at a given location, which expressions are constant at at a given location etc. Dataflow frameworks will typically require that the domain of values forms a finite lattice, that the transfer functions be monotone (the transfer function determines how that information is propagated from entry to the exit of the block), all this with the aim of being able to compute a fixed-point of dataflow values. It is used in compilers.
Abstract Interpretation (AI) OTOH aims to construct an abstract interpreter of the language. The goal is to determine “What does this piece of code compute? Lets try and answer that question in an abstract sense”. For example, if the computation returns the value of some index variable i, AI might compute a range for i so you can answer if there will be a bounds violation or something. So the domain of abstract values is slightly different, it might be a range domain, a polyhedral domain, etc. For this reason AI places different constraints from dataflow: the concrete and abstract domains are typically required to be related by something called a galois connection, which relates sets of concrete values to abstract ones. Because the domains used aren’t required to be finite, AI won’t always converge without intervention, in the form of widening/narrowing operations. AI is used in formal verification tools. They both share in common a desire to have the function iteration converge but that’s about it. So use dataflow analysis if you want to know the value of something at a location, use AI if you want to know what a program abstractly computes.
Both dataflow and AI can be used together. For example the disassembler tool Jakstab combines both - the dataflow is used to determine values for indirect jump targets (ie. what is new computed the value of the PC that will be loaded) and the AI is used to abstractly evaluate the piece of binary code.
This was copied from this StackOverflow answer
Soundness and correctness
Note
For the next few sections, we’re gonna get pretty abstract. Hold on! We’ll bring back concrete examples very soon.
So how do we know that an abstract semantics “corresponds” to a concrete semantics? That is to say, how do we know that if a property holds for the abstract semantics, it also holds for the concrete semantics? This is formalized as soundness (also called correctness in the textbook).
To formalize this, we first define representation functions
We can formulate “correctness” or “soundness” of an analysis like so:
The most important part of this equation is boxed: if
Alternative framing around relations
Alternatively, we can frame correctness in terms of a correctness relation
, which returns whether a value is described by the property . This relation must have the following properties:
Due to the first condition, if
approximates and the analysis is able to compute an upper approximation of , , is also an approximation of . Therefore, in the property lattice if , then is more precise than ( approximates all the values approximated by and possibly some other ones). If a value is approximated by more than one abstract value, the second condition allows us to deterministically select a single abstract value to use in our analysis: we take the smallest (i.e., most precise) of them all,
. is a valid approximation of . This condition excludes correctness relations where a value is approximated by two incomparable abstract values, and allows us to work with a single analysis instead of considering one analysis for each candidate abstract value.
On precision and the meaning of partial orderings
Notice that when discussing lattices here, we say
Link to originalNote
Another way of phrasing this:
For more information, see the note on lattices, which covers some of these differences and links to different references of lattices.
Galois connections
Sometimes, if we have a complex analysis
- An abstraction function
provides a representation of elements as elements - A concretization function
expresses the meaning of elements of as elements of .
There are two isomorphic ways of concretizing the notion that these are comparable analyses.
- In the first approach, we say
is a Galois connection iff and are monotone and the following holds: - Applying and undoing abstraction will yield something at most as strong as the original. It might be weaker (which is okay, since we don’t want abstraction to introduce spurious strength)
- Applying and undoing concretization will yield something at least as strong as the original.
- In both cases, the operations may lose precision—we might not get back what we started with, but we can get back a safe approximation
- Alternatively, we say
is an adjunction iff . and should respect the ordering between and . - If
safely describes , the concrete version of should over-approximate .
Example
We can describe the power set of integers
as an interval . The abstraction function returns the smallest interval that contains all the integers in the set, and the concretization function gets the set of all integers within the interval. Note that this relies on a partial ordering of intervals
, defined as if is entirely inside . Remember that in abstract interpretation, lesser analyses are more precise!
To relate it back to our discussion of program analyses above, we can define Galois connections using extraction functions! Given an extraction function
…where abstraction identifies the least upper bound of all the representations of the values in a set:
…and concretization correspondingly finds all the concrete values such that their representation is less than the abstraction value:
Soundness in Galois connections
A worked example: sign analysis
As an example, for integer expressions, we might want to figure out if an integer expression could be positive, negative, or zero. For instance, if we see x = y + z
, and we know y
is positive and z
is non-negative, what can we say about the sign of x
? This is the analysis discussed in the Isil slides (though with less precision than we’re doing here)!
In this case, let’s say we have two analyses. The first analysis x
is non-negative”—
So how might we define
What this means is that the abstraction function just gets the set of all
In the sign example, we’d read this as: abstraction gets the signs of all of the values in the input set, and concretization gets every integer with a sign contained in our sign set.
Extra properties of Galois connections
There are some properties we can establish about Galois connections:
is purely additive, is purely multiplicative , . has least upper bound, lots of joins—gets less precise (remember that we’re backwards vs dataflow here) , . has greatest lower bound, lots of meets—gets more precise
- We can define
and in terms of each other! - Let’s say we’ve already defined
somewhere. - Substitute by adjunction:
. Defining using . - If we had originally written
, we wouldn’t have been able to use adjunction!
- If we had originally written
- Same in other direction:
- Let’s say we’ve already defined
Strengthening Galois connections with Galois insertions
In a Galois connection,
In a Galois insertion, we now require:
thus guaranteeing that
Note that if we’re using an extraction function
Alternatively, if we have a reduction operator
If we have an extraction function, we can get the reduction operator
Composing Galois connections
The power of defining all this structure is that we can define simpler analyses, prove the core properties of Soundness and correctness on those analyses, then compose them together to get more complex analyses. Additionally, if can show equivalences between more precise and more approximate analyses, we can move between precise and approximate analyses too, all while maintaining our guarantees about soundness and correctness!
Functionally (i.e., sequentially)
If
Independent attributes
A product! We have a Galois connection
This can lose us precision. For instance, let’s say we have an expression (x, -x)
. This has possible values
Relational method
If we have
This is precise enough to have a sign analysis of
Total function space
If
Monotone function space
Note
Remember when we said we’d deal with different lattice value types? Here we are :)
If we have
Basically, to turn
Other combinations
- Direct product: abstraction is element-wise, concretization is join of items
- Direct tensor product: is to direct product what relational is to independent items. Requires two
, similar construction to relational. - Reduced product and tensor product: define reduction to get a Galois insertion
Taking a step back
Okay. That was. A ton of theory. Let’s take a step back and come up for some air. What does all of this theory get us? How can express the intuition of abstract interpretation using the precise mathematical tools we have now?
Squaring with Isil Dillig’s lecture slides
As we alluded to at the start of this chapter, in an imperative language like IMP, we can interpret a statement
In the Isil Dillig lecture notes, our abstract semantics maps abstract states to abstract states, where abstract states map variables to abstract values:
We might express this relationship in our notation as:
where
Now, let’s say we want to do sign analysis, as the lecture slides do. Thus, our analysis value x = y + z
. The concrete semantics of that statement are as follows:
Now we want to go about defining an analysis such that we can intuit the sign of
Let’s first start by looking at plus expressions themselves. The semantics of a plus expression is to add together a pair of numbers:
So how can we get an analysis of the sign of
Well, we can start by lifting numbers
Notice how the lecture slides refers to the concrete meaning of our abstract domain as sets of integers, not just integers! We do this lifting with our representation function
Now, since we’re operating on sets of numbers, we can define a precise analysis of
Now, the Galois connection in the lecture slides corresponds to what we worked on in A worked example sign analysis. We defined a function
Additionally, using the relational method, we can get the Galois connection
With these different pieces, we can compose together everything we need to get a sign-approximate analysis of
The benefit of abstract interpretation is that at every step of the way, we know that this analysis is valid with respect to our concrete semantics!
Further up, our analysis for the statement can call into
Finally, remember that our actual analysis is a mapping
Types of analyses
So what can we do with abstract interpretation? A ton.
- Relational domains
- Karr’s domain: linear equalities between variables (e.g.,
) - Octagon domain: constraints of the form
- Polyhedra domain: constraints of the form
- Karr’s domain: linear equalities between variables (e.g.,
- Rounding error analysis (egglog, Herbie)
- Interval analysis (also egglog, Herbie)
- Symbolic analysis
- Strictness analysis: will a parameter in a function call be evaluated? Does this call terminate?
- Aliasing and pointer anlaysis
- Escape analysis: does data lifetime exceed scope?
- Heap analysis: memory leak detection, etc.
- Control flow analysis
With respect to MOP in dataflow
The dataflow world has something called a meet-over-paths solution, where we get a fixpoint solution via transfer functions. In the abstract interpretation world, we specify static analysis algorithms as approximations of program semantics, and use our soundness/correctness rules to ensure that these are valid approximations.
The abstract interpretation fixpoint solution is the same as the MOP solution for distributive abstract interpretations—when abstract transformers (i.e., abstract interpreters) preserve union.
Danger
I don’t know what means but this is on this page about abstract interpretation:
Otherwise, the abstract domain used for the fixpoint solution can be minimally enriched into its disjunctive completion to get exactly the “MOP solution” in fixpoint form (on the completed abstract domain)
Fixed points
As we’ve elaborated above before, we express the relation that our abstract interpreter transforms a state
However, for recursive or iterative program constructs, we want to obtain the least fixed point,
However, this relies on
Thus, we don’t know if this infinite sequence stabilizes, and even if it does, whether that is truly the least fixed point of the function.
Some notation
Before we move on, let’s establish some notation.
Since a function can have many different fixed points—values where applying the function returns the same thing—we denote the set of all these fixed points as
We can also define two others sets
…and
The different possible fixed points for
Note that the greatest lower bound for
If we started with
Widening
Okay. So we can’t say that
At a broad level, we first calculate an over-approximation of a fixed point that we know will converge. We will then later iteratively improve this approximation until we reach a true least fixed point. To calculate this over-approximation, we introduce a widening operator
- It must provide an “upper bound” on
and . - We don’t require
to be monotone, commutative, associative, or absorptive. - It literally just has to return something bigger than
and . - We don’t guarantee that
!
- It must provide an “upper bound” on
- Given a chain
, the chain eventually stabilizes. - Essentially, build the next item of the chain by
-ing with the previous item. - Basically, we widen! At each point in the chain, the next item has to be at least bigger than that point in the original chain, as well as every other previous item in the new chain.
- Think:
becomes . - Forcing items to become bigger!
Normally to show that
Example
We can define a simple widening operator for intervals.
Wherever the first interval is inside the second, extend to infinity. Otherwise, use the first interval.
With this operator, we can define
Essentially, if applying
We know that