Summary
SSA is the same as CPS. The algorithm for finding SSA is the same as finding optimal nesting in CPS. In this algorithm, wherever a variable is defined, we need
s for that variable in the dominator frontier of that node. The dominator frontier consists of all nodes who are successors of dominated nodes, but are not themselves dominated.
The core thesis: SSA and lambda calculus are doing the same thing. Imperative languages compile down to basic blocks and flow edges, functional languages compile to lexically nested functions. They’re the same in different notation!
Motivation for SSA
Sometimes we want to find the use-sites of each defined variable, or conversely, the definition-sites of each variable in an expression. The def-use chain is a data structure that holds, for each statement, all use sites of variables defined there, and all definition sites of the variables used there. But if a variable has
Hence, SSA:
§ 6.2.4. Single static assignment (SSA)
Single static assignment (SSA) form is a form of a program with two key attributes:
- Every variable is only defined once. No variable is redefined or shadowed or mutated, ever.
- Where two diverging control paths rejoin (e.g., after a branch), we use the
function to denote “pick the value based on which branch was taken.” Example:
becomes
Link to original
Where to place s?
So how do we convert into SSA? Specifically, how do we rename variables and where do we put
One dumb approach: split every variable at every basic-block boundary, and use
Example of crude placement
Lot of useless copies… what can we do about that?
Taking inspo from functional programming
We can instead express the program as a set of mutually recursive functions, where each function represents a basic block, and its arguments represents the locals in the program.
In this representation, a basic block might correspond to a function:
If, in that function,
When actually writing programs in this style though, normally we don’t define a ton of mutually recursive functions; instead we take advantage of nested scope to reduce repetition:
In this case, since we only ever use one value of
Here’s the key: the best way to nest definitions is the same as the optimal SSA form!
The algorithm for optimal s
We only need a
Well, let’s look at blocks 1 and 2, for instance. Only one definition of
Additionally, we say that if there’s an edge
Let’s assume that at the start node, we initializing all variables with some definition. If a node
Thus, the rule is: whenever node x
, all nodes in the dominance frontier of
Because of the equivalence above, we can say that this algorithm converts imperative procedures into well-structured functional programs with nested scope.
What we can learn
For SSA users: one important property of SSA is that a variable definition dominates all uses (or for a use within a
For functional programmers: learn how to use flow charts and boxes lmao:
Functional programmers often get lost in the notation of functional programming, which is a shame.