Symbolic Abstraction
Symbolic abstraction is a technique that computes the best abstract domain element that overapproximates a given formula in first-order logic. It serves as a bridge between concrete program analysis and efficient abstract reasoning, enabling scalable verification and analysis of complex systems.
The key idea is to transform a concrete logical formula into an abstract representation that captures the essential behavior while being more amenable to automated analysis. This abstraction preserves important properties (like satisfiability) while discarding unnecessary details that would make analysis computationally expensive.
Overview
Arlib’s symbolic abstraction module (arlib/symabs
) provides a comprehensive suite
of abstraction techniques for program analysis and verification. The module is organized
around several key approaches:
Abstract Interpretation-based Symbolic Abstraction (
ai_symabs
)Optimization Modulo Theory-based Symbolic Abstraction (
omt_symabs
)Predicate Abstraction (
predicate_abstraction
)Model Counting-based Abstract Interpretation (
mcai
)Congruence Closure-based Abstraction (
congruence
)Range Set SAT-based Abstraction (
rangeset_sat
)
Each approach offers different trade-offs between precision, scalability, and applicability to different problem domains.
Core Concepts
Abstract Domain
An abstract domain defines the lattice structure and operations for abstract elements:
class AbstractDomain:
def join(self, elements): # Least upper bound
def meet(self, elements): # Greatest lower bound
def gamma_hat(self, alpha): # Concretization function
def model(self, formula): # Find concrete model
The gamma_hat
function translates an abstract element back to a concrete formula,
while the join
operation computes the least upper bound in the abstract lattice.
Concretization and Abstraction
Concretization (γ̂): Maps abstract elements to concrete formulas
Abstraction: Maps concrete formulas to abstract elements
Galois Connection: The pair (α, γ̂) forms a Galois connection if α(γ̂(α̂)) = α̂
Precision Metrics
The quality of an abstraction is measured by:
Precision: How closely the abstract formula approximates the concrete one
Scalability: Computational cost of abstraction computation
Expressiveness: Ability to capture important program properties
Abstract Interpretation (ai_symabs)
The ai_symabs
submodule implements classic abstract interpretation domains based on
Thakur’s PhD thesis. It provides a framework for computing symbolic abstractions using
various abstract domains.
Supported Abstract Domains
Interval Domain
The interval domain represents variable ranges using lower and upper bounds:
from arlib.symabs.ai_symabs.domains.interval.domain import IntervalDomain
# Create interval domain for variables ['x', 'y']
domain = IntervalDomain(['x', 'y'])
# Abstract state represents: x ∈ [0, 10], y ∈ [-5, 5]
# This concretizes to: 0 ≤ x ∧ x ≤ 10 ∧ -5 ≤ y ∧ y ≤ 5
Sign Domain
The sign domain tracks the sign of variables (negative, zero, positive):
from arlib.symabs.ai_symabs.domains.sign.domain import SignDomain
# Sign domain for variable 'x'
# Possible abstract values: -, 0, +, T (top)
Reduced Product Domain
Combines multiple domains for better precision:
from arlib.symabs.ai_symabs.domains.reduced_product.domain import ReducedProductDomain
# Combine interval and sign domains
combined = ReducedProductDomain([interval_domain, sign_domain])
Abstraction Algorithms
Bilateral Algorithm
The bilateral algorithm efficiently computes abstractions by working with both the positive and negative forms of the formula:
from arlib.symabs.ai_symabs.domains.algorithms.bilateral import bilateral
# Compute abstraction using bilateral algorithm
abstract_state = bilateral(concrete_formula, domain)
RSY Algorithm
Recursive abstraction refinement algorithm that iteratively improves precision:
from arlib.symabs.ai_symabs.domains.algorithms.rsy import rsy
# Compute abstraction with RSY algorithm
abstract_state = rsy(concrete_formula, domain)
OMT-based Symbolic Abstraction (omt_symabs)
The omt_symabs
submodule uses Optimization Modulo Theory (OMT) solvers to compute
optimal abstractions. This approach leverages modern OMT engines like Z3’s optimizer
to find the best abstract representation.
Key Features
Optimal Abstractions: Computes the most precise abstraction within the domain
Multiple Domains: Supports intervals, zones, and octagons
Theory Support: Handles both linear arithmetic (LIA/LRA) and bit-vectors
Linear Arithmetic Abstraction
from arlib.symabs.omt_symabs.lira_symbolic_abstraction import LIRASymbolicAbstraction
# Create LIRA symbolic abstraction
symabs = LIRASymbolicAbstraction()
# Initialize from SMT formula
symabs.init_from_formula(formula)
# Compute interval abstraction
interval_abs = symabs.compute_interval_abstraction()
# Compute zone abstraction
zone_abs = symabs.compute_zone_abstraction()
Bit-Vector Abstraction
from arlib.symabs.omt_symabs.bv_symbolic_abstraction import BVSymbolicAbstraction
# Create bit-vector symbolic abstraction
bv_symabs = BVSymbolicAbstraction()
# Initialize with bit-vector formula
bv_symabs.init_from_formula(bv_formula)
# Compute abstraction
abstract_state = bv_symabs.compute_abstraction()
Predicate Abstraction (predicate_abstraction)
Predicate abstraction computes the strongest Boolean combination of given predicates that is entailed by the concrete formula. This approach is particularly effective for software verification tasks.
Core Algorithm
The predicate abstraction algorithm works by:
Finding satisfying models of the concrete formula
Evaluating predicates in each model
Constructing the strongest Boolean formula consistent with the evaluations
from arlib.symabs.predicate_abstraction.predicate_abstraction import compute_predicate_abstraction
# Define predicates
predicates = [x > 0, y > 0, x + y < 10]
# Concrete formula: x > 5 ∧ y > 3 ∧ x + y < 8
concrete_formula = z3.And(x > 5, y > 3, x + y < 8)
# Compute predicate abstraction
abstract_formula = compute_predicate_abstraction(concrete_formula, predicates)
# Result: x > 0 ∧ y > 0 ∧ x + y < 10 (strongest consequence)
Additional Components
Model Counting-based Abstract Interpretation (mcai)
The mcai
submodule combines model counting with abstract interpretation to provide
precision metrics and analysis for different abstract domains. This is particularly
useful for:
Evaluating abstraction quality
Comparing different abstract domains
Bit-vector formula analysis
Congruence Closure-based Abstraction (congruence)
The congruence
submodule implements congruence closure techniques for abstraction,
particularly useful for equality reasoning and uninterpreted functions.
Range Set SAT-based Abstraction (rangeset_sat)
The rangeset_sat
submodule provides SAT-based abstraction techniques for range
and set operations over bit-vectors, offering efficient analysis for hardware verification.
Usage Examples
Basic Symbolic Abstraction
import z3
from arlib.symabs.ai_symabs.domains.interval.domain import IntervalDomain
from arlib.symabs.ai_symabs.domains.algorithms.bilateral import bilateral
# Define concrete formula: 0 ≤ x ∧ x ≤ 10 ∧ x + y = 15
x, y = z3.Ints('x y')
formula = z3.And(x >= 0, x <= 10, x + y == 15)
# Create interval domain
domain = IntervalDomain(['x', 'y'])
# Compute abstraction
abstract_state = bilateral(formula, domain)
# Get concrete representation
concrete_repr = domain.gamma_hat(abstract_state)
print(f"Abstract state concretizes to: {concrete_repr}")
OMT-based Abstraction
from arlib.symabs.omt_symabs.lira_symbolic_abstraction import LIRASymbolicAbstraction
# Create symbolic abstraction engine
symabs = LIRASymbolicAbstraction()
# Load formula from file
symabs.init_from_file('example.smt2')
# Set OMT engine type
symabs.set_omt_engine_type('z3opt')
# Compute optimal interval abstraction
interval_abs = symabs.compute_interval_abstraction()
print(f"Interval abstraction: {interval_abs}")
Predicate Abstraction Example
import z3
from arlib.symabs.predicate_abstraction.predicate_abstraction import compute_predicate_abstraction
# Define variables and predicates
x, y = z3.Ints('x y')
predicates = [x >= 0, y >= 0, x + y <= 10]
# Concrete formula representing a program path
concrete = z3.And(x >= 5, y >= 3, x + y <= 8)
# Compute abstraction
abstract = compute_predicate_abstraction(concrete, predicates)
print(f"Predicate abstraction: {abstract}")
Applications
Symbolic abstraction techniques find applications in various domains:
Program Verification
Software Model Checking: Abstract program states to enable scalable verification
Shape Analysis: Abstract heap structures and pointer relationships
Numerical Program Analysis: Abstract numerical computations for overflow/underflow detection
Security Analysis
Information Flow Analysis: Abstract security levels and taint propagation
Cryptographic Protocol Verification: Abstract cryptographic operations
Buffer Overflow Detection: Abstract memory operations and bounds checking
Hardware Verification
Bit-Vector Abstraction: Abstract bit-level operations for hardware verification
Range Analysis: Abstract value ranges for timing and power analysis
Equivalence Checking: Abstract circuit behaviors for equivalence verification
Optimization and Synthesis
Compiler Optimizations: Abstract program semantics for optimization
Program Synthesis: Abstract specifications for synthesis tasks
Test Generation: Abstract program paths for test case generation
Performance Considerations
Choosing the Right Approach
When to use ai_symabs:
Fast analysis with reasonable precision
Classic abstract domains (intervals, signs, octagons)
Large-scale program analysis
Resource-constrained environments
When to use omt_symabs:
Need for optimal precision
Linear arithmetic formulas
Small to medium-sized problems
When exact bounds are critical
When to use predicate_abstraction:
Software verification tasks
User-defined predicates available
Boolean program analysis
Counterexample-guided abstraction refinement (CEGAR)
Scalability Trade-offs
Precision vs. Scalability: OMT provides highest precision but scales poorly
Domain Selection: Interval domains offer good precision/scalability balance
Algorithm Choice: Bilateral algorithm often provides best performance
Parallelization: Many abstraction algorithms can be parallelized
Future Directions
Research Areas
Theory Development: - New abstract domains for emerging computing paradigms - Combination methods for heterogeneous domains - Completeness and decidability results for abstraction algorithms
Algorithm Improvement: - Machine learning-guided abstraction refinement - Parallel and distributed abstraction algorithms - Incremental abstraction techniques for dynamic analysis
Integration with Modern Verification: - Deep learning integration for abstraction learning - Quantum computing abstraction techniques - Blockchain and smart contract verification
Industrial Applications: - Large-scale software verification - Autonomous systems analysis - Cyber-physical systems verification
References
Core Theory
Thakur, A. V. (2014). Symbolic Abstraction: Algorithms and Applications. Ph.D. dissertation, University of Wisconsin-Madison. https://www.cs.wisc.edu/~aws/pubs/thesis.pdf
Cousot, P., & Cousot, R. (1977). Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. POPL’77.
Graf, S., & Saïdi, H. (1997). Construction of abstract state graphs with PVS. CAV’97.
Abstract Interpretation
Cousot, P., Cousot, R., & Logozzo, F. (2013). A parametric segmentation functor for fully automatic and scalable array content analysis. POPL’13.
Miné, A. (2006). The octagon abstract domain. Higher-Order and Symbolic Computation.
Singh, G., Püschel, M., & Vechev, M. (2015). Fast numerical program analysis with reinforcement learning. CAV’15.
OMT-based Abstraction
Sebastiani, R., & Trentin, P. (2015). OPTIMathSAT: a tool for optimization modulo theories. CAV’15.
Borralleras, C., Larraz, D., Oliveras, A., Rodríguez-Carbonell, E., & Rubio, A. (2019). The iSAT solver. TACAS’19.
Predicate Abstraction
Graf, S., & Saïdi, H. (1997). Construction of abstract state graphs with PVS. CAV’97.
Ball, T., Majumdar, R., Millstein, T., & Rajamani, S. K. (2001). Automatic predicate abstraction of C programs. PLDI’01.
Henzinger, T. A., Jhala, R., Majumdar, R., & McMillan, K. L. (2004). Abstractions from proofs. POPL’04.
Applications and Case Studies
Chen, Y., Wang, Y., Zhu, Z., Song, L., & Zhang, L. (2021). Program Analysis via Efficient Symbolic Abstraction. OOPSLA’21.
Albarghouthi, A., Li, Y., & Zhang, L. (2019). Automating Abstract Interpretation. VMCAI’16.
Gulwani, S., & Tiwari, A. (2007). Combining abstract interpretation with model checking. ESEC/FSE’07.