Skip to main content

SASA Validation

Accuracy comparison of zsasa against reference implementations across the E. coli K-12 proteome (4,370 structures).

Note: Validation uses batch processing across the entire proteome. MD trajectory validation is covered in md.md.

TL;DR

SR quicklook: zsasa f64 vs FreeSASA (4,370 structures)
AlgorithmToolMean Error %Max Error %
SR (100 pts)zsasa f641.0000000.00000.0000
SR (100 pts)zsasa f321.0000000.00010.0150
SR (100 pts)zsasa bitmask0.9997210.812.51
SR (100 pts)RustSASA0.9999630.322.49
SR (128 pts)Lahuta bitmask0.9997680.732.47
LR (20 slices)zsasa f640.9999800.220.31
  • zsasa f64 is bit-identical to FreeSASA at all point counts
  • zsasa f32 has negligible rounding error (max 0.015%)
  • Bitmask error plateaus at ~0.7–0.8% (LUT approximation, independent of point count)
  • RustSASA converges with point count: 0.32% at 100 → 0.06% at 1000
  • Lahuta bitmask (128 pts): comparable accuracy to zsasa bitmask

Test Environment

ItemValue
MachineMacBook Pro
ChipApple M4 (10 cores: 4P + 6E)
Memory32 GB
OSmacOS 15.3.2 (Darwin 24.6.0)

Shrake-Rupley (SR)

Dataset: AlphaFold E. coli K-12 proteome, 4,370 structures. Reference: FreeSASA.

n_points Convergence

Tooln_pointsNMean Error %Max Error %
zsasa f641004,3701.0000000.00000.0000
zsasa f321004,3701.0000000.00010.0150
zsasa bitmask f641004,3700.9997210.81152.5060
zsasa bitmask f321004,3700.9997210.81132.5060
RustSASA1004,3700.9999630.31812.4859
zsasa f642004,3701.0000000.00000.0000
zsasa f322004,3701.0000000.00020.0110
zsasa bitmask f642004,3700.9997810.73351.6818
zsasa bitmask f322004,3700.9997810.73341.6818
RustSASA2004,3700.9999880.18411.2255
zsasa f645004,3701.0000000.00000.0000
zsasa f325004,3701.0000000.00020.0043
zsasa bitmask f645004,3700.9997780.74721.3270
zsasa bitmask f325004,3700.9997780.74711.3270
RustSASA5004,3700.9999970.09230.5562
zsasa f6410004,3701.0000000.00000.0000
zsasa f3210004,3701.0000000.00010.0028
zsasa bitmask f6410004,3700.9997840.73891.0528
zsasa bitmask f3210004,3700.9997850.73871.0527
RustSASA10004,3700.9999990.05580.3401

Key findings:

  • zsasa f64: Bit-identical to FreeSASA at all point counts (same algorithm parameters)
  • zsasa f32: Max 0.015% error from floating-point rounding — negligible for practical use
  • Bitmask variants: Mean error ~0.7–0.8%, plateaus regardless of point count (LUT approximation error)
  • RustSASA: Converges toward FreeSASA with increasing points (0.32% → 0.06% mean error)
  • f32 vs f64 bitmask: Virtually identical — bitmask error dominates floating-point error

Lahuta Bitmask (128 pts)

Lahuta requires n_points=128 for bitmask support.

ToolNMean Error %Max Error %
zsasa f644,3701.0000000.00000.0000
zsasa f324,3701.0000000.00010.0153
zsasa bitmask f644,3700.9998110.66182.0245
zsasa bitmask f324,3700.9998110.66162.0245
RustSASA4,3700.9999730.27522.1487
Lahuta bitmask4,3700.9997680.73162.4709
  • Lahuta bitmask accuracy is comparable to zsasa bitmask (~0.73% vs ~0.66% mean error)
  • Both use LUT bitmask neighbor lists — the error pattern is similar

Validation Plots

SR validation grid

zsasa f64zsasa f32
f64f32
zsasa bitmask f64zsasa bitmask f32
bm f64bm f32
Lahuta bitmask
lahuta

Lee-Richards (LR)

Dataset: AlphaFold E. coli K-12 proteome, 4,370 structures, n_slices=20. Reference: FreeSASA.

ToolNMean Error %Max Error %
zsasa f644,3700.9999800.22140.3102
zsasa f324,3700.9999800.22140.3097

Key findings:

  • f32 and f64 are identical — LR error comes from slice discretization differences, not floating-point precision
  • Mean error ~0.22% is higher than SR (<0.001%) due to different slicing implementations between zsasa and FreeSASA
  • R² = 0.999980 confirms strong linear agreement

Validation Plots

LR validation grid

zsasa f64zsasa f32
f64f32

Running Validation

# Shrake-Rupley (all tools, multi-point convergence)
./benchmarks/scripts/validation.py run \
-i benchmarks/UP000000625_83333_ECOLI_v6/pdb \
-n ecoli --algorithm sr
# -> benchmarks/results/validation/ecoli/sr/

# Lee-Richards
./benchmarks/scripts/validation.py run \
-i benchmarks/UP000000625_83333_ECOLI_v6/pdb \
-n ecoli --algorithm lr
# -> benchmarks/results/validation/ecoli/lr/

# Re-analyze existing results
./benchmarks/scripts/validation.py compare \
-d benchmarks/results/validation/ecoli/sr