Skip to main content

SASA Validation

Validation is a consistency check rather than a comparison to an external ground truth. As in the paper, SASA is defined operationally by the chosen algorithm, radii, probe radius, sampling convention, hydrogen policy, and parser decisions, so there is no single implementation-independent reference value for these inputs. Close agreement under matched settings shows that zsasa stays in line with established tools, not that any one tool is uniquely correct.

Static structure validation against FreeSASA

The exact Shrake--Rupley path closely reproduces FreeSASA total SASA on the E. coli AlphaFold Database validation set. At 100 sphere points, f64 has a mean relative difference of 2.06e-5%; f32 remains very close, with a mean relative difference of 0.000140%. This near-identity is expected because exact zsasa follows FreeSASA's golden-spiral sphere-point convention.

The bitmask path is intentionally different: it trades exact numerical identity for throughput and bounded approximation error. At 128 points, bitmask f64 and f32 both show mean relative differences of about 0.662% versus FreeSASA. These rows should be read as measurements of a lookup-table approximation, not as a simple sampling-density convergence curve: fixed direction and angle quantization introduces quantization error and a systematic offset, while finite sphere-point sampling error can either cancel or reinforce that offset. Increasing the requested point count can reduce worst-case deviations, but it does not guarantee a monotonic decrease in mean relative difference.

The two visible scatter plots below match the two validation panels used in paper Figure 2: an exact f64 row that demonstrates near numerical identity with FreeSASA, and the bitmask f32 row used for the throughput-oriented approximation claim. The full scatter grid is still available below as supporting detail.

Figures 1-2. Representative static-validation scatter plots. Left: exact zsasa f64 at 100 sphere points (R² = 1.000000, mean relative difference 2.1×10⁻⁵%). Right: zsasa bitmask f32 at 128 sphere points (R² = 0.999811, mean relative difference 0.66%). Click either panel to open the full-size PNG.

Static validation mean relative error

Figure 3. Static validation error across point counts. This summary view shows how agreement changes across the point-count sweep. Exact zsasa modes remain nearly identical to FreeSASA because they share the same golden-spiral sphere-point convention. zsasa bitmask mode has a visible but quantified approximation envelope from lookup-table quantization error. RustSASA and Lahuta use the same point-placement convention in this sweep, so their curves are nearly overlapping. In the paper-aligned comparison, Lahuta bitmask is applied only at the 128-point, Lahuta-compatible setting used for the paper/batch comparison rather than as the main point-count trend.

Show static-validation scatter grid

Static validation scatter grid

The scatter grid is retained as supporting evidence. Click the image to inspect the full-size grid.

Trajectory validation against MDTraj

Trajectory validation uses the 5wvo_C ATLAS trajectory with 1,001 frames. Agreement improves as sphere-point count increases because MDTraj and zsasa use different sphere-point conventions at low point counts but converge toward the same surface area.

MD validation R2

Figure 4. MD validation R² across point counts. Higher point counts reduce the implementation-specific sampling difference against MDTraj. This summary is easier to scan than the full scatter-grid output.

Show MD-validation scatter grid

MD validation scatter grid

The scatter grid is retained as supporting evidence. Click the image to inspect the full-size grid.

Trajectory validation values:

PathPointsMean relative differenceMax relative difference
zsasa + MDTraj f641000.8720.946%1.90%
zsasa + MDTraj f645000.99380.198%0.531%
zsasa + MDTraj f6410000.99830.0998%0.288%
CLI f641000.77251.27%2.66%
CLI f6410000.97220.416%1.03%

Summary table

ComparisonWorkloadModePointsMean relative differenceMax relative difference
FreeSASA4,370 E. coli AFDB structuresf641001.0000000.0000206%0.000205%
FreeSASA4,370 E. coli AFDB structuresf321001.0000000.000140%0.0150%
FreeSASA4,370 E. coli AFDB structuresbitmask f641280.9998110.662%2.02%
MDTraj5wvo_C, 1,001 frameszsasa + MDTraj f645000.99380.198%0.531%
MDTraj5wvo_C, 1,001 frameszsasa + MDTraj f6410000.99830.0998%0.288%

Reproducibility notes

  • Static validation used 4,370 E. coli K-12 AFDB structures.
  • Static validation points: 100, 128, 200, 500, and 1,000.
  • MD validation points: 100, 200, 500, and 1,000.
  • Validation records were imported into DuckDB and exported through zsasa-benchmarks/results/tables/validation_pairwise_summary.csv.