Skip to main content

Batch Processing Benchmarks

Batch benchmarks measure complete directory processing: parsing, SASA calculation, output writing, and worker scheduling. The current benchmark results use zsasa v0.6.0, pinned comparator builds, 128 sphere points, and 10 threads unless noted.

TL;DR

FreeSASA has no native directory-batch command, so the FreeSASA batch rows use a thin freesasa_batch wrapper around the pinned FreeSASA C API.

The map view is the most compact summary: points higher and farther left are better. Both datasets show the same overall pattern: zsasa bitmask modes occupy the high-throughput, low-RSS corner, while comparator runs trade more memory for lower throughput.

Figures 1-2. Batch throughput versus peak RSS. These figures are generated by the benchmark repository plotting script and use its label-placement rules. Click either panel to open the full-size PNG.

Headline values:

DatasetStructuresBest zsasa modeRuntimeThroughputRSSSpeedup vs FreeSASA batch
E. coli AFDB4,370bitmask f321.481 s2,951 str/s45.1 MiB8.77×
Human AFDB23,586bitmask f3213.814 s1,707 str/s79.5 MiB9.70×

E. coli AFDB batch

The E. coli AFDB collection contains 4,370 structures and is used for both comparator benchmarking and thread scaling. It is the smaller proteome-scale workload, so it is useful for seeing scheduler overhead, thread scaling, and the cost of different output modes without being dominated by disk I/O.

At 10 threads, exact zsasa f64 is already faster than the FreeSASA batch wrapper and RustSASA, while the bitmask modes move to a different throughput class. The key practical result is not only speed: peak RSS stays around 45-49 MiB for zsasa, compared with 170-213 MiB for RustSASA and FreeSASA batch in this workload.

Figure 3. E. coli throughput versus peak RSS. The bitmask modes are the fastest points and remain far left of Lahuta bitmask and FreeSASA batch on memory.

Figures 4-5. E. coli absolute throughput and peak RSS. Script-generated bar views for structures per second and memory.

Figures 6-7. E. coli runtime speedup and RSS reduction ratios. The n× views remain useful for direct comparator comparisons.

Detailed 10-thread values:

ToolRuntimeStructures/sRSSSpeedup vs FreeSASASpeedup vs RustSASASpeedup vs Lahuta bitmask
zsasa f644.411 s99145.5 MiB2.94×1.31×0.46×
zsasa f324.246 s1,02943.5 MiB3.06×1.36×0.48×
zsasa bitmask f641.504 s2,90548.5 MiB8.63×3.84×1.35×
zsasa bitmask f321.481 s2,95145.1 MiB8.77×3.90×1.37×
FreeSASA batch12.982 s337212.6 MiBbaseline0.44×0.16×
RustSASA5.769 s757170.5 MiB2.25×baseline0.35×
Lahuta bitmask2.034 s2,148180.7 MiB6.38×2.84×baseline

Thread scaling

E. coli throughput by thread count

Figure 8. E. coli throughput by thread count. zsasa exact and bitmask modes scale strongly up to 10 threads on the M4 benchmark machine.

The thread-scaling plot is included because the E. coli dataset was measured at 1, 4, 8, and 10 threads. zsasa f64 reaches 6.72× speedup at 10 threads and bitmask f32 reaches 6.78×. FreeSASA batch flattens after 4 threads in this run, while RustSASA and Lahuta bitmask continue to improve but remain below the zsasa bitmask throughput at 10 threads.

Thread-scaling values:

Mode1 thread4 threads8 threads10 threads10-thread speedup
zsasa f64147 str/s563 str/s871 str/s991 str/s6.72×
zsasa bitmask f32435 str/s1,688 str/s2,623 str/s2,951 str/s6.78×
FreeSASA batch105 str/s366 str/s352 str/s337 str/s3.21×
RustSASA142 str/s538 str/s697 str/s757 str/s5.34×
Lahuta bitmask325 str/s1,300 str/s1,917 str/s2,148 str/s6.61×

Human AFDB batch

The Human AFDB collection contains 23,586 structures, 5.4× more than the E. coli AFDB collection. It is the main larger proteome-scale workload in the current pinned benchmark set. Human proteins are larger on average, so absolute structures/s is lower than E. coli, but the relative shape of the comparison stays the same.

zsasa retains low peak memory because batch mode streams structures instead of holding the full collection in memory. The bitmask f32 mode processes the full Human AFDB set in 13.814 s at 1,707 structures/s, while staying below 80 MiB RSS. Comparator RSS ranges from about 327 MiB for Lahuta bitmask to 628 MiB for the FreeSASA batch wrapper.

Figure 9. Human throughput versus peak RSS. The Human workload shows the same high-throughput, low-memory separation for `zsasa` bitmask modes.

Figures 10-11. Human absolute throughput and peak RSS. Script-generated bar views for structures per second and memory.

Figures 12-13. Human runtime speedup and RSS reduction ratios. Ratio views are retained alongside the absolute throughput and RSS bars.

Detailed 10-thread values:

ToolRuntimeStructures/sRSSSpeedup vs FreeSASASpeedup vs RustSASASpeedup vs Lahuta bitmask
zsasa f6445.508 s51882.3 MiB2.94×1.48×0.43×
zsasa f3244.352 s53276.4 MiB3.02×1.52×0.44×
zsasa bitmask f6414.150 s1,66783.7 MiB9.47×4.77×1.38×
zsasa bitmask f3213.814 s1,70779.5 MiB9.70×4.89×1.42×
FreeSASA batch133.960 s176627.5 MiBbaseline0.50×0.15×
RustSASA67.531 s349330.0 MiB1.98×baseline0.29×
Lahuta bitmask19.566 s1,205326.9 MiB6.85×3.45×baseline

Legacy SwissProt benchmark (pre-pinned)

The SwissProt result below is retained only as historical context from the earlier website benchmark set. It was collected before the current zsasa-benchmarks pinned v0.6.0 harness and should not be mixed with the current pinned headline claims above.

It is still useful as a stress note for very large directory walks: when the dataset fits in the OS page cache, compute and per-file overhead dominate; when it exceeds memory, mmap page faults and storage behavior dominate. That is why the M2 Max and M4 results should be read separately rather than merged into the current E. coli/Human benchmark claims.

Dataset: SwissProt PDB v6, 550,122 structures, PDB format. Benchmark settings: warmup=3, runs=3, threads=10.

M2 Max, 96 GB

ToolTimefiles/sRSS
zsasa bitmask f324m 02s2,269157 MB
zsasa bitmask f644m 07s2,229162 MB
Lahuta bitmask5m 12s1,7612,187 MB
RustSASA10m 58s8351,131 MB
FreeSASA32m 21s2832,875 MB

M4, 32 GB

ToolTimefiles/sRSS
zsasa bitmask f3211m 05s828157 MB
zsasa bitmask f6411m 07s824161 MB
Lahuta bitmask11m 08s8232,152 MB
RustSASA26m 16s3491,131 MB
FreeSASA31m 42s2892,440 MB

On the 32 GB M4 system, the dataset exceeded available RAM and the run became I/O-bound; zsasa and Lahuta bitmask converged in wall-clock time, while zsasa retained much lower peak RSS.

Reproducing the current batch results

The current pinned benchmark data is exported from zsasa-benchmarks/results/tables/batch_t10_summary.csv and batch_thread_scaling.csv. The corresponding full harness is in N283T/zsasa-benchmarks.