Skip to main content

Batch Processing

Use batch mode when you need to calculate SASA for many structure files with the same settings.

Basic Directory Batch

zsasa batch structures/ results/

This scans structures/ for supported input files and writes per-file outputs under results/.

Common options:

zsasa batch structures/ results/ --threads=8 --format=json
zsasa batch structures/ results/ --format=jsonl --output=results.jsonl
zsasa batch structures/ results/ --classifier=ccd --ccd=components.zsdc

JSONL for Large Runs

For large datasets, prefer JSONL because each structure result is written as one line and can be streamed by downstream tools:

zsasa batch structures/ results/ --format=jsonl --output=results.jsonl

JSONL is especially useful when you want to concatenate, filter, or process results incrementally.

Experimental Adaptive Bitmask SR

For large SR batch jobs that already use bitmask mode, --adaptive-sr runs a coarse bitmask pass for every atom and recomputes only intermediate-exposure atoms with a fine point count:

zsasa batch structures/ results/ \
--use-bitmask \
--adaptive-sr \
--coarse-points=64 \
--fine-points=256 \
--adaptive-low=0.10 \
--adaptive-high=0.90

Adaptive mode is currently available for zsasa batch only. It requires --use-bitmask and --algorithm=sr. The output schema is unchanged; compare against fixed fine-point bitmask runs when validating a new dataset.

Residue Maps in JSONL

Add --residue-map to include compact residue-level arrays in each JSONL row:

zsasa batch structures/ results/ \
--format=jsonl \
--output=results.jsonl \
--residue-map

This adds these arrays to each JSONL result:

  • residue_chain
  • residue_name
  • residue_number
  • residue_insertion_code
  • residue_atom_start
  • residue_atom_count
  • residue_sasa

The default JSONL schema is unchanged unless --residue-map is enabled.

Chain Filters

For a single non-workflow batch job, use --chain or --auth-chain:

zsasa batch structures/ results/ --chain=A
zsasa batch structures/ results/ --auth-chain --chain=A

Use --chain for label/asym chain IDs. Add --auth-chain as a boolean modifier when the --chain value should match author-provided chain IDs in mmCIF inputs.

For named multi-job runs such as chain A, chain B, and AB complex calculations, use Workflow Files instead of repeating shell commands.

When to Use Workflow Files

Use workflow files when you need:

  • Reproducible settings checked into a project.
  • Multiple named jobs in one batch run.
  • Per-job chain filters.
  • Shared classifier, output, and calculation settings.
  • Custom classifier TOML configs for batch jobs.

See Workflow Files for TOML examples.

Reference