CLI Reference
Quick reference for all pmb commands. See individual pages for detailed usage.
Data Pipeline Commands
pmb sync
Download data files from PDBj/wwPDB mirrors via rsync. Targets are defined in config.yml.
pmb sync # Sync all configured targets
pmb sync pdbj cc # Sync specific targets
pmb sync --dry-run # Preview without downloading
See Syncing Data for details.
pmb update
Run incremental database updates. Tracks file modification times to skip unchanged entries.
pmb update # Update all pipelines
pmb update pdbj cc # Update specific pipelines
pmb update pdbj --limit 100 # Limit entries processed
pmb update pdbj --force # Ignore mtime cache
See Updating the Database for details.
pmb load
Bulk load data using PostgreSQL COPY protocol. Truncates tables before loading.
pmb load cc --force # Load single pipeline
pmb load pdbj --limit 1000 --force # Load with entry limit
See Updating the Database - Initial Load for details.
pmb all
Run full sync + update cycle in a single step.
pmb all
Query Commands
pmb schema
Inspect database schema definitions from model definitions. No database connection required.
pmb schema # List all schemas
pmb schema pdbj # List tables in schema
pmb schema pdbj.brief_summary # Show columns
pmb schema pdbj.brief_summary.pdbid # Single column detail
pmb schema -s resolution # Search columns
pmb schema pdbj --json # JSON output (for AI/scripts)
See Inspecting Schemas for details.
pmb query
Execute SQL queries with multi-format output.
pmb query "SELECT * FROM cc.brief_summary LIMIT 5"
pmb query -f query.sql -F csv -o out.csv
pmb query "SELECT * FROM pdbj.brief_summary" -F parquet -o out.parquet
See Querying the Database for details.
pmb stats
Show database statistics (table counts, row counts, sizes).
pmb stats
pmb config
Display active configuration and resolved settings. Useful for verifying which config file is being used.
pmb config # Show config summary
pmb config --json # JSON output
pmb config -c /path/to/config.yml # Show specific config
pmb config --init # Generate config at ~/.config/pmb/config.yml
pmb config --init -c ./ # Generate config.yml in current directory
Config file is discovered automatically: ./config.yml → ~/.config/pmb/config.yml.
Use --init to generate a config file from the bundled template.
pmb compounds
Refresh the unified chem.compounds table from cc.brief_summary and prd.brief_summary. Includes RDKit mol column, GiST index, and descriptor triggers.
pmb compounds # Refresh compounds table
pmb compounds --force # Skip confirmation
This is automatically run after pmb update when cc or prd pipelines are included. Use this command to manually refresh the compounds table.
Administration Commands
pmb reset
Drop and recreate database schemas. Destructive -- cannot be undone.
pmb reset cc # Reset single schema
pmb reset all --force # Reset all schemas
See Updating the Database - Reset Schemas for details.
pmb setup-rdkit
Set up RDKit PostgreSQL extension, create mol columns on cc.brief_summary and prd.brief_summary, and load chemical search SQL functions.
pmb setup-rdkit
This is automatically run by the cc and prd pipelines. Use this command only when you need to add RDKit functions to an existing database without re-running the full pipeline.
See cc Schema - RDKit Integration for details.
pmb test
Create a test database and validate pipelines. Uses config.test.yml by default.
pmb test # Test all pipelines
pmb test pdbj cc # Test specific pipelines
pmb test --drop --limit 20 # Drop existing test DB, process 20 entries
| Option | Short | Description |
|---|---|---|
--drop | -d | Drop existing test database before testing |
--limit | -l | Limit number of files to process (default: 10) |
--workers | -w | Number of worker processes |
--config | -c | Config file path (default: config.test.yml) |
Global Options
These options are available for most commands:
| Option | Short | Description |
|---|---|---|
--config | -c | Config file path (auto-discovered if not set) |
--version | -v | Show version |
--help | Show help |