Skip to main content

Querying the Database

The pmb query command executes SQL queries against the configured database and outputs results in various formats.

Basic Usage

# Display results as a Rich table (default)
pmb query "SELECT * FROM cc.brief_summary LIMIT 5"

# Read SQL from a file
pmb query -f query.sql

Output Formats

Table (default)

Rich-formatted table for terminal viewing:

pmb query "SELECT comp_id, name, formula FROM cc.brief_summary LIMIT 5"

Use --limit to cap the number of displayed rows:

pmb query "SELECT * FROM cc.brief_summary" --limit 20

CSV

# Output to stdout (pipe or redirect)
pmb query "SELECT * FROM cc.brief_summary" -F csv > compounds.csv

# Output to file
pmb query "SELECT * FROM cc.brief_summary" -F csv -o compounds.csv

JSON

# Output to stdout
pmb query "SELECT * FROM cc.brief_summary LIMIT 10" -F json

# Output to file
pmb query "SELECT * FROM cc.brief_summary" -F json -o compounds.json

Parquet

Parquet output requires --output (-o):

pmb query "SELECT * FROM cc.brief_summary" -F parquet -o compounds.parquet
tip

Parquet files can be loaded directly into Polars or Pandas for further analysis:

import polars as pl
df = pl.read_parquet("compounds.parquet")

Options

OptionShortDescription
--file-fRead SQL from file instead of argument
--format-FOutput format: table, csv, json, parquet
--output-oOutput file path (required for parquet)
--limit-lMax rows to display (table format only)
--config-cConfig file path (default: config.yml)

Examples

# Chemical substructure search → CSV
pmb query "SELECT comp_id, name FROM cc.brief_summary WHERE mol @> 'c1ccccc1'::mol" -F csv

# Export all entities for a structure
pmb query "SELECT * FROM pdbj.entity WHERE pdbid = '1A00'" -F json

# Large export to Parquet
pmb query "SELECT pdbid, title FROM pdbj.brief_summary" -F parquet -o structures.parquet