

Get it from pypi:

pip install rcsbsearch

Or, download from github


Here is a quick example of how the package is used. Two syntaxes are available for constructing queries: an “operator” API using python’s comparators, and a “fluent” syntax where terms are chained together. Which to use is a matter of preference.

A runnable jupyter notebook with this example is available in notebooks/quickstart.ipynb, or can be run online using binder: Binder

An additional example including a Covid-19 related example is in notebooks/covid.ipynb: Binder

Operator example

Here is an example from the RCSB Search API page, using the operator syntax. This query finds symmetric dimers having a twofold rotation with the DNA-binding domain of a heat-shock transcription factor.

from rcsbsearch import TextQuery
from rcsbsearch import rcsb_attributes as attrs

# Create terminals for each query
q1 = TextQuery('"heat-shock transcription factor"')
q2 = attrs.rcsb_struct_symmetry.symbol == "C2"
q3 = attrs.rcsb_struct_symmetry.kind == "Global Symmetry"
q4 = attrs.rcsb_entry_info.polymer_entity_count_DNA >= 1

# combined using bitwise operators (&, |, ~, etc)
query = q1 & q2 & q3 & q4  # AND of all queries

# Call the query to execute it
for assemblyid in query("assembly"):

For a full list of attributes, please refer to the RCSB schema.

Fluent Example

Here is the same example using the fluent syntax

from rcsbsearch import Attr, TextQuery

# Start with a Attr or TextQuery, then add terms
results = TextQuery('"heat-shock transcription factor"') \
    .and_("rcsb_struct_symmetry.symbol").exact_match("C2") \
    .and_("rcsb_struct_symmetry.kind").exact_match("Global Symmetry") \
    .and_("rcsb_entry_info.polymer_entity_count_DNA").greater_or_equal(1) \

# Exec produces an iterator of IDs
for assemblyid in results: