kinactive.db module

A DB class for the PK data collection creation and io.

class kinactive.db.DB(cfg: DBConfig = DBConfig(verbose=True, target_dir=PosixPath('db'), pdb_dir=PosixPath('pdb/structures'), pdb_dir_info=PosixPath('pdb/info'), seq_dir=PosixPath('uniprot/fasta'), max_fetch_trials=2, io_cpus=1, init_cpus=1, init_map_numbering_cpus=1, profile=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/kinactive/checkouts/stable/kinactive/resources/Pkinase.hmm'), pk_map_name='PK', pk_min_score=50, pk_min_seq_domain_size=150, pk_min_str_domain_size=100, pk_min_cov_hmm=0.7, pk_min_cov_seq=0.7, pk_min_str_seq_match=0.9, min_seq_size=150, max_seq_size=3000, pdb_fmt='cif', pdb_num_fetch_threads=10, pdb_str_min_size=100, uniprot_chunk_size=100, uniprot_num_fetch_threads=10))[source]

Bases: object

An object encapsulating methods for building/saving/loading an lXtractor “database” – a collection of Chain’s.

__init__(cfg: DBConfig = DBConfig(verbose=True, target_dir=PosixPath('db'), pdb_dir=PosixPath('pdb/structures'), pdb_dir_info=PosixPath('pdb/info'), seq_dir=PosixPath('uniprot/fasta'), max_fetch_trials=2, io_cpus=1, init_cpus=1, init_map_numbering_cpus=1, profile=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/kinactive/checkouts/stable/kinactive/resources/Pkinase.hmm'), pk_map_name='PK', pk_min_score=50, pk_min_seq_domain_size=150, pk_min_str_domain_size=100, pk_min_cov_hmm=0.7, pk_min_cov_seq=0.7, pk_min_str_seq_match=0.9, min_seq_size=150, max_seq_size=3000, pdb_fmt='cif', pdb_num_fetch_threads=10, pdb_str_min_size=100, uniprot_chunk_size=100, uniprot_num_fetch_threads=10))[source]
build(uniprot_ids: Collection[str] | None = None, pdb_chain_ids: Collection[str] | None = None, n_domains: int = 0) ChainList[Chain][source]

Build a new lXt-PK data collection.

Parameters:
  • uniprot_ids – An optional list of UniProt IDs to restrict the db to.

  • pdb_chain_ids – An optional collection of PDB chains to restrict the db to. Format: “{PDB_ID}:{ChainID}”.

  • n_domains – Use n random sequence domains. It is helpful for testing the pipeline.

Returns:

A ChainList of Chain objects having at least one child PK domain with at least one PK domain structure passing the filtering thresholds.

fetch()[source]

Fetch an already prepared data collection.

Returns:

load(dump: Path | Iterable[Path]) ChainList[Chain][source]

Load prepared db.

Parameters:

dump – Path with dumped :class:`Chain`s.

Returns:

A chain list with initialized :class:`Chain`s.

save(dest: Path | None = None, chains: Iterable[Chain] | None = None, *, overwrite: bool = False, summary: bool = True) None[source]

Save DB sequence to file system.

Parameters:
  • dest – Destination path to write seqs into.

  • chains – Manual chains input to save. If None, will use chains.

  • overwrite – Overwrite existing data in dest.

  • summary – Compose and save summaries to dest.

Returns:

An iterator over paths of successfully saved chains. Consume to trigger saving.

property chains: ChainList[Chain]
Returns:

Currently stored chains.