Credo Application Programming Interface Tutorial

Fetching objects from the database

# Fetch all structures by UniProt accession
In [1]: structures = StructureAdaptor().fetchAllByUniProt('P53779')
 
In [2]: structures
Out[2]:
[<Structure('1JNK')>,
 <Structure('1PMN')>,
 <Structure('1PMQ')>,
 <Structure('1PMU')>,
 <Structure('1PMV')>,
 <Structure('2B1P')>,
 <Structure('2EXC')>,
 <Structure('2O0U')>,
 <Structure('2O2U')>,
 <Structure('2OK1')>,
 <Structure('2P33')>,
 <Structure('2R9S')>,
 <Structure('2WAJ')>,
 <Structure('2ZDT')>,
 <Structure('2ZDU')>,
 <Structure('3CGF')>,
 <Structure('3CGO')>,
 <Structure('3DA6')>,
 <Structure('3FI2')>,
 <Structure('3FI3')>,
 <Structure('3FV8')>,
 <Structure('3G90')>,
 <Structure('3G9L')>,
 <Structure('3G9N')>]
 
# Fetch all chemical components by tanimoto similarity (requires MyChem)
In [1]: j07 = s.Ligands[0].LigandComponents[0].ChemComp
In [2]: chemcomps = ChemCompAdaptor().fetchAllBySimilarity(j07.can, tanimoto=0.5)
 
In [3]: chemcomps
Out[3]:
[('J07', 1.0),
 ('JNO', 0.56983240223463705),
 ('JNF', 0.53216374269005895),
 ('5BP', 0.50289017341040498),
 ('AA2', 0.502857142857143)]

Chemical components in CREDO

In [1]: from credoscript import *
In [1]: s = StructureAdaptor().fetchByPDB('3CS9')
 
In [2]: s
Out[2]: <Structure('3CS9')>
 
In [3]: s.Ligands
Out[3]:          
[Ligand(600, NIL, A),
 Ligand(600, NIL, B),
 Ligand(600, NIL, C),
 Ligand(600, NIL, D)]
 
# First ligand in list
In [4]: s.Ligands[0]
Out[4]: Ligand(600, NIL, A)                        
 
In [5]: s.Ligands[0].LigandComponents              
Out[5]: [<LigandComponent(600, NIL, A)>]           
 
# First component of first ligand (heteropeptides can have more than one component) 
In [6]: s.Ligands[0].LigandComponents[0]           
Out[6]: <LigandComponent(600, NIL, A)>             
 
# Chemical component object / conformer-independent information  
In [7]: s.Ligands[0].LigandComponents[0].ChemComp  
Out[7]: <ChemComp(NIL)>                            
 
In [8]: c = s.Ligands[0].LigandComponents[0].ChemComp
 
# Isomeric smiles (Generated with OEChem)
In [9]: c.ism
Out[9]: 'Cc1ccc(cc1Nc2nccc(n2)c3cccnc3)C(=O)Nc4cc(cc(c4)n5cc(nc5)C)C(F)(F)F'
 
# SDF file
In [10]: print c.getMolBlock()
NIL                            
  -OEChem-07250903003D         
 
 61 65  0     0  0  0  0  0  0999 V2000
    1.4221    0.1035   -0.0988 O   0  0  0  0  0  0  0  0  0  0  0  0
[...]
 
# Cross references to other databases
In [11]: c.XRefs
Out[11]:
[<XRef(ChemComp,DrugBank Compound,DB04868)>,
 <XRef(ChemComp,KEGG DRUG,D08953)>,
 <XRef(ChemComp,StARlite compound,426660)>]

Shape similarity search using Ultrafast Shape Recognition (USR)

USR moments are calculated for all ligands in CREDO and their (up to) 200 unbound, low energy conformers.

Active ligand against active ligands

In this approach, a bound ligand is used as a template against all other bound ligands in CREDO.
The function returns a list of tuples in the form (LIGAND ID, HET ID, SIMILARITY).

s = StructureAdaptor().fetchByPDB('1OPJ')
s.Ligands
 
[Ligand(1, MYR, A),
 Ligand(3, STI, A),
 Ligand(5, CL, A),
 Ligand(2, MYR, B),
 Ligand(4, STI, B),
 Ligand(6, CL, B)]
 
sti = s.Ligands[1]
 
# Default is active against active, Top 10 hits
sti.USR()
 
[(251127L, 'STI', 0.96517336947999999),
 (147408L, 'STI', 0.96494049005000004),
 (147409L, 'STI', 0.95693780307999998),
 (147410L, 'STI', 0.94801710280999996),
 (147411L, 'STI', 0.94794215309999996),
 (60177L, 'STI', 0.93764648486000002),
 (168962L, 'STI', 0.93720707308999995),
 (30467L, 'STI', 0.92843326611999999),
 (30470L, 'STI', 0.91157703434000004),
 (163155L, 'STI', 0.90518220483)]
 
# Ignore all hits with the same chemical compound (STI)
sti.USR(exclude_self=True)
 
[(49806L, 'NAD', 0.88915234453000003),
 (184633L, 'TH3', 0.88836243273000004),
 (49807L, 'NAD', 0.88645934419000005),
 (49808L, 'NAD', 0.88170462764000002),
 (49809L, 'NAD', 0.88118661818999999),
 (62424L, 'LI1', 0.87931414069000002),
 (110258L, 'NAD', 0.87931408678),
 (105227L, 'NAD', 0.87822015420999999),
 (110256L, 'NAD', 0.87476306328999998),
 (48822L, 'NAD', 0.87457187697000005)]

Active ligand against unbound chemical component conformers

In this approach, an active ligand is used as a template against all unbound conformers of all ligands in the PDB chemical component dictionary. The function returns a list of tuples in the form (HET ID, CONF_ID, SIMILARITY)

sti.USR(conformers=True)
 
[('OKA', 36L, 0.91414645592999999),
 ('337', 200L, 0.91136936419000003),
 ('OKA', 126L, 0.91074688378000002),
 ('337', 152L, 0.91060855332000001),
 ('337', 144L, 0.90860909284000002),
 ('337', 36L, 0.90723517072000004),
 ('337', 68L, 0.90675530338999999),
 ('337', 107L, 0.90422721311999998),
 ('OKA', 32L, 0.90375055872999999),
 ('DRY', 123L, 0.90361441895000005)]

Fragments in CREDO

Chemical components from the PDB are fragmented with the help of the RECAP algorithm. A fragment in Credo is simply the product of a fragmentation reaction plus the unfragmented input molecule which is referred to as the root fragment. Unlike fragments defined in a fragment-based drug design context, fragments in Credo can be of arbitrary size (depending of the size of the starting molecule. However, all fragments in Credo have physicochemical properties associated with them for easy filtering.

In [1]: from credoscript import *
In [2]: sti = ChemCompAdaptor().fetchByHetID('STI') # Imatinib
 
# Fragments of Imatinib / many-to-many relation: ChemComp<->ChemFragments<->Fragment
In [3]: c.Fragments
Out[3]: [<Fragment(65)>, <Fragment(218)>, <Fragment(222)>, <Fragment(820)>, <Fragment(1349)>, <Fragment(2826)>, 
<Fragment(2924)>, <Fragment(4146)>, <Fragment(8836)>, <Fragment(11326)>, <Fragment(12626)>, <Fragment(17355)>, 
<Fragment(22836)>, <Fragment(24545)>, <Fragment(28823)>, <Fragment(33826)>]
 
# SMILES of terminal fragments
In [4]: for fragment in c.Fragments:
   ...:     if fragment.is_terminal:
   ...:         fragment.ism
   ...:
   ...:
Out[4]: '[NH4]'
Out[4]: 'c1ccncc1'
Out[4]: 'c1cncnc1'
Out[4]: 'Cc1ccc(cc1)N'
Out[4]: 'Cc1ccc(cc1)C=O'
Out[4]: 'C[NH+]1CC[NH2+]CC1'
 
# Further example that was used to create a figure in the CREDO paper:
In [5]: c.Fragments[5].ism
Out[5]: 'c1cc(cnc1)c2ccncn2'
 
In [6]: c.Fragments[5].ChemFragments
Out[6]:
[<ChemFragment(MPZ, 2826)>,
 <ChemFragment(MUH, 2826)>,
 <ChemFragment(NIL, 2826)>,
 <ChemFragment(PRC, 2826)>,
 <ChemFragment(RAJ, 2826)>,
 <ChemFragment(STI, 2826)>]
 
# Get Chemical components directly
In [7]: for cf in c.Fragments[5].ChemFragments:
   ....:     cf.ChemComp
   ....:
   ....:
Out[7]: <ChemComp(MPZ)>
Out[7]: <ChemComp(MUH)>
Out[7]: <ChemComp(NIL)>
Out[7]: <ChemComp(PRC)>
Out[7]: <ChemComp(RAJ)>
Out[7]: <ChemComp(STI)>