# Fetch all structures by UniProt accession In [1]: structures = StructureAdaptor().fetchAllByUniProt('P53779') In [2]: structures Out[2]: [<Structure('1JNK')>, <Structure('1PMN')>, <Structure('1PMQ')>, <Structure('1PMU')>, <Structure('1PMV')>, <Structure('2B1P')>, <Structure('2EXC')>, <Structure('2O0U')>, <Structure('2O2U')>, <Structure('2OK1')>, <Structure('2P33')>, <Structure('2R9S')>, <Structure('2WAJ')>, <Structure('2ZDT')>, <Structure('2ZDU')>, <Structure('3CGF')>, <Structure('3CGO')>, <Structure('3DA6')>, <Structure('3FI2')>, <Structure('3FI3')>, <Structure('3FV8')>, <Structure('3G90')>, <Structure('3G9L')>, <Structure('3G9N')>] # Fetch all chemical components by tanimoto similarity (requires MyChem) In [1]: j07 = s.Ligands[0].LigandComponents[0].ChemComp In [2]: chemcomps = ChemCompAdaptor().fetchAllBySimilarity(j07.can, tanimoto=0.5) In [3]: chemcomps Out[3]: [('J07', 1.0), ('JNO', 0.56983240223463705), ('JNF', 0.53216374269005895), ('5BP', 0.50289017341040498), ('AA2', 0.502857142857143)]
In [1]: from credoscript import * In [1]: s = StructureAdaptor().fetchByPDB('3CS9') In [2]: s Out[2]: <Structure('3CS9')> In [3]: s.Ligands Out[3]: [Ligand(600, NIL, A), Ligand(600, NIL, B), Ligand(600, NIL, C), Ligand(600, NIL, D)] # First ligand in list In [4]: s.Ligands[0] Out[4]: Ligand(600, NIL, A) In [5]: s.Ligands[0].LigandComponents Out[5]: [<LigandComponent(600, NIL, A)>] # First component of first ligand (heteropeptides can have more than one component) In [6]: s.Ligands[0].LigandComponents[0] Out[6]: <LigandComponent(600, NIL, A)> # Chemical component object / conformer-independent information In [7]: s.Ligands[0].LigandComponents[0].ChemComp Out[7]: <ChemComp(NIL)> In [8]: c = s.Ligands[0].LigandComponents[0].ChemComp # Isomeric smiles (Generated with OEChem) In [9]: c.ism Out[9]: 'Cc1ccc(cc1Nc2nccc(n2)c3cccnc3)C(=O)Nc4cc(cc(c4)n5cc(nc5)C)C(F)(F)F' # SDF file In [10]: print c.getMolBlock() NIL -OEChem-07250903003D 61 65 0 0 0 0 0 0 0999 V2000 1.4221 0.1035 -0.0988 O 0 0 0 0 0 0 0 0 0 0 0 0 [...] # Cross references to other databases In [11]: c.XRefs Out[11]: [<XRef(ChemComp,DrugBank Compound,DB04868)>, <XRef(ChemComp,KEGG DRUG,D08953)>, <XRef(ChemComp,StARlite compound,426660)>]
USR moments are calculated for all ligands in CREDO and their (up to) 200 unbound, low energy conformers.
In this approach, a bound ligand is used as a template against all other bound ligands in CREDO.
The function returns a list of tuples in the form (LIGAND ID, HET ID, SIMILARITY).
s = StructureAdaptor().fetchByPDB('1OPJ') s.Ligands [Ligand(1, MYR, A), Ligand(3, STI, A), Ligand(5, CL, A), Ligand(2, MYR, B), Ligand(4, STI, B), Ligand(6, CL, B)] sti = s.Ligands[1] # Default is active against active, Top 10 hits sti.USR() [(251127L, 'STI', 0.96517336947999999), (147408L, 'STI', 0.96494049005000004), (147409L, 'STI', 0.95693780307999998), (147410L, 'STI', 0.94801710280999996), (147411L, 'STI', 0.94794215309999996), (60177L, 'STI', 0.93764648486000002), (168962L, 'STI', 0.93720707308999995), (30467L, 'STI', 0.92843326611999999), (30470L, 'STI', 0.91157703434000004), (163155L, 'STI', 0.90518220483)] # Ignore all hits with the same chemical compound (STI) sti.USR(exclude_self=True) [(49806L, 'NAD', 0.88915234453000003), (184633L, 'TH3', 0.88836243273000004), (49807L, 'NAD', 0.88645934419000005), (49808L, 'NAD', 0.88170462764000002), (49809L, 'NAD', 0.88118661818999999), (62424L, 'LI1', 0.87931414069000002), (110258L, 'NAD', 0.87931408678), (105227L, 'NAD', 0.87822015420999999), (110256L, 'NAD', 0.87476306328999998), (48822L, 'NAD', 0.87457187697000005)]
In this approach, an active ligand is used as a template against all unbound conformers of all ligands in the PDB chemical component dictionary. The function returns a list of tuples in the form (HET ID, CONF_ID, SIMILARITY)
sti.USR(conformers=True) [('OKA', 36L, 0.91414645592999999), ('337', 200L, 0.91136936419000003), ('OKA', 126L, 0.91074688378000002), ('337', 152L, 0.91060855332000001), ('337', 144L, 0.90860909284000002), ('337', 36L, 0.90723517072000004), ('337', 68L, 0.90675530338999999), ('337', 107L, 0.90422721311999998), ('OKA', 32L, 0.90375055872999999), ('DRY', 123L, 0.90361441895000005)]
Chemical components from the PDB are fragmented with the help of the RECAP algorithm. A fragment in Credo is simply the product of a fragmentation reaction plus the unfragmented input molecule which is referred to as the root fragment. Unlike fragments defined in a fragment-based drug design context, fragments in Credo can be of arbitrary size (depending of the size of the starting molecule. However, all fragments in Credo have physicochemical properties associated with them for easy filtering.
In [1]: from credoscript import * In [2]: sti = ChemCompAdaptor().fetchByHetID('STI') # Imatinib # Fragments of Imatinib / many-to-many relation: ChemComp<->ChemFragments<->Fragment In [3]: c.Fragments Out[3]: [<Fragment(65)>, <Fragment(218)>, <Fragment(222)>, <Fragment(820)>, <Fragment(1349)>, <Fragment(2826)>, <Fragment(2924)>, <Fragment(4146)>, <Fragment(8836)>, <Fragment(11326)>, <Fragment(12626)>, <Fragment(17355)>, <Fragment(22836)>, <Fragment(24545)>, <Fragment(28823)>, <Fragment(33826)>] # SMILES of terminal fragments In [4]: for fragment in c.Fragments: ...: if fragment.is_terminal: ...: fragment.ism ...: ...: Out[4]: '[NH4]' Out[4]: 'c1ccncc1' Out[4]: 'c1cncnc1' Out[4]: 'Cc1ccc(cc1)N' Out[4]: 'Cc1ccc(cc1)C=O' Out[4]: 'C[NH+]1CC[NH2+]CC1' # Further example that was used to create a figure in the CREDO paper: In [5]: c.Fragments[5].ism Out[5]: 'c1cc(cnc1)c2ccncn2' In [6]: c.Fragments[5].ChemFragments Out[6]: [<ChemFragment(MPZ, 2826)>, <ChemFragment(MUH, 2826)>, <ChemFragment(NIL, 2826)>, <ChemFragment(PRC, 2826)>, <ChemFragment(RAJ, 2826)>, <ChemFragment(STI, 2826)>] # Get Chemical components directly In [7]: for cf in c.Fragments[5].ChemFragments: ....: cf.ChemComp ....: ....: Out[7]: <ChemComp(MPZ)> Out[7]: <ChemComp(MUH)> Out[7]: <ChemComp(NIL)> Out[7]: <ChemComp(PRC)> Out[7]: <ChemComp(RAJ)> Out[7]: <ChemComp(STI)>