Modeller - quick start, filling loops in the DUDE protein receptor structures
The aim here was getting simulations of the DUDE receptor proteins up and running. The main challenge is a large number of missing sidechains and loops in these structures - this is not a problem for docking codes perhaps, but certainly is for MD simulation codes.
Modeller from the Sali lab seemed like the tool for the job, so let’s grab that
wget https://salilab.org/modeller/9.16/modeller_9.16-1_amd64.deb
sudo env KEY_MODELLER=MODELIRANJE dpkg -i modeller_9.16-1_amd64.deb
The only useful reference I was able to find on filling loops and missing sidechains was:
https://salilab.org/modeller/wiki/Missing%20residues
However, it assumes the aligment.ali file is constructed by hand. Maybe that’s pleasurable to some but when I have ~20 receptor structures to complete, there is no time for this. Let’s automate how alignment.ali file is generated:
# alignment.py file
from MDAnalysis import Universe
from MDAnalysis.lib.util import convert_aa_code
import textwrap, glob
u = Universe("receptor.pdb")
ca = u.select_atoms("protein and name CA")
residues = [(at.resid, at.residue.name) for at in ca]
sequence = []
for i, (resid, resname) in enumerate(residues):
if i:
prev_resid = residues[i-1][0]
gaps = ["-"]*(resid - prev_resid - 1)
sequence.append("".join(gaps))
sequence.append(convert_aa_code(resname))
seq_gap = "".join(sequence)
f = glob.glob("*.fasta.txt")[0]
fasta = "".join([l.strip() for l in open(f).readlines() if not l.startswith(">")])
seq_filled = fasta[residues[0][0]+2: residues[-1][0]+residues[0][0]+1]
print """
>P1;receptor
structureX:receptor: {0} : :+{1} : :::-1.00:-1.00
{2}*
>P1;receptor_fill
sequence:::::::::
{3}*
""".format(residues[0][0], len(residues), "\n".join(textwrap.wrap(seq_gap, 73)), "\n".join(textwrap.wrap(seq_filled, 73)))
# modeller.py
from modeller import *
from modeller.automodel import * # Load the automodel class
log.verbose()
env = environ()
# directories for input atom files
env.io.atom_files_directory = ['.', '../atom_files']
a = loopmodel(env, alnfile = 'alignment.ali',
knowns = 'receptor', sequence = 'receptor_fill')
a.starting_model= 1
a.ending_model = 1
a.loop.starting_model = 1
a.loop.ending_model = 2
a.loop.md_level = refine.fast
a.make()
Lastly, that’s how it all fits together
python alignment.py > alignment.ali
python modeller.py