Multiple Mapping Method with Multiple Templates (M4T) is a fully automated comparative protein structure modeling server. The novelty of M4T resides in two of its major modules, Multiple Templates (MT) and Multiple Mapping Method (MMM). The MT module of M4T selects and optimally combines the sequences of multiple template structures through an iterative clustering approach that takes into account the ‘unique’ contribution of each template, its sequence similarity to other template sequences and to the target sequences, and the quality of its experimental resolution. MMM module is a sequence-to-structure alignment method that is aimed at improving the alignment accuracy, especially at lower sequence identity levels. The current implementation of MMM takes inputs from three profile-to-profile-based alignment methods and iteratively compares and ranks alternatively aligned regions according to their fit in the structural environment of the template structure. The performance of M4T was benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set and showed a favorable performance to current state-of-the-art methods.
The target sequence is used as query to search for homologous protein structure(s) that could serve as template(s) by running three iterations of PSI-BLAST against PDB, with an E-value cutoff of 0.0001. Only those hits are selected where the sequence overlap with the target sequence is covering more than 60% of the actual SCOP domain length or more than 75% of the PDB chain length in case of a missing SCOP classification. After searching the PDB an iterative clustering procedure identifies the most suitable templates to combine, i.e. the least number of templates that can contribute the most to the model. Templates are selected or discarded according to a hierarchical selection procedure that accounts for sequence identity between templates and target sequence, sequence identity among templates, crystal resolution of the templates and contribution of templates to the target sequence (i.e. if a region is covered by several templates or by a single template only).
The result of the iterative clustering of templates is one or more groups of templates each containing one or more template structures. Within each cluster, all templates are aligned to the corresponding target sequence using the iterative-MMM approach. In the last consolidation step the sequence-to-structure alignments of the overlapping clusters are combined. The overlapping parts are identified by first structurally superposing the templates, and subsequently by calculating an LGA_S score (17) on that superposition. If this score is greater than 70%, the overlapping clusters are combined using their alignment to the (same) target sequence as reference. If clusters of templates are not overlapping or the overlap between them is not sufficient for a structurally accurate superposition (i.e. less than six residues) then the target sequence is split into independent, separate parts and individual models are built for each ‘modelable’ part of the target sequence.
Submitting a query
The M4T server has a straightforward interface. In order to use this server, the user must provide a target sequence, which can be entered in a text box, or can be uploaded as a text file. The target sequence must be in raw text containing one-letter amino acid codes (without any headers). Users may add a description of the sequence at the ‘Job Description’ field. If an e-mail address is provided the user is also notified by e-mail when the prediction is finished including a hyperlink where the results can be accessed. M4T assigns a unique job identifier for each submitted query (e.g. DIR_cA8r0n). This job identifier can be used to check the status of the submission (i.e. in queue, running, finished) and to retrieve the results by typing it in the ‘Job ID’ field at the submission page.
M4T returns a full atom model(s) in PDB format and the alignment(s) used to build the model. When the prediction process is finished, the server will send a notification by e-mail to the user (if an e-mail address was provided). Otherwise, users have to visit the submission page and access the results page by using the job identifier. Results are kept on the server for 5 days only.