Outer membrane proteins (OMPs) are the transmembrane proteins found in the outer membranes of Gram-negative bacteria, mitochondria and plastids. Most prediction methods have focused on analogous features, such as alternating hydrophobicity patterns. We can identify proteins as OMPs by detecting their homologous relationships to known OMPs using sequence similarity. Given an input sequence, HHomp builds a profile hidden Markov model (HMM) and compares it with an OMP database by pairwise HMM comparison, integrating OMP predictions by PROFtmb. A crucial ingredient is the OMP database, which contains profile HMMs for over 20,000 putative OMP sequences. These were collected with the exhaustive, transitive homology detection method HHsenser, starting from 23 representative OMPs in the PDB database. In a benchmark on TransportDB, HHomp detects 63.5% of the true positives before including the first false positive. This is 70% more than PROFtmb, four times more than BOMP and 10 times more than TMB-Hunt.
HHomp is based on a database of precomputed profile HMMs of putative OMPs. The underlying sequences were identified by the exhaustive, transitive sequence search method HHsenser. To obtain representative bona fide OMPs as starting points for the transitive searches, the sequences of all bacterial OMPs in the Protein Data Bank were filtered for a maximum of 25% pairwise sequence identity. For each of these 23 proteins, we performed an HHsenser run, searching through all bacterial sequences in the NCBI non-redundant (nr) database, and all sequences in the NCBI environmental database (both filtered for 90% maximum pairwise sequence identity). We pooled the largely overlapping results from these 23 searches into a nr database of over 20 000 proteins. We did not use any annotation such as that contained in SwissProt or TransportDB for filtering or complementing this sequence set.
Given a sequence as input, HHomp builds a profile HMM by searching homologous sequences with buildali.pl from the HHsearch package with default parameters. We predict the secondary structure and the β-barrel strands using PSIPRED and PROFtmb in the same way as for the database HMMs.
The HHomp webserver consists of an OMP prediction and searching interface and a browsing interface for the underlying OMP database. With the prediction interface, the user can search the database with a query protein sequence. Various parameters for alignment building and searching can be modified. After a few minutes, the server returns a graphical overview of the matched regions, a detailed and annotated list of matched OMP database HMMs and the corresponding alignments. An alternative view of the alignments can be selected, in which the columns of the aligned profile HMMs are represented as coloured histograms. The detected OMP HMMs are linked to the browsing interface with detailed description pages for the OMP clusters, containing alignments and 3D models for the TM domain if available. HHomp is integrated into the MPI Bioinformatics Toolkit, which provides a user-friendly framework for job control and offers many other tools for sequence analysis. In particular, a 3D model can be built with the MODELLER software based on the HHomp results.