Let m be a BPA on a frame of discernment ��, a pignistic probability transformation function BetPm : �� �� [0,1] corresponding to m m(?)��1,(13)where |A| is the cardinality of?isBetPm(x)=��A?��,x��A1|A|m(A)1?m(?), proposition A.By using PPT function, the BPA mr can be translated into a probability distribution pr. Then the class else of the residue r can be determined according to the maximum value of the probability distribution pr. At last, the topology of a transmembrane protein can be determined when the classes of all residues in the protein sequence have been determined. For each protein, the transmembrane orientation is determined by the location of the first residue, and each transmembrane region whose length exceeds a threshold consists of these residues labelled as class ��M.
�� According to the topology, all transmembrane helixes and the orientation of each transmembrane helix can be derived.4. Experimental VerificationIn this paper, a data set of 125 transmembrane protein sequences with known topology is collected from the data set of MPtopo [40] to verify the effectiveness of the proposed method TOPPER.In order to reflect the performance of combination predictor faithfully and to avoid overfitting, the experiment is performed using tenfold cross-validation. For each fold, it roughly contains 12-13 transmembrane proteins and their homology has been reduced to 30% below by using cd-hit program [41]. In order to assess the prediction performance of transmembrane regions (i.e., transmembrane helixes without considering orientation) of different algorithms, an evaluation method developed by Tusn��dy and Simon [11] is adopted in this paper.
To a transmembrane region, the prediction is considered successful when the overlapping region Cilengitide of predicted and observed transmembrane region contains at least 9 amino acids. The total numbers of predicted and real observed transmembrane regions are indicated by Nprd and Nobs, respectively. The overlapping predicted and real observed transmembrane regions are indicated by Ncor. The efficiency of the transmembrane regions prediction is measured by M = Ncor/Nobs and C = Ncor/Nprd. The overall prediction power is defined byQ=M?C��100%.(14)Besides, if all transmembrane regions and orientation of a transmembrane protein sequence have been predicted correctly, the topology of the transmembrane protein is said to be predicted correctly.In the rest of this section, various prediction algorithms will be compared from three aspects, namely, the prediction performance of residue level, transmembrane region level, and topology level, respectively.In the level of residue prediction, the confusion matrix of residue prediction for each algorithm is shown in Table 1.