Louis Encephalitis virus (SLE) (Figure 1B). Analysis of systematically selected WNV E protein sequences suggested that the PAAP motif was present in about 90% of the analyzed sequences while the frequency of the PSAP motif was less than 10% (Figure 1C). The YCYL motif was present in more than 95% of the WNV sequences analyzed.
Table 1, depicts the occurrences of the PXAP and YCYL Hedgehog inhibitor motifs in the protein non-redundant database (nr) database. As VX-689 expected, sequence motifs that serve some biological functions, occur more often than by chance [39, 40] although it deserves mention that these motifs are maintained within the Flavivirus E proteins that themselves are highly conserved. While sequence analysis revealed the predominance of PAAP motifs over PSAP it is unclear as to what advantage the PSAP motif would render in case of WNV. From studies in HIV and that of host proteins like
Hrs (Hepatocyte growth factor Receptor Substrate) it is well known that the PSAP motif is a strong binding partner of Tsg101 [41]. Figure 1 Sequence analysis of Flavivirus Envelope proteins. (A) Outline of WNV structural proteins C, PrM and E. (B) Presence of conserved 461PS/AAP464 and 349YCYL352 motifs in the Flavivirus envelope protein. Selected Flavivirus proteins were downloaded from NCBI [42], C59 wnt datasheet aligned with MAFFT [43] and the respective motif regions visualized in Jalview [44] using ClustalX-like coloring based on physicochemical properties and conservation. Virus names are shown left with NCBI GI number. (C) Frequency of YCYL Casein kinase 1 and PAAP motif variants in WNV envelope. Significant protein hits (E<0.001) were first identified with Delta-BLAST [45] starting with the sequence of the envelope glycoprotein structure (PDB:2hg0) against NCBI’s non-redundant protein database restricting to West Nile virus sequences only. All hits were next aligned with MAFFT after discarding those without sequence information for the YCYL or PAAP region and removing 100% identical
sequences using Jalview. The resulting set of 286 WNV sequences was analyzed for the respective motif occurrences. Table 1 Occurrences of the PXAP and YCYL motifs in the protein nr database Motif Actual # occurrences Actual frequency Expected # occurrences* Expected frequency* PXAP 2802870 3.05e-04 1867974 2.03e-04 YCYL 11945 1.30e-06 10851 1.18e-06 26,682,258 protein sequences in the non-redundant (nr) protein database downloaded from NCBI on 1st July 2013, were searched for presence of the PXAP and YCYL motifs. The relative abundance of each of the relevant amino acids in the nr database was used to calculate the expected occurrences of the motifs by chance. *The expected occurrences and frequency were based on the relative abundance of each of the 20 amino acid residues in the nr database.