advertisement: compare things at compare-stuff.com!
next up previous contents
Next: Sequence conservation Up: Mapping of structure and Previous: Mapping of structure and   Contents


Sequence-derived hydropathy

In order to introduce information about hydrophobicity into the mapping procedure we simply input four dimensional vectors as follows:

\begin{displaymath}
v_i = \left ( \begin{array}{c} x_i\\
y_i\\
z_i\\
f_{\overline{h}}\overline{h}_i\\ \end{array} \right )
\end{displaymath} (15)

where $x_i$, $y_i$ and $z_i$ are the carbon-$\alpha $ coordinates of residue $i$, and $\overline{h}_i$ is the mean hydropathy calculated from position $i$ in a multiple sequence alignment, using the scale of Kyte and Doolittlekyte:scale given in Table 4.3. $f_{\overline{h}}$ is a weight factor and is described below. The mapped hydrophobicity of residue $i$ is defined as the value of the hydrophobicity component of the closest map vector $r_w$ to the input vector $v_i$ (see Equation 5.1).

The vector, $v_i$, has mixed units: Ångstrøms for the coordinates and usually kcal/mol for the hydropathy (for the free energy of transfer from an apolar solvent to a polar solvent, i.e. from octanol to water). This immediately raises the problem of what weight to apply to the hydropathy component in order for the correct balance of information to be present. We were not in favour of normalising each vector component by its mean and standard deviation, because this would distort the three-dimensional coordinates. Instead we tested different weight factors, $f_{\overline{h}}$, for the hydropathy component empirically as described below.

With $f_{\overline{h}} = 1$ the mapping, shown in Figure 5.8, is broadly similar to that of structure alone; structural constraints dominate and the mapping of hydropathy preserves only its low-resolution features (see Figure 5.8(c)), and loses the marked alternation in the $\beta $-sheet. The use of a larger hydropathy weighting ( $f_{\overline{h}} = 3$) creates a better balance between the two types of information. The result, shown in Figure 5.9, is a mapping which preserves the structural and hydrophobic relationships of amphipathic sheet residues, whilst smoothing the hydrophobicity of loop residues. The local relationships of residues which are close both in space and hydrophobicity are retained by the mapping at the expense of less well spatially correlated hydrophobics (in loop regions, for example). As a result, more space in the map is devoted to these points, and the map reference vectors more closely resemble the input vectors for residues with `important' or spatially correlated hydrophobicity. Conversely, isolated hydrophobic residues will map to the same map vector as neighbouring less hydrophobic residues.

Figure 5.8: Mapping of structure and mean hydropathy for domain 1sxaA0 with weighting factor $f_{\overline{h}} = 1$. (a) Rasmol cartoon view of 1sxaA0. (b) Map trajectory from the program visTraj. The map vector hydropathy component is shown for each unit by the shading of the circles: white=hydrophilic, black=hydrophobic. (c) Mapped hydropathy shown against structure (compare with the raw hydrophobicities in Figure 5.7). The mapped value for residue $i$ is obtained from the hydropathy component of the closest map vector $r_w$ (see Section 5.2) to input vector $v_i$. (d) Distance maps derived from coordinates (top) and mapping (bottom); correlation coefficient = 0.75. From the map in (b) it can be seen that structural relationships are preserved whilst local ordering is not retained in the hydropathy information. This results in the smoothed hydrophobic `core' seen in (c).
\begin{figure}\begin{center}
\par (a)~\epsfig{file=chap5/figs/strhyd1/1sxaA0_tra...
...1sxaA0_dm1.eps,width=\twotoapage}\par\vspace{0.5in}
\par\end{center}\end{figure}

Figure 5.9: Mapping of structure and mean hydropathy for domain 1sxaA0 with weighting factor $f_{\overline{h}} = 3$. (a) Mapping. (b) Mapped hydropathy. (c) Distance matrices (for full explanation, see Figure 5.8(b), (c) and (d) respectively). With a higher weighting of hydropathy versus structural information, the mapping attempts to preserve local relationships in both properties simultaneously. The result is a less interpretable map (a), however the mapped hydrophobicities displayed in (b) on the structure now exhibit the alternation we observe in the raw hydropathy information (in Figure 5.7) and these patterns of hydropathy are largely confined to structurally important $\beta $-sheet residues. Because of the increase in non-structure information presented to the mapping algorithm, the correlation coefficient between intra-map and intra-structure distances has dropped to 0.63.
\begin{figure}\begin{center}
\par (a)~\epsfig{file=chap5/figs/strhyd3/1sxaA0_p3t...
...1sxaA0_dm3.eps,width=\twotoapage}\par\vspace{0.5in}
\par\end{center}\end{figure}

Other values for $f_{\overline{h}}$ were tested and the simplest way to visualise the results was a GIF (Graphics Interchange Format) animation. The animation for domains 1sxaA0 and 1aak00[Cook et al., 1992] using values for $f_{\overline{h}} = [1..9]$ are available on the internet[*]. The viewing of these animations is not essential. In summary, the highest values of $f_{\overline{h}}$ bias the mapping heavily towards the original hydropathy data and look very much like Figure 5.7. As $f_{\overline{h}}$ decreases, the mapped hydropathy becomes more idealised and the information in the loops decreases (as in Figure 5.9) until a sudden change at $f_{\overline{h}} =
2$ where the alternating patterns are lost (as in Figure 5.8). The raw and mapped ( $f_{\overline{h}} = 3$) hydrophobicities of a number of domains are shown in Figure 5.10. The hydrophobic feature extraction appears successful in each case (domains were picked at random from a list of 19 domains taken from CATH between 100 and 300 residues having at least 20 multiple sequences of not more than 70% pairwise identity). A quantitative evaluation of these mappings is undertaken in Section 5.5.

Figure 5.10: Raw and mapped hydophobicity for a number of domains. (a) 1fkb00 raw, (b) 1fkb00 mapped, (c) 1gia02 raw, (d) 1gia02 mapped, (e) 1rblM0 raw, (f) 1rblM0 mapped, (g) 1thm00 raw, (h) 1thm00 mapped, (i) 2ohxA2 raw, (j) 2ohxA2 mapped, (k) 5p2100 raw, (l) 5p2100 mapped.
\begin{figure}\begin{center}
\par (a)~\epsfig{file=chap5/figs/strhyd/1fkb00_raw....
.../figs/strhyd/5p2100_map.eps,width=\threetoapage}\par\par\end{center}\end{figure}


next up previous contents
Next: Sequence conservation Up: Mapping of structure and Previous: Mapping of structure and   Contents
Copyright Bob MacCallum - DISCLAIMER: this was written in 1997 and may contain out-of-date information.