advertisement: compare things at compare-stuff.com!
next up previous contents
Next: Fold essence Up: Mapping of three-dimensional protein Previous: Map discontinuities   Contents

Map dimensions

For the mapping in Figures 5.3 we have used a $17 \times 17$ map grid to accommodate the 306 residues, on the assumption that the number of map nodes should be proportional to the volume of domain, which is in turn proportional to the number of residues in the domain[Hao et al., 1992]. Through experimentation, a scaling factor of 1.0 (i.e. 100 residues $\rightarrow 10 \times 10$ map) was found to give sensible maps.

Self-organising maps need not be square. In fact, the dimensions of the map should reflect the distribution of input vectors; if the input vectors occupy a long and thin region of space, then mapping to a long and thin grid will give the best performance. With high-dimensional inputs, the first two principal components give an approximation of the length and breadth of the input space. We therefore use the ellipsoid approximations of the atomic coordinates of each domain[Taylor et al., 1983] to specify the ratio of $M$ to $N$. Three eigenvalues are obtained from the principal components analysis of atomic coordinates. In order of descending magnitude they represent the lengths of the long, middle and short axes of the best fitting ellipsoid and are defined as $a$, $b$, and $c$ respectively. The number of residues in the domain is $I$. The map dimensions $M$ and $N$ must satisfy the following equations:

\begin{displaymath}
MN = I
\end{displaymath} (13)


\begin{displaymath}
M/N = a/b
\end{displaymath} (14)

therefore $M = \sqrt{Ia/b}$ and $N = \sqrt{Ib/a}$. In practice, however, these maps appear to be too elongated so we reduce the aspect ratio by half, so that $M/N = 1 + \frac{a/b - 1}{2}$. Figure 5.6 demonstrates the use of rectangular maps for an elongated domain: 1kapP1[Baumann et al., 1993], a $\beta $-solenoid structure. The advantages of the rectangular map can be clearly seen: improved correlation between real map-derived inter-residue distances and fewer discontinuities.

Figure 5.6: Map dimensions affect the quality of mapping. (a) Rasmol cartoon view of domain 1kapP1. (b) Mapping to $15 \times 15$ grid (total 225 units). (c) Mapping to $23 \times 9$ grid (total 207 units). (d) Intra-molecular distance matrices for mapping in (b). Correlation coefficient = 0.77. (e) Intra-molecular distance matrices for mapping in (c). Correlation coefficient = 0.90. The rectangular map has produced a better mapping with fewer discontinuities, despite having fewer units.
\begin{figure}\begin{center}
\par (a) \mbox{\hspace{3em}\hspace{4in}}\\
\vspace...
...ig{file=chap5/figs/struct/1kapP1_obdm.eps,width=2in}\par\end{center}\end{figure}


next up previous contents
Next: Fold essence Up: Mapping of three-dimensional protein Previous: Map discontinuities   Contents
Copyright Bob MacCallum - DISCLAIMER: this was written in 1997 and may contain out-of-date information.