advertisement: compare things at compare-stuff.com! |
The ClustalW[Thompson et al., 1994] program was used to perform multiple sequence alignments of the probe and hit sequences. Again, default settings were used, except when more than 50 sequences were being aligned, when we used the `quicktree' option. Also, when more than 50 hits were recovered, the sequences were weeded (see below) such that none had more than 98% identity in their unaligned sequences. This step was used primarily to remove duplicates. The multiple sequence alignment retains the gaps (`-' characters) introduced between sequence fragments.