[Biojava-l] Alignment objects
Nathan S. Haigh
n.haigh at sheffield.ac.uk
Wed Aug 9 15:13:49 UTC 2006
I think i'm having a few problem with alignments. I've generated an
protein alignment in the following way:
String alnString1 =
">seq1\n" +
"----FGHIKLMNPQRST\n" +
">seq2\n" +
"ACDEFGHIKLMNPQRST\n";
BufferedReader br1 = new BufferedReader(new
StringReader(alnString1));
FastaAlignmentFormat faf1 = new FastaAlignmentFormat();
alignment = faf1.read( br1 );
If i loop over positions in the alignment to add the positions with gaps
to a Location object, i have to do the following. It seems hacky since
i'm having to check for symbol names containing "[]" in order to
identify gaps. I'm sure there must be a better way to do this!? A better
way would be to calculate the frequency of each symbol (including gaps)
at a position in the alignment. This way i could return a list of these
frequencies for each position which could be used by other methods for
identifying positions with certain characteristic (such as those
containing gaps) ...any ideas?
for (int col = 1; col <= alignment.length(); col++) {
for (Iterator labels = alignment.getLabels().iterator();
labels.hasNext(); ) {
Object label = labels.next();
Symbol sym = alignment.symbolAt(label,col);
if (sym.getName().contains("[]")) {
Location newLocation =
LocationTools.makeLocation(col, col);
gapped = this.appendLocation(gapped, newLocation);
}
}
}
Cheers
Nath
More information about the Biojava-l
mailing list