[Biojava-l] IndexOutOfBounds Exception when performing Pairwise Alignment
Hannes Brandstätter-Müller
biojava at hannes.oib.com
Tue Dec 6 08:20:46 UTC 2011
On Tue, Dec 6, 2011 at 02:57, Andreas Prlic <andreas at sdsc.edu> wrote:
>> Now, I'm getting null return value - must be still something wrong in
>> the parameters...
>>
>> Where should I start looking for that?
>
> try different gap penalties, I think the default ones are for protein
> alignments and one of the blosum matrices...
> If that does not help, can you send some of the sequences that are
> causing problems? There should be more informative error messages..
There are no other gap penalties predefined, and using a custom simple
gap penalty with (gop=1, gep=1) also does not change the null outcome.
Here is a unit test case that fails for me:
public void testPSA() {
String targetSeq =
"CACGTTTCTTGTGGCAGCTTAAGTTTGAATGTCATTTCTTCAATGGGACGGA"
+
"GCGGGTGCGGTTGCTGGAAAGATGCATCTATAACCAAGAGGAGTCCGTGCGCTTCGACAGC"
+
"GACGTGGGGGAGTACCGGGCGGTGACGGAGCTGGGGCGGCCTGATGCCGAGTACTGGAACA"
+
"GCCAGAAGGACCTCCTGGAGCAGAGGCGGGCCGCGGTGGACACCTACTGCAGACACAACTA"
+ "CGGGGTTGGTGAGAGCTTCACAGTGCAGCGGCGAG";
DNASequence target = new DNASequence(targetSeq,
AmbiguityDNACompoundSet.getDNACompoundSet());
String querySeq =
"ACGAGTGCGTGTTTTCCCGCCTGGTCCCCAGGCCCCCTTTCCGTCCTCAGGAA"
+
"GACAGAGGAGGAGCCCCTCGGGCTGCAGGTGGTGGGCGTTGCGGCGGCGGCCGGTTAAGGT"
+
"TCCCAGTGCCCGCACCCGGCCCACGGGAGCCCCGGACTGGCGGCGTCACTGTCAGTGTCTT"
+
"CTCAGGAGGCCGCCTGTGTGACTGGATCGTTCGTGTCCCCACAGCACGTTTCTTGGAGTAC"
+
"TCTACGTCTGAGTGTCATTTCTTCAATGGGACGGAGCGGGTGCGGTTCCTGGACAGATACT"
+
"TCCATAACCAGGAGGAGAACGTGCGCTTCGACAGCGACGTGGGGGAGTTCCGGGCGGTGAC"
+
"GGAGCTGGGGCGGCCTGATGCCGAGTACTGGAACAGCCAGAAGGACATCCTGGAAGACGAG"
+
"CGGGCCGCGGTGGACACCTACTGCAGACACAACTACGGGGTTGTGAGAGCTTCACCGTGCA"
+ "GCGGCGAGACGCACTCGT";
DNASequence query = new DNASequence(querySeq);
SubstitutionMatrix<NucleotideCompound> matrix =
SubstitutionMatrixHelper.getNuc4_4();
SequencePair<DNASequence, NucleotideCompound> psa =
Alignments.getPairwiseAlignment(query, target,
PairwiseSequenceAlignerType.LOCAL, new SimpleGapPenalty(), matrix);
assertNotNull(psa);
}
>> Is there a simple way to align (or score, don't need the full
>> alignment) a single DNA sequence against a List of sequences?
>
> You could do a multiple sequence alignment.
> http://www.biojava.org/wiki/BioJava:CookBook3:MSA
yeah, but that also computes loads of unnecessary PSAs. I just need
the following:
I get some sequences (from a sequencing machine), and for each of
these sequences I want to look up in my (small) 'library' of reference
sequences which one would be the most likely. So, I don't want PSAs of
the reference sequences, just my query against each ref seq -
something like that should be in the biojava library itself, the only
thing I found was to calculate PSAs of eact sequence in a list (much
like you need for a MSA), but if biuojava could offer that using the
ConcurrencyTools stuff, that would be cool - I really need to figure
out the inner structure of the biojava classes and start implementing
that stuff for myself, but the factory method stuff is kinda confusing
to get a hang of.
As soon as I figure this out, I'm going to improve the hell out of the
cookbook examples. Those are next to useless for my scenario.
Hannes
More information about the Biojava-l
mailing list