[Bioperl-l] Bio::AlignIO ignores questionmarks?

Albert Vilella avilella at gmail.com
Fri Apr 14 10:01:01 EDT 2006


It seems like missing_char is more for SimpleAlign than for AlignIO. So
in case of fasta files with '?' chars, they will be ignores in line 113
of Bio::AlignIO::fasta.pm.

So you can add '\?' in that line of fasta.pm.

That will parse it correctly, although I am not sure whether fasta
format should or shouldn't allow '?' chars in the file.

Anyone?

Cheers,

    Albert.

On Thu, 2006-04-13 at 20:38 -0400, Kai Müller wrote:
> hi,
> 
> I'm very new to BioPerl and have a maybe silly question.
> when using Bio::AlignIO to load a set of sequences, the questionmarks are 
> simply lost (they refer to missing characters as opposed to gap characters 
> [-] or ambiguity [N]). I thought that 'missing_char()' might help, but it 
> didn't (I probably used it the wrong way).
> 
> when $filename contains sequences with ????, the following snippet would 
> produce an alignment with ???? lost and downstream nucleotide just shifted 
> and the resulting length differnces filled by '---' @ 3' end:
> 
> 
> my $aln_in = Bio::AlignIO->new(-file => "$filename", '-format' => 'fasta');
> 	my $aln = $aln_in->next_aln();
> 	$aln->gap_char('-');
> 	$aln->missing_char('?');
> 	
> 	my $testout = Bio::AlignIO->new(-fh => \*STDOUT , '-format' => 'clustalw');
> 	$testout->write_aln($aln);
> 
> 
> 
> Can somebody give me a hint here?
> 
> thanks and all the best,
> 
> Kai Müller
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list