[Bioperl-l] BioPerl 1.6 and parsing multiple EMBL records
Hotz, Hans-Rudolf
hrh at fmi.ch
Mon Jan 11 15:42:22 UTC 2010
On 1/11/10 3:16 PM, "Peter" <biopython at maubp.freeserve.co.uk> wrote:
> Hi,
>
> I'm running bioperl-live from SVN, just updated to revision 16648.
>
> $ perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"'
> 1.0069
>
> I am trying to get Bio::SeqIO to convert a multiple record EMBL
> file into GenBank format, piping the data via stdin/stdout using
> the following trivial Perl script:
>
> #!/usr/bin/env perl
> use Bio::SeqIO;
> my $in = Bio::SeqIO->new(-fh => \*STDIN, -format => 'embl');
> my $out = Bio::SeqIO->new(-format => 'genbank');
> while (my $seq = $in->next_seq) { $out->write_seq($seq) };
>
> This only seems to find the first EMBL record in my example
> files. For example, this simple file has just two contig records:
> http://biopython.open-bio.org/SRC/biopython/Tests/EMBL/Human_contigs.embl
>
> This is just the first two records taken from a much larger EMBL file
> rel_con_hum_01_r102.dat downloaded and uncompressed from:
> ftp://ftp.ebi.ac.uk/pub/databases/embl/release/rel_con_hum_01_r102.dat.gz
These entries form the CON data class, see:
http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3_4_14
and they don't contain any sequence information.
If you take the 'expanded' entries from
ftp://ftp.ebi.ac.uk/pub/databases/embl/expanded_con/release/rel_con_hum_01_r
102.dat.gz
your script will work.
Hans
> Trying both these examples as input, BioPerl just gives a single
> GenBank record as output (the first EMBL entry in the input).
>
> Is this a BioPerl bug, or am I missing something?
>
> Peter
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list