[emboss-dev] Regression in GenBank/GenPept parsing?
Peter Rice
pmr at ebi.ac.uk
Tue Jul 21 13:30:17 UTC 2009
Peter wrote:
> On Tue, Jul 21, 2009 at 11:40 AM, Peter Rice<pmr at ebi.ac.uk> wrote:
>> GenPept format expects to find 9 fields on the LOCUS line.
>> RefseqP format expects only 8.
>>
>> The difference is GenPept format including the original GenPept locus name.
>
> Which 8 or 9 fields?
'LOCUS'
identifier
Genbank-locus-name (GenPept format only)
seqlen (numeric)
'aa'
molecule-type (controlled vocabulary - we ignore the protein ones
for now)
'circular' or 'linear'
division (expecting 'UNC' for unclassified)
date (last modified date)
> Grand. Will there be an EMBOSS 6.1.1 in a week or so then (addressing
> this, the FASTQ @ problem, and any other minor issues)?
There will be a patch file in the
ftp://emboss.open-bio.org/pub/EMBOSS/patches/ directory
For those (like me) who prefer to manually update there will also be
replacement file(s) in the fixes directory.
> http://biopython.open-bio.org/SRC/biopython/ is just a dump from
> our repository (hourly or something). If you just download the latest
> Biopython source code, this will have all the unit test files etc:
> http://biopython.org/DIST/biopython-1.51b.tar.gz
Super, thanks.
> Ask if you need clarification on what any of the test data files are
> for. In some cases searching the Tests/test_*.py files may have
> informative comments.
Thanks. The plan is to include them in the EMBOSS QA tests so I will
take a look at the inputs and what you check for in the outputs. At
first glance it looks straightforward.
regards,
Peter
More information about the emboss-dev
mailing list