[emboss-dev] Regression in GenBank/GenPept parsing?

Peter biopython at maubp.freeserve.co.uk
Tue Jul 21 17:19:58 UTC 2009


On Tue, Jul 21, 2009 at 2:30 PM, Peter Rice<pmr at ebi.ac.uk> wrote:
>
> Peter wrote:
>> On Tue, Jul 21, 2009 at 11:40 AM, Peter Rice<pmr at ebi.ac.uk> wrote:
>>> GenPept format expects to find 9 fields on the LOCUS line.
>>> RefseqP format expects only 8.
>>>
>>> The difference is GenPept format including the original GenPept locus name.
>>
>> Which 8 or 9 fields?
>
> 'LOCUS'
> identifier
> Genbank-locus-name (GenPept format only)
> seqlen             (numeric)
> 'aa'
> molecule-type      (controlled vocabulary - we ignore the protein ones
> for now)
> 'circular' or 'linear'
> division           (expecting 'UNC' for unclassified)
> date               (last modified date)

Do you have some publicly available examples of these? And if so,
are you happy for them to be included within Biopython for unit tests?

>> Grand. Will there be an EMBOSS 6.1.1 in a week or so then (addressing
>> this, the FASTQ @ problem, and any other minor issues)?
>
> There will be a patch file in the
> ftp://emboss.open-bio.org/pub/EMBOSS/patches/ directory
>
> For those (like me) who prefer to manually update there will also be
> replacement file(s) in the fixes directory.

Would there eventually be an EMBOSS 6.1.1 release for the less
technical users who won't want to mess about with patches or
replacing single files? I hope we don't have to wait 40 days! ;)
[This is a joke referencing St Swithin's day and associated legends]

Peter C.




More information about the emboss-dev mailing list