[Biojava-dev] [Bug 2273] New: More problems writing uniprot files

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri Apr 13 05:08:33 UTC 2007


http://bugzilla.open-bio.org/show_bug.cgi?id=2273

           Summary: More problems writing uniprot files
           Product: BioJava
           Version: live (CVS source)
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: seq.io
        AssignedTo: biojava-dev at biojava.org
        ReportedBy: gwaldon at geneinfinity.org


I found a few problems during the writing of uniprot files. Using P04941 as a
test exemple:

1. The ID line does not appear with a fix format (this is probably not a bug
actually):

(before/after - read/write)
ID   KV6A7_MOUSE             Reviewed;         107 AA.
ID   KV6A7_MOUSE     Reviewed;             107 AA.

2. The reference title get truncated at the end by one character after each
read/write operation:

RT   phenyloxazolone and its early diversification.";
RT   phenyloxazolone and its early diversification";
RT   phenyloxazolone and its early diversificatio";
...

3. The FT line is not formatted correctly; this is a bug because the FT line
has a fixed format, the I of Ig should be at position 35:

(before/after - read/write)
FT   CHAIN         1   >107       Ig kappa chain V-VI region NQ2-48.2.2.
FT   CHAIN        1  107>    Ig kappa chain V-VI region NQ2-48.2.2.

4. SQ line, are-these exactly the same CRC64 number?

SQ   SEQUENCE   107 AA;  11557 MW;  72488DA9EF354934 CRC64;
SQ   SEQUENCE 107  AA; 11564 MW; ffffffffe278ca323958dd50 CRC64;

- George


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the biojava-dev mailing list