[Biopython-dev] [Bug 3069] Support for EMBL-like files from IMGT

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue May 18 16:10:29 UTC 2010


http://bugzilla.open-bio.org/show_bug.cgi?id=3069


biopython-bugzilla at maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Support for EMBL-line files |Support for EMBL-like files
                   |from IMGT                   |from IMGT




------- Comment #18 from biopython-bugzilla at maubp.freeserve.co.uk  2010-05-18 12:10 EST -------
(In reply to comment #16)
> (In reply to comment #15)
> > Uri - Could you explain what your code was trying to do with the record
> > header parsing? An example or two would be great. Thanks!
> 
> So the approach I used was to keep the feature parser the exact same as it was
> in the EMBL parser.  In the parse_header function, I would determine for each
> record what the indentation was, and then changed FEATURE_QUALIFIER_INDENT and
> FEATURE_QUALIFIER_SPACER for each record.  This way, the standard EMBL parser
> would work fine, and there would never be any problems if the feature key was
> adjacent to the location qualifier. 
> 

I see now. If the IGMT have consistent FH and FT lines we can trust, that would
be quite elegant... on the other hand to fix the nasty locations we are forced
to subclass parse_features anyway.

(In reply to comment #17)
> Also, here is a script that will fix the location errors with the '>'
> symbols. 
>
> Run as:
> 
> python fix_ligm_locations.py imgt.dat imgt.fixed.dat
> 
> http://gist.github.com/405146
> 

I've used your regular expression solution in my branch now,
http://github.com/biopython/biopython

Remind me to add your name as a contributor once this gets merged to the trunk.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list