[Biopython-dev] [Bug 1758] genbank parser chokes on /transl_except
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Mar 8 17:19:21 EST 2005
http://bugzilla.open-bio.org/show_bug.cgi?id=1758
------- Additional Comments From biopython-bugzilla at maubp.freeserve.co.uk 2005-03-08 17:19 -------
You can test this with accession NT_033779 available here:
ftp://ftp.ncbi.nih.gov/genomes/Drosophila_melanogaster/CHR_2/NT_033779.gbk
I think the problem is that the /transl_except=... entry spans multiple lines,
but is not wrapped in quotes (as done normally for multi-line entries).
Due to bug 1747 I haven't tried loading this with the current GenBank parser as
I don't have enough RAM. However, for the record, even my personal GenBank
parser (patch on bug 1747) doesn't yet cope with the /transl_except=... entry.
Is this file and others like it are "wrong", and should it quote the entry?
Or we need to cope with this as well (not sure how painful that would be!)?
Snippet from the file:
CDS complement(join(18108857..18109603,18109665..18110692,
18111046..18111608,18111671..18112909,18113657..18114058,
18115560..18116014))
/gene="kel"
/locus_tag="CG7210"
/codon_start=1
/transl_except=(pos:complement(18111697..18111699),
aa:OTHER)
/protein_id="NP_476589.4"
/db_xref="GI:45549017"
/db_xref="FLYBASE:FBgn0001301"
/db_xref="GeneID:35084"
/translation="MIALSALLTKYTIGIMSNLSNGNSNNNNQQQQQQQQGQNPQQPA
QNEGGAGAEFVAPPPGLGAAVGVAAMQQRNRLLQQQQQQHHHHQNPAAEGSGLERGSC
LLRYASQNSLDESSQKHVQRPNGKERGTVGQYSNEQHTARSFDAMNEMRKQKQLCDVI
LVADDVEIHAHRMVLASCSPYFYAMFTSFEESRQARITLQSVDARALELLIDYVYTAT
VEVNEDNVQVLLTAANLLQLTDVRDACCDFLQTQLDASNCLGIREFADIHACVELLNY
AETYIEQHFNEVIQFDEFLNLSHEQVISLIGNDRISVPNEERVYECVIAWLRYDVPMR
EQFTSLLMEHVRLPFLSKEYITQRVDKEILLEGNIVCKNLIIEALTYHLLPTETKSAR
TVPRKPVGMPKILLVIGGQAPKAIRSVEWYDLREEKWYQAAEMPNRRCRSGLSVLGDK
VYAVGGFNGSLRVRTVDVYDPATDQWANCSNMEARRSTLGVAVLNGCIYAVGGFDGTT
GLSSAEMYDPKTDIWRFIASMSTRRSSVGVGVVHGLLYAVGGYDGFTRQCLSSVERYN
PDTDTWVNVAEMSSRRSGAGVGVLNNILYAVGGHDGPMVRRSVEAYDCETNSWRSVAD
MSYCRRNAGVVAHDGLLYVVGGDDGTSNLASVEVYCPDSDSWRILPALMTIGRSYAGV
CMIDKPMXMEEQGALARQAASLAIALLDDENSQAEGTMEGAIGGAIYGNLAPAGGAAA
AAAPAAPAQAPQPNHPHYENIYAPIGQPSNNNNNSGSNSNQAAAIANANAPANAEEIQ
QQQQPAPTEPNANNNPQPPTAAAPAPSQQQQQQQAQPQQPQRILPMNNYRNDLYDRSA
AGGVCSAYDVPRAVRSGLGYRRNFRIDMQNGNRCGSGLRCTPLYTNSRSNCQRQRSFD
DTESTDGYNLPYAGAGTMRYENIYEQIRDEPLYRTSAANRVPLYTRLDVLGHGIGRIE
RHLSSSCGNIDHYNLGGHYAVLGHSHFGTVGHIRLNANGSGVAAPGVAGTGTCNVPNC
QGYMTAAGSTVPVEYANVKVPVKNSASSFFSCLHGENSQSMTNIYKTSGTAAAMAAHN
SPLTPNVSMERASRSASAGAAGSAAAAVEEHSAADSIPSSSNINANRTTGAIPKVKTA
NKPAKESGGSSTAASPILDKTTSTGSGKSVTLAKKTSTAAARSSSSGDTNGNGTLNRI
SKSSLQWLLVNKWLPLWIGQGPDCKVIDFNFMFSRDCVSCDTASVASQMSNPYGTPRL
SGLPQDMVRFQSSCAGACAAAGAASTIRRDANASARPLHSTLSRLRNGEKRNPNRVAG
NYQYEDPSYENVHVQWQNGFEFGRSRDYDPNSTYHQQRPLLQRARSESPTFSNQQRRL
QRQGAQAQQQSQQPKPPGSPDPYKNYKLNADNNTFKPKPIAADELEGAVGGAVAEIAL
PEVDIEVVDPVSLSDNETETTSSQNNLPSTTNSNNLNEHND"
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list