[Biopython] problem parsing embl file
Sameet Mehta
msameet at gmail.com
Mon Jun 28 15:20:49 EDT 2010
Hi,
I am trying to parse a EMBL file created in 2004. The file contains a
single record for the entire chromosome. I have tried the following
two approaches
r = SeqIO.parse( file( "chromosome1.contig.embl" ), "embl" ).next()
r = SeqIO.read( file( "chromosome1.contig.embl" ), "embl" )
I get the following error:
ValueError Traceback (most recent call last)
/home/sameet/NIH-work/downloads/2004_release/<ipython console> in <module>()
/usr/lib64/python2.6/site-packages/Bio/SeqIO/__init__.pyc in
read(handle, format, alphabet)
516 iterator = parse(handle, format, alphabet)
517 try:
--> 518 first = iterator.next()
519 except StopIteration:
520 first = None
/usr/lib64/python2.6/site-packages/Bio/GenBank/Scanner.pyc in
parse_records(self, handle, do_features)
418 #This is a generator function
419 while True:
--> 420 record = self.parse(handle, do_features)
421 if record is None : break
422 assert record.id is not None
/usr/lib64/python2.6/site-packages/Bio/GenBank/Scanner.pyc in
parse(self, handle, do_features)
401 feature_cleaner = FeatureValueCleaner())
402
--> 403 if self.feed(handle, consumer, do_features):
404 return consumer.data
405 else:
/usr/lib64/python2.6/site-packages/Bio/GenBank/Scanner.pyc in
feed(self, handle, consumer, do_features)
383 consumer.sequence(sequence_string)
384 #Calls to consumer.base_number() do nothing anyway
--> 385 consumer.record_end("//")
386
387 assert self.line == "//"
/usr/lib64/python2.6/site-packages/Bio/GenBank/__init__.pyc in
record_end(self, content)
1047 and self._expected_size != len(sequence):
1048 raise ValueError("Expected sequence length %i, found %i." \
-> 1049 % (self._expected_size, len(sequence)))
1050
1051 if self._seq_type:
ValueError: Expected sequence length 666, found 5580032.
Can you tell me if i am doing anything wrong. I am following the
instructions as given in the Bio.SeqIO wiki page.
Thanks for the help.
Sameet
--
Sameet Mehta, Ph.D.,
Phone: (301) 842-4791
More information about the Biopython
mailing list