[Biopython] instances

Mon Jun 29 10:55:07 EDT 2009

Hi Peter

Thanks for the reply. I certainly didn't write my own parser, I just
made use of the Genbank one in biopython (I'm using 1.49) and I
started with the Genbank parser as it was one of the example Brad
posted some years ago, so I just adapted it (some things didn't work,
but some tweaking and it worked fine).

I have referred to the examples on the tutorial cookbook, it has been
very helpful as well, but I am very new to this so am still trying to
figure where and why everything goes. Would you suggest I recode the
py file to take advantage of SeqIO (I'm sure it wouldn't be that
difficult) ? I would be most willing if it would help with this
problem.

I tried your suggestion and got the following error

Traceback (most recent call last):
  File "/media/RESCUE/HBx_Bioinformatics/reannotate.py", line 166, in <module>
    corestart = corecur_seq[0].position
  File "/var/lib/python-support/python2.6/Bio/SeqFeature.py", line
265, in __getattr__
    raise AttributeError("Cannot evaluate attribute %s." % attr)
AttributeError: Cannot evaluate attribute position.

So I guess it doesn't have that position option, pressing tab gives me
__doc__, __getattr__, __init__, __module__, __repr__, _str__, _start,
_end

Thanks
Liam

On Mon, Jun 29, 2009 at 4:25 PM, Peter Cock<p.j.a.cock at googlemail.com> wrote:
> On 6/29/09, Liam Thompson <dejmail at gmail.com> wrote:
>> Hi everyone
>>
>> Ok, so I managed to write a parser for Genbank files ( I will post to
>> script central once completed, it works well with single genes from
>> genomic sequences) which can search for a gene from a genomic
>> sequence and copy it out as a FASTA.
>
> I hope you didn't spend time writing a whole new GenBank
> parser, Biopython already has one which works pretty well ;)
> From the rest of your email it sounds like you actually using
> this (the Bio.GenBank module, which is also used internally
> by Bio.SeqIO).
>
>> ...
>> I then attempt to print the sequence at the given coordinates
>>
>>  if corecur_seq > 0:
>>             print "core sequence only \n"
>>             corestart = corecur_seq[0]._start
>>             coreend = corecur_seq[0]._end
>>             coreseq = corecur_seq[1]
>>             print coreseq[corestart:coreend]
>>
>> getting the following error message
>>
>> Traceback (most recent call last):
>>   File "/media/RESCUE/HBx_Bioinformatics/reannotate.py", line 171, in
>> <module>
>>     print coreseq[corestart:coreend]
>>   File "/var/lib/python-support/python2.6/Bio/Seq.py", line 132, in
>> __getitem__
>>     return Seq(self._data[index], self.alphabet)
>> TypeError: object cannot be interpreted as an index
>
> I would guess that corestart and coreend are NOT integers. To
> do slicing, you will need integers. Based on the later bits of your
> email you discovered they are Biopython position objects (not
> integers):
>
>> I think the error is (although I don't know, I am pretty new to python
>> and programming in biopython) with the variable type of
>> corestart and coreend, both defined as <type 'instance'> and when I
>> print them on the shell I get
>>
>> Bio.SeqFeature.ExactPosition(1900)
>>
>> Bio.SeqFeature.ExactPosition(2452)
>>
>> as an example, do I need to convert these to integers ? I have tried,
>> but I think I would need to replace or copy out the number
>> into a different variable ?
>
> A position object has a position attribute you should be using
> if you just need an integer. I think (without knowing exactly
> what your code is doing) that this would work:
>
> corestart = corecur_seq[0].position
> coreend = corecur_seq[0].position
> print current_entry.seq[corestart:coreend]
>
>> Specific thanks to Peter, Andrew Dalke and Brad who posted
>> numerous examples on their pages and on the mailing lists
>> which have helped me tremendously.
>>
>> I would appreciate any comments.
>
> Be careful as lots of Andrew's examples may be out of date
> now.
>
> What version of Biopython are you using, and have you been
> looking at a recent version of the tutorial? We currently
> recommend using Bio.SeqIO to parse GenBank files, although
> it does internally use Bio.GenBank
>
> http://biopython.org/DIST/docs/tutorial/Tutorial.html
> http://biopython.org/DIST/docs/tutorial/Tutorial.pdf
>
> The latest version of the tutorial (included with Biopython 1.51b)
> discusses the SeqRecord and SeqFeature objects and their
> locations more prominently (they get a whole chapter now).
> Most of this section would still apply directly to older versions
> of Biopython.
>
> Peter
>

-- 
-----------------------------------------------------------
Antiviral Gene Therapy Research Unit
University of the Witwatersrand
Faculty of Health Sciences, Room 7Q07
7 York Road, Parktown
2193

Tel: 2711 717 2465/7
Fax: 2711 717 2395
Email: liam.thompson at students.wits.ac.za / dejmail at gmail.com