[Biopython] How to get intron/exon boundaries?

Sean Davis sdavis2 at mail.nih.gov
Mon Nov 30 00:55:03 UTC 2009


On Fri, Nov 27, 2009 at 7:18 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
> On Fri, Nov 27, 2009 at 5:03 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>> On Fri, Nov 27, 2009 at 10:56 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
>>> I'm wondering how to get intron exon boundaires for all the genes.
>>> Could somebody show me what functions I should use?
>>
>> What do you want to know? The co-ordinates of the intron/exons,
>> or just to get the coding sequence?
>
> I want the co-ordinates.

You are talking about coordinates in genomic space or on the
transcript?  What organism?  And what annotation system do you want to
use--Ensembl, UCSC, or NCBI?

>> What kind of data are you looking at? For GenBank or EMBL
>> files this is encoded in the CDS feature locations. For GFF
>> files I think this information is given explicitly,
>
> Would you please let me know how to get the CDS feature locations from
> GenBank and EMBL? What are GFF files?

For GFF, google will get you a long way ("GFF format").

Sean



More information about the Biopython mailing list