[Biopython] How to get sequences upstream of TSS of genes?

Peng Yu pengyu.ut at gmail.com
Fri Oct 16 14:52:00 UTC 2009


On Fri, Oct 16, 2009 at 3:29 AM, Giovanni Marco Dall'Olio
<dalloliogm at gmail.com> wrote:
> On Thu, Oct 15, 2009 at 11:17 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
>> I have a set of genes. I want to get the 5kb sequence that is upstream
>> of the TSS's of each gene.
>
> You can do that with biomart:
> - http://www.ensembl.org/biomart/martview/a90f00892a48e04d438f762f551bf48a/a90f00892a48e04d438f762f551bf48a
>
> select Ensembl56 as database, Mus Musculus as species, go to Filters
> and fill the 'Id list limit' form to add the required geneIds, then go
> to Attributes, select Sequences and then check 'Upstream Flank -
> 5000'.

I have gene names (for example, Krt83) what geneIDs shall I choose?

> As for doing that in python, I am not sure there are python interfaces
> to BioMart. Galaxy (http://main.g2.bx.psu.edu/) is written in python,
> so they must have written a library for that somewhere, but I don't
> know their code.
>
> If you use R (remember that you can mix python and R with rpy2) there
> is a nice module in bioconductor called BioMart.
>
>
>> I have the following specific questions. Could somebody help me? Thank you!
>>
>> Which database I can access to get mouse genome?
>> Give a gene name what function I should call to get the gene's location?
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>
>
> --
> Giovanni Dall'Olio, phd student
> Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)
>
> My blog on bioinformatics: http://bioinfoblog.it
>




More information about the Biopython mailing list