[Biopython] Fetching fasta sequences by accession number

Sat Apr 21 20:02:36 UTC 2018

Thanks a lot. I actually coded it and only after running, I realized I was
misinformed. I don't have the accession numbers, I have gene names like
this HS71L_HUMAN, CATIP_HUMAN , ZMY12_HUMAN, ..etc. Any suggestions?

Regards.

On Fri, Apr 20, 2018 at 2:44 PM, Iddo Friedberg <idoerg at gmail.com> wrote:

> Uniprot has several APIs to access it: http://www.uniprot.org/help/
> programmatic_access but I am not sure there is a module in biopython that
> accesses that. But it should be easy to do, just use a script to retrieve
> this generic URL:
>
> https://www.uniprot.org/uniprot/P12345.fasta
>
> where "P12345" is replaced by whatever UniprotID you have.
>
>
> Then you can upload your concatenated FASTA file to the HMMER site, I am
> not sure what their size limitation is.
>
> If it's only swissprot you are interested in, and you have the disk space,
> I suggest you download it, download HMMER and whatever reference databases
> you wish to run against, and do it all locally. Especially if you have a
> large number of sequences to process. Biopython can read swissprot or XML
> uniprot files via SeqIO.
>
> To download uniprot: http://www.uniprot.org/downloads
>
>
> On Fri, Apr 20, 2018 at 1:11 PM, Ahmad Abdelzaher <underoath006 at gmail.com>
> wrote:
>
>> Thank you for the reply. The accession numbers that I want to fetch the
>> fasta sequences for are uniprot accession numbers. I want to search for
>> homologs for these sequences on HMMR: https://www.ebi.ac.uk/Tools/hmmer/
>>
>> Any suggestions on how to do so?
>>
>> On Fri, Apr 20, 2018 at 10:16 AM, Iddo Friedberg <idoerg at gmail.com>
>> wrote:
>>
>>> There is an example here for downloading multiple GenBank entries:
>>> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc131
>>>
>>> Depending on the actual database you are downloading from, you can use
>>> rettype="fasta" or convert a genbank file to fasta as in here:
>>> http://biopython.org/wiki/Converting_sequence_files
>>>
>>> The possible rettype and retmode  are  dependent on the database you are
>>> fetching from, and determined of the efetch API . More about that here:
>>> https://www.ncbi.nlm.nih.gov/books/NBK25499/#chapter4.EFetch
>>>
>>> HTH,
>>>
>>> Iddo
>>>
>>>
>>> On Fri, Apr 20, 2018 at 6:25 AM, Ahmad Abdelzaher <
>>> underoath006 at gmail.com> wrote:
>>>
>>>> How can I batch download fasta sequences by accession number? Is there
>>>> a Biopython method that can do that? Any other suggestions or alternatives?
>>>>
>>>> Regards.
>>>>
>>>> _______________________________________________
>>>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>>>> http://mailman.open-bio.org/mailman/listinfo/biopython
>>>>
>>>
>>>
>>>
>>> --
>>> Iddo Friedberg
>>> http://iddo-friedberg.net/contact.html
>>> ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
>>> ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
>>> .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
>>> >>----.<--.>++++++.<<<<------------------------------------.
>>>
>>
>>
>
>
> --
> Iddo Friedberg
> http://iddo-friedberg.net/contact.html
> ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> >>----.<--.>++++++.<<<<------------------------------------.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20180421/07071feb/attachment.html>