[Biopython] Pulling Alignment From PSI-Blast Output

Michiel de Hoon mjldehoon at yahoo.com
Tue Feb 8 01:20:09 UTC 2011


One option you could try is to let PSI-Blast generate its output in XML and check if the information you need is present in the XML. If it is, you can parse the XML with the read() function in Bio.Entrez. You may find that Bio.Entrez needs an additional DTD file to be able to parse the PSI-Blast XML output (Bio.Entrez will tell you which one and where to store it). If so, please let us know, so we can include the required DTDs in the next release of Biopython.

--Michiel.

--- On Mon, 2/7/11, Brett Bowman <bnbowman at gmail.com> wrote:

> From: Brett Bowman <bnbowman at gmail.com>
> Subject: [Biopython] Pulling Alignment From PSI-Blast Output
> To: biopython at biopython.org
> Date: Monday, February 7, 2011, 5:30 PM
> I'm trying to use the PSI-Blast
> results from a series of proteins to detect
> distant homologues, using HMMs of various sorts. 
> Currently I'm pulling down
> the sequence IDs with PSI-Blast, downloading the full
> sequences from NCBI,
> then aligning everything with ClustalW or Muscle. 
> However this is eating up
> way more processor time than I have to spare, so I want to
> just pull the
> full multi-sequence alignment from the PSI-blast results if
> possible (OUTFMT
> option #3 or 4), for use in building the HMMs.  But it
> doesn't look like
> AlignIO has a module for reading the peculiar format that
> PSI-Blast
> generates...
> 
> Has this been done before, or will I need to write my own
> parser?
> 
> Brett Bowman
> Woelk Lab
> UCSD School of Medicine
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 


      




More information about the Biopython mailing list