[Bioperl-l] trying to save blast hit sequences to fasta file
Xianran Li
xianranli78 at yahoo.com.cn
Thu Aug 2 08:56:04 UTC 2007
----- Original Message -----
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>; <bioperl-l at bioperl.org>
Cc: <jay at jays.net>
Sent: Thursday, August 02, 2007 3:06 PM
Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file
> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence. I don't understand very well the difference between
> hit_string() and query_string(), are they the whole sequence that is
> similiar (about hit_string), a part of the whole sequence or just the
> part that is aligned to my query string?
The hit_string() returns the aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment.
>
> With the previous code what I have are different sequences in length
> with the same id as my query string, so I am not sure that I am doing
> what I need to do. Any light on this point?
Did you specify the $id before
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem.
my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");
my $i;
while(my $result = $blast_report->next_result()) {
while(my $hit = $result->next_hit()) {
while(my $hsp = $hit->next_hsp()) {
$i ++;
my $hseq = $hsp->hit_string();
$hseq =~ s/-//g; #### remove the gap within the aligment
my $id = $i; ###### specifiy the id
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
# $seq_out->write_seq($hseq);
$seq_out->write_result($hseq_obj);
}
}
}
Xianran
>
> Thank you very much for your help.
> Alicia
>
> > Alicia,
> >
> > > Hi, I would like to save my hit sequences from a blast result in a fasta
> > > file. I am trying some things but I have problems using Bio::SearchIO
> > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > > code:
> > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > > "fasta");
> > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > > => "fasta");
> > > ...
> > > my $hseq = $hsp->hit_string();
> > > # $seq_out->write_seq($hseq);
> > > $seq_out->write_result($hseq);
> >
> > You have encountered two common problems for BioPerl beginners:
> >
> > 1. "fasta" means two different things! In SearchIO it refers to the
> > output format of the "fasta" sequence alignment software. In SeqIO it
> > refers to a file format that stores just sequences. Confusing, I know.
> > You need SeqIO and write_seq, not SearchIO and write_result.
> >
> > 2. $hseq is a STRING which has the raw sequence letters in it.
> > However, the write_seq() method needs a Bio::Seq object (which has
> > extra details like the name and ID) not a raw string.
> >
> > The example code Jay Hannah supplied in his reply looks pretty good,
> > you should try it.
> >
> > --
> > --Torsten Seemann
> > --Victorian Bioinformatics Consortium, Monash University
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-lÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿá¶Úÿÿ÷'þf¢—üš†Šÿ
More information about the Bioperl-l
mailing list