[Bioperl-l] Getting sequences by ID

Yuval Itan y.itan at ucl.ac.uk
Thu Apr 6 15:59:06 UTC 2006


Thanks a lot for your help guys. Problem solved, although I did it in very 
non-elegant way mixing Bioperl and too many IOs... :

my $seqio_obj = Bio::SeqIO->new(-file => 
"/home/Yuval/unproc_pseudo.build35.cdna", -format => "fasta" ); 

#file to write into
my $seq_out = Bio::SeqIO->new(-file => 
">/home/Yuval/unproc_pseudo_truncated.build35.cdna", -format => "fasta");

while (my $seq_obj = $seqio_obj->next_seq) # reading ids from each gene in big 
file
{   
    my $temp2 = $seq_obj->display_id;
    open(FILE, "/home/Yuval/Pseudo_human35like_trunctuated_IDs.txt"); 
#trunctuated Ids
    while (<FILE>) #reading wanted ids
    {
	my $temp = $_;
	if ($temp =~ /$temp2/ || $temp2 =~ /$temp/) #id match
	{
	    $seq_out->write_seq($seq_obj); #writing the fasta truncated sequence
	}
    }
    close(FILE);
}

Cheers,

Yuval

On Thursday 06 April 2006 14:04, Brian Osborne wrote:
> Yuval,
>
> See:
>
> http://www.bioperl.org/wiki/HOWTO:Beginners#Indexing_for_Fast_Retrieval
>
> Also see:
>
> http://www.bioperl.org/wiki/Bioperl_scripts
>
>
> Brian O.
>
> On 4/5/06 6:00 PM, "Yuval Itan" <y.itan at ucl.ac.uk> wrote:
> > Hi Torsten,
> >
> > I would be grateful for an advice from you regarding Bioperl, after I was
> > fiddling around trying to write the Perl script for that from scratch.
> > I have a large fasta file of about 20,000 genes, and another file which
> > is a list of about 2,000 gene IDs (no sequences), all included in the
> > large file. I need to create a fasta file which will include only the
> > genes with these specific 200 IDs. I was wondering if there is a method
> > in Bioperl that will allow me to do the following pseudocode:
> >
> > For each $ID from 200_IDs_set_file
> > {
> > $my_seq = get_sequence_by_ID(from large_fasta_file, $ID)
> > write $my_seq into file
> > }
> >
> > Many thanks for any hint!
> >
> > Yuval
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list