[Bioperl-l] piping values into an existing GENBANK file

Jason Stajich jason.stajich at gmail.com
Mon May 7 13:38:15 EDT 2012


your question is unclear, maybe you can show what you want the output to look like.

are you trying to conditionally add a COGlist of info to only certain CDSes? Then you need to have a hash or a dataset that defines the CDS you want to add values to and then you need to interrogate each of the CDS features to get their name for example ($locus) = $feat->get_tag_values('locus_tag')  and use that info to determine which features you will update.

If you want to write it back out as genbank you would just initialize another SeqIO object that writes genbank and pass the sequence object to it - since the feature object is updated it will just be written out with the new info as attached to the sequence.

jason
On Apr 21, 2012, at 2:22 AM, Alavi, Mohammadali (0313xxx) wrote:

> Hello All,
> I have a GENBANK file already, to which I need to add some feauture. To be precise, I want to add the data (over the COG function) to the CDSs present in the GENBANK file.  The data (COG functions) I need to add is included in an array in a manner that the first value is the value needed to be added to my first CDS in the GENBANK file, the second value needs to be added to the second CDS in the GENBANK file and so on. I tried to add the data in a tag/value style to the CDSs (as described in HOW TO:Feautures-Annotation provided by Biopel), which actually basically works. The Problem is though, I do not know how I could tell Perl/Bioperl to only take one single value at a time and add it in a tag/value style to a CDS and then take the next (and only the next) value and add it to the NEXT CDS and so on. Here is the code I used. As you see, using the for $item(@array) is not appropriate, since it adds all the values of my array to all CDSs! 
> So is there a way of piping in values one after another into CDSs one after another in a file using Bioperl?! or maybe how about another way of doing it in regular Perl? I would appreciate any help on that very much!
> 
> 
> Bioperl I'm using: 1.6.1
> The Active Perl I'm using : 5.12.4 (on Windows Vista)
> 
> 
> #!/bin/perl
> use Bio::SeqIO;
> use Bio::SeqFeature::Generic;
> use warnings;
> 
> @COGlist = qw(motility General metabolism nunknown); # think of this as the #array I would like to add the values of to my file, the real one has ofcourse #as many values as the number of CDSs in the GENBANK file 
> 
> 
> 
> $seqio_object = Bio::SeqIO -> new(-file => "file.gbk", -format => "genbank");
> $seq_object = $seqio_object -> next_seq;
> for $feat_object ($seq_object -> get_SeqFeatures){
> 	for $item(@COGlist){ # this would add all elements of the array to all of CDSs and is therefore wrong!
> 		$feat_object -> add_tag_value("note", $item); 		
> 	}
> 	 	 		 	
> 	for $tags ($feat_object -> get_all_tags){     	
>        print "tag:".$tags . "\n";
>        for $values ($feat_object -> get_tag_values($tags)){     	
>    print "value: " . $values . "\n";   # as one might imagine this does not give the output I have been looking for :-))
>        }
> 	  }
> }
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org





More information about the Bioperl-l mailing list