[Bioperl-l] Two 'host' tags?

Jason Stajich jason.stajich at gmail.com
Fri Jul 6 19:39:52 UTC 2012


Hi Jeremy -

You are printing for every feature in the loop (e.g. the source and the misc_RNA ) - you only want to loop through the features, then grab the one which is source, then change or print the info when you see that.  So you could have an if( $feature->primary_tag eq 'source') in there or something as well.  Alternatively I've left it pretty much intact and just simplified it a bit.
You should also try and use Bio::SeqIO to print instead of your printing.

I updated the code here to be simpler - right now it warns you that you are printing IDs with spaces (which is something you should think about when it comes to your output file, but I don't know your downstream plans). Also you could put other info in the description field if you wanted to capture accession number or the endophyte name too.

https://gist.github.com/3062285

Best,
Jason
On Jul 6, 2012, at 10:56 AM, Jeremy Hayward wrote:

> Hi--  Clueless newbie here, for which apologies.
> 
> I've posted a description of my problem, inputs and outputs, at Gist
> 2816510; https://gist.github.com/2816510
> 
> Briefly, I'm trying to take a genbank file (.gb), and create a FASTA
> file with a specific identifier line for each sequence. Specifically,
> I want the "host" tag as the identifier. With the help of the Bioperl
> beginner readme and the HOWTO's (which are great!) I've worked out how
> to loop through my sequences and get the 'host' tag for each one. For
> some reason, I get two identifier lines for each sequence. I guess the
> problem is in the 'for' loop--it's running the stuff below it twice,
> once with the actual 'host' tag data and once with...nothing? Not
> sure.
> 
> I think I can work out how to use s/ and a regex just to delete the
> second identifier line, but that feels like I'm avoiding the problem
> instead of fixing it. Any help appreciated!
> 
> 
> Many thanks,
> 
> --Jeremy Hayward
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org





More information about the Bioperl-l mailing list