[Bioperl-l] Parsing Genbank
Mark A. Jensen
maj at fortinbras.us
Wed Dec 2 20:52:28 UTC 2009
Yes, 1.006 is 1.6. There is a later update 1.6.1, but it sounds
as if there is a bug. If you can provide data that can reproduce
it, as Chris suggests, we can get onto it.
thanks MAJ
----- Original Message -----
From: Brandi Cantarel
To: Mark A. Jensen
Sent: Wednesday, December 02, 2009 3:38 PM
Subject: Re: [Bioperl-l] Parsing Genbank
How can I tell what version I am using?When I use the command from the website:
perl -MBio::Root::Version -e 'printf "%vd\n", $Bio::Root::Version::VERSION'
I get 1.006, but the bioperl lib was updated in July, so probably 1.6.0 version since that was the last stable release….
Brandi
On Dec 2, 2009, at 2:48 PM, Mark A. Jensen wrote:
with fake seq data and that header, I don't get a problem:
DB<2> x $cds->location
0 Bio::Location::Simple=HASH(0x37b1df4)
'_end' => 974
'_location_type' => 'EXACT'
'_root_verbose' => 0
'_seqid' => 'subjpool12_contig3'
'_start' => 911
'_strand' => '-1'
Are you using the latest BioPerl (1.6.1 or the trunk) ?
MAJ
----- Original Message ----- From: "Brandi Cantarel" <bcantarel at som.umaryland.edu>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, December 02, 2009 2:29 PM
Subject: Re: [Bioperl-l] Parsing Genbank
Here is some of my code, the real code actually enters the data into a database.
$in = Bio::SeqIO->new(-file => $gbkfile,
'-format' => 'genbank');
W1:while (my $seq = $in->next_seq()) {
my @feats = $seq->get_all_SeqFeatures();
my $j = 0;
F1:foreach $cds (@feats) {
next F1 unless ($cds->primary_tag() eq 'CDS');
###>> debugger stops here for above output
#do something with the cds start and cds end
}
}
LOCUS subjpool12_contig3 974 bp DNA linear UNK 19-Nov-2009
ACCESSION subjpool12_contig3
KEYWORDS .
SOURCE human metagenome
ORGANISM human metagenome
unclassified sequences; organismal metagenomes,metagenomes.
FEATURES Location/Qualifiers
source 1..974
/mol_type="genomic DNA"
/isolation_source="Homo sapiens"
/organism="human metagenome"
/collection_date="19-Nov-2009"
CDS complement(911..974)
/locus_tag="subjpool12_contig3|metagene|gene_2"
/translation="IRIMTVELINPYIRHVEHST"
/score="2.52804"
/product="hypothetical protein"
/note="score=2.52804"
/note="score=2.52804"
/note="frame=1"
ORIGIN
#some sequence….
From this example, I would like to get the coordinates 911 and 974, rather than 1 and 64.
~~~~~~~~~~~~~~~~~~~~
Brandi Cantarel, PhD
Bioinformatics Analyst
Institute for Genome Sciences
School of Medicine
University of Maryland, Baltimore
On Dec 2, 2009, at 2:09 PM, Mark A. Jensen wrote:
Hi Brandi-
If $cds is a Bio::SeqFeature::Generic, that's weird (I believe); if its an ordinary Bio::Seq, that's normal.
Can you elaborate by posting your code?
cheers,
MAJ
----- Original Message ----- From: "Brandi Cantarel" <bcantarel at som.umaryland.edu>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, December 02, 2009 1:36 PM
Subject: [Bioperl-l] Parsing Genbank
Hi all,
I am not sure if this is normal, but when I use SEQIO to parse genbank files, it changes the coordinates of things on the minus strand.
For example, I have a sequence that has a CDS on the minus strand at it is from 911 to 974. The sequence is 974 nt.
x $cds->start
1
x $cds->end
64
How can I get the original coordinates? Is there a command for that or will I have to just do the math?
Feature or Bug?
~~~~~~~~~~~~~~~~~~~~
Brandi Cantarel, PhD
Bioinformatics Analyst
Institute for Genome Sciences
School of Medicine
University of Maryland, Baltimore
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list