[Bioperl-l] Fwd: problem with bioperl (where's the Mus?)
Anand C. Patel
acpatel at usa.net
Sun Aug 23 01:13:14 UTC 2009
Turns out that using the default namespace bioperl doesn't change
anything.
Common name -- still "genbank common name" in name_class in the
taxon_name table for "house mouse", which I think the module is
looking for as "common name".
It's not behaving differently despite reloading the sequences.
I've created a horrible munge that fixes it for cosmetic purposes:
my $species = $seq->species;
my $justspecies = $species->scientific_name();
my $binspecies = $species->binomial();
my $gbstring2 = $gbstring;
$gbstring2 =~ s/$binspecies/$justspecies/g;
$gbstring2 =~ s/$justspecies/$binspecies/g;
But this does not strike me as a long term solution.
Thanks,
Anand
On Aug 22, 2009, at 6:21 PM, Hilmar Lapp wrote:
>
> On Aug 22, 2009, at 6:44 PM, Anand C. Patel wrote:
>
>> [...]
>> I think I know what's broken. Using load_seqdatabases.pl, I'd put
>> a set of sequences from genbank into a biosql db in mysql.
>>
>> I'd also loaded the ncbi taxonomy using the load_ncbi_taxonomy.pl
>> script from biosql.
>
> Did you load the NCBI taxonomy first, or afterwards?
>
>>
>> When I searched for house (as in house mouse), I found that the
>> name of the type of taxon class was "genbank common name".
>>
>> When I searched for musculus, it does appear as a type of
>> "scientific name".
>
> It is the 'scientific name' class names that Bioperl-db will onto
> the lineage array.
>
>> [...]
>> I'm not just getting warnings. I'm getting errors. Tons of them.
>> It's a wonder it's working at all.
>
> I'm not sure what you're referring to, but what you pasted into your
> email were neither errors nor warnings but a debugging log (and what
> it prints looks like it's working fine). You triggered that by
> setting -verbose to a value greater than 0. If you don't want
> debugging output, then you can just leave off that argument (no
> debugging output is the default).
>
>>
>> I started with the getentry.cgi script in the cgi-bin folder, and
>> stripped most of it away.
>
> I see - which reminds me that I need to look at that script; I'm
> afraid it hasn't been updated for a long time (that doesn't mean
> though that it can't work - the core API has been stable for years).
>
>>
>> Code:
>> #!/usr/bin/perl
>>
>> [...]
>> if( $@ || !defined $seq) {
>> print "Got fetch exception of...\n<pre>$@\n</pre>";
>> exit(0);
>> }
>
> Wouldn't you want to put that right after the eval() clause?
>
> -hilmar
>
>>
>>
>>>
>>> On Aug 22, 2009, at 4:17 PM, Chris Fields wrote:
>>>
>>>> Anand,
>>>>
>>>> You should always post emails to the bioperl-l mailing list,
>>>> never to individual developers (you'll get an answer much
>>>> faster). Keep responses on the list as well.
>>>>
>>>> Though I use bioperl-db some, I'm probably not the best person to
>>>> ask. Does anyone know what's going on with this? Does this have
>>>> to do with the Species/Taxon refactoring?
>>>>
>>>> chris
>>>>
>>>> Begin forwarded message:
>>>>
>>>>> From: "Anand C. Patel" <acpatel at gmail.com>
>>>>> Date: August 22, 2009 2:57:42 PM CDT
>>>>> To: cjfields at illinois.edu
>>>>> Subject: problem with bioperl (where's the Mus?)
>>>>>
>>>>> Dr. Fields,
>>>>>
>>>>> I'm struggling with what seems to be a strange quirk in Bioperl
>>>>> +/- Bioperl-db/BioSQL.
>>>>>
>>>>> I've successfully loaded in genbank sequences into a biosql
>>>>> database.
>>>>>
>>>>> When I try to write a genbank sequence back out, a curious thing
>>>>> happens -- the Genus is missing from the SOURCE and ORGANISM
>>>>> areas.
>>>>>
>>>>> Despite reporting:
>>>>> primary tag: source
>>>>> tag: chromosome
>>>>> value: 3
>>>>>
>>>>> tag: db_xref
>>>>> value: taxon:10090
>>>>>
>>>>> tag: map
>>>>> value: 3 74.5 cM
>>>>>
>>>>> tag: mol_type
>>>>> value: mRNA
>>>>>
>>>>> tag: organism
>>>>> value: Mus musculus
>>>>> The sequence when printed out via SeqIO looks like this:
>>>>> LOCUS NM_017474 2935 bp dna linear
>>>>> ROD 13-AUG-2009
>>>>> DEFINITION Mus musculus chloride channel calcium activated 3
>>>>> (Clca3), mRNA.
>>>>> ACCESSION NM_017474 XM_978159
>>>>> VERSION NM_017474.2 GI:255918210
>>>>> KEYWORDS .
>>>>> SOURCE musculus
>>>>> ORGANISM musculus
>>>>> Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa;
>>>>> Bilateria;
>>>>> Coelomata; Deuterostomia; Chordata; Craniata; Vertebrata;
>>>>> Gnathostomata; Teleostomi; Euteleostomi; Sarcopterygii;
>>>>> Tetrapoda;
>>>>> Amniota; Mammalia; Theria; Eutheria; Euarchontoglires;
>>>>> Glires;
>>>>> Rodentia; Sciurognathi; Muroidea; Muridae; Murinae; Mus.
>>>>> Confession -- I have a final project due Monday wherein I boldly
>>>>> elected to interface Bioperl, MySQL, Perl, and CGI.
>>>>> (I'm an MD getting my MS in Bioinformatics.)
>>>>> After many misadventures, I'm getting to the point where I could
>>>>> actually complete the objectives, but this is bug is rather
>>>>> problematic.
>>>>> Thanks,
>>>>> Anand
>>>>> Anand C. Patel, MD
>>>>> Assistant Professor of Pediatrics
>>>>> Division of Allergy/Pulmonary Medicine
>>>>> Department of Pediatrics
>>>>> Washington University School of Medicine
>>>>> 660 South Euclid Ave, Campus Box 8052
>>>>> St. Louis, MO 63110
>>>>> acpatel at wustl.edu
>>>>> acpatel at gmail.com
>>>>> acpatel at jhu.edu
>>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>
>
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
More information about the Bioperl-l
mailing list