[Bioperl-l] Mutation IO

Chris Fields cjfields at uiuc.edu
Thu Jan 18 15:56:42 UTC 2007


I haven't dabbled with the Mutation/Variation stuff, but couldn't one  
use a reference sequence (as Heikki suggests) and then use  
SeqFeatures for the alleles?  You could tag the seqfeature with the  
allele name for downstream work.  You could maybe add a SeqIO writer  
(Jason's suggestion) or just add a helper sub to Bio::SeqUtils for  
converting any variation data in a Bio::SeqI into the string you  
want, based on allele(s) you specify and the Seq object.

While working on Location stuff, I noticed this is how variations are  
represented in normal GenBank files, using the primary feature tag of  
'variation' or 'misc_difference' (I think there are a few others):

http://tinyurl.com/22coeq

Using SeqFeatures also allows for deletions/insertions:

http://tinyurl.com/2e2egw

http://tinyurl.com/277a6g


chris

On Jan 18, 2007, at 1:39 AM, Heikki Lehvaslaiho wrote:

> Marian,
>
> Do not try to cram too much into one class. BIC format is  
> apparently a useful
> shorthand for some cases, but representing that in the memory using  
> objects
> in an expandable way is an other thing.
>
> Your example below describes an individual's diploid genotype.  
> Putting that
> into one sequence object is not a good idea. The way to model that  
> is to have
> a reference sequence and then define an individual that has that  
> sequence in
> diploid or haploid (sex chromosomes) setting and list the alleles  
> that person
> has in the reference sequence coordinate system. You might be  
> interested in
> separating the alleles by chromosomes, too.
>
> Representing, reporting and modelling genotype information is  
> something that
> has been of interest for me and a group of other people for some  
> time. An
> early draft of a web site about a genotyping standard can be found  
> here:
> http://www.openpml.org. It being worked on heavily and more  
> material will be
> added soon.
>
> 	-Heikki
>
>
> On Thursday 18 January 2007 01:31, marian thieme wrote:
>> Jason, your right, probably it is some kind of abuse of the  
>> bioperl api,
>> but its a very quick way to get results, because I dont need to  
>> cope with
>> replacing substrings. On the other hand, if you are using the  
>> Root.pm class
>> in other scripts, it can probably cause some malfunction  
>> (inclusive crash
>> of your application). Probably its no big matter to provide a  
>> filestream IO
>> class which is reading/writing the sequence and translates the in/ 
>> from
>> IUPAC chars. But one thing I dont see at present: How would you  
>> represent
>> more complex mutations, as change of few bases ? Ok here we could  
>> represent
>> each position seperatly. But in the case of a mutation ? I dont  
>> know if
>> there is a iupac char which treats a mutation ! Lets consider this  
>> case:
>> 1.) origin of some position is a
>> 2.) some individual has in one locus an a and the other is missing  
>> that
>> base or perhaps both loci are missing the a. so via BIC notation  
>> you can
>> write [a/_] resp. [_/_]. Any idea how to resolve this ?
>>
>> Marian
>>
>>> Von: Jason Stajich <jason at bioperl.org>
>>> An: marian thieme <marian.thieme at lycos.de>
>>> Betreff: Re: [Bioperl-l] Bio::Root::Root/Bio::LiveSeq::Mutation
>>> Datum: Wed, 17 Jan 2007 08:40:45 -0800
>>>
>>> I think you are ignoring the fact that errors are thrown for a
>>> reason, not just to annoy you.
>>>
>>> Why not store the data in Bio::Seq objects as IUPAC ambiguity codes
>>> and write a special writer class in Bio::SeqIO which converts the
>>> ambiguity codes to your specified encoding.
>>> There are examples of how to write your own Bio::SeqIO class in the
>>> HOWTO tutorials when we talk about extending the toolkit. There is
>>> also all the code to decompose an ambiguity code into the bases it
>>> represents.
>>>
>>>
>>> -jason
>>>
>>> On Jan 16, 2007, at 2:20 AM, marian thieme wrote:
>>>> Hi, as I told to this list some time ago, I want to ouput
>>>> heterozygous dna sequences of different individuals.
>>>> We need to output variations in the following manner:
>>>> [a/g] if there is a loci where one allele has an "a" and the other
>>>> has a "g". (Also known as BIC db format or something like this)
>>>> My approach is to use the Bio::LiveSeq::Mutation (class ?) to
>>>> change the specific position in the sequence.
>>>>
>>>>
>>>> Bio::SeqUtils->mutate($seqobj, Bio::LiveSeq::Mutation->new(
>>>>   -seq => "[a/g]",
>>>>   -seqori => $seqori,
>>>>   -pos => $pos,
>>>>   -len => $length));
>>>>
>>>> But unfortunatly this would rise an exception, that some unexpected
>>>> chars occur. Hence I went in to the code of Root.pm and made a
>>>> small change: commenting out line 359 in Root.pm :
>>>>
>>>> if( $ERRORLOADED ) {
>>>> #       print STDERR "  Calling Error::throw\n\n";
>>>>
>>>>        # Enable re-throwing of Error objects.
>>>>        # If the error is not derived from Bio::Root::Exception,
>>>>        # we can't guarantee that the Error's value was set properly
>>>>        # and, ipso facto, that it will be catchable from an eval{}.
>>>>        # But chances are, if you're re-throwing non-
>>>> Bio::Root::Exceptions,
>>>>        # you're probably using Error::try(), not eval{}.
>>>>        # TODO: Fix the MSG: line of the re -thrown error. Has an
>>>> extra line
>>>>        # containing the '----- EXCEPTION -----' banner.
>>>>        if( ref($args[0])) {
>>>>            if( $args[0]->isa('Error')) {
>>>>                my $class = ref $args[0];
>>>>                $class->throw( @args );
>>>>            } else {
>>>>                my $text .= "\nWARNING: Attempt to throw a non-
>>>> Error.pm object: " . ref$args[0];
>>>>                my $class = "Bio::Root::Exception";
>>>>                $class->throw( '-text' => $text, '-value' => $args
>>>> [0] );
>>>>            }
>>>>        } else {
>>>>            $class ||= "Bio::Root::Exception";
>>>>
>>>>            my %args;
>>>>            if( @args % 2 == 0 && $args[0] =~ /^-/ ) {
>>>>                %args = @args;
>>>>                $args{-text} = $text;
>>>>                $args{-object} = $self;
>>>>            }
>>>>
>>>> (Line 359:)   #$class->throw( scalar keys %args > 0 ? %args :
>>>> @args ); # (%args || @args) puts %args in scalar context!
>>>>  &nbs p;     }
>>>>    }
>>>>
>>>>
>>>> After I did alter this line all is working fine. But I know that
>>>> this can be considered in the best case  as a work around.
>>>>
>>>> 2 Questions:
>>>>
>>>> Do you think it is worth to provide some class which are natively
>>>> able to cope with that matter ?
>>>> Do I need to expect some unwanted behavior of some scripts resp.
>>>> classes ?
>>>>
>>>> Regards,
>>>> Marian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _________________________________
>>>> Stelle Deine Fragen bei Lycos iQ <a
>>>
>>> href=http://iq.lycos.de/qa/ask/>http://iq.lycos.de/qa/ask/</a>>
>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> <a
>>>
>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http:// 
>>> lists.op
>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>> --
>>> Jason Stajich
>>> Miller Research Fellow
>>> University of California, Berkeley
>>> lab: 510.642.8441
>>> <a
>>> href=http://pmb.berkeley.edu/~taylor/people/js.html>http:// 
>>> pmb.berkeley.e
>>> du/ ~taylor/people/js.html</a>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> <a
>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http:// 
>>> lists.op
>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>
>> Schnell und einfach ohne Anschlusswechsel zur Lycos DSL Flatrate  
>> wechseln
>> und 3 Monate kostenlos ab effektiven 5,21 EUR pro Monat im ersten  
>> Jahr
>> surfen.
>> http://www.lycos.de/startseite/online/dsl/index.html? 
>> prod=DSL&trackingID=em
>> ail_footertxt
>
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>   _/  _/  _/  University of Western Cape, South Africa
>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list