[Bioperl-l] Mutation IO

Chris Fields cjfields at uiuc.edu
Thu Jan 18 16:45:21 UTC 2007


Agreed.  The only reason I use them is the wrap-around issue.

For future eyes:

http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=1335776
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=558492
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=468423

chris


On Jan 18, 2007, at 10:25 AM, Mauricio Herrera Cuadra wrote:

> Folks,
>
> A bit off-topic here: It would be better if we post full URLs  
> instead of tinyfied ones. I think of this because tinyurls have an  
> expiration date thus leaving a soon-to-expire URL archived in the  
> mailing list and making the list archive less useful for future  
> references.
>
> Regards,
> Mauricio.
>
> Chris Fields wrote:
>> I haven't dabbled with the Mutation/Variation stuff, but couldn't  
>> one  use a reference sequence (as Heikki suggests) and then use   
>> SeqFeatures for the alleles?  You could tag the seqfeature with  
>> the  allele name for downstream work.  You could maybe add a SeqIO  
>> writer  (Jason's suggestion) or just add a helper sub to  
>> Bio::SeqUtils for  converting any variation data in a Bio::SeqI  
>> into the string you  want, based on allele(s) you specify and the  
>> Seq object.
>> While working on Location stuff, I noticed this is how variations  
>> are  represented in normal GenBank files, using the primary  
>> feature tag of  'variation' or 'misc_difference' (I think there  
>> are a few others):
>> http://tinyurl.com/22coeq
>> Using SeqFeatures also allows for deletions/insertions:
>> http://tinyurl.com/2e2egw
>> http://tinyurl.com/277a6g
>> chris
>> On Jan 18, 2007, at 1:39 AM, Heikki Lehvaslaiho wrote:
>>> Marian,
>>>
>>> Do not try to cram too much into one class. BIC format is   
>>> apparently a useful
>>> shorthand for some cases, but representing that in the memory  
>>> using  objects
>>> in an expandable way is an other thing.
>>>
>>> Your example below describes an individual's diploid genotype.   
>>> Putting that
>>> into one sequence object is not a good idea. The way to model  
>>> that  is to have
>>> a reference sequence and then define an individual that has that   
>>> sequence in
>>> diploid or haploid (sex chromosomes) setting and list the  
>>> alleles  that person
>>> has in the reference sequence coordinate system. You might be   
>>> interested in
>>> separating the alleles by chromosomes, too.
>>>
>>> Representing, reporting and modelling genotype information is   
>>> something that
>>> has been of interest for me and a group of other people for some   
>>> time. An
>>> early draft of a web site about a genotyping standard can be  
>>> found  here:
>>> http://www.openpml.org. It being worked on heavily and more   
>>> material will be
>>> added soon.
>>>
>>> 	-Heikki
>>>
>>>
>>> On Thursday 18 January 2007 01:31, marian thieme wrote:
>>>> Jason, your right, probably it is some kind of abuse of the   
>>>> bioperl api,
>>>> but its a very quick way to get results, because I dont need to   
>>>> cope with
>>>> replacing substrings. On the other hand, if you are using the   
>>>> Root.pm class
>>>> in other scripts, it can probably cause some malfunction   
>>>> (inclusive crash
>>>> of your application). Probably its no big matter to provide a   
>>>> filestream IO
>>>> class which is reading/writing the sequence and translates the  
>>>> in/ from
>>>> IUPAC chars. But one thing I dont see at present: How would you   
>>>> represent
>>>> more complex mutations, as change of few bases ? Ok here we  
>>>> could  represent
>>>> each position seperatly. But in the case of a mutation ? I dont   
>>>> know if
>>>> there is a iupac char which treats a mutation ! Lets consider  
>>>> this  case:
>>>> 1.) origin of some position is a
>>>> 2.) some individual has in one locus an a and the other is  
>>>> missing  that
>>>> base or perhaps both loci are missing the a. so via BIC  
>>>> notation  you can
>>>> write [a/_] resp. [_/_]. Any idea how to resolve this ?
>>>>
>>>> Marian
>>>>
>>>>> Von: Jason Stajich <jason at bioperl.org>
>>>>> An: marian thieme <marian.thieme at lycos.de>
>>>>> Betreff: Re: [Bioperl-l] Bio::Root::Root/Bio::LiveSeq::Mutation
>>>>> Datum: Wed, 17 Jan 2007 08:40:45 -0800
>>>>>
>>>>> I think you are ignoring the fact that errors are thrown for a
>>>>> reason, not just to annoy you.
>>>>>
>>>>> Why not store the data in Bio::Seq objects as IUPAC ambiguity  
>>>>> codes
>>>>> and write a special writer class in Bio::SeqIO which converts the
>>>>> ambiguity codes to your specified encoding.
>>>>> There are examples of how to write your own Bio::SeqIO class in  
>>>>> the
>>>>> HOWTO tutorials when we talk about extending the toolkit. There is
>>>>> also all the code to decompose an ambiguity code into the bases it
>>>>> represents.
>>>>>
>>>>>
>>>>> -jason
>>>>>
>>>>> On Jan 16, 2007, at 2:20 AM, marian thieme wrote:
>>>>>> Hi, as I told to this list some time ago, I want to ouput
>>>>>> heterozygous dna sequences of different individuals.
>>>>>> We need to output variations in the following manner:
>>>>>> [a/g] if there is a loci where one allele has an "a" and the  
>>>>>> other
>>>>>> has a "g". (Also known as BIC db format or something like this)
>>>>>> My approach is to use the Bio::LiveSeq::Mutation (class ?) to
>>>>>> change the specific position in the sequence.
>>>>>>
>>>>>>
>>>>>> Bio::SeqUtils->mutate($seqobj, Bio::LiveSeq::Mutation->new(
>>>>>>   -seq => "[a/g]",
>>>>>>   -seqori => $seqori,
>>>>>>   -pos => $pos,
>>>>>>   -len => $length));
>>>>>>
>>>>>> But unfortunatly this would rise an exception, that some  
>>>>>> unexpected
>>>>>> chars occur. Hence I went in to the code of Root.pm and made a
>>>>>> small change: commenting out line 359 in Root.pm :
>>>>>>
>>>>>> if( $ERRORLOADED ) {
>>>>>> #       print STDERR "  Calling Error::throw\n\n";
>>>>>>
>>>>>>        # Enable re-throwing of Error objects.
>>>>>>        # If the error is not derived from Bio::Root::Exception,
>>>>>>        # we can't guarantee that the Error's value was set  
>>>>>> properly
>>>>>>        # and, ipso facto, that it will be catchable from an  
>>>>>> eval{}.
>>>>>>        # But chances are, if you're re-throwing non-
>>>>>> Bio::Root::Exceptions,
>>>>>>        # you're probably using Error::try(), not eval{}.
>>>>>>        # TODO: Fix the MSG: line of the re -thrown error. Has an
>>>>>> extra line
>>>>>>        # containing the '----- EXCEPTION -----' banner.
>>>>>>        if( ref($args[0])) {
>>>>>>            if( $args[0]->isa('Error')) {
>>>>>>                my $class = ref $args[0];
>>>>>>                $class->throw( @args );
>>>>>>            } else {
>>>>>>                my $text .= "\nWARNING: Attempt to throw a non-
>>>>>> Error.pm object: " . ref$args[0];
>>>>>>                my $class = "Bio::Root::Exception";
>>>>>>                $class->throw( '-text' => $text, '-value' => $args
>>>>>> [0] );
>>>>>>            }
>>>>>>        } else {
>>>>>>            $class ||= "Bio::Root::Exception";
>>>>>>
>>>>>>            my %args;
>>>>>>            if( @args % 2 == 0 && $args[0] =~ /^-/ ) {
>>>>>>                %args = @args;
>>>>>>                $args{-text} = $text;
>>>>>>                $args{-object} = $self;
>>>>>>            }
>>>>>>
>>>>>> (Line 359:)   #$class->throw( scalar keys %args > 0 ? %args :
>>>>>> @args ); # (%args || @args) puts %args in scalar context!
>>>>>>  &nbs p;     }
>>>>>>    }
>>>>>>
>>>>>>
>>>>>> After I did alter this line all is working fine. But I know that
>>>>>> this can be considered in the best case  as a work around.
>>>>>>
>>>>>> 2 Questions:
>>>>>>
>>>>>> Do you think it is worth to provide some class which are natively
>>>>>> able to cope with that matter ?
>>>>>> Do I need to expect some unwanted behavior of some scripts resp.
>>>>>> classes ?
>>>>>>
>>>>>> Regards,
>>>>>> Marian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _________________________________
>>>>>> Stelle Deine Fragen bei Lycos iQ <a
>>>>> href=http://iq.lycos.de/qa/ask/>http://iq.lycos.de/qa/ask/</a>>
>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> <a
>>>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl- 
>>>>> l>http:// lists.op
>>>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>>>> --
>>>>> Jason Stajich
>>>>> Miller Research Fellow
>>>>> University of California, Berkeley
>>>>> lab: 510.642.8441
>>>>> <a
>>>>> href=http://pmb.berkeley.edu/~taylor/people/js.html>http://  
>>>>> pmb.berkeley.e
>>>>> du/ ~taylor/people/js.html</a>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> <a
>>>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl- 
>>>>> l>http:// lists.op
>>>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>>> Schnell und einfach ohne Anschlusswechsel zur Lycos DSL  
>>>> Flatrate  wechseln
>>>> und 3 Monate kostenlos ab effektiven 5,21 EUR pro Monat im  
>>>> ersten  Jahr
>>>> surfen.
>>>> http://www.lycos.de/startseite/online/dsl/index.html?  
>>>> prod=DSL&trackingID=em
>>>> ail_footertxt
>>> -- 
>>> ______ _/      _/ 
>>> _____________________________________________________
>>>       _/      _/
>>>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>>>    _/  _/  _/  SANBI, South African National Bioinformatics  
>>> Institute
>>>   _/  _/  _/  University of Western Cape, South Africa
>>>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>>> ___ _/_/_/_/_/ 
>>> ________________________________________________________
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Genética
> Unidad de Morfofisiología y Función
> Facultad de Estudios Superiores Iztacala, UNAM
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign







More information about the Bioperl-l mailing list