FW: [Biojava-l] orderNSymbols and Alphabets
Schreiber, Mark
mark.schreiber@agresearch.co.nz
Fri, 2 Mar 2001 10:55:31 +1300
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_000_01C0A29A.5603C5C8
Content-Type: text/plain;
charset="iso-8859-1"
Sorry,
Should have sent this to the whole group
Mark
-----Original Message-----
From: Schreiber, Mark
Sent: Friday, March 02, 2001 9:58 AM
To: 'Matthew Pocock'
Subject: RE: [Biojava-l] orderNSymbols and Alphabets
Hi,
Attached is a program which details some of my adventures in orderNSymbol
land which may be of use as a demo/ tutorial.
Thanks to those who showed me how to do it.
Mark
> -----Original Message-----
> From: Matthew Pocock [mailto:mrp@sanger.ac.uk]
> Sent: Tuesday, February 27, 2001 11:55 PM
> To: Thomas Down
> Cc: Schreiber, Mark; 'biojava-l@biojava.org'
> Subject: Re: [Biojava-l] orderNSymbols and Alphabets
>
>
> ...and to make the n'th order symbol list for the distribution to be
> used with you can use one of:
>
> SymbolListViews.orderNSybolList(source, order)
> SymbolListViews.windowedSymbolList(source, windowWidth)
>
> Thomas Down wrote:
>
> > On Tue, Feb 27, 2001 at 05:04:59PM +1300, Schreiber, Mark wrote:
> >
> >> Hi
> >>
> >> What is the simplest way to create an orderN alphabet or
> symbol that can be
> >> used in a dsitribution?
> >
> >
> > Cross product alphabets are created via the AlphabetManager:
> >
> > Alphabet codons = AlphabetManager.getCrossProductAlphabet(
> > Collections.nCopies(3,
> DNATools.getDNA());
> >
> > This method will work on any arbitrary List of Alphabets.
> >
> > You can then retrieve symbols from that alphabet:
> >
> > List symbols = DNATools.createDNA("atg").toList();
> > Symbol startCodon = codons.getSymbol(symbols);
> >
> > This method works on an arbitrary list of Symbols (but obviously
> > these must match the alphabet -- you'll get an
> IllegalSymbolException
> > otherwise.
> >
> > Hope this helps,
> >
> > Thomas.
> >
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
>
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
------_=_NextPart_000_01C0A29A.5603C5C8
Content-Type: application/octet-stream;
name="CrossProductTest.java"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="CrossProductTest.java"
/*=0A=
* BioJava development code=0A=
*=0A=
* This code may be freely distributed and modified under the=0A=
* terms of the GNU Lesser General Public Licence. This should=0A=
* be distributed with the code. If you do not have a copy,=0A=
* see:=0A=
*=0A=
* http://www.gnu.org/copyleft/lesser.html=0A=
*=0A=
* Copyright for this code is held jointly by the individual=0A=
* authors. These should be listed in @author doc comments.=0A=
*=0A=
* For more information on the BioJava project and its aims,=0A=
* or to join the biojava-l mailing list, visit the home page=0A=
* at:=0A=
*=0A=
* http://www.biojava.org/=0A=
*=0A=
*/=0A=
=0A=
=0A=
package testbed;=0A=
=0A=
import org.biojava.bio.*;=0A=
import org.biojava.utils.*;=0A=
import org.biojava.bio.dist.*;=0A=
import org.biojava.bio.seq.*;=0A=
import org.biojava.bio.symbol.*;=0A=
import java.util.*;=0A=
=0A=
/**=0A=
* Title: CrossProductTest=0A=
* Description: A test of the nmer alphabet and distribution =
concepts=0A=
*=0A=
* This program demonstrates the use of crossproduct (nmer) alphabets =
and=0A=
* distributions. A codon distribution is created from a sequence. =
This=0A=
* distribution is them used to generate another random sequence. The =
probality=0A=
* of this new sequence is then calculated. This program also =
demonstrates=0A=
* how a cross product alphabet may be displayed to STDOUT.=0A=
*=0A=
* Thanks to Matthew and Thomas for hints and suggestions.=0A=
*=0A=
* @author Mark Schreiber=0A=
* @version 1.0=0A=
*/=0A=
=0A=
public class CrossProductTest {=0A=
=0A=
double prob =3D 1.0; //emmission probability=0A=
=0A=
public CrossProductTest() throws NestedException {=0A=
try{=0A=
//create a cross product of three dna alphabets ie a codon =
alphabet.=0A=
Alphabet tri =3D AlphabetManager.getCrossProductAlphabet(=0A=
=
Collections.nCopies(3,DNATools.getDNA()));=0A=
=0A=
=0A=
//create a distribution for the alphabet and a trainer.=0A=
Distribution d =3D =
DistributionFactory.DEFAULT.createDistribution(tri);=0A=
DistributionTrainer dt =3D new SimpleDistributionTrainer(d);=0A=
DistributionTrainerContext context =3D new =
SimpleDistributionTrainerContext();=0A=
=0A=
//create a dna sequence.=0A=
SymbolList seq =3D DNATools.createDNA(=0A=
"atgatgatggtggcggaggatgggcgcgcggtggaaacaacaattaca" +=0A=
"tagcaccccataccaatagacacagatggcggtgtgaacagataagac" +=0A=
"gcttagacacaaatgacacacggggccggggaatatttttaaatacaa" +=0A=
"cggctctctttataggcgcgcctttaaatataggcgcgcgcgggccta" +=0A=
"tttataaatatttttagaccacacccatatcatacgacaagaagccat" +=0A=
"ccaaatacggataacacccctagaggggaaccccgttatattttacac"=0A=
);=0A=
=0A=
//create a trimer view on the sequence.=0A=
SymbolList subseq =3D SymbolListViews.windowedSymbolList(seq, =
3);=0A=
=0A=
//add trimer counts to the distribution.=0A=
Iterator iter =3D subseq.iterator();=0A=
while (iter.hasNext()) {=0A=
Object item =3D iter.next();=0A=
dt.addCount(context,(AtomicSymbol)item,1.0);=0A=
}=0A=
//train the model using the weights given.=0A=
dt.train(0.0); //No psuedo-counts to nullModel.=0A=
=0A=
for (int i =3D 1; i <=3D 20; i++) { // generate a new sequence=0A=
Symbol sym =3D d.sampleSymbol();=0A=
//get the symbols that make up sym.=0A=
List syms =3D ((BasisSymbol)sym).getSymbols();=0A=
//print the codon=0A=
iter =3D syms.iterator();=0A=
while (iter.hasNext()) {=0A=
Symbol s =3D (Symbol)iter.next();=0A=
System.out.print(s.getToken());=0A=
}=0A=
//get the probability of the emmission so far=0A=
prob *=3D d.getWeight(sym);=0A=
}=0A=
System.out.println("\nProbablity of emission =3D " + prob);=0A=
=0A=
}catch(Exception e){=0A=
throw new NestedException(e);=0A=
}=0A=
}=0A=
public static void main(String[] args) {=0A=
try{=0A=
CrossProductTest crossProductTest1 =3D new CrossProductTest();=0A=
}catch(NestedException ne){=0A=
ne.printStackTrace(System.out);=0A=
}=0A=
}=0A=
}
------_=_NextPart_000_01C0A29A.5603C5C8--