[Bioperl-l] Bio::Matrix::Substitution alpha version

Allen Smith easmith@beatrice.rutgers.edu
Sun, 29 Sep 2002 22:13:17 -0400


On Sep 29,  3:40pm, Jason Stajich wrote:

BTW, my apologies for forgetting to CC you on the initial email - I knew
there was someone I forgot...

> On Fri, 27 Sep 2002, Allen Smith wrote:
> 
> > Hi. I have an alpha version of Bio::Matrix::Substitution and
> > Bio::Matrix::SubstitutionI ready for public inspection. It includes not
> > only these modules (which do have POD documentation) but a
> > t/MatrixSubstitution.t test file and a couple of data files for testing
> > in the t/data/ subdirectory.
> >
> > Three things:
> > 	A. What's the recommended means of submission of such? SSH is not
> > 	   locally particularly available, thanks to IRIX not having a
> > 	   /dev/random, incidentally. I can make it available for HTTP
> > 	   access without problems, BTW.
> 
> I'd rather that you own this code and commit it directly to the CVS
> repository once we've reviewed it.  The submitted patches system requires
> too much bandwidth for core developers so we'd rather people put code in
> and allow others to make suggestions and improvements.  This is done with
> CVS over SSH so you'll ned to decide if your concerns over non-secure SSH
> from IRIX this is a big enough deal to prevent you from contributing.

In this case, the security concerns are rather likely to be much more on
your end - if _y'all_ find it acceptable to be doing SSH that may not be
giving as much security as normal, then that is your decision. One need not
have sshd running locally and providing access in order to use ssh for
outgoing connections, after all (unless things have changed since the last
time I checked). But I thought that you should be aware of this (possible)
concern.

> We can contact you off list about getting an account for this.

Please do.

> If you want to make it available via HTTP in the meantime we can look it
> over and commit it.

It is now available via:
	http://cesario.rutgers.edu/easmith/computers/MatrixSubstitution.tar.gz

For viewing of seperate files:
	http://cesario.rutgers.edu/easmith/computers/Bio-Matrix-Substitution.pm
	http://cesario.rutgers.edu/easmith/computers/Bio-Matrix-SubstitutionI.pm
	http://cesario.rutgers.edu/easmith/computers/MatrixSubstitution.t.pl
(t/MatrixSubstitution)
	http://cesario.rutgers.edu/easmith/computers/t-data-blosum62.mat.txt
(t/data/blosum62.mat)
	http://cesario.rutgers.edu/easmith/computers/t-data-gcg_test.mat.txt
(t/data/gcg_test.mat)

For viewing of HTML-converted POD documentation:
	http://cesario.rutgers.edu/easmith/computers/Bio-Matrix-Substitution.pm.html
	http://cesario.rutgers.edu/easmith/computers/Bio-Matrix-SubstitutionI.pm.html

> > 	B. This is _just_ those two modules. Incorporation into the rest of
> > 	   Bioperl (including SimpleAlign and maybe Bio::Tools::OddCodes) is

Speaking of Bio::Tools::OddCodes, my email to Derek Gatherer
(D.Gatherer@organon.nhe.akzonobel.nl) bounced.

> > 	   a further project. Two major things will be needed for this:
> 
> > 		1. To efficiently match "these are the AAs/whatever that are
> > 		   closely related according to the matrix" to "these are
> > 		   the AAs/whatever that we _have_", as in the
> > 		   substitution/consensus groups in SimpleAlign and
> > 		   Bio::Tools::OddCodes, the best method I've come up with
> > 		   is conversion of the presence/absence of each AA/whatever
> > 		   into a bit in a bitstring, using vec, followed by bit
> > 		   operations. This is considerably faster than using
> > 		   regexes or Set::Scalar. It would be by far for the best
> > 		   if this were also made into a new module (or set of
> > 		   modules). Anyone have a good name?
> > 		2. A proper means of associating matrices with alignments
> > 		   _and with sequences_, and having this be easily
> > 		   extensible to associate "this matrix is the best one to
> > 		   use in these spots along this sequence, unless the other
> > 		   sequence says to use something else" (as in, for

[...]

> 
> You use Bio::Location::Split instead of Bio::Range to represent
> non-continuous ranges.

Ah, thank you - I will take a look at this. I suspected there was something
I was missing...

	-Allen

-- 
Allen Smith			http://cesario.rutgers.edu/easmith/
September 11, 2001		A Day That Shall Live In Infamy II
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." - Benjamin Franklin