[Biojava-dev] Serialization problems, "-" turns to "n" after
serializing sequence
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Wed Oct 19 21:05:56 EDT 2005
Hello -
Found out what was happening. Not a problem with serialization but a
problem with the createDNASequence method. This method wasn't dealing well
with gaps. There is another DNATools.createGappedDNASequence() that is
supposed to do what you want. Ideally you shouldn't use the
createDNASequence method with gap symbols. I have changed it now so that
if it detects one it calls the createGapped method. This is in CVS. Your
test seems to work now.
More generally I may need to apply this to RNATools and ProteinTools as
well. I'll hve a look.
- Mark
Mark Schreiber/GP/Novartis at PH
Sent by: biojava-dev-bounces at portal.open-bio.org
10/19/2005 11:19 AM
To: Kalle Näslund <kalle.naslund at genpat.uu.se>
cc: biojava-dev at biojava.org, (bcc: Mark Schreiber/GP/Novartis)
Subject: Re: [Biojava-dev] Serialization problems, "-" turns to "n" after
serializing sequence
Hello -
What should happen is that a method called readResolve() should be called
by the JVM on deserialization to replace the gap symbol that was
deserialized with the gap symbol of the local AlphabetManager.
This prevents you from having a gap that is not == the gap provided by the
alphabet manager. It seems that somehow it is instead being replaced by
the ambiguity symbol n.
It may take me a while to get around to looking at this. If you find it,
please let me know. If I forget, please remind me : )
- Mark
Kalle Näslund <kalle.naslund at genpat.uu.se>
Sent by: biojava-dev-bounces at portal.open-bio.org
10/19/2005 02:04 AM
To: biojava-dev at biojava.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] Serialization problems, "-" turns
to "n" after serializing
sequence
Hi!
I seem to be stuck with a serialization issue, somewhere deep in the
alphabet stuff. The problem is that "-" turns into "n". This happens
both with farily new CVS code as well as 1.4 release code.
The code i am using is the following:
import java.util.*;
import java.io.*;
import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;
import org.biojava.utils.*;
import org.biojava.bio.*;
/**
* Temp class, just to check out some serialization issues im having.
*
* @author kalle
*/
public class AlignmentSerializationTest {
public void run() throws Exception {
Sequence dnaSeq1 =
DNATools.createDNASequence("---ATGC---ATGC---", "seq1" );
dumpInfoAboutSequence( dnaSeq1 );
System.out.println("Writing alignment to disk");
File file = new File("/tmp/ali.obj");
FileOutputStream fOS = new FileOutputStream( file );
ObjectOutputStream oOS = new ObjectOutputStream( fOS );
oOS.writeObject( dnaSeq1 );
oOS.close();
fOS.close();
System.out.println( "Loading alignment from disk" );
FileInputStream fIS = new FileInputStream( file );
ObjectInputStream oIS = new ObjectInputStream( fIS );
Sequence serSeq = ( Sequence )oIS.readObject();
dumpInfoAboutSequence( serSeq );
}
public static void main( String[] flags ) throws Exception {
AlignmentSerializationTest myAST = new
AlignmentSerializationTest();
myAST.run();
}
private void dumpInfoAboutSequence( Sequence sequence ) throws
Exception {
System.out.println("Name :" + sequence.getName() );
System.out.println("Alphabet :" + sequence.getAlphabet() );
System.out.println("GapSymbol :" +
sequence.getAlphabet().getGapSymbol() );
System.out.println("Sequence :" + sequence.seqString() );
System.out.println("Tokeniz :" +
sequence.getAlphabet().getTokenization( "token" ) );
}
}
And the output i get is :
Name :seq1
Alphabet
:org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 1bc887b
GapSymbol :org.biojava.bio.symbol.SimpleBasisSymbol: []
Sequence :---atgc---atgc---
Tokeniz
:org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper at 120cc56
Writing alignment to disk
Loading alignment from disk
Name :seq1
Alphabet
:org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 1bc887b
GapSymbol :org.biojava.bio.symbol.SimpleBasisSymbol: []
Sequence :nnnatgcnnnatgcnnn
Tokeniz
:org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper at 120cc56
I have spent some time using a debugger and stepping trough the bj code
but realised that it will most likely take me loads of time, and was
hoping that some of you guys that have some more experience with the
alphabet stuff could atleast point me in the right direction, if not
outright recognize the bug =)
kind regards Kalle
_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org
http://biojava.org/mailman/listinfo/biojava-dev
_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org
http://biojava.org/mailman/listinfo/biojava-dev
More information about the biojava-dev
mailing list