From bugzilla-daemon at newportal.open-bio.org Tue Aug 1 05:08:22 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Tue, 1 Aug 2006 05:08:22 -0400
Subject: [Biojava-dev] [Bug 2046] Cannot read in more than one serialized
ProfileHMM object, attempt crashes
In-Reply-To:
Message-ID: <200608010908.k7198MQX029674@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2046
------- Comment #3 from holland at ebi.ac.uk 2006-08-01 05:08 -------
I did a bit more investigation, and found that on each pass of the
deserializer, two sets of symbols from two alphabets are involved. The symbols
from the first alphabet are deserialized correctly each time, but the second
time round the second alphabet symbols are _not_. See log below from the
example code above plus a slightly modified SimpleDistribution#readObject:
SymbolWeightMemento[] swm = (SymbolWeightMemento[]) stream.readObject();
for (int m = 0; m < swm.length; ++m) {
try {
System.err.println("Looking for: "+swm[m].symbol);
System.err.println("Alphabet is: "+alpha);
weights[indexer.indexForSymbol(swm[m].symbol)] = swm[m].weight;
} catch (IllegalSymbolException ex) {
throw new IOException("Symbol in serialized stream can't be found
in the alphabet");
}
}
I am not sure what is causing this. Could someone who knows more about
deserializing alphabets have a look?
---
Writing HMM
Wrote 22561 bytes
Reading HMM
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at 16cd7d5
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
...
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
Looking for: org.biojava.bio.dp.SimpleDotState: d-2
Alphabet is: Transitions from d-1
Looking for: org.biojava.bio.dp.SimpleEmissionState at 70610a
...
Looking for: org.biojava.bio.dp.SimpleEmissionState at a7dd39
Alphabet is: Transitions from m-8
Read HMM
Reading HMM again!
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at 16cd7d5
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
...
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at cdedfd
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
Looking for: org.biojava.bio.dp.SimpleDotState: d-2
Alphabet is: Transitions from d-1
Exception in thread "main" java.io.IOException: Symbol in serialized stream
can't be found in the alphabet
at
org.biojava.bio.dist.SimpleDistribution.readObject(SimpleDistribution.java:101)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
at java.util.HashMap.readObject(HashMap.java:1015)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
at sandbox.TestByteArray.testSerialize(TestByteArray.java:49)
at sandbox.TestByteArray.main(TestByteArray.java:114)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Fri Aug 4 04:10:08 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Fri, 4 Aug 2006 04:10:08 -0400
Subject: [Biojava-dev] [Bug 2046] Cannot read in more than one serialized
ProfileHMM object, attempt crashes
In-Reply-To:
Message-ID: <200608040810.k748A8i6009880@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2046
------- Comment #4 from holland at ebi.ac.uk 2006-08-04 04:10 -------
When reading/writing SimpleSymbolLists, even with ambiguity symbols in, this
problem does not occur.
Specifically the problem occurs upon the second deserialization of the
org.biojava.bio.dp.SimpleDotState 'd-2' symbol, which belongs to the
'Transitions from d-1' alphabet.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Wed Aug 9 19:24:21 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 19:24:21 -0400
Subject: [Biojava-dev] [Bug 2067] New: Errors under Java1.5
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
Summary: Errors under Java1.5
Product: BioJava
Version: 1.4
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: seq
AssignedTo: biojava-dev at biojava.org
ReportedBy: pedromor at ufl.edu
Modified org.biojava.bio.seq.db.BioIndex.java to compile under Java 1.5 with
generics enabled. Eclipse 3.2 with JDK 1.5.07 complained that the
compare(Object,Object) method could not be resolved under the rules for
generics enabled in Java 1.5.
Below is an inlined version of the amendments to the file.
//-----------------------------------------------------------------
/*
* BioJava development code
*
* This code may be freely distributed and modified under the
* terms of the GNU Lesser General Public Licence. This should
* be distributed with the code. If you do not have a copy,
* see:
*
* http://www.gnu.org/copyleft/lesser.html
*
* Copyright for this code is held jointly by the individual
* authors. These should be listed in @author doc comments.
*
* For more information on the BioJava project and its aims,
* or to join the biojava-l mailing list, visit the home page
* at:
*
* http://www.biojava.org/
*
*/
package org.biojava.bio.seq.db;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintStream;
import java.io.PrintWriter;
import java.io.RandomAccessFile;
import java.util.AbstractList;
import java.util.AbstractSet;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.StringTokenizer;
import org.biojava.bio.BioError;
import org.biojava.bio.BioException;
import org.biojava.bio.seq.io.SequenceBuilderFactory;
import org.biojava.bio.seq.io.SequenceFormat;
import org.biojava.bio.seq.io.SymbolTokenization;
/**
* The original object for indexing sequence files.
*
*
This class may not be thread-safe.
*
* @author Matthew Pocock
* @author Thomas Down
*/
public class BioIndex implements IndexStore {
private static Comparator STRING_CASE_SENSITIVE_ORDER = new
Comparator() {
public int compare(String a, String b) {
return a.compareTo(b);
}
};
private File indexDirectory;
private int fileCount;
private File[] fileIDToFile;
private FileAsList indxList;
private Map secondaryKeyToFileAsList;
private Set idSet = new ListAsSet();
private String name;
private SequenceFormat format;
private SequenceBuilderFactory sbFactory;
private SymbolTokenization symbolTokenization;
{
fileCount = 0;
fileIDToFile = new File[4];
}
public BioIndex(
File indexDirectory,
String namespace,
int idLength
) throws IOException, BioException {
if(indexDirectory.exists()) {
throw new BioException(
"Can't create new index as directory already exists: " +
indexDirectory
);
}
// create directory
indexDirectory.mkdirs();
// create BIOINDEX.dat
{
File bioindex = new File(indexDirectory, "BIOINDEX.dat");
bioindex.createNewFile();
PrintWriter pw = new PrintWriter(new FileWriter(bioindex));
pw.println("index\tflat/1");
pw.close();
}
// create fileids.dat
PrintWriter fileidsWriter;
{
File fileids = new File(indexDirectory, "fileids.dat");
fileids.createNewFile();
fileidsWriter = new PrintWriter(
new FileWriter(
fileids
)
);
}
// create config.dat
PrintWriter configWriter;
{
File config = new File(indexDirectory, "config.dat");
config.createNewFile();
configWriter = new PrintWriter(new FileWriter(config));
configWriter.println("namespace\t" + namespace);
}
// create index file
{
String uniqueName = "key_" + namespace + ".key";
File unique = new File(indexDirectory, uniqueName);
unique.createNewFile();
int recordLen =
idLength + // id
1 + // tab
4 + // 9999 files
1 + // tab
String.valueOf(Long.MAX_VALUE).length() + // space for any long
1 + // tab
String.valueOf(Integer.MAX_VALUE).length() + // space for any int
"\n".length() // new line (os dependant)
;
indxList = new IndexFileAsList(
new RandomAccessFile(unique, "rw"),
recordLen
);
fileidsWriter.println(uniqueName + "\t" + recordLen);
}
// other field initialization to get things going
fileCount = 0;
fileIDToFile = new File[4];
configWriter.close();
fileidsWriter.close();
}
/**
* Load an existing index file.
*
* If indexDirectory does not exist, or is not a bioindex stoore, this will
* barf.
*/
public BioIndex(
File indexDirectory
) throws IOException, BioException {
this.indexDirectory = indexDirectory;
if(!indexDirectory.exists()) {
throw new BioException(
"Tried to load non-existant index: " +
indexDirectory
);
}
// read in the global config
{
System.out.println("Global");
Map config = new HashMap();
BufferedReader fi = new BufferedReader(
new FileReader(
new File(indexDirectory, "config.dat")
)
);
for(String line = fi.readLine(); line != null; line = fi.readLine()) {
int tab = line.indexOf("\t");
config.put(line.substring(0, tab), line.substring(tab + 1));
}
String namespace = (String) config.get("namespace");
RandomAccessFile indxFile = new RandomAccessFile("key_" + namespace +
".key", "rw");
int recLen = guessRecLen(indxFile);
indxList = new IndexFileAsList(indxFile, recLen);
}
// set up file set
{
System.out.println("Files");
fileCount = 0;
fileIDToFile = new File[4];
BufferedReader fi = new BufferedReader(
new FileReader(
new File(indexDirectory, "fileids.dat")
)
);
for(String line = fi.readLine(); line != null; line = fi.readLine()) {
StringTokenizer sTok = new StringTokenizer("\t");
int id = Integer.parseInt(sTok.nextToken());
File file = new File(sTok.nextToken());
long fileLength = Long.parseLong(sTok.nextToken());
if(file.length() != fileLength) {
throw new BioException("File length changed: " + file + " "
+ file.length() + " vs " + fileLength);
}
fileIDToFile[id] = file;
}
}
}
private File getFileForID(int fileId) {
return fileIDToFile[fileId];
}
private int getIDForFile(File file) {
// scan list
for(int i = 0; i < fileCount; i++) {
if(file.equals(fileIDToFile[i])) {
return i;
}
}
// extend fileIDToFile array
if(fileCount >= fileIDToFile.length) {
File[] tmp = new File[fileIDToFile.length + 4]; // 4 is magic number
System.arraycopy(fileIDToFile, 0, tmp, 0, fileCount);
fileIDToFile = tmp;
}
// add the unseen file to the list
fileIDToFile[fileCount] = file;
return fileCount++;
}
public String getName() {
return this.name;
}
public int guessRecLen(RandomAccessFile file)
throws IOException {
file.seek(0l);
int b = 0;
while(b != '\n' && b != '\r') {
b = file.read();
}
int offset = (int) file.getFilePointer();
if(b == '\n') { // \n
return offset + 1;
} else {
b = file.read();
if(b == '\n') { // \r\n
return offset + 2;
} else { // \r
return offset + 1;
}
}
}
public Index fetch(String id)
throws IllegalIDException, BioException {
int indx = Collections.binarySearch(
indxList,
id,
indxList.getComparator()
);
if(indx < 0) {
throw new IllegalIDException("Can't find sequence for " + id);
}
return (Index) indxList.get(indx);
}
public void store(Index indx) {
indxList.add(indx);
}
public void commit()
throws BioException {
indxList.commit();
try {
// write files
{
PrintStream fo = new PrintStream(
new FileOutputStream(
new File(indexDirectory, "fileids.dat")
)
);
for(int i = 0; i < fileCount; i++) {
fo.print(i);
fo.print('\t');
fo.print(fileIDToFile[i]);
fo.print('\t');
fo.print(fileIDToFile[i].length());
fo.println();
}
fo.close();
}
} catch (Exception e) {
rollback();
throw new BioException("Unable to commit. Rolled back to be safe",e);
}
}
public void rollback() {
indxList.rollback();
}
public Set getIDs() {
return idSet;
}
public Set getFiles() {
return new HashSet(Arrays.asList(fileIDToFile));
}
public SequenceFormat getFormat() {
return format;
}
public SequenceBuilderFactory getSBFactory() {
return sbFactory;
}
public SymbolTokenization getSymbolParser() {
return symbolTokenization;
}
private interface Commitable {
public void commit()
throws BioException;
public void rollback();
}
// records stored as:
// seqID(\w+) \t fileID(\w+) \t start(\d+) \t length(\d+) ' ' * \n
private abstract class FileAsList
extends AbstractList
implements /* RandomAccess, */ Commitable {
private RandomAccessFile mappedFile;
private int commitedRecords;
private int lastIndx;
private Object lastRec;
private byte[] buffer;
public FileAsList(RandomAccessFile mappedFile, int recordLength) {
this.mappedFile = mappedFile;
buffer = new byte[recordLength];
}
public Object get(int indx) {
if(indx < 0 || indx >= size()) {
throw new IndexOutOfBoundsException();
}
if(indx == lastIndx) {
return lastRec;
}
long offset = indx * buffer.length;
try {
mappedFile.seek(offset);
mappedFile.readFully(buffer);
} catch (IOException ioe) {
throw new BioError("Failed to seek for record",ioe);
}
lastRec = parseRecord(buffer);
lastIndx = indx;
return lastRec;
}
public int size() {
try {
return (int) (mappedFile.length() / (long) buffer.length);
} catch (IOException ioe) {
throw new BioError("Can't read file length",ioe);
}
}
public boolean add(Object o) {
generateRecord(buffer, o);
try {
mappedFile.seek(mappedFile.length());
mappedFile.write(buffer);
} catch (IOException ioe) {
throw new BioError("Failed to write index",ioe);
}
return true;
}
public void commit() {
Collections.sort(indxList, indxList.getComparator());
commitedRecords = indxList.size();
}
public void rollback() {
try {
mappedFile.setLength((long) commitedRecords * (long) buffer.length);
} catch (Throwable t) {
throw new BioError(
"Could not roll back. " +
"The index store will be in an inconsistent state " +
"and should be discarded. File: " + mappedFile, t
);
}
}
protected abstract Object parseRecord(byte[] buffer);
protected abstract void generateRecord(byte[] buffer, Object item);
protected abstract Comparator getComparator();
}
private class IndexFileAsList extends FileAsList {
private Comparator INDEX_COMPARATOR = new Comparator() {
public int compare(Object a, Object b) {
String as;
String bs;
if(a instanceof Index) {
as = ((Index) a).getID();
} else {
as = (String) a;
}
if(b instanceof Index) {
bs = ((Index) b).getID();
} else {
bs = (String) b;
}
return STRING_CASE_SENSITIVE_ORDER.compare(as, bs);
}
};
public IndexFileAsList(RandomAccessFile file, int recordLength) {
super(file, recordLength);
}
protected Object parseRecord(byte[] buffer) {
int lastI = 0;
int newI = 0;
while(buffer[newI] != '\t') {
newI++;
}
String id = new String(buffer, lastI, newI);
while(buffer[newI] != '\t') {
newI++;
}
File file = getFileForID(Integer.parseInt(new String(buffer, lastI,
newI).trim()));
while(buffer[newI] != '\t') {
newI++;
}
long start = Long.parseLong(new String(buffer, lastI, newI));
int length = Integer.parseInt(
new String(buffer, newI + 1, buffer.length)
);
return new SimpleIndex(file, start, length, id);
}
protected void generateRecord(byte[] buffer, Object item) {
Index indx = (Index) item;
String id = indx.getID();
int fileID = getIDForFile(indx.getFile());
String start = String.valueOf(indx.getStart());
String length = String.valueOf(indx.getLength());
int i = 0;
byte[] str;
str = id.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = String.valueOf(fileID).getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = start.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = length.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
while(i < buffer.length - 1) {
buffer[i++] = ' ';
}
buffer[i] = '\n';
}
public Comparator getComparator() {
return INDEX_COMPARATOR;
}
}
private static final class Record {
private final String key;
private final String value;
public Record(String key, String value) {
this.key = key;
this.value = value;
}
public String getKey() {
return key;
}
public String getValue() {
return value;
}
public int hashCode() {
return key.hashCode();
}
}
private class SecondaryIDFileAsList extends FileAsList {
private Comparator RECORD_COMPARATOR = new Comparator() {
public int compare(Object a, Object b) {
String as;
String bs;
if(a instanceof Record) {
as = ((Record) a).getKey();
} else {
as = (String) a;
}
if(b instanceof Index) {
bs = ((Record) b).getKey();
} else {
bs = (String) b;
}
return STRING_CASE_SENSITIVE_ORDER.compare(as, bs);
}
};
public SecondaryIDFileAsList(RandomAccessFile file, int recordLength) {
super(file, recordLength);
}
public Object parseRecord(byte[] buffer) {
int tab = 0;
while(buffer[tab] != '\t') {
tab++;
}
String key = new String(buffer, 0, tab);
String value = new String(buffer, tab + 1, buffer.length).trim();
return new Record(key, value);
}
protected void generateRecord(byte[] buffer, Object item) {
Record rec = (Record) item;
byte[] str;
int indx = 0;
str = rec.getKey().getBytes();
for(int i = 0; i < str.length; i++) {
buffer[indx++] = str[i];
}
buffer[indx++] = '\t';
str = rec.getValue().getBytes();
for(int i = 0; i < str.length; i++) {
buffer[indx++] = str[i];
}
while(indx < buffer.length - 1) {
buffer[indx++] = ' ';
}
buffer[buffer.length - 1] = '\n';
}
protected Comparator getComparator() {
return RECORD_COMPARATOR;
}
}
private class ListAsSet
extends AbstractSet {
public Iterator iterator() {
return indxList.iterator();
}
public int size() {
return indxList.size();
}
}
}
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Wed Aug 9 20:18:12 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 20:18:12 -0400
Subject: [Biojava-dev] [Bug 2068] New: bytecode.jar incorrectly specified in
build.xml
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2068
Summary: bytecode.jar incorrectly specified in build.xml
Product: BioJava
Version: 1.4
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Others
AssignedTo: biojava-dev at biojava.org
ReportedBy: pedromor at ufl.edu
When downloading the BioJava, I had to tweak build.xml to include
bytecode-0.92.jar in the classpath section. I was unable to find the bytecode
package because I didn't realize that it was in the jar file, but the build.xml
file line 51 refers to "bytecode.jar" and not "bytecode-0.92.jar".
You should package the required jar files in a lib/ directory, and include them
all in the classpath via the build.xml file.
Thank you!
Pedro
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Wed Aug 9 21:51:11 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 21:51:11 -0400
Subject: [Biojava-dev] [Bug 2068] bytecode.jar incorrectly specified in
build.xml
In-Reply-To:
Message-ID: <200608100151.k7A1pB9c018349@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2068
mark.schreiber at novartis.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
------- Comment #1 from mark.schreiber at novartis.com 2006-08-09 21:51 -------
This is not a bug. The ant build file is for the CVS download. In the CVS
download of biojava-live the bytecode.jar is correctly named.
Downloads of the bytecode.jar and biojava.jar in preassembled form don't need
the ant build script.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Wed Aug 9 21:59:18 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 21:59:18 -0400
Subject: [Biojava-dev] [Bug 2067] Errors under Java1.5
In-Reply-To:
Message-ID: <200608100159.k7A1xIUQ018974@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
mark.schreiber at novartis.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME
------- Comment #1 from mark.schreiber at novartis.com 2006-08-09 21:59 -------
This works with Netbeans and with ant from the command line. Maybe it is a
Eclipse bug.
Biojava currently doesn't use generics so as a work around you could possibly
'switch off' generics
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mateusz at kaduk.net Thu Aug 10 03:23:12 2006
From: mateusz at kaduk.net (Mateusz Kaduk)
Date: Thu, 10 Aug 2006 09:23:12 +0200
Subject: [Biojava-dev] One character amino acid representation
Message-ID: <1155194592.7371.8.camel@localhost.localdomain>
Hi,
How to get one character amino acid mark from Symbol ?
getName() method returns 3 char string.
Thanks in advance,
From mark.schreiber at novartis.com Thu Aug 10 03:47:55 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Thu, 10 Aug 2006 15:47:55 +0800
Subject: [Biojava-dev] One character amino acid representation
Message-ID:
You need to use a SymbolTokenizer to convert a Symbol to a character (or
String).
Symbol tyrosine = ProteinTools.tyr();
Alphabet protein = ProteinTools.getAlphabet()
SymbolTokenization st = protein.getTokenization("token");
String token = st.tokenizeSymbol(tyrosine);
//token should be equal to "Y"
For a SymbolList you can use
SymbolList sl = ... //make a SymbolList
st.tokenizeSymbolList(sl)
I'm really not sure why we didn't just have a single tokenize() method and
overload it but it is stuck that way now.
- Mark
Mateusz Kaduk
Sent by: biojava-dev-bounces at lists.open-bio.org
08/10/2006 03:23 PM
Please respond to mateusz
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] One character amino acid representation
Hi,
How to get one character amino acid mark from Symbol ?
getName() method returns 3 char string.
Thanks in advance,
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
From bugzilla-daemon at newportal.open-bio.org Thu Aug 10 04:43:49 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Thu, 10 Aug 2006 04:43:49 -0400
Subject: [Biojava-dev] [Bug 2067] Errors under Java1.5
In-Reply-To:
Message-ID: <200608100843.k7A8hnqK004361@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
holland at ebi.ac.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
------- Comment #2 from holland at ebi.ac.uk 2006-08-10 04:43 -------
BioJava will compile and run just fine under Java 1.5.
However, when compiling the source code in an IDE such as Eclipse you must tell
Eclipse to compile using source level 1.4, _not_ 1.5, otherwise it will
complain about constructs that are not legal in 1.5 but are perfectly legal in
1.4.
So, this is not a bug with BioJava. It can easily be resolved by changing your
Eclipse project settings to source level 1.4.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From gwaldon at geneinfinity.org Thu Aug 24 19:59:40 2006
From: gwaldon at geneinfinity.org (george waldon)
Date: Thu, 24 Aug 2006 16:59:40 -0700
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID: <200608242359.k7ONxerp092068@mmm1924.dulles19-verio.com>
Hi,
The keyword BASE COUNT is deprecated from Genbank releases since October 2003. I'd like to remove it completely from the output of GenbankFormat when writing sequences.
Is-this ok with everybody?
Thanks,
George
From mark.schreiber at novartis.com Fri Aug 25 02:43:48 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Fri, 25 Aug 2006 14:43:48 +0800
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID:
Probably OK. Will this be removed from the GenebankFormat in org.biojavax
? Also, what will happen if I have a legacy Sequence annotate with that
keyword and I try to write it out with the new format?
- Mark
"george waldon"
Sent by: biojava-dev-bounces at lists.open-bio.org
08/25/2006 07:59 AM
Please respond to george waldon
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Hi,
The keyword BASE COUNT is deprecated from Genbank releases since October
2003. I'd like to remove it completely from the output of GenbankFormat
when writing sequences.
Is-this ok with everybody?
Thanks,
George
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
From gwaldon at geneinfinity.org Fri Aug 25 13:46:15 2006
From: gwaldon at geneinfinity.org (george waldon)
Date: Fri, 25 Aug 2006 10:46:15 -0700
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID: <200608251746.k7PHkFth036355@mmm1924.dulles19-verio.com>
-----Original Message-----
>From: mark.schreiber at novartis.com
>Will this be removed from the GenebankFormat in
>org.biojavax ?
- Yes
>Also, what will happen if I have a legacy Sequence annotate
> with that keyword and I try to write it out with the new
>format?
It will be lost, as an output but not as information which can be easily recovered from the sequence data. Nothing will happen then; I don't think this line is parsed anyway, not in biojavax for sure. Note that until yesterday the output was incorrect; the COUNT keyword was missing and only BASE was written out. This is now corrected in the trunk. I think removing it completely is a logical move and is very safe. If you look in genbank last release notes (genbank release 154.0- june 15 2006), you'll see that more things are going to change and we need to keep our output updated.
- George
From mark.schreiber at novartis.com Sun Aug 27 21:50:09 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Mon, 28 Aug 2006 09:50:09 +0800
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID:
OK, go ahead with the change.
Are you OK to watch for format changes?
Thanks,
- Mark
"george waldon"
Sent by: biojava-dev-bounces at lists.open-bio.org
08/26/2006 01:46 AM
Please respond to george waldon
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: Re: [Biojava-dev] GenbankFormat and BASE COUNT
-----Original Message-----
>From: mark.schreiber at novartis.com
>Will this be removed from the GenebankFormat in
>org.biojavax ?
- Yes
>Also, what will happen if I have a legacy Sequence annotate
> with that keyword and I try to write it out with the new
>format?
It will be lost, as an output but not as information which can be easily
recovered from the sequence data. Nothing will happen then; I don't think
this line is parsed anyway, not in biojavax for sure. Note that until
yesterday the output was incorrect; the COUNT keyword was missing and only
BASE was written out. This is now corrected in the trunk. I think removing
it completely is a logical move and is very safe. If you look in genbank
last release notes (genbank release 154.0- june 15 2006), you'll see that
more things are going to change and we need to keep our output updated.
- George
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
From bugzilla-daemon at newportal.open-bio.org Tue Aug 1 09:08:22 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Tue, 1 Aug 2006 05:08:22 -0400
Subject: [Biojava-dev] [Bug 2046] Cannot read in more than one serialized
ProfileHMM object, attempt crashes
In-Reply-To:
Message-ID: <200608010908.k7198MQX029674@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2046
------- Comment #3 from holland at ebi.ac.uk 2006-08-01 05:08 -------
I did a bit more investigation, and found that on each pass of the
deserializer, two sets of symbols from two alphabets are involved. The symbols
from the first alphabet are deserialized correctly each time, but the second
time round the second alphabet symbols are _not_. See log below from the
example code above plus a slightly modified SimpleDistribution#readObject:
SymbolWeightMemento[] swm = (SymbolWeightMemento[]) stream.readObject();
for (int m = 0; m < swm.length; ++m) {
try {
System.err.println("Looking for: "+swm[m].symbol);
System.err.println("Alphabet is: "+alpha);
weights[indexer.indexForSymbol(swm[m].symbol)] = swm[m].weight;
} catch (IllegalSymbolException ex) {
throw new IOException("Symbol in serialized stream can't be found
in the alphabet");
}
}
I am not sure what is causing this. Could someone who knows more about
deserializing alphabets have a look?
---
Writing HMM
Wrote 22561 bytes
Reading HMM
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at 16cd7d5
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
...
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
Looking for: org.biojava.bio.dp.SimpleDotState: d-2
Alphabet is: Transitions from d-1
Looking for: org.biojava.bio.dp.SimpleEmissionState at 70610a
...
Looking for: org.biojava.bio.dp.SimpleEmissionState at a7dd39
Alphabet is: Transitions from m-8
Read HMM
Reading HMM again!
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at 16cd7d5
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
...
Looking for:
org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol at cdedfd
Alphabet is:
org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 64f6cd
Looking for: org.biojava.bio.dp.SimpleDotState: d-2
Alphabet is: Transitions from d-1
Exception in thread "main" java.io.IOException: Symbol in serialized stream
can't be found in the alphabet
at
org.biojava.bio.dist.SimpleDistribution.readObject(SimpleDistribution.java:101)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
at java.util.HashMap.readObject(HashMap.java:1015)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
at sandbox.TestByteArray.testSerialize(TestByteArray.java:49)
at sandbox.TestByteArray.main(TestByteArray.java:114)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Fri Aug 4 08:10:08 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Fri, 4 Aug 2006 04:10:08 -0400
Subject: [Biojava-dev] [Bug 2046] Cannot read in more than one serialized
ProfileHMM object, attempt crashes
In-Reply-To:
Message-ID: <200608040810.k748A8i6009880@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2046
------- Comment #4 from holland at ebi.ac.uk 2006-08-04 04:10 -------
When reading/writing SimpleSymbolLists, even with ambiguity symbols in, this
problem does not occur.
Specifically the problem occurs upon the second deserialization of the
org.biojava.bio.dp.SimpleDotState 'd-2' symbol, which belongs to the
'Transitions from d-1' alphabet.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Wed Aug 9 23:24:21 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 19:24:21 -0400
Subject: [Biojava-dev] [Bug 2067] New: Errors under Java1.5
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
Summary: Errors under Java1.5
Product: BioJava
Version: 1.4
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: seq
AssignedTo: biojava-dev at biojava.org
ReportedBy: pedromor at ufl.edu
Modified org.biojava.bio.seq.db.BioIndex.java to compile under Java 1.5 with
generics enabled. Eclipse 3.2 with JDK 1.5.07 complained that the
compare(Object,Object) method could not be resolved under the rules for
generics enabled in Java 1.5.
Below is an inlined version of the amendments to the file.
//-----------------------------------------------------------------
/*
* BioJava development code
*
* This code may be freely distributed and modified under the
* terms of the GNU Lesser General Public Licence. This should
* be distributed with the code. If you do not have a copy,
* see:
*
* http://www.gnu.org/copyleft/lesser.html
*
* Copyright for this code is held jointly by the individual
* authors. These should be listed in @author doc comments.
*
* For more information on the BioJava project and its aims,
* or to join the biojava-l mailing list, visit the home page
* at:
*
* http://www.biojava.org/
*
*/
package org.biojava.bio.seq.db;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintStream;
import java.io.PrintWriter;
import java.io.RandomAccessFile;
import java.util.AbstractList;
import java.util.AbstractSet;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.StringTokenizer;
import org.biojava.bio.BioError;
import org.biojava.bio.BioException;
import org.biojava.bio.seq.io.SequenceBuilderFactory;
import org.biojava.bio.seq.io.SequenceFormat;
import org.biojava.bio.seq.io.SymbolTokenization;
/**
* The original object for indexing sequence files.
*
* This class may not be thread-safe.
*
* @author Matthew Pocock
* @author Thomas Down
*/
public class BioIndex implements IndexStore {
private static Comparator STRING_CASE_SENSITIVE_ORDER = new
Comparator() {
public int compare(String a, String b) {
return a.compareTo(b);
}
};
private File indexDirectory;
private int fileCount;
private File[] fileIDToFile;
private FileAsList indxList;
private Map secondaryKeyToFileAsList;
private Set idSet = new ListAsSet();
private String name;
private SequenceFormat format;
private SequenceBuilderFactory sbFactory;
private SymbolTokenization symbolTokenization;
{
fileCount = 0;
fileIDToFile = new File[4];
}
public BioIndex(
File indexDirectory,
String namespace,
int idLength
) throws IOException, BioException {
if(indexDirectory.exists()) {
throw new BioException(
"Can't create new index as directory already exists: " +
indexDirectory
);
}
// create directory
indexDirectory.mkdirs();
// create BIOINDEX.dat
{
File bioindex = new File(indexDirectory, "BIOINDEX.dat");
bioindex.createNewFile();
PrintWriter pw = new PrintWriter(new FileWriter(bioindex));
pw.println("index\tflat/1");
pw.close();
}
// create fileids.dat
PrintWriter fileidsWriter;
{
File fileids = new File(indexDirectory, "fileids.dat");
fileids.createNewFile();
fileidsWriter = new PrintWriter(
new FileWriter(
fileids
)
);
}
// create config.dat
PrintWriter configWriter;
{
File config = new File(indexDirectory, "config.dat");
config.createNewFile();
configWriter = new PrintWriter(new FileWriter(config));
configWriter.println("namespace\t" + namespace);
}
// create index file
{
String uniqueName = "key_" + namespace + ".key";
File unique = new File(indexDirectory, uniqueName);
unique.createNewFile();
int recordLen =
idLength + // id
1 + // tab
4 + // 9999 files
1 + // tab
String.valueOf(Long.MAX_VALUE).length() + // space for any long
1 + // tab
String.valueOf(Integer.MAX_VALUE).length() + // space for any int
"\n".length() // new line (os dependant)
;
indxList = new IndexFileAsList(
new RandomAccessFile(unique, "rw"),
recordLen
);
fileidsWriter.println(uniqueName + "\t" + recordLen);
}
// other field initialization to get things going
fileCount = 0;
fileIDToFile = new File[4];
configWriter.close();
fileidsWriter.close();
}
/**
* Load an existing index file.
*
* If indexDirectory does not exist, or is not a bioindex stoore, this will
* barf.
*/
public BioIndex(
File indexDirectory
) throws IOException, BioException {
this.indexDirectory = indexDirectory;
if(!indexDirectory.exists()) {
throw new BioException(
"Tried to load non-existant index: " +
indexDirectory
);
}
// read in the global config
{
System.out.println("Global");
Map config = new HashMap();
BufferedReader fi = new BufferedReader(
new FileReader(
new File(indexDirectory, "config.dat")
)
);
for(String line = fi.readLine(); line != null; line = fi.readLine()) {
int tab = line.indexOf("\t");
config.put(line.substring(0, tab), line.substring(tab + 1));
}
String namespace = (String) config.get("namespace");
RandomAccessFile indxFile = new RandomAccessFile("key_" + namespace +
".key", "rw");
int recLen = guessRecLen(indxFile);
indxList = new IndexFileAsList(indxFile, recLen);
}
// set up file set
{
System.out.println("Files");
fileCount = 0;
fileIDToFile = new File[4];
BufferedReader fi = new BufferedReader(
new FileReader(
new File(indexDirectory, "fileids.dat")
)
);
for(String line = fi.readLine(); line != null; line = fi.readLine()) {
StringTokenizer sTok = new StringTokenizer("\t");
int id = Integer.parseInt(sTok.nextToken());
File file = new File(sTok.nextToken());
long fileLength = Long.parseLong(sTok.nextToken());
if(file.length() != fileLength) {
throw new BioException("File length changed: " + file + " "
+ file.length() + " vs " + fileLength);
}
fileIDToFile[id] = file;
}
}
}
private File getFileForID(int fileId) {
return fileIDToFile[fileId];
}
private int getIDForFile(File file) {
// scan list
for(int i = 0; i < fileCount; i++) {
if(file.equals(fileIDToFile[i])) {
return i;
}
}
// extend fileIDToFile array
if(fileCount >= fileIDToFile.length) {
File[] tmp = new File[fileIDToFile.length + 4]; // 4 is magic number
System.arraycopy(fileIDToFile, 0, tmp, 0, fileCount);
fileIDToFile = tmp;
}
// add the unseen file to the list
fileIDToFile[fileCount] = file;
return fileCount++;
}
public String getName() {
return this.name;
}
public int guessRecLen(RandomAccessFile file)
throws IOException {
file.seek(0l);
int b = 0;
while(b != '\n' && b != '\r') {
b = file.read();
}
int offset = (int) file.getFilePointer();
if(b == '\n') { // \n
return offset + 1;
} else {
b = file.read();
if(b == '\n') { // \r\n
return offset + 2;
} else { // \r
return offset + 1;
}
}
}
public Index fetch(String id)
throws IllegalIDException, BioException {
int indx = Collections.binarySearch(
indxList,
id,
indxList.getComparator()
);
if(indx < 0) {
throw new IllegalIDException("Can't find sequence for " + id);
}
return (Index) indxList.get(indx);
}
public void store(Index indx) {
indxList.add(indx);
}
public void commit()
throws BioException {
indxList.commit();
try {
// write files
{
PrintStream fo = new PrintStream(
new FileOutputStream(
new File(indexDirectory, "fileids.dat")
)
);
for(int i = 0; i < fileCount; i++) {
fo.print(i);
fo.print('\t');
fo.print(fileIDToFile[i]);
fo.print('\t');
fo.print(fileIDToFile[i].length());
fo.println();
}
fo.close();
}
} catch (Exception e) {
rollback();
throw new BioException("Unable to commit. Rolled back to be safe",e);
}
}
public void rollback() {
indxList.rollback();
}
public Set getIDs() {
return idSet;
}
public Set getFiles() {
return new HashSet(Arrays.asList(fileIDToFile));
}
public SequenceFormat getFormat() {
return format;
}
public SequenceBuilderFactory getSBFactory() {
return sbFactory;
}
public SymbolTokenization getSymbolParser() {
return symbolTokenization;
}
private interface Commitable {
public void commit()
throws BioException;
public void rollback();
}
// records stored as:
// seqID(\w+) \t fileID(\w+) \t start(\d+) \t length(\d+) ' ' * \n
private abstract class FileAsList
extends AbstractList
implements /* RandomAccess, */ Commitable {
private RandomAccessFile mappedFile;
private int commitedRecords;
private int lastIndx;
private Object lastRec;
private byte[] buffer;
public FileAsList(RandomAccessFile mappedFile, int recordLength) {
this.mappedFile = mappedFile;
buffer = new byte[recordLength];
}
public Object get(int indx) {
if(indx < 0 || indx >= size()) {
throw new IndexOutOfBoundsException();
}
if(indx == lastIndx) {
return lastRec;
}
long offset = indx * buffer.length;
try {
mappedFile.seek(offset);
mappedFile.readFully(buffer);
} catch (IOException ioe) {
throw new BioError("Failed to seek for record",ioe);
}
lastRec = parseRecord(buffer);
lastIndx = indx;
return lastRec;
}
public int size() {
try {
return (int) (mappedFile.length() / (long) buffer.length);
} catch (IOException ioe) {
throw new BioError("Can't read file length",ioe);
}
}
public boolean add(Object o) {
generateRecord(buffer, o);
try {
mappedFile.seek(mappedFile.length());
mappedFile.write(buffer);
} catch (IOException ioe) {
throw new BioError("Failed to write index",ioe);
}
return true;
}
public void commit() {
Collections.sort(indxList, indxList.getComparator());
commitedRecords = indxList.size();
}
public void rollback() {
try {
mappedFile.setLength((long) commitedRecords * (long) buffer.length);
} catch (Throwable t) {
throw new BioError(
"Could not roll back. " +
"The index store will be in an inconsistent state " +
"and should be discarded. File: " + mappedFile, t
);
}
}
protected abstract Object parseRecord(byte[] buffer);
protected abstract void generateRecord(byte[] buffer, Object item);
protected abstract Comparator getComparator();
}
private class IndexFileAsList extends FileAsList {
private Comparator INDEX_COMPARATOR = new Comparator() {
public int compare(Object a, Object b) {
String as;
String bs;
if(a instanceof Index) {
as = ((Index) a).getID();
} else {
as = (String) a;
}
if(b instanceof Index) {
bs = ((Index) b).getID();
} else {
bs = (String) b;
}
return STRING_CASE_SENSITIVE_ORDER.compare(as, bs);
}
};
public IndexFileAsList(RandomAccessFile file, int recordLength) {
super(file, recordLength);
}
protected Object parseRecord(byte[] buffer) {
int lastI = 0;
int newI = 0;
while(buffer[newI] != '\t') {
newI++;
}
String id = new String(buffer, lastI, newI);
while(buffer[newI] != '\t') {
newI++;
}
File file = getFileForID(Integer.parseInt(new String(buffer, lastI,
newI).trim()));
while(buffer[newI] != '\t') {
newI++;
}
long start = Long.parseLong(new String(buffer, lastI, newI));
int length = Integer.parseInt(
new String(buffer, newI + 1, buffer.length)
);
return new SimpleIndex(file, start, length, id);
}
protected void generateRecord(byte[] buffer, Object item) {
Index indx = (Index) item;
String id = indx.getID();
int fileID = getIDForFile(indx.getFile());
String start = String.valueOf(indx.getStart());
String length = String.valueOf(indx.getLength());
int i = 0;
byte[] str;
str = id.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = String.valueOf(fileID).getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = start.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
buffer[i++] = '\t';
str = length.getBytes();
for(int j = 0; j < str.length; j++) {
buffer[i++] = str[j];
}
while(i < buffer.length - 1) {
buffer[i++] = ' ';
}
buffer[i] = '\n';
}
public Comparator getComparator() {
return INDEX_COMPARATOR;
}
}
private static final class Record {
private final String key;
private final String value;
public Record(String key, String value) {
this.key = key;
this.value = value;
}
public String getKey() {
return key;
}
public String getValue() {
return value;
}
public int hashCode() {
return key.hashCode();
}
}
private class SecondaryIDFileAsList extends FileAsList {
private Comparator RECORD_COMPARATOR = new Comparator() {
public int compare(Object a, Object b) {
String as;
String bs;
if(a instanceof Record) {
as = ((Record) a).getKey();
} else {
as = (String) a;
}
if(b instanceof Index) {
bs = ((Record) b).getKey();
} else {
bs = (String) b;
}
return STRING_CASE_SENSITIVE_ORDER.compare(as, bs);
}
};
public SecondaryIDFileAsList(RandomAccessFile file, int recordLength) {
super(file, recordLength);
}
public Object parseRecord(byte[] buffer) {
int tab = 0;
while(buffer[tab] != '\t') {
tab++;
}
String key = new String(buffer, 0, tab);
String value = new String(buffer, tab + 1, buffer.length).trim();
return new Record(key, value);
}
protected void generateRecord(byte[] buffer, Object item) {
Record rec = (Record) item;
byte[] str;
int indx = 0;
str = rec.getKey().getBytes();
for(int i = 0; i < str.length; i++) {
buffer[indx++] = str[i];
}
buffer[indx++] = '\t';
str = rec.getValue().getBytes();
for(int i = 0; i < str.length; i++) {
buffer[indx++] = str[i];
}
while(indx < buffer.length - 1) {
buffer[indx++] = ' ';
}
buffer[buffer.length - 1] = '\n';
}
protected Comparator getComparator() {
return RECORD_COMPARATOR;
}
}
private class ListAsSet
extends AbstractSet {
public Iterator iterator() {
return indxList.iterator();
}
public int size() {
return indxList.size();
}
}
}
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Thu Aug 10 00:18:12 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 20:18:12 -0400
Subject: [Biojava-dev] [Bug 2068] New: bytecode.jar incorrectly specified in
build.xml
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2068
Summary: bytecode.jar incorrectly specified in build.xml
Product: BioJava
Version: 1.4
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Others
AssignedTo: biojava-dev at biojava.org
ReportedBy: pedromor at ufl.edu
When downloading the BioJava, I had to tweak build.xml to include
bytecode-0.92.jar in the classpath section. I was unable to find the bytecode
package because I didn't realize that it was in the jar file, but the build.xml
file line 51 refers to "bytecode.jar" and not "bytecode-0.92.jar".
You should package the required jar files in a lib/ directory, and include them
all in the classpath via the build.xml file.
Thank you!
Pedro
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Thu Aug 10 01:51:11 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 21:51:11 -0400
Subject: [Biojava-dev] [Bug 2068] bytecode.jar incorrectly specified in
build.xml
In-Reply-To:
Message-ID: <200608100151.k7A1pB9c018349@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2068
mark.schreiber at novartis.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
------- Comment #1 from mark.schreiber at novartis.com 2006-08-09 21:51 -------
This is not a bug. The ant build file is for the CVS download. In the CVS
download of biojava-live the bytecode.jar is correctly named.
Downloads of the bytecode.jar and biojava.jar in preassembled form don't need
the ant build script.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at newportal.open-bio.org Thu Aug 10 01:59:18 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Wed, 9 Aug 2006 21:59:18 -0400
Subject: [Biojava-dev] [Bug 2067] Errors under Java1.5
In-Reply-To:
Message-ID: <200608100159.k7A1xIUQ018974@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
mark.schreiber at novartis.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME
------- Comment #1 from mark.schreiber at novartis.com 2006-08-09 21:59 -------
This works with Netbeans and with ant from the command line. Maybe it is a
Eclipse bug.
Biojava currently doesn't use generics so as a work around you could possibly
'switch off' generics
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mateusz at kaduk.net Thu Aug 10 07:23:12 2006
From: mateusz at kaduk.net (Mateusz Kaduk)
Date: Thu, 10 Aug 2006 09:23:12 +0200
Subject: [Biojava-dev] One character amino acid representation
Message-ID: <1155194592.7371.8.camel@localhost.localdomain>
Hi,
How to get one character amino acid mark from Symbol ?
getName() method returns 3 char string.
Thanks in advance,
From mark.schreiber at novartis.com Thu Aug 10 07:47:55 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Thu, 10 Aug 2006 15:47:55 +0800
Subject: [Biojava-dev] One character amino acid representation
Message-ID:
You need to use a SymbolTokenizer to convert a Symbol to a character (or
String).
Symbol tyrosine = ProteinTools.tyr();
Alphabet protein = ProteinTools.getAlphabet()
SymbolTokenization st = protein.getTokenization("token");
String token = st.tokenizeSymbol(tyrosine);
//token should be equal to "Y"
For a SymbolList you can use
SymbolList sl = ... //make a SymbolList
st.tokenizeSymbolList(sl)
I'm really not sure why we didn't just have a single tokenize() method and
overload it but it is stuck that way now.
- Mark
Mateusz Kaduk
Sent by: biojava-dev-bounces at lists.open-bio.org
08/10/2006 03:23 PM
Please respond to mateusz
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] One character amino acid representation
Hi,
How to get one character amino acid mark from Symbol ?
getName() method returns 3 char string.
Thanks in advance,
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
From bugzilla-daemon at newportal.open-bio.org Thu Aug 10 08:43:49 2006
From: bugzilla-daemon at newportal.open-bio.org (bugzilla-daemon at newportal.open-bio.org)
Date: Thu, 10 Aug 2006 04:43:49 -0400
Subject: [Biojava-dev] [Bug 2067] Errors under Java1.5
In-Reply-To:
Message-ID: <200608100843.k7A8hnqK004361@newportal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2067
holland at ebi.ac.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
------- Comment #2 from holland at ebi.ac.uk 2006-08-10 04:43 -------
BioJava will compile and run just fine under Java 1.5.
However, when compiling the source code in an IDE such as Eclipse you must tell
Eclipse to compile using source level 1.4, _not_ 1.5, otherwise it will
complain about constructs that are not legal in 1.5 but are perfectly legal in
1.4.
So, this is not a bug with BioJava. It can easily be resolved by changing your
Eclipse project settings to source level 1.4.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From gwaldon at geneinfinity.org Thu Aug 24 23:59:40 2006
From: gwaldon at geneinfinity.org (george waldon)
Date: Thu, 24 Aug 2006 16:59:40 -0700
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID: <200608242359.k7ONxerp092068@mmm1924.dulles19-verio.com>
Hi,
The keyword BASE COUNT is deprecated from Genbank releases since October 2003. I'd like to remove it completely from the output of GenbankFormat when writing sequences.
Is-this ok with everybody?
Thanks,
George
From mark.schreiber at novartis.com Fri Aug 25 06:43:48 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Fri, 25 Aug 2006 14:43:48 +0800
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID:
Probably OK. Will this be removed from the GenebankFormat in org.biojavax
? Also, what will happen if I have a legacy Sequence annotate with that
keyword and I try to write it out with the new format?
- Mark
"george waldon"
Sent by: biojava-dev-bounces at lists.open-bio.org
08/25/2006 07:59 AM
Please respond to george waldon
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Hi,
The keyword BASE COUNT is deprecated from Genbank releases since October
2003. I'd like to remove it completely from the output of GenbankFormat
when writing sequences.
Is-this ok with everybody?
Thanks,
George
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
From gwaldon at geneinfinity.org Fri Aug 25 17:46:15 2006
From: gwaldon at geneinfinity.org (george waldon)
Date: Fri, 25 Aug 2006 10:46:15 -0700
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID: <200608251746.k7PHkFth036355@mmm1924.dulles19-verio.com>
-----Original Message-----
>From: mark.schreiber at novartis.com
>Will this be removed from the GenebankFormat in
>org.biojavax ?
- Yes
>Also, what will happen if I have a legacy Sequence annotate
> with that keyword and I try to write it out with the new
>format?
It will be lost, as an output but not as information which can be easily recovered from the sequence data. Nothing will happen then; I don't think this line is parsed anyway, not in biojavax for sure. Note that until yesterday the output was incorrect; the COUNT keyword was missing and only BASE was written out. This is now corrected in the trunk. I think removing it completely is a logical move and is very safe. If you look in genbank last release notes (genbank release 154.0- june 15 2006), you'll see that more things are going to change and we need to keep our output updated.
- George
From mark.schreiber at novartis.com Mon Aug 28 01:50:09 2006
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Mon, 28 Aug 2006 09:50:09 +0800
Subject: [Biojava-dev] GenbankFormat and BASE COUNT
Message-ID:
OK, go ahead with the change.
Are you OK to watch for format changes?
Thanks,
- Mark
"george waldon"
Sent by: biojava-dev-bounces at lists.open-bio.org
08/26/2006 01:46 AM
Please respond to george waldon
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: Re: [Biojava-dev] GenbankFormat and BASE COUNT
-----Original Message-----
>From: mark.schreiber at novartis.com
>Will this be removed from the GenebankFormat in
>org.biojavax ?
- Yes
>Also, what will happen if I have a legacy Sequence annotate
> with that keyword and I try to write it out with the new
>format?
It will be lost, as an output but not as information which can be easily
recovered from the sequence data. Nothing will happen then; I don't think
this line is parsed anyway, not in biojavax for sure. Note that until
yesterday the output was incorrect; the COUNT keyword was missing and only
BASE was written out. This is now corrected in the trunk. I think removing
it completely is a logical move and is very safe. If you look in genbank
last release notes (genbank release 154.0- june 15 2006), you'll see that
more things are going to change and we need to keep our output updated.
- George
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev