From andreas at sdsc.edu Wed Sep 19 11:13:47 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 19 Sep 2012 08:13:47 -0700 Subject: [Biojava-l] new features and git Message-ID: Hi, I have been a bit silent last few weeks, mostly due to traveling. As such a quick status update: Several people are working on new features: - Daniel Asarnow is working on a new CATH parser - Marco Vaz is working on a new Stockholm parser - Carmelo Foti is working on an enhanced GFF3 parser - Thiago Satake is working on a Genbank parser - For the structure modules, we are working on enhancements for working with the biological assemblies of proteins as you can see, quite a number of things in the pipeline! We also have the still open discussion about our migration to Git and github. Overall the feedback has been positive and it would seem that there is support for this change. I will spend the next couple of days talking to various people and will send out a proposal for the next releases and a timeline for migrating to git. Andreas From nickengland at gmail.com Thu Sep 20 11:06:24 2012 From: nickengland at gmail.com (Nick England) Date: Thu, 20 Sep 2012 16:06:24 +0100 Subject: [Biojava-l] [Biojava-dev] Problem converting ab1 to fastq files in Biojava 1.8 In-Reply-To: References: Message-ID: Sebastian, I have also tried to obtain the AB1 quality scores using BioJava, but was not successful. I can obtain them from scf files, but not AB1. Looking at the API it appears that they are not available, but I would be happy to be shown wrong! Nick On 20 September 2012 15:49, Sebastian garcia lopez wrote: > Good day to all, > > Excuse me for my english, I wil try to explain in a better way my problem: > I need to parser ab1 file into fastq file, the problem is that I do not > know the form to obtain the quality scores from ab1. to obtain the > sequence, I use ABIFChromatogram, and in fact, I can obtain the sequence, > yet I do not know how I can obtain the quality scores to build my fastq > file. The problem is that the "trace-offsets" do not correspond with > quality scores. If it is helpful, I put a fragment of the code that I am > using > > > ABIFChromatogram y = new ABIFChromatogram();//ya casi lo logro, falta ver > como leer esos alignment > y=ABIFChromatogram.create(new File(Path)); > Alignment to=y.getBaseCalls(); > SymbolList dnaSeq=to.symbolListForLabel("dna"); > SymbolList trace=to.symbolListForLabel("trace-offsets"); > > System.out.println(dnaSeq.seqString()); > System.out.println(trace.seqString()); > > Please, if somebody know the mode to obtain the scores from ab1 files in > Biojava, please let me know. > > Thank you. > > -- > Sebasti?n Garc?a L?pez > Electronic Engineer > Universidad Nacional de Colombia at Manizales > > Ms.Eng. Industrial Automation Student > Control and Digital Signal Processing Research Group (GC&PDS) > Universidad Nacional de Colombia at Manizales > MCP-Microsoft Certified Professional > > Email: deltadedirac at gmail.com > sgarcialop at unal.edu.co > Skype: sebastiang55 > Mobile: +57 3147569794 > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From ayates at ebi.ac.uk Thu Sep 20 11:17:02 2012 From: ayates at ebi.ac.uk (Andy Yates) Date: Thu, 20 Sep 2012 16:17:02 +0100 Subject: [Biojava-l] [Biojava-dev] Problem converting ab1 to fastq files in Biojava 1.8 In-Reply-To: References: Message-ID: <350A403F-D8E9-44FD-9826-B91E9B8ECDE1@ebi.ac.uk> Hi, It's been about 7-8 years since I last looked into the ABIFChromatogram object and I all remember failing to extract quality scores. Can you convert the AB1 file into an SCF trace and then use biojava? The staden io package comes with a binary called convert_trace which will get you into SCF. No guarantee the scores will be converted as well but it's worth a try I guess. Andy On 20 Sep 2012, at 16:06, Nick England wrote: > Sebastian, > > I have also tried to obtain the AB1 quality scores using BioJava, but > was not successful. I can obtain them from scf files, but not AB1. > Looking at the API it appears that they are not available, but I would > be happy to be shown wrong! > > Nick > > On 20 September 2012 15:49, Sebastian garcia lopez > wrote: >> Good day to all, >> >> Excuse me for my english, I wil try to explain in a better way my problem: >> I need to parser ab1 file into fastq file, the problem is that I do not >> know the form to obtain the quality scores from ab1. to obtain the >> sequence, I use ABIFChromatogram, and in fact, I can obtain the sequence, >> yet I do not know how I can obtain the quality scores to build my fastq >> file. The problem is that the "trace-offsets" do not correspond with >> quality scores. If it is helpful, I put a fragment of the code that I am >> using >> >> >> ABIFChromatogram y = new ABIFChromatogram();//ya casi lo logro, falta ver >> como leer esos alignment >> y=ABIFChromatogram.create(new File(Path)); >> Alignment to=y.getBaseCalls(); >> SymbolList dnaSeq=to.symbolListForLabel("dna"); >> SymbolList trace=to.symbolListForLabel("trace-offsets"); >> >> System.out.println(dnaSeq.seqString()); >> System.out.println(trace.seqString()); >> >> Please, if somebody know the mode to obtain the scores from ab1 files in >> Biojava, please let me know. >> >> Thank you. >> >> -- >> Sebasti?n Garc?a L?pez >> Electronic Engineer >> Universidad Nacional de Colombia at Manizales >> >> Ms.Eng. Industrial Automation Student >> Control and Digital Signal Processing Research Group (GC&PDS) >> Universidad Nacional de Colombia at Manizales >> MCP-Microsoft Certified Professional >> >> Email: deltadedirac at gmail.com >> sgarcialop at unal.edu.co >> Skype: sebastiang55 >> Mobile: +57 3147569794 >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From sharanya.raghunath at utah.edu Thu Sep 20 18:21:41 2012 From: sharanya.raghunath at utah.edu (SHARANYA RAGHUNATH) Date: Thu, 20 Sep 2012 22:21:41 +0000 Subject: [Biojava-l] OBO parser Message-ID: <0EFF41C89D14B44AB9BC5DEF88DA3462035C950B@X-MB12.xds.umail.utah.edu> Hello, I am using the parser pro dived by BioJava to handle OBO files. I have relationships in my ontology that do not have is_a relationship. I am unable to parse the terms that are related by a different relationship type. For example, if I try to parse the Triple: "mitochondria part of cell" the predicate and the object is null. Is there a way to get around this problem? Can something else other than Triples be used to handle such complexities? Thank you for your time, Sharanya Raghunath From kurka at mikro.biologie.tu-muenchen.de Tue Sep 25 08:07:22 2012 From: kurka at mikro.biologie.tu-muenchen.de (Hedwig Kurka) Date: Tue, 25 Sep 2012 14:07:22 +0200 Subject: [Biojava-l] Error when parsing genbank files Message-ID: <50619E7A.7090906@mikro.biologie.tu-muenchen.de> Dear Mailing-List, I have a problem parsing genbank-files. First: I know my parser works, because I used it several times. But for some files I get the following error: org.biojava.bio.BioException: Could not read sequence . . . Caused by: org.biojava.bio.seq.io.ParseException: A Exception Has Occurred During Parsing. Please submit the details that follow to biojava-l at biojava.org or post a bug report to http://bugzilla.open-bio.org/ Format_object=org.biojavax.bio.seq.io.GenbankFormat Accession=null Id=null Comments=Bad locus line Parse_block=LOCUS ABCD01000001; 10777 bp linear genomic DNA 23-AUG-2012 Stack trace follows .... at org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:315) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 3 more The Locus line in the genbank file is the following: LOCUS ABCD01000001; 10777 bp linear genomic DNA 23-AUG-2012 Can you tell me, what Biojava expects in that line or what is too much or missing? I use biojava 1.82 Thanks already for helping. From andreas at sdsc.edu Wed Sep 19 15:13:47 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 19 Sep 2012 08:13:47 -0700 Subject: [Biojava-l] new features and git Message-ID: Hi, I have been a bit silent last few weeks, mostly due to traveling. As such a quick status update: Several people are working on new features: - Daniel Asarnow is working on a new CATH parser - Marco Vaz is working on a new Stockholm parser - Carmelo Foti is working on an enhanced GFF3 parser - Thiago Satake is working on a Genbank parser - For the structure modules, we are working on enhancements for working with the biological assemblies of proteins as you can see, quite a number of things in the pipeline! We also have the still open discussion about our migration to Git and github. Overall the feedback has been positive and it would seem that there is support for this change. I will spend the next couple of days talking to various people and will send out a proposal for the next releases and a timeline for migrating to git. Andreas From nickengland at gmail.com Thu Sep 20 15:06:24 2012 From: nickengland at gmail.com (Nick England) Date: Thu, 20 Sep 2012 16:06:24 +0100 Subject: [Biojava-l] [Biojava-dev] Problem converting ab1 to fastq files in Biojava 1.8 In-Reply-To: References: Message-ID: Sebastian, I have also tried to obtain the AB1 quality scores using BioJava, but was not successful. I can obtain them from scf files, but not AB1. Looking at the API it appears that they are not available, but I would be happy to be shown wrong! Nick On 20 September 2012 15:49, Sebastian garcia lopez wrote: > Good day to all, > > Excuse me for my english, I wil try to explain in a better way my problem: > I need to parser ab1 file into fastq file, the problem is that I do not > know the form to obtain the quality scores from ab1. to obtain the > sequence, I use ABIFChromatogram, and in fact, I can obtain the sequence, > yet I do not know how I can obtain the quality scores to build my fastq > file. The problem is that the "trace-offsets" do not correspond with > quality scores. If it is helpful, I put a fragment of the code that I am > using > > > ABIFChromatogram y = new ABIFChromatogram();//ya casi lo logro, falta ver > como leer esos alignment > y=ABIFChromatogram.create(new File(Path)); > Alignment to=y.getBaseCalls(); > SymbolList dnaSeq=to.symbolListForLabel("dna"); > SymbolList trace=to.symbolListForLabel("trace-offsets"); > > System.out.println(dnaSeq.seqString()); > System.out.println(trace.seqString()); > > Please, if somebody know the mode to obtain the scores from ab1 files in > Biojava, please let me know. > > Thank you. > > -- > Sebasti?n Garc?a L?pez > Electronic Engineer > Universidad Nacional de Colombia at Manizales > > Ms.Eng. Industrial Automation Student > Control and Digital Signal Processing Research Group (GC&PDS) > Universidad Nacional de Colombia at Manizales > MCP-Microsoft Certified Professional > > Email: deltadedirac at gmail.com > sgarcialop at unal.edu.co > Skype: sebastiang55 > Mobile: +57 3147569794 > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From ayates at ebi.ac.uk Thu Sep 20 15:17:02 2012 From: ayates at ebi.ac.uk (Andy Yates) Date: Thu, 20 Sep 2012 16:17:02 +0100 Subject: [Biojava-l] [Biojava-dev] Problem converting ab1 to fastq files in Biojava 1.8 In-Reply-To: References: Message-ID: <350A403F-D8E9-44FD-9826-B91E9B8ECDE1@ebi.ac.uk> Hi, It's been about 7-8 years since I last looked into the ABIFChromatogram object and I all remember failing to extract quality scores. Can you convert the AB1 file into an SCF trace and then use biojava? The staden io package comes with a binary called convert_trace which will get you into SCF. No guarantee the scores will be converted as well but it's worth a try I guess. Andy On 20 Sep 2012, at 16:06, Nick England wrote: > Sebastian, > > I have also tried to obtain the AB1 quality scores using BioJava, but > was not successful. I can obtain them from scf files, but not AB1. > Looking at the API it appears that they are not available, but I would > be happy to be shown wrong! > > Nick > > On 20 September 2012 15:49, Sebastian garcia lopez > wrote: >> Good day to all, >> >> Excuse me for my english, I wil try to explain in a better way my problem: >> I need to parser ab1 file into fastq file, the problem is that I do not >> know the form to obtain the quality scores from ab1. to obtain the >> sequence, I use ABIFChromatogram, and in fact, I can obtain the sequence, >> yet I do not know how I can obtain the quality scores to build my fastq >> file. The problem is that the "trace-offsets" do not correspond with >> quality scores. If it is helpful, I put a fragment of the code that I am >> using >> >> >> ABIFChromatogram y = new ABIFChromatogram();//ya casi lo logro, falta ver >> como leer esos alignment >> y=ABIFChromatogram.create(new File(Path)); >> Alignment to=y.getBaseCalls(); >> SymbolList dnaSeq=to.symbolListForLabel("dna"); >> SymbolList trace=to.symbolListForLabel("trace-offsets"); >> >> System.out.println(dnaSeq.seqString()); >> System.out.println(trace.seqString()); >> >> Please, if somebody know the mode to obtain the scores from ab1 files in >> Biojava, please let me know. >> >> Thank you. >> >> -- >> Sebasti?n Garc?a L?pez >> Electronic Engineer >> Universidad Nacional de Colombia at Manizales >> >> Ms.Eng. Industrial Automation Student >> Control and Digital Signal Processing Research Group (GC&PDS) >> Universidad Nacional de Colombia at Manizales >> MCP-Microsoft Certified Professional >> >> Email: deltadedirac at gmail.com >> sgarcialop at unal.edu.co >> Skype: sebastiang55 >> Mobile: +57 3147569794 >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From sharanya.raghunath at utah.edu Thu Sep 20 22:21:41 2012 From: sharanya.raghunath at utah.edu (SHARANYA RAGHUNATH) Date: Thu, 20 Sep 2012 22:21:41 +0000 Subject: [Biojava-l] OBO parser Message-ID: <0EFF41C89D14B44AB9BC5DEF88DA3462035C950B@X-MB12.xds.umail.utah.edu> Hello, I am using the parser pro dived by BioJava to handle OBO files. I have relationships in my ontology that do not have is_a relationship. I am unable to parse the terms that are related by a different relationship type. For example, if I try to parse the Triple: "mitochondria part of cell" the predicate and the object is null. Is there a way to get around this problem? Can something else other than Triples be used to handle such complexities? Thank you for your time, Sharanya Raghunath From kurka at mikro.biologie.tu-muenchen.de Tue Sep 25 12:07:22 2012 From: kurka at mikro.biologie.tu-muenchen.de (Hedwig Kurka) Date: Tue, 25 Sep 2012 14:07:22 +0200 Subject: [Biojava-l] Error when parsing genbank files Message-ID: <50619E7A.7090906@mikro.biologie.tu-muenchen.de> Dear Mailing-List, I have a problem parsing genbank-files. First: I know my parser works, because I used it several times. But for some files I get the following error: org.biojava.bio.BioException: Could not read sequence . . . Caused by: org.biojava.bio.seq.io.ParseException: A Exception Has Occurred During Parsing. Please submit the details that follow to biojava-l at biojava.org or post a bug report to http://bugzilla.open-bio.org/ Format_object=org.biojavax.bio.seq.io.GenbankFormat Accession=null Id=null Comments=Bad locus line Parse_block=LOCUS ABCD01000001; 10777 bp linear genomic DNA 23-AUG-2012 Stack trace follows .... at org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:315) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 3 more The Locus line in the genbank file is the following: LOCUS ABCD01000001; 10777 bp linear genomic DNA 23-AUG-2012 Can you tell me, what Biojava expects in that line or what is too much or missing? I use biojava 1.82 Thanks already for helping.