[Biojava-l] Genome Assembly
Janier J. Ramírez
jjramirez at estudiantes.uci.cu
Wed Apr 24 18:24:44 UTC 2013
Hi !
I´m trying to assemble multiple reads using one as Reference, is this possile using Biojava ?
Greetings
----- Mensaje original -----
De: biojava-l-request at lists.open-bio.org
Para: biojava-l at lists.open-bio.org
Enviados: Miércoles, 24 de Abril 2013 11:00:03
Asunto: Biojava-l Digest, Vol 123, Issue 10
Send Biojava-l mailing list submissions to
biojava-l at lists.open-bio.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.open-bio.org/mailman/listinfo/biojava-l
or, via email, send a message with subject or body 'help' to
biojava-l-request at lists.open-bio.org
You can reach the person managing the list at
biojava-l-owner at lists.open-bio.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Biojava-l digest..."
Today's Topics:
1. Re: DNA assembly (Chris Friedline)
----------------------------------------------------------------------
Message: 1
Date: Wed, 24 Apr 2013 10:58:22 -0400
From: Chris Friedline <cfriedline at vcu.edu>
Subject: Re: [Biojava-l] DNA assembly
To: Andreas Prlic <andreas at sdsc.edu>
Cc: "Biojava-l at lists.open-bio.org" <biojava-l at lists.open-bio.org>,
Khalil El Mazouari <khalil.elmazouari at gmail.com>
Message-ID: <DCB620AB-AE14-4A23-BCEE-BEB45D404BBA at vcu.edu>
Content-Type: text/plain; charset=iso-8859-1
There's also a good deal of alignment quality checking, thresholding, and scoring the overlapping region that is both necessary and but maybe not all that straightforward. I suggest that you check out the PANDAseq paper, which describes their algorithm.
http://dx.doi.org/10.1186/1471-2105-13-31
Andreas is correct - the basic building blocks are already there.
Chris
On Apr 24, 2013, at 10:48 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
> It sounds like as all you need is to get the reverse complement of one of your sequences and then you do a local alignment. Both should be possible with BioJava...
>
> Andreas
>
>
> On Wed, Apr 24, 2013 at 7:29 AM, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
> Hi Chris,
>
> my application is deployed as war file. I am trying to avoid, as much as possible, to shell out to other none java programs... for maintainability reasons.
>
> I don't think I need a 'full' genome assembly tools (eg velvet ...), it's overkill for my case: cloned gene is sequenced on both directions. Normally one strand is sufficient. If the sequence quality is not good enough, the 2 strands are used to get the full length gene. There is always a large overlap between the 2 strand sequence.
> I can QC the full length gene.
>
> Best
>
> khalil
>
>
>
>
>
>
>
> -----
>
> Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
>
>
>
> On 24 Apr 2013, at 16:04, Chris Friedline wrote:
>
> > Khalil,
> >
> > Why not just shell out to programs designed for this purpose and pull in the results? We are in the process of publishing a paper which uses PANDAseq to assemble overlapping PE reads. The latest version of mothur also does this.
> >
> > www.mothur.org
> > https://github.com/neufeld/pandaseq/wiki/PANDAseq-Assembler
> >
> > PANDAseq is particularly nice in this case, because you could read right from stderr and stdout streams. It's also wicked fast.
> >
> > Chris
> >
> > On Apr 24, 2013, at 4:08 AM, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
> >
> >> Hi,
> >>
> >> It's not a global sequence alignment nor genome assembly. It's just a DNA fragment sequenced from both ends with an overlapping region. I want to assemble the 2 reads in order to get the full length sequence. This assembly is a part of a complex analysis process that uses biojava.
> >> I agree, there a lot of simple option how to achieve this. But I need somthing in java/biojava.
> >>
> >> Best
> >>
> >> khalil
> >>
> >>
> >>
> >>
> >> -----
> >>
> >> Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
> >>
> >>
> >>
> >> On 23 Apr 2013, at 23:38, Spencer Bliven wrote:
> >>
> >>> If you just have two contiguous sequences to align, you should just use a global sequence alignment. See http://biojava.org/wiki/BioJava:CookBook3:PSA for how to do this in BioJava, or it might be easier to just use one of the online services for this such as http://www.ebi.ac.uk/Tools/psa/.
> >>>
> >>> On the other hand, if you actually want to do genome assembly (ie from many overlapping reads), then there are much more computationally efficient methods. BioJava isn't really intended for large-scale genome assembly, so you'd want to use a sequence assembly tool (eg Velvet).
> >>>
> >>> -Spencer
> >>>
> >>>
> >>> On Tue, Apr 23, 2013 at 12:38 PM, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
> >>> Hi,
> >>>
> >>> I would like to assemble 2 overlapping DNA sequences. Is there something in biojava that may help in this task?
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>>
> >>> -----
> >>>
> >>> Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >>>
> >>
> >>
> >> _______________________________________________
> >> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
> >
> >
> >
>
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
>
>
------------------------------
_______________________________________________
Biojava-l mailing list - Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l
End of Biojava-l Digest, Vol 123, Issue 10
******************************************
http://www.uci.cu
http://www.uci.cu
More information about the Biojava-l
mailing list