[Biojava-l] help with short reads ?

Peter Cock p.j.a.cock at googlemail.com
Mon Jul 18 06:29:50 UTC 2011


On Saturday, June 25, 2011, Jay Vyas <jayunit100 at gmail.com> wrote:
> Hi everyone.  A collaborator sent me some short reads in GZ format for 2
> bacterial genomes.

Files names something.gz are compressed, a bit like the .zip format
used on windows. If it ends with .tar.gz (or the short hand .tgz) then
you have a collection of files (Unix tool tar), which have then been
compressed (with gzip or a similar tool).

What operating system do you have? Even on Windows you can
deal with these files.

> I have NO IDEA how to process this data or convert it.  Any help or utilies
> out there ?  If you're interested in collaborating on a publication , let me
> know.   We can get you're name on it.  And it won't be much work for those
> of you that know about contig assembly.....
>
> For me, its out of my league, im a protein guy....
>

[This seems rather tangential to BioJava and I don't think this
mailing list is the best place to discuss this.]

That sounds like it will be tough/hard. If you can find a local team or
bioinformatician working with NextGen sequencing I would help.
Try asking your sequencing center/provider if they have any kind
local sequencing group. See also http://seqanswers.com

Good luck,

Peter




More information about the Biojava-l mailing list