[Bioperl-l] Getting read position information from an ACE file?
Dan Bolser
dan.bolser at gmail.com
Fri Sep 18 17:09:09 UTC 2009
2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
> Dan -- I don't know much about Assembly, so can't help there. But can I
> encourage you and perhaps one or two others (steganographic content:
> fangly) to create a HOWTO stub out of this? Would be excellent-
I'd love to. ACE is pretty ubiquitous, so any additional info on how
to work with them using BioPerl should help a lot of people.
The problem is that I'm one of those people ;-)
I'm working on an 'ace2tab.plx' script that should encompass this
info. I'm finding that some 'read ids' have the .range format. i.e.
"read123455.23-239". However, some do not. i.e. "read123456". Not sure
where this ID comes from, but I think its telling me something about
partially aligned reads. The problem is that the coordinates I'm
seeing don't reflect that (they are just the start and the end point
of the full read).
A 'proper' ace2tab script would be very nice.
> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE file?
>
>
>> Dear Perl Monkeys,
>>
>> I wrote a little demo script for Bio::Assembly::IO here:
>>
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>>
>>
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>>
>> In particular, I read about the 4 possible coordinates of a read in an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>>
>> Additionally, I'm not sure how to distinguish between the trimmed read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>>
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the quality
>> trimmed range of the read.
>>
>> The ACE file in question is produced by the gsMapper program, which is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>>
>>
>> Thanks very much for any hep you can provide on any of the above issues.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
More information about the Bioperl-l
mailing list