[Bioperl-l] Getting read position information from an ACE file?
Phillip San Miguel
pmiguel at purdue.edu
Mon Sep 21 12:01:03 UTC 2009
Dan Bolser wrote:
> 2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>
>> Dan -- I don't know much about Assembly, so can't help there. But can I
>> encourage you and perhaps one or two others (steganographic content:
>> fangly) to create a HOWTO stub out of this? Would be excellent-
>>
>
> I'd love to. ACE is pretty ubiquitous, so any additional info on how
> to work with them using BioPerl should help a lot of people.
>
> The problem is that I'm one of those people ;-)
>
>
> I'm working on an 'ace2tab.plx' script that should encompass this
> info. I'm finding that some 'read ids' have the .range format. i.e.
> "read123455.23-239". However, some do not. i.e. "read123456". Not sure
> where this ID comes from, but I think its telling me something about
> partially aligned reads.
I think you are right. I have heard that Newbler (the 454 assembler)
does this insane thing, where it will rip reads apart into segments and
cluster parts of reads in different contigs.
> The problem is that the coordinates I'm
> seeing don't reflect that (they are just the start and the end point
> of the full read).
>
That sounds similar to how phrap/consed handle "chimeric" reads. But my
experience is that phrap is pretty parsimonious with numbers of
chimerics it will allow. (That isn't entirely fair to Newbler -- I've
never been able to get phrap to consistently assemble ESTs. Phrap seems
tuned to assemble BAC shotgun reads. ESTs seem to drive it a little
crazy. It will create contigs from a set of reads that have essentially
no similarity to each other, nor to the consensus sequence phrap creates
for them.)
--
Phillip
More information about the Bioperl-l
mailing list