[Bioperl-l] Add a kind of hspsepQmax/hspsepSmax (like WuBlast has)in Bio::Search::Tiling::MapTiling
Frederic.SAPET at biogemma.com
Frederic.SAPET at biogemma.com
Thu Apr 29 15:54:29 UTC 2010
Hi Mark
The kludge works.
I have just had to set undef the previously calculalted value of
identities.
my $identString = "identities_".$type."_exact_".$context;
$tiling->{$identString} = undef;
But I think that what I would like to do is more deeper than that.
Sometime, I can have such results :
Chr1 TBLASTN match_set 2164104 56772932 544 + .
ID=Sample2;alignLength=1105;eValue=0.0;fractionAligned=95.959595959596;gapNumber=67;Name=Sample2;percentageIdentity=60.0476992143659
Chr1 TBLASTN match_part 2164104 2174973 630 + 1
Parent=Sample2;Target=Sample2 71 358
Chr1 TBLASTN match_part 2216917 2218191 1014 + 1
Parent=Sample2;Target=Sample2 70 502
Chr1 TBLASTN match_part 2218504 2218665 181 + 1
Parent=Sample2;Target=Sample2 533 585
Chr1 TBLASTN match_part 56771229 56771357 230 +
1 Parent=Sample2;Target=Sample2 25 67
Chr1 TBLASTN match_part 56772054 56772932 1401 +
1 Parent=Sample2;Target=Sample2 71 364
I would like to see the HSP separated in two distinct groups.
I tried to have a look inside the source code.
Is the method interval_tiling in MapTileUtils.pm a good start ?
Can I add here a new param (the kind of hspsepQmax/hspsepSmax) ?
thank you.
Fred
"Mark A. Jensen" <maj at fortinbras.us> a écrit sur 26/04/2010 15:17:51 :
> Hi Fred,
>
> I'll tell you how you can write a kludge; maybe you can expand it into
> a more general method.
>
> For your tblastn data, get the coverage map array
>
> @map = $tiling->coverage_map('hit', 'p0')
>
> Each element of the map is a ref to a pair [$int, $hsp], where $int is
> itself a reference to a two-elt array containing the coordinates of the
> hsp in context and $hsp is the hsp object itself. You can use these to
> filter the @map array.
>
> For your example, you can just get rid of the first @map elt:
>
> shift @map;
>
> Replace the internal map for this type and context, so that
> the methods work on the modified map:
>
> $tiling->{'coverage_map_hit_p0'} = \@map;
>
> Then $tiling->identities('hit', 'exact', 'p0'), etc. give you the
> new values.
>
> HTH-
> MAJ
> ----- Original Message -----
> From: <Frederic.SAPET at biogemma.com>
> To: <bioperl-l at bioperl.org>
> Sent: Friday, April 23, 2010 11:16 AM
> Subject: [Bioperl-l] Add a kind of hspsepQmax/hspsepSmax (like WuBlast
has)in
> Bio::Search::Tiling::MapTiling
>
>
> > Hello
> >
> > Based on bp_search2gff.pl script and Bio::Search::Tiling::MapTiling
> > documentation (http://www.bioperl.org/wiki/HOWTO:Tiling), I'm trying
to
> > write a generic blast to gff3 parser.
> >
> > My idea is to filter hits on frac_aligned and percent_identity values.
> >
> > I'm facing a problem with a BlastX result and the corresponding
TBlastN.
> >
> > Please find my script and the two example files attached.
> >
> > The example is a piece of Maize Chromosome where a protein seems to be
> > duplicated.
> >
> > When I launch the parsing of BlastX file and I want to retrieve data
from
> > a Query View ( >tiling.pl BlastX query), I have :
> >
> > Chr6:159690000-159718000 BLASTX match_set 23971 25620
> > 121.6 + .
> > ID=Os03g17980.2:1.1.1;alignLength=576;eValue=4.6e-137;
> fractionAligned=97.0530451866405;gapNumber=16;Name=Os03g17980.2;
> percentageIdentity=69.1552062868369
> > Chr6:159690000-159718000 BLASTX match_part 23971 24186
331
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 120 191
> > Chr6:159690000-159718000 BLASTX match_part 24820 24915
100
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 291 322
> > Chr6:159690000-159718000 BLASTX match_part 25195 25308
89
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 358
395
> > Chr6:159690000-159718000 BLASTX match_part 25390 25620
192
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 395 472
> >
> > Chr6:159690000-159718000 BLASTX match_set 918 2567
121.6
> > + .
> > ID=Os03g17980.2:1.2.1;alignLength=576;eValue=4.6e-137;
> fractionAligned=97.0530451866405;gapNumber=16;Name=Os03g17980.2;
> percentageIdentity=69.1552062868369
> > Chr6:159690000-159718000 BLASTX match_part 918 1148
192
> > - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 395 472
> > Chr6:159690000-159718000 BLASTX match_part 1230 1343
89
> > - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 358
395
> > Chr6:159690000-159718000 BLASTX match_part 1623 1718
100
> > - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 291 322
> > Chr6:159690000-159718000 BLASTX match_part 2352 2567
331
> > - 0 Parent=Os03g17980.2:1.2.1;Target=Os03g17980.2 120 191
> >
> > this is perfect, I retrieve two nice hits, with perfectly tiled HSP.
> >
> > But, with the TBlastN report (using a Hit View : >tiling.pl TBlastN
hit),
> > I have :
> > Chr6:159690000-159718000 TBLASTN match_set 7666 25620
> > 121.6 + .
> > ID=Os03g17980.2:1.1.1;alignLength=303;eValue=4.9e-137;
> fractionAligned=98.8212180746562;gapNumber=18;Name=Os03g17980.2;
> percentageIdentity=66.0052390307793
> > Chr6:159690000-159718000 TBLASTN match_part 7666 7917
44
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 332
416
> > Chr6:159690000-159718000 TBLASTN match_part 23971 24186
331
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 120 191
> > Chr6:159690000-159718000 TBLASTN match_part 24820 24915
100
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 291 322
> > Chr6:159690000-159718000 TBLASTN match_part 25195 25308
89
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 358
395
> > Chr6:159690000-159718000 TBLASTN match_part 25390 25620
192
> > + 0 Parent=Os03g17980.2:1.1.1;Target=Os03g17980.2 395 472
> >
> > I lose one of my hit, because another HSP is tiled to my hit, so I
trash
> > it when I filter the context using identitie values (line 42 to 54 of
my
> > script).
> > This HSP is far away in 5', so I would like to know if it could be
> > possible to add (or help me to develop this) a sort of
> > hspsepQmax/hspsepSmax (maximum allowed separation along the query(or
> > subject) sequence between two HSPs ) as a new parameter during the
tiling
> > phase ?
> >
> >
> >
> > Thank you.
> >
> > Fred
> >
> >
> >
>
>
>
--------------------------------------------------------------------------------
>
>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list