[Bioperl-l] Chromosome coordinates

Boddu, Jayanand jboddu at illinois.edu
Thu Dec 1 16:38:00 UTC 2011


Hello
I am newbie to Perl scripts.
I have a file with short reads mapped to the MAIZE genome
The format is a simple BLASTN output.
READ_ID

Chr

% Similarity

Alignment

Mismatches

Gaps

READ Start

READ End

Chr Start

Chr End

E Value

Score

READ1

chrPt

100

17

0

0

1

17

35021

35037

0.21

34.2

READ1

chr10

100

17

0

0

1

17

128587356

128587372

0.21

34.2

READ1

chr6

100

17

0

0

1

17

160769803

160769787

0.21

34.2

READ1

chr5

100

17

0

0

1

17

172103083

172103067

0.21

34.2

READ1

chr4

100

17

0

0

1

17

213173683

213173699

0.21

34.2

READ1

chr3

100

17

0

0

1

17

23689132

23689116

0.21

34.2

READ2

chr8

100

17

0

0

1

17

161048603

161048587

0.21

34.2

READ2

chr6

100

17

0

0

1

17

155768884

155768868

0.21

34.2

READ2

chr5

100

17

0

0

1

17

32958812

32958828

0.21

34.2

READ2

chr3

100

17

0

0

1

17

212451090

212451074

0.21

34.2

READ2

chr2

100

17

0

0

1

17

2046449

2046465

0.21

34.2

READ2

chr1

100

17

0

0

1

17

223233801

223233785

0.21

34.2

READ2

chr1

100

17

0

0

1

17

277573037

277573021

0.21

34.2

























As expected the same read maps to multiple places on the same/different chromosome.
I have a GFF file with annotated coordinates.
I would like to run a PERL script to find out READS that are within the GENES in the GFF file and that are not.
The anticipated script should;

1.       Take the READ coordinates on the genome (by chromosome);

2.       Go the GFF file;

3.       Find the Chromosome;

4.       Find the GENE (by coordinates);

5.       and report READ-its coordinates-Chromosome-GENE-and its coordinates.

It doesn't need to be in the same order.
After this, I guess I could use simple Microsoft ACCESS query to pull out READS that are not mapped to the GENEs.
I would greatly appreciate if anyone can has a script that more or less similar job.

Thanks
Jay




More information about the Bioperl-l mailing list