[Bioperl-l] Repeatmasker scripts

Fri Mar 9 00:41:11 UTC 2007

I'm new here and new to BioPerl.  Please let me know if I am breaking any
rules.  

Does anyone know of scripts designed to parse repeatmasker output so that
split repetitive elements can be recovered as a single row?  I tried a
search but I am either searching for the wrong terms or there isn't anything
to find.   Below is an example of what I would like to do but any comparable
system would be useful.

This is the output, single elements are split over two lines.

591	7.1	0	2.4	Mluc_cont1.010442	1392	1478	-967	C	Tc2_ML1_coding	Unknown	0
1296	1212	1
3825	4.7	0.6	1	Mluc_cont1.010442	1470	1959	-486	C	Tc2_ML1_coding	Unknown
-808	488	1	1
1816	7	0	7	Mluc_cont1.010866	3614	3890	-836	C	Tc2_ML1_coding	Unknown	-1037
259	1	2
596	3.6	2.4	2.4	Mluc_cont1.011200	1155	1239	-847	C	Tc2_ML1_coding	Unknown	0
1296	1212	3
3848	5.2	0.8	0.6	Mluc_cont1.011200	1231	1717	-369	C	Tc2_ML1_coding	Unknown
-808	488	1	3

It would be nice to combine the multiple entries as follows or do something
similar.

591	7.1	0	2.4	Mluc_cont1.010442	1392	1959	-967	C	Tc2_ML1_coding	Unknown	0
1296	1212	1
1816	7	0	7	Mluc_cont1.010866	3614	3890	-836	C	Tc2_ML1_coding	Unknown	-1037
259	1	2
596	3.6	2.4	2.4	Mluc_cont1.011200	1155	1717	-847	C	Tc2_ML1_coding	Unknown	0
1296	1212	3

Any help would be appreciated.

-- 
View this message in context: http://www.nabble.com/Repeatmasker-scripts-tf3372839.html#a9385755
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.