[Bioperl-l] Bio::Tools:Run::BEDTools using switches with intersect BED

Fields, Christopher J cjfields at illinois.edu
Mon Oct 3 18:34:27 UTC 2016


HI Mathew,

Not sure what’s going on, but can you submit this as a big report?

https://github.com/bioperl/bioperl-run/issues

My guess is there is a discrepancy in the attributes being passed to bedtools; I know this wrapper is older so it may be due to an API change.

chris

From: Bioperl-l <bioperl-l-bounces+cjfields=illinois.edu at mailman.open-bio.org<mailto:bioperl-l-bounces+cjfields=illinois.edu at mailman.open-bio.org>> on behalf of Matthew <mccormack at molbio.mgh.harvard.edu<mailto:mccormack at molbio.mgh.harvard.edu>>
Date: Monday, September 26, 2016 at 2:14 PM
To: BioPerl List <bioperl-l at mailman.open-bio.org<mailto:bioperl-l at mailman.open-bio.org>>
Subject: [Bioperl-l] Bio::Tools:Run::BEDTools using switches with intersect BED


I am using Bio::Tools::Run::BEDTools. I would like to intersect two BED files and print out the intersecting lines of the first BED file. I am  using the code below, but it is not printing out the first file entries.

 my $bedtools_fac2 = Bio::Tools::Run::BEDTools->new( -command => 'intersect');          $bedtools_fac2->set_parameters( -switches => 'write_entry_1');
 my $result_file2 = $bedtools_fac2->run( -bgv1 => $bed1, -bgv2 => $files);

The first file, $bed1, file looks like this:

chr1    5686    5887    1:5786    130    .    44.0    0.00    4.35    100    ABI3VP1    AT5G60130    col
chr1    14354    14555    1:14454    147    .    68.6    0.00    14.69    100    ABI3VP1    AT5G60130    col
chr1    21440    21641    1:21540    163    .    90.9    0.00    27.26    100    ABI3VP1    AT5G60130    col
chr1    41649    41850    1:41749    129    .    42.0    0.00    3.75    100    ABI3VP1    AT5G60130    col

The second file, $files, looks like this:
chr1    1    3630    AT1G01010     intergenic    +    AT1G01010    0    255,0,0
chr1    631    3630    AT1G01010     upstream_3000    +    AT1G01010    0    255,0,0
chr1    760    3759    AT1G01010     upstream_3000_translation_start    +    AT1G01010    0    255,0,0
chr1    2631    3630    AT1G01010     upstream_1000    +    AT1G01010    0    255,0,0
chr1    2760    3759    AT1G01010     upstream_1000_translation_start    +    AT1G01010    0    255,0,0

The output, $result_file2, looks like this:

chr1    4775    4976    AT1G01010.1     cds_updated    +    AT1G01010    0    255,0,0
chr1    4775    4976    AT1G01010.1    exon    +    AT1G01010    0    255,0,0
chr1    1635    1836    AT1G01010     intergenic    +    AT1G01010    0    255,0,0
chr1    1635    1836    AT1G01010     upstream_3000    +    AT1G01010    0    255,0,0

Columns 2 and 3, the numbers (4775  4976), in the output file may have come from the first file, but the rest seems to come from the 2nd file. The intersection is being done, but I would like the output to come from the first file. I can use -switches => 'write_entry_1' and get the same result with -switches => 'write_entry_2'

Matthew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/bioperl-l/attachments/20161003/618ebe73/attachment.html>


More information about the Bioperl-l mailing list