[Biopython] [Samtools-help] Segmentation fault

Mic mictadlo at gmail.com
Tue Oct 18 12:05:01 UTC 2011


Thank you for your tip, but I got an error:
$ulimit -c unlimited
$SubsetBAM-P.py --BAM ex1.bam -f ex1.fa -o new.bam
Read fasta ...
Done!
creating subset....
chr1 Done 1464
EAS56_57:6:190:289:82 69 0 99 0 None 0 99 35
CTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA <<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<; [('MF',
192)]
chr2 Done 1806
B7_591:8:4:841:340 73 1 0 99 [(0, 36)] -1 -1 36
TTCAAATGAACTTCTGTAATTGAAAAATTCATTTAA
<<<<<<<<;<<<<<<<<;<<<<<;<;:<<<<<<<;; [('MF',
18), ('Aq', 77), ('NM', 0), ('UQ', 0), ('H0', 1), ('H1', 0)]
Done!
xxxxx
Segmentation fault (core dumped)
$file core
core: ERROR: cannot open `core' (No such file or directory)


I also inserted "print reads[0]" in the method GenerateSubsetBAM:

def GenerateSubsetBAM(bam_filename, ref_name):
    reads = []
    bam_fh = pysam.Samfile(bam_filename, "rb")

    for read in bam_fh.fetch(ref_name):
        reads.append(read)

    print ref_name + ' Done ' + str(len(reads))
    print reads[0]   # works fine!
    return (ref_name, reads)

and as output I got:

python SubsetBAM-P.py --BAM ex1.bam -f ex1.fa -o new.bam
Read fasta ...
Done!
creating subset....
chr1 Done 1464
EAS56_57:6:190:289:82 69 0 99 0 None 0 99 35
CTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA <<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<; [('MF',
192)]
chr2 Done 1806
B7_591:8:4:841:340 73 1 0 99 [(0, 36)] -1 -1 36
TTCAAATGAACTTCTGTAATTGAAAAATTCATTTAA
<<<<<<<<;<<<<<<<<;<<<<<;<;:<<<<<<<;; [('MF',
18), ('Aq', 77), ('NM', 0), ('UQ', 0), ('H0', 1), ('H1', 0)]
Done!
xxxxx
Segmentation fault

Why does reads['chr1'][0] caused the Segmentation fault?

Thank you in advance.


On Tue, Oct 18, 2011 at 9:44 PM, Martin Mokrejs <mmokrejs at fold.natur.cuni.cz
> wrote:

> Before running your python code, do (under bash):
> $ ulimit -c unlimited
> $ python mypython.py
> $ file core
> $ gdb /usr/bin/python ./core
> gdb> where
> gdb> bt full
> gdb> quit
> $
>
> Martin
>
> Mic wrote:
> > Hello,
> > Thank you for your email. I updated the code and find out that
> >     print reads['chr1']     #works fine
> > but
> >     print reads['chr1'][0]  #caused Segmentation fault
> >
> > Please find below the updated code:
> >
> > from Bio import SeqIO
> > import pysam
> > from optparse import OptionParser
> > import subprocess, os, sys
> > from multiprocessing import Pool
> > import functools
> >
> >
> > def GetReferenceInfo(referenceFastaPath):
> >   referencenames = []
> >   referencelengths = []
> >   referenceFastaFile = open(referenceFastaPath)
> >   for record in SeqIO.parse(referenceFastaFile, "fasta"):
> >     referencenames.append(record.name)
> >     referencelengths.append(len(record.seq))
> >   referenceFastaFile.close()
> >   return (referencenames, referencelengths)
> >
> >
> > def GenerateSubsetBAM(bam_filename, ref_name):
> >     reads = []
> >     bam_fh = pysam.Samfile(bam_filename, "rb")
> >
> >     for read in bam_fh.fetch(ref_name):
> >         reads.append(read)
> >
> >     print ref_name + ' Done ' + str(len(reads))
> >     return (ref_name, reads)
> >
> >
> > if __name__ == '__main__':
> >   parser = OptionParser("Usage: %prog -b BAMfile -f new_reference_fasta
> -o
> > outputBAM")
> >   parser.add_option("-b", "--BAM", type="string",
> dest="inputBAMFilepath",
> > help="Specify a BAM file")
> >   parser.add_option("-f", "--fasta", type="string", dest="fastaFilepath",
> > help="Specify a reference fasta file.")
> >   parser.add_option("-o", "--output", type="string",
> > dest="outputBAMFilepath", help="Specify an output BAM file.")
> >
> >   (opts, args) = parser.parse_args()
> >
> >   if (opts.inputBAMFilepath is None):
> >     print ("\nSpecify a BAM file. eg. -b large.bam\n")
> >     parser.print_help()
> >   elif not(os.path.exists(opts.inputBAMFilepath)):
> >     print ("\nReference BAM file does not exists: " +
> opts.inputBAMFilepath
> > +"\n")
> >   elif (opts.fastaFilepath is None):
> >     print ("\nSpecify a reference fasta file.  eg. -f Subset.fasta\n")
> >     parser.print_help()
> >   elif not(os.path.exists(opts.fastaFilepath)):
> >     print ("\nReference fasta file does not exists: " +
> opts.fastaFilepath
> > +"\n")
> >   elif os.path.exists(opts.outputBAMFilepath) and
> > not(raw_input(opts.outputBAMFilepath + " exists. Overwrite? (Y/n):
> ")=='Y'):
> >     print ("\nOutput BAM exists. Please specify alternative output file.
> >  eg. -o Subset.bam\n")
> >   else:
> >     print "Read fasta ..."
> >     (ref_names, ref_lengths) = GetReferenceInfo(opts.fastaFilepath)
> >     print 'Done!'
> >
> >     print "creating subset...."
> >     pool = Pool()
> >     GenerateSubsetBAM_wrapper = functools.partial(GenerateSubsetBAM,
> > opts.inputBAMFilepath)
> >     reads = dict(pool.imap_unordered(GenerateSubsetBAM_wrapper,
> ref_names))
> >     pool.close()
> >     print "Done!"
> >
> >     print reads['chr1']     #works fine
> >     print "xxxxx"
> >
> >     print reads['chr1'][0]  #caused Segmentation fault
> >
> > I run the code with the pysam-0.5 examples (pysam-0.5/tests) in the
> > following way:
> >
> > python SubsetBAM-P.py --BAM ex1.bam -f ex1.fa -o new.bam
> >
> > Read fasta ...
> > Done!
> > creating subset....
> > chr1 Done 1464
> > chr2 Done 1806
> > Done!
> > [<csamtools.AlignedRead object at 0x2b975635d168>, ...,
> > <csamtools.AlignedRead object at 0x2b35d89b6ca8>]
> > xxxxx
> > Segmentation fault
> >
> > Thank you in advance.
> >
> >
> > On Tue, Oct 18, 2011 at 7:00 PM, Peter Cock <p.j.a.cock at googlemail.com
> >wrote:
> >
> >> On Tue, Oct 18, 2011 at 4:44 AM, Mic <mictadlo at gmail.com> wrote:
> >>> Hello,
> >>> I have tried to generate a subset BAM, but I get a 'Segmentation fault'
> >> with
> >>> the following code:
> >>> from Bio import SeqIO
> >>> import pysam
> >>> from optparse import OptionParser
> >>> import subprocess, os, sys
> >>> from multiprocessing import Pool
> >>> import functools
> >>> ...
> >>
> >> I tried this and it seemed to get stuck much earlier. Could you
> >> cut down the example a bit by removing the multiprocessing?
> >>
> >> Peter
> >>
> >> P.S. Also you can remove the unused "import argparse" line.
> >>
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> >
> >
>



More information about the Biopython mailing list