[Biopython] [Samtools-help] Segmentation fault

Peter Cock p.j.a.cock at googlemail.com
Tue Oct 18 12:58:47 UTC 2011


On Tue, Oct 18, 2011 at 11:26 AM, Mic <mictadlo at gmail.com> wrote:
> Hello,
> Thank you for your email. I updated the code and find out that
>     print reads['chr1']     #works fine
> but
>     print reads['chr1'][0]  #caused Segmentation fault
> Please find below the updated code:
> ...

Your pool version doesn't run on my machine, something
unhappy in multiprocessing gives:
TypeError: type 'partial' takes at least one argument

Here's a version using a single thread, which works fine
for me. What does it do on your machines? Either way
this should help in determining the segmentation fault.

from Bio import SeqIO
import pysam
import subprocess, os, sys

def GetReferenceInfo(referenceFastaPath):
  referencenames = []
  referencelengths = []
  referenceFastaFile = open(referenceFastaPath)
  for record in SeqIO.parse(referenceFastaFile, "fasta"):
    referencenames.append(record.name)
    referencelengths.append(len(record.seq))
  referenceFastaFile.close()
  return (referencenames, referencelengths)


def GenerateSubsetBAM(bam_filename, ref_name):
    reads = []
    bam_fh = pysam.Samfile(bam_filename, "rb")

    for read in bam_fh.fetch(ref_name):
        reads.append(read)

    print ref_name + ' Done ' + str(len(reads))
    return (ref_name, reads)

bam_filename = "ex1.bam"
fasta_filename = "ex1.fa"

print "Read fasta ..."
(ref_names, ref_lengths) = GetReferenceInfo(fasta_filename)
print 'Done!'

print "creating subset...."
reads = dict()
for ref in ref_names:
    reads[ref] = GenerateSubsetBAM(bam_filename, ref)
print "Done!"

print reads['chr1']     #works fine
print "xxxxx"
print reads['chr1'][0]  #also fine

--

Peter




More information about the Biopython mailing list