[Biopython] entire sequence file is unintentionally being loaded
Peter Cock
p.j.a.cock at googlemail.com
Wed Nov 9 20:45:24 UTC 2016
Great.
If you are still using the index approach, you might be able
to use the get_raw method to output the records without
ever needing to parse them into SeqRecord objects?
e.g.
index1 = SeqIO.index(file1, "fastq")
index2 = SeqIO.index(file1, "fastq")
output_file = open("wanted.fastq", "wb")
for key in my_list_of_keys:
# Use key+"/1" and key+"/2" if you have old-style names
output_file.write(index1.get_raw(key))
output_file.write(index2.get_raw(key))
output_file.close()
Peter
On Wed, Nov 9, 2016 at 8:29 PM, Liam Thompson <dejmail at gmail.com> wrote:
> Hi Peter
>
> Apologies for the inadequate description, but you understood the gist of it.
>
> Thank you for the suggestions. You were right about zip(), I was unaware
> that it would override the memory cautious operators. The itertools.izip
> seems to have sorted things out as suggested, although now I need to spend
> some time speeding the whole script up.
>
> I will try the .itervalues() as well. I did try that, but it complained as
> well but perhaps for different reasons. I will investigate and report back.
>
>
> Liam
More information about the Biopython
mailing list