[Bioperl-l] Enquiry on gi_taxid_nucl.dmp.gz
Roy Chaudhuri
roy.chaudhuri at gmail.com
Wed Aug 25 07:12:15 EDT 2010
> Also it would be safer for the split to be whitespace matching and that
> you want the the two first columns from the file. Doing this would
> eliminate the need for the chomp on the line above.
>
> my ($gi, $taxid) = split(/\s+/, $_);
>
> instead of
>
> chomp;
> my ($gi, $taxid) = split(" ", $_,2);
Sorry to be pedantic, but according to perldoc -f split: "As a special
case, specifying a PATTERN of space (' ') will split on white space just
as "split" with no arguments does"
The only difference between patterns of " " and /\s+/ is that the latter
will return an initial null field if there is leading white space, which
may or may not be what you want.
$ perl -e 'print join("-", split(" ", " 1\t2 3")), "\n"'
1-2-3
$ perl -e 'print join("-", split(/\s+/, " 1\t2 3")), "\n"'
-1-2-3
Cheers.
Roy.
More information about the Bioperl-l
mailing list