[mdlug] csv files
John Wiersba
jrw32982 at yahoo.com
Sun Nov 15 20:49:14 EST 2009
I use the following perl functions to parse csv records into fields and create csv records from a list of fields. The functions do not handle records with embedded newlines. You can embed the functions into your perl code or store the code in a file CSV.pm (in the right directory) and then write code like the two following scripts.
-- John Wiersba
#!/usr/bin/perl
# convert csv records to colon-separated fields
use CSV;
while (<>) {
@fields = csv2fields $_;
print join(":", @fields), "\n";
}
or
#!/usr/bin/perl
# convert colon-separated fields to csv records
use CSV;
while (<>) {
chomp;
@fields = split /:/;
print fields2csv(@fields), "\n";
}
Contents of CSV.pm:
package CSV;
use Exporter ();
use vars qw{@ISA @EXPORT};
@ISA = qw{Exporter};
@EXPORT = qw{csv2fields fields2csv};
sub fields2csv {
my (@fields) = @_;
for (@fields) {
$_ = "<NULL>" unless defined $_; # handle undefined value
next unless /[,"]/; # commas and quotes need special handling
s/"/""/g; # double up quotes
$_ = qq{"$_"}; # surround with quotes
}
return join ",", @fields;
}
sub csv2fields {
my ($csv) = @_;
chomp $csv;
$csv .= "," if $csv ne ""; # make it easier to parse
my @fields;
while ($csv =~
/ "( # $1: capture everything between quotes
(?:""|[^"])* # doubled quotes or non-quotes
)"
([^,]+)? # $2: check for garbage before comma
, # terminating comma
|
(") # $3: unclosed quote
|
([^,]*), # $4: capture everything between commas
/gx
) {
if (defined $2 || defined $3) {
my $err2 = defined $2;
$csv =~ s/,\z//;
die(($err2 ? "data after quoted field" : "unclosed quote")
. " in csv line: <$csv>\n");
}
my $field = $+;
$field =~ s/""/"/g if defined $1;
push @fields, $field;
}
return @fields;
}
1;
More information about the mdlug
mailing list