July 13, 2013, 8:31 PM — To begin, the format of a CSV file is not quite as easy as when,where,what. If it were, awk would have no trouble selecting any column you might want to see with a command such as this:
$ awk -F, '{print $3}'
Instead, CSV files can contain fields that contain blanks and fields that contain commas. When a field in a CSV file contains commas, for example, the entire field needs to be enclosed in quotes so that applications such as Excel can determine the extent of the field.
1997,Infonet,"El Segundo, CA" 1998,InCap,"Corte Madera, CA" 1998,WPI,"San Francisco, CA" 1999,E*Trade,"San Francisco, CA" 2001,TCS,"Annapolis, MD"
Other fields may be quoted, but don't need to be.
Were we to try to pull out the third field in these records using only the command as a separator, we would get this:
"el "Corte "San "Annapolis
Clearly this isn't what we'd want.
Similarly, were we to use double quotes, we'd have a different sort of problem. The location field would be $2, but the first field would be 1997,Infonet,. And if we were lucky enough to find that each field was quoted, we'd still have to refer to the fields of interest in an odd manner, calling them $2, $4 and $6. $3, for example would be only a comma for each record. And, note, this only works if your CSV quotes every field religiously. Can you count on that?
"1997","Infonet","El Segundo, CA" "1998","InCap","Corte Madera, CA" "1998","WPI","San Francisco, CA" "1999","E*Trade","San Francisco, CA" "2001","TCS","Annapolis, MD"
So, what do you do? You can specify " as the separator and live with the $2, $4, $6 way of referring to your data fields or you can use the Text::CSV - comma-separated values manipulation routines from CPAN to separate the fields in your CSV file properly. Here's a very simple script that shows how it works.
#!/usr/bin/perl use Text::CSV; my $file = 'jobs.csv'; my $csv = Text::CSV->new(); open (CSV, "<", $file) or die $!; while () { if ($csv->parse($_)) { my @columns = $csv->fields(); while (defined($item = shift @columns)){ print $item, "\n"; } } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } print "--\n"; } close CSV;
What is this script doing? First, it employs the Text::CSV module with the use Text::CSV line. This module isn't likely to be installed on your system by default, so you will likely have to download the Text-CSV-0.5.tar.gz file and then run through these steps to set it up:
perl Makefile.PL make make test make install
Once this is done, you can start using the module.
In the script above, we read a file named jobs.csv. In a while loop, we then break out each column using the my @columns = $csv->fields() line.
flickr / TripleScoop


















