[X-Unix] Semi-newbe question

Sun Nov 28 03:29:23 PST 2004

On Nov 28, 2004, at 4:30 AM, Russell McGaha wrote:

> Folks;
> 	I need a few pointers on an appropriate way to do the following:
> 		I've two files, call them file1 & file2.
> 		formated as follows: text1, text2, .... textx [return] ie [ascii 
> chr(13)]
> 						textxy, ..., textxx [return]
>
> (in other words a csv text file) and I need to take the coma's out and 
> replace them with returns. (what I'd REALY like to do is combine them 
> [textx1,texty1, return textx2,texty2, ... ect] but THAT"S gravy, I'd 
> settle for just the biscuit right now.  BTW each file is a copy of 
> hundred K, and I've got  to do this on a repetitive basis [one to two 
> times a week.
> 		I USED to be able do just whip up a small program to do the job but; 
> I'm woefully out of practice, so I'd be grateful for a cmd line 
> EXAMPLE of how to do this.

I asked before for the conventions, but we might not need them because 
the csv Python module has a dialect guesser.

I wrote this simple Python script, in principle nothing extra needs to 
be installed, Python comes with Mac OS X. The usage is:

     $ python combine_csv.py file1 file2 result

The script does that pairing you wanted, i.e., "[textx1,texty1, return 
textx2,texty2, ... ect]" and assumes both input files have the same 
length in lines (otherwise the pairing itself is ill-defined).

I tested it a bit but I am sure it has some hidden assumptiom, please 
if it does not work, or it does not do what you wanted, or the dialect 
guesser can't guess the dialect, or whatever, don't hesitate to report 
the problem.

-- fxn

import sys, csv

# Process parameters
if len(sys.argv) != 4:
     print "Usage: python %s file1 file2 result" % sys.argv[0]
     sys.exit(1)
(file1, file2, result) = sys.argv[1:]

# Guess CSV dialect
csv_sample = open(file1).read(1024)
csv_dialect_guesser = csv.Sniffer()
csv_dialect = csv_dialect_guesser.sniff(csv_sample)
if csv_dialect == None:
     print "Sorry, the files have an unknown CSV dialect"
     sys.exit(1)

# Process files
csv1 = csv.reader(open(file1))
csv2 = csv.reader(open(file2))
csvr = csv.writer(open(result, "w"), dialect=csv_dialect)
for row1, row2 in zip(csv1, csv2):
     for field1, field2 in zip(row1, row2):
         csvr.writerow([field1, field2])