Linein Question

Linein Question

Post by George W. » Fri, 07 Jan 2005 05:10:01


I downloaded my T-Mobile phone bill from their web site in a CSV format, but
they use '0D'x for a line termination and the OBJREXX LineIN does not stop
at the CR.
Using CHARIN, adding a LF & writing a new file works OK but Is there an
easier way to do this?
Some way to define the EOL character(s) for a particular REX program?

Thanks, George Barrowcliff
 
 
 

Linein Question

Post by richg » Fri, 07 Jan 2005 05:39:39

In article <tQXCd.10587$ XXXX@XXXXX.COM >,


See if you can find a program (or write one in rexx) that converts from
Mac linends to PC linends. A similar program called unix2dos converts
from unix linends (LF) to PC linends (CRLF).

--
Rich Greenberg Marietta, GA, USA richgr atsign panix.com + 1 770 321 6507
Eastern time. N6LRT I speak for myself & my dogs only. VM'er since CP-67
Canines:Val, Red & Shasta (RIP),Red, husky Owner:Chinook-L
Atlanta Siberian Husky Rescue. www.panix.com/~richgr/ Asst Owner:Sibernet-L

 
 
 

Linein Question

Post by George W. » Fri, 07 Jan 2005 07:45:17


In looking at the Tmobile data, I find an interesting variation on the comma
separated value file format.

"11/17/04","Houston, TX","10:28 AM","281-610-3096","","3","-","-","-"

Note the extra comma between Houston and TX.





but
 
 
 

Linein Question

Post by richg » Fri, 07 Jan 2005 08:44:02

In article <16_Cd.10299$ XXXX@XXXXX.COM >,


Since its within a pair of quotes, that comma has no special meaning.

--
Rich Greenberg Marietta, GA, USA richgr atsign panix.com + 1 770 321 6507
Eastern time. N6LRT I speak for myself & my dogs only. VM'er since CP-67
Canines:Val, Red & Shasta (RIP),Red, husky Owner:Chinook-L
Atlanta Siberian Husky Rescue. www.panix.com/~richgr/ Asst Owner:Sibernet-L
 
 
 

Linein Question

Post by Lee Peedi » Fri, 07 Jan 2005 08:56:52


Oh, yes it does if you're using Rexx parse - if you're loading it into
Excel then you are correct.
 
 
 

Linein Question

Post by richg » Fri, 07 Jan 2005 09:35:05

In article < XXXX@XXXXX.COM >,




No, you must special case it when parseing. or like:

parse var csv '"' p.1 '"."' p.2 '"."' p.3 '"."' p.4 . . . . . p.n '"'

If you know how many there will be. If you don't know how many ahead or
there are different numbers on each line, you need a loop, peeling off
one value per loop.

--
Rich Greenberg Marietta, GA, USA richgr atsign panix.com + 1 770 321 6507
Eastern time. N6LRT I speak for myself & my dogs only. VM'er since CP-67
Canines:Val, Red & Shasta (RIP),Red, husky Owner:Chinook-L
Atlanta Siberian Husky Rescue. www.panix.com/~richgr/ Asst Owner:Sibernet-L
 
 
 

Linein Question

Post by richg » Fri, 07 Jan 2005 09:37:58

In article <cri13p$ij$ XXXX@XXXXX.COM >,


Typo alert:


Should be:

parse var csv '"' p.1 '","' p.2 '","' p.3 '","' p.4 . . . . . p.n '"'

--
Rich Greenberg Marietta, GA, USA richgr atsign panix.com + 1 770 321 6507
Eastern time. N6LRT I speak for myself & my dogs only. VM'er since CP-67
Canines:Val, Red & Shasta (RIP),Red, husky Owner:Chinook-L
Atlanta Siberian Husky Rescue. www.panix.com/~richgr/ Asst Owner:Sibernet-L
 
 
 

Linein Question

Post by Anthony Bo » Fri, 07 Jan 2005 10:03:41


There doesn't appear to be any method of doing so at present. However, it
would seem to me that such a thing could be achieved, relatively
transparently, in future REXX implementations via a new OPTION, say
something like:

OPTION LINE_TERMINATOR=line_term_string

This approach would seem preferable to adding an additional parameter to
LINEIN specifying the line terminator string.

Food for thought, anyway :) !

Cheers,

Anthony Borla
 
 
 

Linein Question

Post by Lee Peedi » Fri, 07 Jan 2005 10:43:52


Yep, I realize that - matter of fact there was a thread a while back
concerning parsing csv files with commas inside quotes. My point was
simply that a comma inside quotes is still a comma. :-)
 
 
 

Linein Question

Post by richg » Sat, 08 Jan 2005 00:41:51

In article < XXXX@XXXXX.COM >,






And your point is correct.

Thats why you can't use a simple comma as a parse target on a CSV file.

--
Rich Greenberg Marietta, GA, USA richgr atsign panix.com + 1 770 321 6507
Eastern time. N6LRT I speak for myself & my dogs only. VM'er since CP-67
Canines:Val, Red & Shasta (RIP),Red, husky Owner:Chinook-L
Atlanta Siberian Husky Rescue. www.panix.com/~richgr/ Asst Owner:Sibernet-L
 
 
 

Linein Question

Post by Dennis Nol » Sun, 09 Jan 2005 17:33:54


Translate comes to mind.

Regards
Dennis
 
 
 

Linein Question

Post by Sahanand » Sun, 09 Jan 2005 19:20:27

I have a routine (written in classic rexx) for parsing a CSV line into a
stem variable.
You are welcome to cut & paste it. It is CSV2STEM at
http://www.yqcomputer.com/

best wishes,
Jon








the comma
6507
CP-67
Owner:Chinook-L
Owner:Sibernet-L
 
 
 

Linein Question

Post by trex » Tue, 11 Jan 2005 12:52:24

Cracking CSV files is a constant problem for me, as I use it to
get into and out of Excel a lot. My current choice is to use orexx
with regex expressions. Here in re-csv.rex is some fairly tight code that
seems to crack
most CSV files that I've come across.

Watch for line wraps....

/*ORexx*/
str= '"11/17/04","Houston, TX","10:28 AM","281-610-3096","","3","-","-","-"'
csv = .array~of()
fpat = .RegularExpression~new('"[^"]*"|[^,]*')
i = 0
do while str~length() \= 0
start = fpat~pos(str)
fin = fpat~position
i=i+1
csv[i] = str~substr(start,fin)
say "#"i csv[i]
str = str~substr(fin+2)
end
::requires "rxregexp.cls"

output:
C:\test>orexx re-csv.rex
#1 "11/17/04"
#2 "Houston, TX"
#3 "10:28 AM"
#4 "281-610-3096"
#5 ""
#6 "3"
#7 "-"
#8 "-"
#9 "-"

REX




comma
6507
CP-67
Owner:Chinook-L
Owner:Sibernet-L
 
 
 

Linein Question

Post by Sahanand » Tue, 11 Jan 2005 18:11:48

That's a beaut,
Can you get it to allow for the placement of "" inside literals to represent
the literal " which I have seen occasionally?

Jon



AM","281-610-3096","","3","-","-","-"'
 
 
 

Linein Question

Post by trex » Wed, 12 Jan 2005 00:39:59

Jon
Here is the regex for "" inside literals and also for backslashed quotes \"
from the Unix world. Field #1 is just a number without quotes, and is very
common in CSV files. Field #3 has a text string with no quotes, also common
when there is no comma in a string. Commas used as field space holders is
also common (see field #11 after the field #10 containing "3"). I've added
a series of blank fields at the end of the string to see how it performs
with blank field space holders using commas at the tail end. There should
be 17 fields in the test string. The last blank field #17 (after the last
comma) is not parsed correctly. One can fix this with Rexx by adding a
space after that last comma... My experience is that Excel appears to not
put blank field commas at the end of a line anyway when writing out a CSV
file. The reason I mention this little issue is that when one combines
fields horizontally, record fields can get out of order.
REX

Watch for linewraps....

/*ORexx*/
str= '"11/17/04","Houston, TX","10:28 AM","that is "too early" for me!","I
will need \"strong\" coffee","281-610-3096","","3",,"-","-","-",,,'

csv = .array~of()
fpat = .RegularExpression~new('"([^"]|\\"")*"|[^,]*') /* with quoted
field with backslash escaped quotes \" */
i = 0
do while str~length() \= 0
start = fpat~pos(str)
fin = fpat~position
i=i+1
csv[i] = str~substr(start,fin)
say "#"i csv[i]
str = str~substr(fin+2)
end
::requires "rxregexp.cls"
output:
C:\test>orexx re-csv3.rex

#1 22.13
#2 "11/17/04"
#3 this and that
#4 "Houston, TX"
#5 "10:28 AM"
#6 "that is "too early" for me!"
#7 "I will need \"strong\" coffee"
#8 "281-610-3096"
#9 ""
#10 "3"
#11
#12 "-"
#13 "-"
#14 "-"
#15
#16


represent