cat gzip files

cat gzip files

Post by ivan » Sat, 28 Apr 2007 07:52:57


Hi,

If I have a few thousand .gz files and I cat them together into one file.
Should the result be a valid .gz file?

Thanks,

--
Ivan Novick
http://www.yqcomputer.com/
 
 
 

cat gzip files

Post by Giorgos Ke » Sat, 28 Apr 2007 07:55:33


XXXX@XXXXX.COM writes:

Probably yes. A short test I ran locally was successful.

To be on the safe side, you can always use something like the following
though:

gzip -cd *.gz | gzip -9c - > /output/path/all-files.gz

 
 
 

cat gzip files

Post by Jean-Rene » Sat, 28 Apr 2007 08:04:12

* XXXX@XXXXX.COM [2007.04.26 22:52]:

Nothing like trying:

$ cat <(echo "this is" | gzip -c) <(echo "a test" | gzip -c) | zcat
this is
a test

--
JR
 
 
 

cat gzip files

Post by ivan » Sat, 28 Apr 2007 08:20:00


I think it generally works. I was more wondering if it is hack or
supported functionality of gzip?


So this unzips and then rezips the data right? I think just doing cat is
much faster and I have a lot of data to process. But again not sure if
gzip is really mean to allow this:

cat *gz > new.gz

--
Ivan Novick
http://www.yqcomputer.com/
 
 
 

cat gzip files

Post by Barry Marg » Sat, 28 Apr 2007 14:00:28

In article <20070426192003.293$ XXXX@XXXXX.COM >, XXXX@XXXXX.COM




Yes, from < http://www.yqcomputer.com/ ;:

A gzip file consists of a series of "members" (compressed data sets).
The format of each member is specified in the following section. The
members simply appear one after another in the file, with no
additional information before, between, or after them.

Thus, if file1.gz contains <member1a><member1b><member1c>, and file2.gz
is <member2a><member2b><member2c>, when you cat them together you'll get
<member1a><member1b><member1c><member2a><member2b><member2c>, which is
obviously a series of members, and thus also a valid gzip file.

--
Barry Margolin, XXXX@XXXXX.COM
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
 
 
 

cat gzip files

Post by Mikko Rauh » Tue, 01 May 2007 10:00:37


And, to put it more bluntly, man gzip at least on my Ubuntu system:

ADVANCED USAGE
Multiple compressed files can be concatenated. In this case,
gunzip will extract all members at once. For example:

gzip -c file1 > foo.gz
gzip -c file2 >> foo.gz

Then

gunzip -c foo

is equivalent to

cat file1 file2

--
Mikko Rauhala - XXXX@XXXXX.COM - <URL: http://www.yqcomputer.com/ ;
Transhumanist - WTA member - <URL: http://www.yqcomputer.com/ ;
Singularitarian - SIAI supporter - <URL: http://www.yqcomputer.com/ ;
 
 
 

cat gzip files

Post by Tom » Sat, 05 May 2007 00:41:50

Something to consider, is the interal gzip trailer will not be
accurate for for all members if you concatenate multiple files
together, it is only accurate for the last member (provided it's under
4gb ... depending on your version).
For example:
echo "This is the first file, and it's bigger than the next" >file1
echo "This is the second file." >file2

gzip -c file1 > file.gz ; gzip -c file2 >> file.gz

gzip -lv file.gz
method crc date time compressed uncompr. ratio
uncompressed_name
defla 5d70c4d3 May 3 10:36 122 25 -292.0% file

ls -latr file*
-rw-r--r-- 1 me myself 54 May 3 10:35 file1
-rw-r--r-- 1 me myself 25 May 3 10:35 file2
-rw-r--r-- 1 me myself 122 May 3 10:36 file.gz

Note the internal gzip trailer contains the byte size, and crc value
for the last member ... not the complete gzip file.

Not sure if that helps, but it was something I had to address once.
Tom