Page History
...
Code Block |
---|
dset = f.create_dataset("BigDataset", shape=(100032,185,1000388), dtype='f'np.int16, chunks=(1,185,388), compression="gzip") >>> dset.compression 'gzip' >>> dset.compression_opts 9 |
...
Code Block |
---|
dset = myfile.create_dataset("Dataset4", shape=(100032,185,388), dtype=np.int16, chunks=(1,185,388),compression="lzf") |
LZF features:
- Works with all HDF5 types
- Fast compression and decompression
- Is only available in Python (ships with h5py); C source available
Extra filters in HDF5
SHUFFLE
Treats low and high bytes separately
Code Block |
---|
>>> dset = myfile.create_dataset("Data", shape=(32,185,388), dtype=np.int16, chunks=(1,185,388), compression="gzip",
shuffle=True) |
- Available with all HDF5 distributions
- Very fast (negligible compared to the compression time)
- Only useful in conjunction with filters like GZIP or LZF
FLETCHER32 Filter
Check-sum
Code Block |
---|
dset = myfile.create_dataset("Data2", shape=(32,185,388), dtype=np.int16, chunks=(1,185,388), fletcher32=True, ...)
>>> dset.fletcher32
True |
FLETCHER32 features:
- Available with all HDF5 distributions
- Very fast
- Compatible with all lossless filters
Code Block | ||
---|---|---|
| ||
gzip default compression_opts level=4
raw: gzip t1(create)=0.003280(sec) t2(+save)=0.216324(sec) input size=4594000(byte) ratio=1.583 shuffle=False fletcher32=False
raw: gzip t1(create)=0.003025(sec) t2(+save)=0.146706(sec) input size=4594000(byte) ratio=1.958 shuffle=True fletcher32=False
calib: gzip t1(create)=0.002738(sec) t2(+save)=0.168040(sec) input size=4594000(byte) ratio=2.072 shuffle=False fletcher32=False
calib: gzip t1(create)=0.002926(sec) t2(+save)=0.178174(sec) input size=4594000(byte) ratio=2.188 shuffle=True fletcher32=False
calib: gzip t1(create)=0.002579(sec) t2(+save)=0.182965(sec) input size=4594000(byte) ratio=2.187 shuffle=True fletcher32=True
calib: lzf t1(create)=0.003225(sec) t2(+save)=0.100822(sec) input size=4594000(byte) ratio=1.351 shuffle=False fletcher32=False
calib: lzf t1(create)=0.002815(sec) t2(+save)=0.086916(sec) input size=4594000(byte) ratio=1.473 shuffle= True fletcher32=False
raw: lzf t1(create)=0.003125(sec) t2(+save)=0.108339(sec) input size=4594000(byte) ratio=1.045 shuffle=False fletcher32=False
raw: lzf t1(create)=0.003071(sec) t2(+save)=0.075530(sec) input size=4594000(byte) ratio=1.698 shuffle= True fletcher32=False
Compression filter "szip" is unavailable
Compression filter "lzo" is unavailable
Compression filter "blosc" is unavailable
Compression filter "bzip2" is unavailable | ||
Code Block | ||
>>> dset = myfile.create_dataset("Data", (1000,), compression="gzip",
shuffle=True) |
References
Overview
Content Tools