Page History
...
Code Block | ||||
---|---|---|---|---|
| ||||
load_nda_from_file: Data from file nda-cxitut13-r0010-e000011-raw.npy: shape:(32, 185, 388) size:2296960 dtype:int16 [1028 1082 1101 1072 1131]... H(8-bit) = 5.080 nda8 : shape:(2296960, 2) size:4593920 dtype:uint8 [ 4 4 58 4 77 4 48 4 107 4]... # split data array for two with even and odd bytes: nda8L: shape:(2296960,) size:2296960 dtype:uint8 [ 4 58 77 48 107 73 45 103 28 89]... nda8H: shape:(2296960,) size:2296960 dtype:uint8 [4 4 4 4 4 4 4 4 4 4]... H(low -byte) = 7.821 H(high-byte) = 0.376 |
Compression in HDF5
GZIP
"A number of compression filters are available in HDF5. By far the most commonly used is the GZIP filter. "
Code Block |
---|
dset = f.create_dataset("BigDataset", (1000,1000), dtype='f', compression="gzip")
dset.compression |
SZIP
"SZIP is a patented compression technology used extensively by NASA. Generally you only have to worry about this if you’re exchanging files with people who use satellite data. Because of patent licensing restrictions, many installations of HDF5 have the compressor (but not the decompressor) disabled."
...
SZIP features:
- Integer (1, 2, 4, 8 byte; signed/unsigned) and floating-point (4/8 byte) types only
- Fast compression and decompression
- A decompressor that is almost always available
LZF
"For files you’ll only be using from Python, LZF is a good choice. It ships with h5py; C source code is available for third-party programs under the BSD license. It’s optimized for very, very fast compression at the expense of a lower compression ratio compared to GZIP. The best use case for this is if your dataset has large numbers of redundant data points."
...