User Profile

RockRoll · Aug 14 '20, 01:26 AM

Hey SioSio,

Thanks for your assistance. So, that "n" was just a typo here. I however solved the problem. It turns out that PyTables' "append" method was much faster than resizing the HDF5 file. Just wanted to mention here, if anyone stops by here in the future!

Thanks again for your time!

RockRoll · Jul 31 '20, 04:23 PM

So here is what is happening:
1. I choose a flexible shape dset on line 9; flexible as I am dealing with large arrays and that can vary with the input file size
2. I fill in some values of interest at line 24
3. At line 23, I am basically expanding the current size of dset by n (=1). The added row is filled in with values I create at line 24.

Simply put, I am generating some numbers (line 22) and filling in dset...

RockRoll · Jul 31 '20, 05:32 AM

Yeah, 9th seems to be fine. f_w is providing a file object. Basically I am creating a new "data.h5" that save in its dset at line 24.

I can also change line 9 to: dset = f_r.create_data set('dataset_3' , data=d1, maxshape=(None, None), chunks=True)

This (instead of creating a new hdf5 file), creates a new dataset3 in input.h5 ; but the computation time is unimpacted.

My suspicion is something can...

RockRoll · Jul 31 '20, 05:01 AM

I tried that too. It gives an error "ValueError : Not a dataset (not a dataset)" at Line 12 where e1 is asking for dset1.

I can't transfer dataset_1 and dataset_2 directly to a list/numpy array as dataset_1 or 2 are really large.

Any other thought?

RockRoll · Jul 31 '20, 04:27 AM

I closed it using "f_r.close( )" at the end and it didn't change anything. Any other suggestion?

User Profile

Profile Sidebar

Possible to have different datatypes among columns of an array?

Leave a comment:

Pandas: Merging Sorted Dataframes

Python/Numpy: Automating to save generated data

Extracting rows based on condition on one column

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Fastest way to subtract elements of datasets of HDF5 file?