Reading data for the AALTD'16 Time Series Classification Contest: Python

In the following, we assume that you downloaded the files:

and copied them in your python working folder.

We also assume that you have python installed with numpy and h5py packages.

Start python, and then:

In [1]:
import numpy, h5py

def dset2npy(fp, dataset_name):
    read_dataset = fp[dataset_name]
    npy_arr = numpy.empty(read_dataset.shape, dtype=read_dataset.dtype)
    read_dataset.read_direct(npy_arr)
    return npy_arr

fp = h5py.File("train.h5", "r")
x = dset2npy(fp, "time_series")
y = dset2npy(fp, "labels")
fp.close()
print(x.shape, y.shape)
(180, 51, 24) (180,)

Now, let's have a look at test data:

In [2]:
fp = h5py.File("test_task1.h5", "r")
x = dset2npy(fp, "time_series")
print(x.shape)
(180, 51, 24)

Of course, something is missing:

In [3]:
try:
    y = dset2npy(fp, "labels")
except:
    print("Object labels does not exist")
fp.close()
Object labels does not exist