Using the fastai2 `Datasets` to make an time series dataset.

Transforms

Basic function to process time-series data before assembling it in a DataLoaders.

class NormalizeTS[source]

NormalizeTS(verbose=False, make_ones=True, eps=1e-07, mean=None) :: ItemTransform

Normalize the Time-Series.

norm = NormalizeTS()
o = (TSTensorSeq(torch.arange(10.)),TSTensorSeqy( torch.arange(10,15),x_len=10))
o_en = norm(o)
test_eq(o_en[0].mean(), 0)
test_eq(o_en[1].mean()==0, False)
dec_o = norm.decode(o_en)
test_eq(dec_o[0],o[0])

f,axs = plt.subplots(1,3, sharey=True)
ax = o[0].show(axs[0])
o[1].show(ax)
ax.plot([0,15],[0,0],'--')
ax = o_en[0].show(axs[1])
o_en[1].show(ax)
ax.plot([0,15],[0,0],'--')
ax = dec_o[0].show(axs[2])
dec_o[1].show(ax)
ax.plot([0,15],[0,0],'--')
[<matplotlib.lines.Line2D at 0x7fd73a94f990>]
norm = NormalizeTS(mean=9)
o = (TSTensorSeq(torch.arange(10.)),TSTensorSeqy( torch.arange(10,15),x_len=10))
o_en = norm(o)
test_eq(o_en[0][-1], 0)
test_eq(o_en[1][-1]==0, False)
dec_o = norm.decode(o_en)
test_eq(dec_o[0],o[0])

f,axs = plt.subplots(1,3, sharey=True)
ax = o[0].show(axs[0])
o[1].show(ax)
ax.plot([0,15],[0,0],'--')
ax = o_en[0].show(axs[1])
o_en[1].show(ax)
ax.plot([0,15],[0,0],'--')
ax = dec_o[0].show(axs[2])
dec_o[1].show(ax)
ax.plot([0,15],[0,0],'--')
[<matplotlib.lines.Line2D at 0x7fd73a82bb90>]

TSDataLoaders

Utils

concat_ts_list[source]

concat_ts_list(train, val)

a = [np.random.randn(3,10)]*50
b = [np.random.randn(3,5)]*50
r = concat_ts_list(a,b)
test_eq(r[0].shape,(3,15))
test_eq(r[0], np.concatenate([a[0],b[0]],1))

make_test[source]

make_test()

Splits the every ts in items based on horizon + lookback*, where the last part will go into val and the first in train.

*if keep_lookback: it will only remove horizon from train otherwise also lookback.

make_test_pct[source]

make_test_pct()

Splits the every ts in items based on pct(percentage) of the length of the timeserie, where the last part will go into val and the first in train.

a = [np.random.randn(3,15)]*50
train, val = make_test(a,5,5)
test_eq(train[0],a[0][:,:-10])
test_eq(val[0],a[0][:,-10:])

train, val = make_test(a,5,5,True)
test_eq(train[0],a[0][:,:-5])
test_eq(val[0],a[0][:,-10:])

Dataloaders

d = {}
d.pop('k',1)
1

class TSDataLoaders[source]

TSDataLoaders(*loaders, path='.', device=None) :: DataLoaders

Basic wrapper around several DataLoaders.

dbunch = TSDataLoaders.from_folder(path, horizon = 14, step=5, bs=64, nrows=10, device = 'cpu', after_batch=noop)
for o in dbunch[0]:
    test_close(o[0].mean(),0)
    test_close(o[0].std(),1,eps=.1)    
Train:1255; Valid: 50; Test 10

TSDataLoaders.from_items[source]

TSDataLoaders.from_items(items:L, horizon:int, valid_pct=1.5, seed=None, lookback=None, step=1, incl_test=True, path:Path='.', device=None, norm=True, min_seq_len=None, max_std=2, bs=64, shuffle=False, num_workers=None, verbose=False, do_setup=True, pin_memory=False, timeout=0, batch_size=None, drop_last=False, indexed=None, n=None, wif=None, before_iter=None, after_item=None, before_batch=None, after_batch=None, after_iter=None, create_batches=None, create_item=None, create_batch=None, retain=None, get_idxs=None, sample=None, shuffle_fn=None, do_batch=None)

Create an list of time series.

The DataLoader for the test set will be save as an attribute under test

TSDataLoaders.from_folder[source]

TSDataLoaders.from_folder(data_path:Path, valid_pct=0.5, seed=None, horizon=None, lookback=None, step=1, nrows=None, skiprows=None, incl_test=True, path:Path='.', device=None, norm=True, min_seq_len=None, max_std=2, bs=64, shuffle=False, num_workers=None, verbose=False, do_setup=True, pin_memory=False, timeout=0, batch_size=None, drop_last=False, indexed=None, n=None, wif=None, before_iter=None, after_item=None, before_batch=None, after_batch=None, after_iter=None, create_batches=None, create_item=None, create_batch=None, retain=None, get_idxs=None, sample=None, shuffle_fn=None, do_batch=None)

Create from M-compition style in path with train,test csv-files.

The DataLoader for the test set will be save as an attribute under test

dbunch = TSDataLoaders.from_folder(path, horizon = 14, step=5, bs=64, nrows=100)
dbunch.train.show_batch(max_n=4)
Train:9408; Valid: 500; Test 100
dbunch.test.show_batch(max_n=4)