Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to import the function used for splitting time series data.

ML Python

from sklearn.model_selection import [1]

Drag options to blanks, or click blank then click option'

Atrain_test_split

BKFold

CTimeSeriesSplit

Dcross_val_score

Attempts:

3 left

2fill in blank

medium

Complete the code to create a time series splitter with 3 splits.

ML Python

tscv = [1](n_splits=3)

Drag options to blanks, or click blank then click option'

ATimeSeriesSplit

BKFold

Ctrain_test_split

DStratifiedKFold

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to correctly split features and target for time series cross-validation.

ML Python

for train_index, test_index in tscv.[1](X):
    X_train, X_test = X.iloc[train_index], X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]

Drag options to blanks, or click blank then click option'

Atrain_test_split

Bsplit

Csplit_data

Dcross_val_score

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a train-test split manually for time series data without shuffling.

ML Python

train_size = int(len(data) * [1])
train, test = data[:train_size], data[[2]:]

Drag options to blanks, or click blank then click option'

A0.8

B0.2

Ctrain_size

Dlen(data) - train_size

Attempts:

3 left

5fill in blank

hard

Fill both blanks to create a dictionary of train and test indices using TimeSeriesSplit.

ML Python

splits = {}
for i, ([1], [2]) in enumerate(tscv.split(data)):
    splits[i] = {'train': [1], 'test': [2]

Drag options to blanks, or click blank then click option'

Atrain_index

Btest_index

Ctrain_idx

Dtest_idx

Attempts:

3 left

Practice

(1/5)

1. Why is it important to keep the order of data when doing a train-test split for time series?

easy

A. Because time series data depends on the order of events and future data should not be used to predict past data.

B. Because random shuffling improves model accuracy in time series.

C. Because train and test sets must have the same number of samples.

D. Because test data should always come before train data.

Train-test split for time series in ML Python - Interactive Code Practice

Start learning this pattern below

Practice

Solution

Step 1: Understand time series data nature

Step 2: Importance of order in train-test split

Final Answer:

Quick Check:

Solution

Step 1: Understand slicing for time series split

Step 2: Check each code snippet

Final Answer:

Quick Check:

Solution

Step 1: Calculate split index

Step 2: Calculate test length

Final Answer:

Quick Check:

Solution

Step 1: Understand train_test_split default behavior

Step 2: Why shuffling is a problem for time series

Final Answer:

Quick Check:

Solution

Step 1: Calculate split fraction for 2.5 years out of 3 years

Step 2: Use slicing to split data preserving order

Final Answer:

Quick Check: