You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Introduction_TSA.md
+60-60Lines changed: 60 additions & 60 deletions
Original file line number
Diff line number
Diff line change
@@ -351,13 +351,13 @@ $$Y =f(X)$$
351
351
352
352
Below is a contrived example of a supervised learning dataset where each row is an observation comprised of one input variable (X) and one output variable to be predicted (y).
353
353
354
-
| X | y |
355
-
|---|-----|
356
-
| 5 | 0.9 |
357
-
| 4 | 0.8 |
358
-
| 5 | 1.0 |
359
-
| 3 | 0.7 |
360
-
| 4 | 0.9 |
354
+
| X | y |
355
+
|---|-----|
356
+
| 5 | 0.9 |
357
+
| 4 | 0.8 |
358
+
| 5 | 1.0 |
359
+
| 3 | 0.7 |
360
+
| 4 | 0.9 |
361
361
362
362
Supervised learning problems can be further grouped into regression and classification problems.
363
363
@@ -368,24 +368,24 @@ Supervised learning problems can be further grouped into regression and classifi
368
368
369
369
Time series data can be phrased as supervised learning. Given a sequence of numbers for a time series dataset, we can restructure the data to look like a supervised learning problem.
370
370
371
-
| time | measure |
372
-
|------|---------|
373
-
| 1 | 100 |
374
-
| 2 | 110 |
375
-
| 3 | 108 |
376
-
| 4 | 115 |
377
-
| 5 | 120 |
371
+
| time | measure |
372
+
|------|---------|
373
+
| 1 | 100 |
374
+
| 2 | 110 |
375
+
| 3 | 108 |
376
+
| 4 | 115 |
377
+
| 5 | 120 |
378
378
379
379
We can restructure this time series dataset as a supervised learning problem by using the value at the previous time step to predict the value at the next time step. Re-organizing the time series dataset this way, the data would look as follows:
380
380
381
-
| X | y |
382
-
|-----|-----|
383
-
| ? | 100 |
384
-
| 100 | 110 |
385
-
| 110 | 108 |
386
-
| 108 | 115 |
387
-
| 115 | 120 |
388
-
| 120 | ? |
381
+
| X | y |
382
+
|-----|-----|
383
+
| ? | 100 |
384
+
| 100 | 110 |
385
+
| 110 | 108 |
386
+
| 108 | 115 |
387
+
| 115 | 120 |
388
+
| 120 | ? |
389
389
390
390
- We can delete 1st and last row since they have missing value before training a supervised model.
391
391
- The use of prior time steps to predict the next time step is called the sliding window method.
@@ -399,37 +399,37 @@ The number of observations recorded for a given time in a time series dataset ma
399
399
400
400
For example suppose we have following dataset:
401
401
402
-
| time | measure1 | measure2 |
403
-
|------|----------|----------|
404
-
| 1 | 0.2 | 88 |
405
-
| 2 | 0.5 | 89 |
406
-
| 3 | 0.7 | 87 |
407
-
| 4 | 0.4 | 88 |
408
-
| 5 | 1.0 | 90 |
402
+
| time | measure1 | measure2 |
403
+
|------|----------|----------|
404
+
| 1 | 0.2 | 88 |
405
+
| 2 | 0.5 | 89 |
406
+
| 3 | 0.7 | 87 |
407
+
| 4 | 0.4 | 88 |
408
+
| 5 | 1.0 | 90 |
409
409
410
410
Let’s also assume that we are only concerned with predicting measure2. We can re-frame this time series dataset as a supervised learning problem with a window width of one.
411
411
412
-
| X1 | X2 | X3 | y |
413
-
|-----|-----|-----|-----|
414
-
| ? | ? | 0.2 | 88 |
415
-
| 0.2 | 88 | 0.5 | 89 |
416
-
| 0.5 | 89 | 0.7 | 87 |
417
-
| 0.7 | 87 | 0.4 | 88 |
418
-
| 0.4 | 88 | 1.0 | 90 |
419
-
| 1.0 | 90 | ? | ? |
412
+
| X1 | X2 | X3 | y |
413
+
|-----|-----|-----|-----|
414
+
| ? | ? | 0.2 | 88 |
415
+
| 0.2 | 88 | 0.5 | 89 |
416
+
| 0.5 | 89 | 0.7 | 87 |
417
+
| 0.7 | 87 | 0.4 | 88 |
418
+
| 0.4 | 88 | 1.0 | 90 |
419
+
| 1.0 | 90 | ? | ? |
420
420
421
421
We can see that as in the univariate time series example above, we may need to remove the first and last rows in order to train our supervised learning model.
422
422
423
423
If we need to predict both `measure1` and `measure2` for the next time step. We can transform the data as follows:
424
424
425
-
| X1 | X2 | y1 | y2 |
426
-
|-----|-----|-----|-----|
427
-
| ? | ? | 0.2 | 88 |
428
-
| 0.2 | 88 | 0.5 | 89 |
429
-
| 0.5 | 89 | 0.7 | 87 |
430
-
| 0.7 | 87 | 0.4 | 88 |
431
-
| 0.4 | 88 | 1.0 | 90 |
432
-
| 1.0 | 90 | ? | ? |
425
+
| X1 | X2 | y1 | y2 |
426
+
|-----|-----|-----|-----|
427
+
| ? | ? | 0.2 | 88 |
428
+
| 0.2 | 88 | 0.5 | 89 |
429
+
| 0.5 | 89 | 0.7 | 87 |
430
+
| 0.7 | 87 | 0.4 | 88 |
431
+
| 0.4 | 88 | 1.0 | 90 |
432
+
| 1.0 | 90 | ? | ? |
433
433
434
434
### Sliding Window With Multiple Steps
435
435
@@ -438,24 +438,24 @@ If we need to predict both `measure1` and `measure2` for the next time step. We
438
438
439
439
Consider this univariate time series dataset:
440
440
441
-
| time | measure |
442
-
|------|---------|
443
-
| 1 | 100 |
444
-
| 2 | 110 |
445
-
| 3 | 108 |
446
-
| 4 | 115 |
447
-
| 5 | 120 |
441
+
| time | measure |
442
+
|------|---------|
443
+
| 1 | 100 |
444
+
| 2 | 110 |
445
+
| 3 | 108 |
446
+
| 4 | 115 |
447
+
| 5 | 120 |
448
448
449
449
We can frame this time series as a two-step forecasting dataset for supervised learning with a window width of one, as follows:
450
450
451
-
| X1 | y1 | y2 |
452
-
|-----|-----|-----|
453
-
| ? | 100 | 110 |
454
-
| 100 | 110 | 108 |
455
-
| 110 | 108 | 115 |
456
-
| 108 | 115 | 120 |
457
-
| 115 | 120 | ? |
458
-
| 120 | ? | ? |
451
+
| X1 | y1 | y2 |
452
+
|-----|-----|-----|
453
+
| ? | 100 | 110 |
454
+
| 100 | 110 | 108 |
455
+
| 110 | 108 | 115 |
456
+
| 108 | 115 | 120 |
457
+
| 115 | 120 | ? |
458
+
| 120 | ? | ? |
459
459
460
460
Specifically, that a supervised model only has X1 to work with in order to predict both y1 and y2.
0 commit comments