Skip to content

Commit 78b0228

Browse files
Ajit Kumar SinghAjit Kumar Singh
Ajit Kumar Singh
authored and
Ajit Kumar Singh
committed
formatting some texts
1 parent 991493f commit 78b0228

File tree

1 file changed

+90
-90
lines changed

1 file changed

+90
-90
lines changed

Introduction_TSA.md

Lines changed: 90 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Introduction to Time Series Analysis
22

33
Contents
4-
---
4+
--
55
- [Taxonomy of Time Series Analysis Domain](#taxonomy-of-time-series-analysis-domain)
66
- [Best Practices for Forecasting Model Development](#best-practices-for-forecasting-model-development)
77
- [Simple and Classical Forecasting Methods](#simple-and-classical-forecasting-methods)
@@ -141,10 +141,10 @@ Define your time series problem. Some topics to consider and motivating question
141141
8. *Contiguous vs. Discontiguous*: Are your observations contiguous or discontiguous?
142142

143143
Some useful tools to help get answers include:
144-
- Data visualizations (e.g. line plots, etc.).
145-
- Statistical analysis (e.g. ACF/PACF plots, etc.). ˆ
146-
- Domain experts.
147-
- Project stakeholders.
144+
- Data visualizations (e.g. line plots, etc.)
145+
- Statistical analysis (e.g. ACF/PACF plots, etc.)
146+
- Domain experts
147+
- Project stakeholders
148148

149149
#### Step 2: Design Test Harness
150150

@@ -175,18 +175,18 @@ This list is based on a univariate time series forecasting problem, but you can
175175

176176
Some data preparation schemes to consider include:
177177

178-
- Differencing to remove a trend.
179-
- Seasonal differencing to remove seasonality. ˆ
180-
- Standardize to center.
181-
- Normalize to rescale.
182-
- Power Transform to make normal.
178+
- Differencing to remove a trend
179+
- Seasonal differencing to remove seasonality
180+
- Standardize to center
181+
- Normalize to rescale
182+
- Power Transform to make normal
183183

184184
This large amount of systematic searching can be slow to execute. Some ideas to speed up the evaluation of models include:
185185

186-
- Use multiple machines in parallel via cloud hardware (such as Amazon EC2).
187-
- Reduce the size of the train or test dataset to make the evaluation process faster. ˆ
188-
- Use a more coarse grid of hyperparameters and circle back if you have time later. ˆ
189-
- Perhaps do not refit a model for each step in walk-forward validation.
186+
- Use multiple machines in parallel via cloud hardware (such as Amazon EC2)
187+
- Reduce the size of the train or test dataset to make the evaluation process faster
188+
- Use a more coarse grid of hyperparameters and circle back if you have time later
189+
- Perhaps do not refit a model for each step in walk-forward validation
190190

191191

192192
#### Step 4: Finalize Model
@@ -266,9 +266,9 @@ Configuring a SARIMA requires selecting hyperparameters for both the trend and s
266266

267267
There are three trend elements that require configuration. They are the same as the ARIMA model; specifically:
268268

269-
- p: Trend autoregression order. ˆ
270-
- d: Trend difference order.
271-
- q: Trend moving average order.
269+
- p: Trend autoregression order
270+
- d: Trend difference order
271+
- q: Trend moving average order
272272

273273
**Seasonal Elements**
274274

@@ -310,11 +310,11 @@ Hyperparameters:
310310

311311
Hyperparameters:
312312

313-
- Alpha (α): Smoothing factor for the level. ˆ
314-
- Beta (β): Smoothing factor for the trend.
315-
- Trend Type: Additive or multiplicative.
316-
- Dampen Type: Additive or multiplicative. ˆ
317-
- Phi (φ): Damping coefficient.
313+
- Alpha (α): Smoothing factor for the level
314+
- Beta (β): Smoothing factor for the trend
315+
- Trend Type: Additive or multiplicative
316+
- Dampen Type: Additive or multiplicative
317+
- Phi (φ): Damping coefficient
318318

319319
3. Triple Exponential Smoothing
320320

@@ -328,14 +328,14 @@ ality.
328328

329329
Hyperparameters:
330330

331-
- Alpha (α): Smoothing factor for the level. ˆ
332-
- Beta (β): Smoothing factor for the trend.
333-
- Trend Type: Additive or multiplicative.
334-
- Dampen Type: Additive or multiplicative. ˆ
335-
- Phi (φ): Damping coefficient.
336-
- Gamma (γ): Smoothing factor for the seasonality. ˆ
337-
- Seasonality Type: Additive or multiplicative.
338-
- Period: Time steps in seasonal period.
331+
- Alpha (α): Smoothing factor for the level
332+
- Beta (β): Smoothing factor for the trend
333+
- Trend Type: Additive or multiplicative
334+
- Dampen Type: Additive or multiplicative
335+
- Phi (φ): Damping coefficient
336+
- Gamma (γ): Smoothing factor for the seasonality
337+
- Seasonality Type: Additive or multiplicative
338+
- Period: Time steps in seasonal period
339339

340340
---
341341

@@ -351,13 +351,13 @@ $$Y =f(X)$$
351351

352352
Below is a contrived example of a supervised learning dataset where each row is an observation comprised of one input variable (X) and one output variable to be predicted (y).
353353

354-
| X | y |
355-
|---|-----|
356-
| 5 | 0.9 |
357-
| 4 | 0.8 |
358-
| 5 | 1.0 |
359-
| 3 | 0.7 |
360-
| 4 | 0.9 |
354+
| X | y |
355+
|---|-----|
356+
| 5 | 0.9 |
357+
| 4 | 0.8 |
358+
| 5 | 1.0 |
359+
| 3 | 0.7 |
360+
| 4 | 0.9 |
361361

362362
Supervised learning problems can be further grouped into regression and classification problems.
363363

@@ -368,24 +368,24 @@ Supervised learning problems can be further grouped into regression and classifi
368368

369369
Time series data can be phrased as supervised learning. Given a sequence of numbers for a time series dataset, we can restructure the data to look like a supervised learning problem.
370370

371-
| time | measure |
372-
|------|---------|
373-
| 1 | 100 |
374-
| 2 | 110 |
375-
| 3 | 108 |
376-
| 4 | 115 |
377-
| 5 | 120 |
371+
| time | measure |
372+
|------|---------|
373+
| 1 | 100 |
374+
| 2 | 110 |
375+
| 3 | 108 |
376+
| 4 | 115 |
377+
| 5 | 120 |
378378

379379
We can restructure this time series dataset as a supervised learning problem by using the value at the previous time step to predict the value at the next time step. Re-organizing the time series dataset this way, the data would look as follows:
380380

381-
| X | y |
382-
|-----|-----|
383-
| ? | 100 |
384-
| 100 | 110 |
385-
| 110 | 108 |
386-
| 108 | 115 |
387-
| 115 | 120 |
388-
| 120 | ? |
381+
| X | y |
382+
|-----|-----|
383+
| ? | 100 |
384+
| 100 | 110 |
385+
| 110 | 108 |
386+
| 108 | 115 |
387+
| 115 | 120 |
388+
| 120 | ? |
389389

390390
- We can delete 1st and last row since they have missing value before training a supervised model.
391391
- The use of prior time steps to predict the next time step is called the sliding window method.
@@ -399,37 +399,37 @@ The number of observations recorded for a given time in a time series dataset ma
399399

400400
For example suppose we have following dataset:
401401

402-
| time | measure1 | measure2 |
403-
|------|----------|----------|
404-
| 1 | 0.2 | 88 |
405-
| 2 | 0.5 | 89 |
406-
| 3 | 0.7 | 87 |
407-
| 4 | 0.4 | 88 |
408-
| 5 | 1.0 | 90 |
402+
| time | measure1 | measure2 |
403+
|------|----------|----------|
404+
| 1 | 0.2 | 88 |
405+
| 2 | 0.5 | 89 |
406+
| 3 | 0.7 | 87 |
407+
| 4 | 0.4 | 88 |
408+
| 5 | 1.0 | 90 |
409409

410410
Let’s also assume that we are only concerned with predicting measure2. We can re-frame this time series dataset as a supervised learning problem with a window width of one.
411411

412-
| X1 | X2 | X3 | y |
413-
|-----|-----|-----|-----|
414-
| ? | ? | 0.2 | 88 |
415-
| 0.2 | 88 | 0.5 | 89 |
416-
| 0.5 | 89 | 0.7 | 87 |
417-
| 0.7 | 87 | 0.4 | 88 |
418-
| 0.4 | 88 | 1.0 | 90 |
419-
| 1.0 | 90 | ? | ? |
412+
| X1 | X2 | X3 | y |
413+
|-----|-----|-----|-----|
414+
| ? | ? | 0.2 | 88 |
415+
| 0.2 | 88 | 0.5 | 89 |
416+
| 0.5 | 89 | 0.7 | 87 |
417+
| 0.7 | 87 | 0.4 | 88 |
418+
| 0.4 | 88 | 1.0 | 90 |
419+
| 1.0 | 90 | ? | ? |
420420

421421
We can see that as in the univariate time series example above, we may need to remove the first and last rows in order to train our supervised learning model.
422422

423423
If we need to predict both `measure1` and `measure2` for the next time step. We can transform the data as follows:
424424

425-
| X1 | X2 | y1 | y2 |
426-
|-----|-----|-----|-----|
427-
| ? | ? | 0.2 | 88 |
428-
| 0.2 | 88 | 0.5 | 89 |
429-
| 0.5 | 89 | 0.7 | 87 |
430-
| 0.7 | 87 | 0.4 | 88 |
431-
| 0.4 | 88 | 1.0 | 90 |
432-
| 1.0 | 90 | ? | ? |
425+
| X1 | X2 | y1 | y2 |
426+
|-----|-----|-----|-----|
427+
| ? | ? | 0.2 | 88 |
428+
| 0.2 | 88 | 0.5 | 89 |
429+
| 0.5 | 89 | 0.7 | 87 |
430+
| 0.7 | 87 | 0.4 | 88 |
431+
| 0.4 | 88 | 1.0 | 90 |
432+
| 1.0 | 90 | ? | ? |
433433

434434
### Sliding Window With Multiple Steps
435435

@@ -438,24 +438,24 @@ If we need to predict both `measure1` and `measure2` for the next time step. We
438438

439439
Consider this univariate time series dataset:
440440

441-
| time | measure |
442-
|------|---------|
443-
| 1 | 100 |
444-
| 2 | 110 |
445-
| 3 | 108 |
446-
| 4 | 115 |
447-
| 5 | 120 |
441+
| time | measure |
442+
|------|---------|
443+
| 1 | 100 |
444+
| 2 | 110 |
445+
| 3 | 108 |
446+
| 4 | 115 |
447+
| 5 | 120 |
448448

449449
We can frame this time series as a two-step forecasting dataset for supervised learning with a window width of one, as follows:
450450

451-
| X1 | y1 | y2 |
452-
|-----|-----|-----|
453-
| ? | 100 | 110 |
454-
| 100 | 110 | 108 |
455-
| 110 | 108 | 115 |
456-
| 108 | 115 | 120 |
457-
| 115 | 120 | ? |
458-
| 120 | ? | ? |
451+
| X1 | y1 | y2 |
452+
|-----|-----|-----|
453+
| ? | 100 | 110 |
454+
| 100 | 110 | 108 |
455+
| 110 | 108 | 115 |
456+
| 108 | 115 | 120 |
457+
| 115 | 120 | ? |
458+
| 120 | ? | ? |
459459

460460
Specifically, that a supervised model only has X1 to work with in order to predict both y1 and y2.
461461

0 commit comments

Comments
 (0)