-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathStats notes
More file actions
33 lines (27 loc) Β· 2.69 KB
/
Stats notes
File metadata and controls
33 lines (27 loc) Β· 2.69 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
I/ The OLS regression model can be represented as:
ππ=π½0+π½1ππ+ππ
β
Where:
ππ is the observed value of the dependent variable.
ππ is the observed value of the independent variable.
π½0 and π½1 are the intercept and slope coefficients, respectively.
ππ is the error term.
A.R-squared (RΒ²):
R-squared is always between 0 and 1.
R-squared measures the proportion of the variance in the dependent variable that is explained by the independent variables in the model.
A higher R-squared indicates that more variance in the dependent variable is explained by the independent variables, suggesting a better fit of the model to the data.
However, R-squared alone doesn't determine the quality of the model. It's important to consider the context and purpose of the analysis.
There isn't a universally agreed-upon threshold for what constitutes a "good" R-squared, but values closer to 1.0 are generally preferred.
For example, an R-squared of 0.70 means that 70% of the variance in the dependent variable is explained by the independent variables in the model.
Limitations of R-squared: While R-squared is a useful measure of goodness of fit,it doesn't indicate the causal relationship between the independent and dependent variables, nor does it indicate whether the model is correctly specified.
Additionally, R-squared can be inflated by adding more independent variables to the model, even if those variables are not truly related to the dependent variable.
B.Adjusted R-squared:
Adjusted R-squared is similar to R-squared but penalizes for the inclusion of additional independent variables in the model. It accounts for the degrees of freedom used by each independent variable.
Adjusted R-squared is often preferred when comparing models with different numbers of predictors.
A higher adjusted R-squared indicates a better fit of the model, similar to R-squared.
Again, there isn't a strict threshold for what constitutes a "good" adjusted R-squared, but values closer to 1.0 are generally preferred.
C.P-value associated with coefficients:
The p-value associated with a coefficient indicates the probability of observing the estimated coefficient (or one more extreme) if the null hypothesis is true (i.e., if the true population coefficient is zero).
A lower p-value suggests that the coefficient is statistically significant, meaning that it's unlikely to have occurred by chance alone.
The conventional threshold for statistical significance is often set at 0.05 (5%). Coefficients with p-values less than 0.05 are typically considered statistically significant.
However, the interpretation of p-values should be considered alongside other factors, such as effect size and practical significance.