Important
- Model Generalizability – “All models are wrong, but some are useful”; models must sacrifice some perfect fit to be generalizable to new data and contexts,.
- Zero Correlation Consequence – When the correlation between the independent and dependent variables is zero, the model collapses to a default value, which is the mean of Y (a flat line),.
- Data Visualization – It is crucial to plot data (e.g., scatterplots) before calculating correlation to identify relationships (like U-shaped curves) that a simple Pearson’s $r$ would incorrectly identify as zero association,.
- Significance Testing for Correlation – Do not rely on p-values to deem a correlation significant, as this is heavily influenced by sample size; focus instead on the magnitude (effect size) of the correlation,.
- Optimization Algorithm – This course exclusively utilizes Ordinary Least Squares (OLS) to estimate regression parameters; other algorithms are not required,.
Core Concepts
- Statistical Model Structure: Most statistical models follow the formula Data = Model + Error, where Data is the dependent variable, Model is the explanatory function of independent variables, and Error is the residual unexplained variation,.
- Regression Analysis: A method used to explain or predict data by utilizing optimization algorithms (like OLS) to minimize the error component and increase the model’s fit.
- Ordinary Least Squares (OLS): The foundational one-step optimization algorithm used to estimate model parameters by minimizing the distance (error) between observed data points and the assumed model line.
- Covariance: A measure of the shared variance between two variables that indicates they vary together, though it is unstandardized and limited by the specific units of measurement,.
- Pearson Correlation ($r$): The standardized covariance between two continuous variables, bounded between -1 and 1, which allows for the comparison of relationship intensity and direction across different scales,.
- Simple Linear Regression (SLR) Model: A linear function ($y_i = a + bx_i + e_i$) where $a$ is the intercept (value of $y$ when $x=0$) and $b$ is the slope (magnitude of change in $y$ given a change in $x$).
- Fixed vs. Random Effects: Fixed effects suppose a stable effect with random error, while random effects (used in repeated measures) account for variance attributable to the subject themselves (e.g., the variance between a person and themselves over time).
Theories and Frameworks
- Frequentism/Bayesianism: Statistical frameworks that provide specific assumptions about data behavior used by algorithms to generate parameter estimates.
- Gauss-Markov Theorem (GMT): A set of statistical assumptions that must be met for the Ordinary Least Squares (OLS) algorithm to function as the best linear unbiased estimator.
- Under-determination of Theory by Data: A philosophical concept stating that multiple different models can fit the same data, making it difficult to determine which theoretical model is “correct” based on data alone.
Notable Individuals
- George Box: Statistician famous for the quote “All models are wrong, but some are useful”.
- Karl Pearson: Developed the standardized correlation coefficient and popularized hypothesis testing.
- Francis Galton: Darwin’s cousin who popularized correlation and regression by applying the work of others.
- Ronald Fisher: Associated with the famous “Fisher’s Iris data set” used to demonstrate regression and ANOVA.

