Resources
The Breakdown
Important
- Interpreting results and extracting percentages/prevalence rates from papers – You need to understand how to interpret research results and how to extract percentages and prevalence rates, such as the prevalence rate of anxiety among specific demographic groups from example papers. This skill is important for the midterm.
- Questionnaires – These contain a set of written questions or items used to gather information from participants. They can be closed-ended, open-ended, or mixed. Understanding what questionnaires are is important.
- Structured, Semi-structured, and Unstructured Interviews – Interviews can be conducted face-to-face, on the phone, or online. Structured interviews have pre-defined questions, semi-structured interviews have pre-defined questions but allow for follow-up questions, and unstructured interviews have no questions planned beforehand, only a general topic. These different types of interviews have been mentioned previously.
- Sample size in cross-sectional studies and Type One Error – For cross-sectional studies, the sample size should be large enough to increase statistical power and decrease Type One error. Type One error happens when the results are not statistically significant, but a researcher concludes that they are significant. You will learn more about Type One error in this course.
- Statistical procedures – You will learn a little bit about statistical procedures used in quantitative methods in this course.
- Controlling for confounding variables – Strategies to control for confounding variables in a study include excluding participants based on certain criteria (like IQ), balancing groups, or collecting information on socio-demographic backgrounds and using that data in the analysis to see if there are differences. Collecting and including everyone initially, gathering important demographic information, and using that information in the analysis is suggested as a good strategy.
- Factor Analysis – This is a statistical technique that shows the relation between a latent factor (or construct) and each variable or item, also indicating the error rate for each item. It can also show relationships among multiple factors. This is a concept you will encounter.
Core concepts
- Quantitative vs. Qualitative Studies – Quantitative studies aim to test hypotheses and measure variables using numerical data to identify patterns, relationships, and differences. Qualitative studies aim for in-depth information, often from smaller sample sizes, with conclusions based on interpretation by the investigator using nonstatistical techniques. Quantitative research focuses on numerical data and uses statistical analysis.
- Data Collection Methods – Common methods in quantitative research include self-reports, questionnaires, scales, interviews, or tasks. Qualitative research commonly uses interviews, focus groups, messages, or content from interactions or social media to analyse content. Questionnaires are a set of written questions. Interviews can be structured, semi-structured, or unstructured. Tasks, like a numerical comparison task or memorisation task, can also be used to collect data.
- Focus Groups – Used in qualitative studies, focus groups gather insights and opinions from a small group of participants (usually 6 to 12) on a specific topic. A moderator guides the discussion using open-ended questions. The moderator and researcher record opinions, interactions, and reactions.
- Data Analysis – In quantitative studies, data analysis typically involves statistical procedures such as correlation, regression, t-test, and ANOVA. In qualitative studies, analysis is based on interpretation using different approaches, which can be influenced by the researcher’s perspective.
- Non-Experimental Methods – These include study types like cross-sectional and longitudinal studies. Cross-sectional studies use tools such as self-reports, scales, and questionnaires and require a large sample size for sufficient statistical power.
- Constructs – Psychological constructs, like depression, anxiety, or love, are complex and not directly observable. Researchers measure constructs by observing behaviours or using items associated with the construct. Factor analysis can help understand the structure of constructs based on multiple variables or items.
Shit I didn’t Understand
- Recall bias is a potential issue in studies, particularly those collecting past information, where participants may not accurately remember events or details. Longitudinal studies that use appropriate time intervals and data collection methods aim to reduce this bias.
- Various regressions are statistical techniques used to examine the relationship between one or more independent variables (predictors) and a dependent variable (outcome). They are used for prediction, understanding, or analysing how the outcome changes as the predictor(s) change. The type of regression used depends on the nature of the outcome variable:
- Linear Regression: Used when the outcome variable is continuous or quantitative.
- Simple Linear Regression involves one outcome variable and one single predictor.
- Multiple Linear Regression (also called multivariable regression) involves one outcome variable and two or more independent variables (predictors).
- Logistic Regression: Used when the outcome variable is binary, meaning it has two possible categories (e.g., having/not having a disorder, yes/no).
- Ordinal Regression: Used when the outcome variable is ordinal, meaning its categories have a specific order (e.g., satisfaction levels from low to high, education levels).
- Multivariate Regression: Used when a study has multiple outcome variables, distinct from multiple predictors.
- Linear Regression: Used when the outcome variable is continuous or quantitative.
- Continuous variables, also referred to as quantitative variables, are quantities that are typically measured by assigning a number to each individual. Examples include height, people’s level of talkativeness, how depressed they are, or the number of siblings they have. In the context of correlational studies, such as examining “the relation between learning styles and personality traits”, these traits would be measured using scales or questionnaires that produce numerical scores, allowing them to be treated as continuous variables for analysis, like using Pearson correlation.
- A Fit line (or regression line) is a line drawn on a scatterplot to visualise the relationship or trend between two variables. It helps to show the overall pattern of the relationship, such as whether it is positive or negative, and its strength.
- An Intercept is represented by B0 or beta 0 in a regression equation. It represents the expected value of the outcome variable when all predictor variables are zero. It can also be thought of as the “start point” of the relationship or the value of the outcome when no predictors are considered in the model. The intercept is a calculated value and is not always zero; while it can sometimes be fixed as zero in certain models, it is typically a value determined by the analysis.
More About Regression
Regression is a statistical technique used by researchers to examine the relationship between one or more independent variables, called predictors, and a dependent variable, called an outcome. The purpose of regression is for prediction, understanding, or analysing how the dependent variable changes as the independent variable(s) change. Once a relationship between two variables has been established, regression can be used to make predictions about the value of one variable given the value of another. The variable used to make the prediction is often called the predictor variable, and the variable being predicted is called the outcome variable or criterion variable.
The type of regression used depends on the nature of the outcome variable.
- Linear regression is used when the outcome variable is continuous or quantitative.
- Ordinal regression is used when the outcome variable is ordinal.
- Logistic or binary logistic regression is used when the outcome variable is binary.
Let’s focus on the formulae for linear regression as detailed in the sources. There are different types of linear regression depending on the number of predictor variables.
Simple Linear Regression Simple linear regression models the relationship between one dependent variable (Y) and one independent variable (X). The general equation for simple linear regression is: Y = b0 + b1X1 + e
In this equation:
- Y refers to the outcome variable that you want to predict.
- b0 (or beta 0) refers to the intercept or constant value. It represents the start point of the correlation or the start point of the outcome when there are no predictors in the model (predictors are zero). It is always included in the equation.
- b1 (or beta coefficient B1) refers to the slope or the coefficient for the first predictor. It indicates the change in Y for a one-unit change in X.
- X1 refers to the predictor variable.
- e refers to the error term for this regression equation.
When interpreting regression coefficients like b1 (or beta), we primarily focus on whether the relationship is positive or negative. A positive coefficient means that as the predictor variable (X) increases, the outcome variable (Y) also increases. A negative coefficient means that as X increases, Y decreases. For a positive beta, every one unit increase in the predictor (X) results in Y increasing by the beta value units on average. For a negative beta, every one unit increase in X results in Y decreasing by the absolute value of the beta units on average. Unlike correlation coefficients (Pearson’s r), the strength of the beta coefficient itself is not interpreted in the same way; other measures in regression are used for strength, which you would learn later.
Multiple Linear Regression Multiple linear regression is a statistical technique used to model the relationship between one dependent variable (Y) and two or more independent variables (X1, X2, …). It is also sometimes called multivariable regression. Having multiple outcomes is called multivariate regression. The general equation for multiple linear regression is: Y = b0 + b1X1 + b2X2 + b3X3 + … + biXi + e
In this equation:
- Y is the outcome variable.
- b0 is the intercept.
- b1, b2, b3, … bi are the coefficients for the predictors. These regression weights indicate how large a contribution a predictor variable makes, on average, to the prediction of the outcome variable. They also indicate how much the outcome variable changes for each one-unit change in the predictor variable. A key advantage of multiple regression is that it can show whether a predictor variable contributes to the outcome variable over and above the contributions made by other predictor variables (i.e., statistically controlling for other predictors).
- X1, X2, X3, … Xi are the predictor variables.
- e is the error term.
The equation for multiple linear regression can also include interaction terms when examining moderation. A moderator variable can change the direction or strength of the relationship between other variables. An interaction term is a product of two main factors. For example, if you are predicting Depression (Y) from Social Support (X1) and Gender (X2) and want to include Gender as a moderator of the relationship between Social Support and Depression, the equation might look like: Depression = B0 + B1(Social Support) + B2(Gender) + B3*(Social Support * Gender) + e** In this equation, B3 is the coefficient for the interaction term (Social Support * Gender). A more complex example with multiple predictors and an interaction term is given for predicting Vaccine Mistrust (Y) from Everyday Discrimination (X1), Major Discrimination (X2), Conspiracy Belief (X3), Health Literacy (X4), and Black Participants (X5), with an interaction between Black Participants and Health Literacy (X4 * X5): **Y = B0 + B1X1 + B2X2 + B3X3 + B4X4 + B5X5 + B6(X4 * X5) + e** Here, B6 is the coefficient for the interaction term.

