What is multiple regression and how is it different from the regression with one independent variable?

  1. Career development
  2. Multiple (Linear) Regression: Formula, Examples and FAQ

By Indeed Editorial Team

Updated June 30, 2022 | Published July 7, 2021

Updated June 30, 2022

Published July 7, 2021

Mathematical calculations such as those used in regression analysis can help you to predict future outcomes in a variety of industries. Statistical analysis of data is often beneficial to both businesses and institutions that aim to be prepared for all possibilities.

Multiple regression is a specific statistical technique that can help people understand the relationship between one dependent variable and two or more independent variables. Because multiple regression allows for more variance, it provides analysts with the ability to make optimal predictions of the response variable’s outcomes.

In this article, we will explain what multiple regression is, go over the formula, explain with examples how you can use multiple regression to forecast events and answer frequently asked questions about the multiple linear regression model.

What is multiple regression?

Multiple regression, also known as multiple linear regression (MLR), is a statistical technique that uses two or more explanatory variables to predict the outcome of a response variable. In other words, it can explain the relationship between multiple independent variables against one dependent variable. These independent variables serve as predictor variables, while the single dependent variable serves as the criterion variable. You can use this technique in a variety of contexts, studies and disciplines, including in econometrics and financial inference.

Multiple vs. linear regression

This form of regression analysis expands upon linear regression, which is the simplest form of regression. Simple linear regression creates linear mathematical relationships between one independent variable and one dependent variable, represented by y = a + ßx, where y can only result in one outcome based on the variable x. For example, in the equation 20 + 2x, where x = 5, y can only be 30.

Related: Linear vs. Nonlinear Equations: Understanding the Key Differences

Multiple linear regression formula

Here’s the formula for multiple linear regression, which produces a more specific calculation:

y = ß0 + ß1x1 + ß2x2 + ... + ßpxp

The variables in this equation are:

  • y is the predicted or expected value of the dependent variable.

  • x1, x2, and xp are three independent or predictor variables.

  • ß0 is the value of y when all the independent variables are equal to zero.

  • ß1, ß2, and ßp are the estimated regression coefficients. Each regression coefficient represents the change in y relative to a one-unit change in the respective independent variable.

Because of the multiple variables, which can be linear or nonlinear, this regression analysis model allows for more variance and precision when it comes to predicting outcomes, as well as understanding the impact of each explanatory variable on the model’s total variance.

Read more: Multiple Regression Analysis: Definition and How To Calculate

5 multiple regression examples

Here are some examples of how you might use multiple linear regression analysis in your career:

1. Real estate example

You’re a real estate professional who wants to create a model to help predict the best time to sell homes. You'd like to sell homes at the maximum sales price, but multiple factors can affect the sales price. These variables include the age of the house, the value of other homes in the neighborhood, quantitative measurements of the public school system regarding student performance and the number of nearby parks, among other factors.

You can build a prediction model off these four independent variables to predict the maximum sales price of homes. You can adjust the variables if any of these factors change in terms of their coefficient values.

Related: Descriptive vs. Inferential Statistics: What's the Difference?

2. Business example

You own stock in a publicly traded company and want to know if now would be a good time to sell your stock. Several variables could affect the value of the stock price, including the company's profitability, the company's costs, the company's competition and the company's assets. You can build a prediction model off these four independent variables to help decide whether you should sell the stock immediately or continue holding the stock.

Related: What Is a Stockholder?

3. Public health example

You're an epidemiologist studying the spread of an infectious disease. You want to predict the future spread of this illness based upon current known infections. Multiple independent variables can affect the number of future infections, including the population size, population density, air temperature, asymptomatic carriers and whether the population has achieved herd immunity. You can conduct statistical modeling and multiple linear regression analysis on empirical data to predict an outcome accounting for potential changes in the coefficient values of the predictor variables.

Related: 12 Types of Epidemiologists

4. Sports example

You're an athlete who believes highly in your ability to excel and succeed in competition. You believe you perform better in competition because of your high self-confidence. Other athletes with similar mindsets share the same beliefs. Multiple independent variables could affect athletic performance, including one's self-confidence, sex, age, experience and willingness to take risks in competition. Researchers can conduct a broader study to predict how a change to any of these variables may affect an athlete's performance.

Related: How To Build Self-Confidence in the Workplace

5. Health care example

You're a biostatistician conducting a medical study. You want to create a way to predict a child's future height. Several independent variables can affect a child's growth, including environmental factors and the child's nutrition. You can conduct a multiple linear analysis to predict a child's future height under a scenario of the coefficient values of these variables changing.

Related: 10 Types of Variables in Research and Statistics

Multiple linear regression FAQ

Here are answers to some frequently asked questions about multiple linear regression (MLR):

What's the difference between simple and multiple linear regression techniques?

As noted above, the simple linear method measures one independent variable against one dependent variable. The multiple linear technique is used when there are at least two independent variables against the one dependent variable. Structurally, the multiple linear regression technique is bigger in scope than the simple linear technique. It’s also the more specific model.

Related: 13 Types of Graphs and Charts (Plus When To Use Them)

When can you use multiple linear regression?

You can use multiple linear regression analysis anytime you have three or more measurement variables to assess. One of the measurement variables is the dependent variable, also known as the y variable. The rest of the variables are the independent variables, also known as the x variables.

What is the purpose of multiple linear regression?

The purpose of using this statistical technique is to find an equation that can best predict the y variable as a linear function of the x variables. This can help you plan for the future or be better prepared for multiple possibilities. Businesses can use this method to help evaluate their long-term outlook, for example. Other professionals can use this method to create research predictions.

Related: Quantitative Forecasting vs. Qualitative Forecasting

Is there any disadvantage to using multiple linear regression?

There can be a disadvantage to using this technique if the regression analysis involves two independent variables that are highly similar, or considered to be correlated. This situation could present a multicollinearity, which occurs when two explanatory variables have a strong linear relationship that could inflate the coefficients and create some problems in your analysis.

Example: A medical professional aims to predict a person's blood pressure by using weight and diet as the sole predictor variables. Those variables, however, are substantially similar for predicting high blood pressure, and the similarity may increase the standard errors of the coefficients and require regression model revisions. Removing a redundant term from the regression model or using more advanced techniques can help overcome a multicollinearity challenge.

Related: 12 Data Modeling Tools for Data Analysis

How do you visualize multiple linear regression?

A graphical model of simple linear regression would show a single line crossing through data points that best capture the relationship between the slope’s values and the terms found along the y-intercept. With multiple linear regression, there will be more than one line: one for every independent variable. So using the multiple linear regression formula:

y = ß0 + ß1x1 + ß2x2 + ... + ßpxp

Where x1, x2, and xp are three independent variables, a graph would show three slopes to interpret. In the scatter plot graph below, for example, which shows a simple linear regression, you can imagine two additional lines in a multiple regression model.

What is multiple regression and how is it different from the regression with one independent variable?

Image description

Graphical representation of a scatter plot graph. This could also represent a simple linear regression equation.