# Multivariate Linear Regression

Multivariate Linear Regression is similar to linear regression but instead of having single dependent variable Y, we have multiple output variables. It may be written as,

**Y = XB + U **,

where **Y** is a matrix with series of multivariate measurements (each column being a set of measurements on one of the dependent variables), **X** is a matrix of observations on independent variables that might be a design matrix (each column being a set of observations on one of the independent variables), **B** is a matrix containing parameters that are usually to be estimated. **U** is the regularisation factor.

**Multivariate Linear Regression vs Multiple Linear Regression**

In **Multivariate regression** there are more than one dependent variable with different variances (or distributions). The predictor variables may be one or multiple. In **Multiple regression, **there is just one dependent variable i.e. y. But, the predictor variables or parameters are multiple.

The number of variables on either side of the equation is that for the case of multivariate regression, the goal is to utilise the fact that there is (generally) correlation between response variables (or outcomes).

In multivariate regression there are more than one dependent variable with different variances (or distributions). The predictor variables may be more than one or multiple. So it is may be a multiple regression with a matrix of dependent variables, i. e. multiple variances. But when we say multiple regression, we mean only one dependent variable with a single distribution or variance. The predictor variables are more than one. To summarise multiple refers to more than one predictor variables but multivariate refers to more than one dependent variables.

**Why use** **Multivariate Linear Regression?**

Suppose, in a medical trial, predictors might be weight, age, and race, and outcome variables are blood pressure and cholesterol. We could, in theory, create **two “multiple regression”** models, one regressing blood pressure on weight, age, and race, and a second model regressing cholesterol on those same factors. However, alternatively, we could create a

**single multivariate regression**model that predicts

*both*blood pressure and cholesterol simultaneously based on the three predictor variables.

*The idea being that the multivariate regression model may be better (more predictive) to the extent that it can learn more from the correlation between blood pressure and cholesterol in patients.*

**Reference:**

Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera