Régression Multiple - Modèle Gaussien

Paul Bastide - Ibrahim Bouzalmat

14/02/2024

What we know

Model

Model:

random vector of responses
non random matrix of predictors
random vector of errors
non random, unknown vector of coefficients

Assumptions:

(H1)
(H2) and

OLS Estimators

Questions

Confidence interval for ?
Prediction intervals ?
Assumptions on the moments are not enough.
We need assumptions on the specific distribution of the .
Most common assumption: are Gaussian.

Gaussian Model

Model:

random vector of responses
non random matrix of predictors
random vector of errors
non random, unknown vector of coefficients

Assumptions:

(H1)
(H2)

Reminder - Notations

The column and row vectors of are written as:

Hence:

Maximum Likelihood Estimators

Distribution of

Model:
Distribution of :

Likelihood of

Distribution of :

Likelihood of :

and

Likelihood of

Distribution of :
Likelihood of :

and

Likelihood of

Likelihood of :

ML Estimators

The Maximum Likelihood estimators are:

with:

ML Estimators -

For any ,

The ML estimators are the same as the OLS estimators.

ML Estimators -

We get:

ML Estimators - - Remarks

The ML estimator is different from the unbiased estimator of the variance we saw earlier.

The ML estimator of the variance is biased:

Reminder -
Gaussian Vectors

Reminder: Gaussian Distribution 1/4

Let be a Gaussian r.v. with expectation and variance :

It admits a probability density function (pdf):

Moments:

Reminder: Gaussian Vector 2/4

Let be a Gaussian vector with expectation and variance :

It admits a probability density function (pdf):

Any linear combination of its coordinates is Gaussian.

Moments:

Reminder: Gaussian Vector 3/4

Property:
If , then for any matrix and vector,

Property:
Let , with invertible. Then is symmetric, positive definite and it is possible to find invertible such that:

Then:

Reminder: Gaussian Vector 4/4

Proof:

Take

Then:

and:

Distribution of the Coefficients - known

Distribution of

We already know the moments of the estimator:

As is a Gaussian vector, is hence also a Gaussian vector.
Hence, when the variance is known, we get:

Cochran Theorem

Chi Squared Distribution

Definition:
Let be standard normal iid rv : .

is a Chi squared r.v. with degrees of freedom.

Property:
Let be a Gaussian vector of size with expectation and variance : . Then, if is invertible,

is a Chi squared r.v. with degrees of freedom.

Chi Squared Distribution - Proof

is a Gaussian vector . with is invertible, so:

And:

with

standard normal iid rv.

Hence:

Cochran Theorem

Theorem: Let

a Gaussian vector ;
a subspace of of dimension ;
the orthogonal projection matrix on ;
the orthogonal projection matrix on .

Then we have:

and ;
and are independent;
and

Cochran Theorem - Proof - hints

Theorem:

a Gaussian vector ;
the orthogonal projection matrix on dim ;
the orthogonal projection matrix on .

and ;
and are independent;
and

Hints:

Cochran Theorem - Proof - (1)

From the linear property of Gaussian vectors:

but:

Same for .

Cochran Theorem - Proof - (2) - 1/2

As is an orthogonal projection matrix:

Define:

Then:

Cochran Theorem - Proof - (2) - 2/2

Lead to:

independent from

Cochran Theorem - Proof - (3) - 1/3

Cochran Theorem - Proof - (3) - 2/3

But:

Hence:

Cochran Theorem - Proof - (3) - 3/3

But:

thanks to previous lemma:

Same for

Distribution of

When the variance is known, we get:

and

Proof: Use Cochran Theorem

We have:

a Gaussian vector ;
;
orthogonal projection on ;

And:

Proof - Chi squared

Hence:

And, from Cochran’s Theorem:

Proof - Independence

and

But and are independent from Cochran’s theorem.

Hence, and are independent.

Distribution of the Coefficients - unknown

Distribution of

When is known:

i.e. for :

i.e.

Problem is generally unknown.

Solution Replace by .

Distribution of

When is unknown, for any :

Notation:

Attention:

Reminder: Student Distribution

and independent

Then

Distribution of - Proof

Hence:

Confidence Intervals and Tests

Confidence intervals

With probability :

Tests

Test - Null Distribution

Test - Do not reject

Test - Reject

Test - p value

Simulated Dataset

set.seed(12890926)

## Predictors
n <- 100
x_1 <- runif(n, min = -2, max = 2)
x_2 <- runif(n, min = 0, max = 4)

## Noise
eps <- rnorm(n, mean = 0, sd = 1)

## Model sim
beta_0 <- -1; beta_1 <- 3; beta_2 <- -1
y_sim <- beta_0 + beta_1 * x_1 + beta_2 * x_2 + eps

Simulated Dataset - Fit

fit <- lm(y_sim ~ x_1 + x_2)
summary(fit)

## 
## Call:
## lm(formula = y_sim ~ x_1 + x_2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.54413 -0.71088  0.00976  0.66096  1.98562 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.83037    0.19076  -4.353 3.33e-05 ***
## x_1          2.87065    0.08777  32.706  < 2e-16 ***
## x_2         -1.07506    0.08229 -13.064  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9761 on 97 degrees of freedom
## Multiple R-squared:  0.9376, Adjusted R-squared:  0.9363 
## F-statistic: 728.9 on 2 and 97 DF,  p-value: < 2.2e-16

Simulated Dataset - Wrong Fit

## unrelated noise variable x_3
x_3 <- runif(n, min = -4, max = 0)
## Fit
fit <- lm(y_sim ~ x_1 + x_2 + x_3); summary(fit)

## 
## Call:
## lm(formula = y_sim ~ x_1 + x_2 + x_3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6257 -0.7069  0.0366  0.6483  1.9548 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.99755    0.25455  -3.919 0.000167 ***
## x_1          2.87506    0.08789  32.711  < 2e-16 ***
## x_2         -1.07733    0.08233 -13.086  < 2e-16 ***
## x_3         -0.08755    0.08826  -0.992 0.323698    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9762 on 96 degrees of freedom
## Multiple R-squared:  0.9382, Adjusted R-squared:  0.9363 
## F-statistic: 486.2 on 3 and 96 DF,  p-value: < 2.2e-16

Prediction

Predict a new point

We fitted on

A new line of predictors comes along. How can we guess ?

We use the same model: with , independent from all .

We predict with:

Question: What is the error ?

Prediction Error - Known variance

The prediction error is such that:

Proof:

is Gaussian as the sum of two Gaussian variables.
We already know its mean and variance (see previous lesson).

Remark: is a scalar.

Prediction Error - Unknown variance

We get:

Proof:

Prediction - Confidence Interval

With probability :

Advertising

fit_all <- lm(sales ~ TV + radio + newspaper, data = ad)
summary(fit_all)

## 
## Call:
## lm(formula = sales ~ TV + radio + newspaper, data = ad)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8277 -0.8908  0.2418  1.1893  2.8292 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.938889   0.311908   9.422   <2e-16 ***
## TV           0.045765   0.001395  32.809   <2e-16 ***
## radio        0.188530   0.008611  21.893   <2e-16 ***
## newspaper   -0.001037   0.005871  -0.177     0.86    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.686 on 196 degrees of freedom
## Multiple R-squared:  0.8972, Adjusted R-squared:  0.8956 
## F-statistic: 570.3 on 3 and 196 DF,  p-value: < 2.2e-16

Régression Multiple - Modèle Gaussien

What we know

Model

Model:

Assumptions:

OLS Estimators

Questions

Gaussian Model

Gaussian Model

Model:

Assumptions:

Reminder - Notations

Maximum Likelihood Estimators

Distribution of y

Likelihood of y

Likelihood of y

Likelihood of y

ML Estimators

ML Estimators - β

ML Estimators - σ2

ML Estimators - σ2 - Remarks

Reminder - Gaussian Vectors

Reminder: Gaussian Distribution 1/4

Reminder: Gaussian Vector 2/4

Reminder: Gaussian Vector 3/4

Reminder: Gaussian Vector 4/4

Distribution of the Coefficients - σ2 known

Distribution of ˆβ

Cochran Theorem

Chi Squared Distribution

Chi Squared Distribution - Proof

Cochran Theorem

Cochran Theorem - Proof - hints

Cochran Theorem - Proof - (1)

Cochran Theorem - Proof - (2) - 1/2

Cochran Theorem - Proof - (2) - 2/2

Cochran Theorem - Proof - (3) - 1/3

Cochran Theorem - Proof - (3) - 2/3

Cochran Theorem - Proof - (3) - 3/3

Distribution of ˆσ2

Distribution of ˆσ2

Proof: Use Cochran Theorem

Proof - Chi squared

Proof - Independence

Distribution of the Coefficients - σ2 unknown

Distribution of ˆβ

Distribution of ˆβ

Reminder: Student Distribution

Distribution of ˆβ - Proof

Confidence Intervals and Tests

Confidence intervals

Tests

Test - Null Distribution

Test - Do not reject H0

Test - Reject H0

Test - p value

Simulated Dataset

Simulated Dataset

Simulated Dataset - Fit

Simulated Dataset - Wrong Fit

Prediction

Predict a new point

Prediction Error - Known variance

Prediction Error - Unknown variance

Prediction - Confidence Interval

Advertising

Advertising

Distribution of

Likelihood of

Likelihood of

Likelihood of

ML Estimators -

ML Estimators -

ML Estimators - - Remarks

Reminder -
Gaussian Vectors

Distribution of the Coefficients - known

Distribution of

Distribution of

Distribution of

Distribution of the Coefficients - unknown

Distribution of

Distribution of

Distribution of - Proof

Test - Do not reject

Test - Reject