OK, so welcome to lecture 4 for Ikonik 1 four.
In this lecture we're looking at nonlinear regression.
After dropping the assumption of linearity, so we're looking like I said last week at section 4.2 of the main textbook.
So here's a plan of what we can do in this session. First of all, I'll present a couple of examples of nonlinear models as it kind of motivation for why you might want to move away from a linear model.
And then we'll cover some situations where you can have a nonlinear model, but essentially use.
Over less is a a method for linear models.
And then we move on to nonlinear least squares, introduce it formally, compute the 1st order conditions, look at the statistical properties. The implication for F testing, and then we'll move on to the LaGrange multiplier test.
Yep.
Based on nonlinear least squares and.
That will be it. So that's the plan we start by looking at some examples.
So yeah, so this is the usual linear model. That's what we're moving away from in this lecture. So as the first example, we look at a model for coffee sales. So one of the variables in this model is the deal rate D and as D goes up, it means that you're having.
Higher and higher price reductions.
So higher value for D we would expect to lead to a greater quantity of coffee sales.
So if D is 1.5, it means a 5% reduction if D is 1.15, it means a 15% price reduction and generally as the price goes down as the price is reduced, the quantity demanded would go up.
So that's the basic idea. However, there's more that we could ask there.
So we could specify the model is linear one of these two so.
Well, this one is linear in Q&D this one is linear in the locks, so sometimes this is called a log linear model. Sometimes they log log model.
So I have a bit of discussion down here about the terminology, but both of these models can be estimated using ordinary least squares, so either using Q&D as the variables, or log Q and log D.
And so if we use the log specification, then there's a nice interpretation of the parameter beta two. This is the elasticities of Q with respect to D.
So.
We can compute beta two. I mean show that beta two is equal to this where.
Will you essentially having a ratio of a small change in Q to Q2 a small change in D develop ID so it's a measure of elasticity of Q with respect to D?
Um?
Yeah, so yeah. So one restriction here. 11 issue with Model 1.
Is that the elasticity is still constant, so you you have a nice interpretation, but the elasticity is a constant, and perhaps if I go back a slide.
As the deal rate gets higher and higher, knew eventually all the people who've bought coffee have. I mean, all the people who want to buy coffee have bought coffee and no further price reductions again to get anymore people to buy coffee, in other words.
The elasticity of demands may not be constant. It may actually be decreasing as D increases, so at higher values of D, the elasticity may be lower than at lower values of D, so that's not actually something that you can.
Model or test within this relatively simple framework, here in this linear model, because beta two is constrained to be a constant, and while it isn't beta two is a constant. The thing on the right hand side, the elasticity is equal to beta two, so this thing the elasticity is a constant.
So you can actually.
Make nonlinear models that have.
Elasticity of demand interpretations where the elasticity isn't constant and then you can ask other questions like.
Well, like the one we just asked, is it the case that demand is less sensitive at larger values of D?
To know equivalently at larger values of log D.
So this is a model from the textbook. You can have a look at Chapter 4, Section 4.2.
So, um, this is a nonlinear model, so we have beta two as an exponent here, so it's clearly non linear in the parameters.
This is lo que this is D.
But we can take the derivative and find the elasticity of demand again. So the elasticities of Q with respect to D again.
So if we take the derivative from via range, we get the elasticities, is this?
And some.
Well, this is clearly now depending on D, so whether it goes up or down with the depends on the value of beta 3, but and beta 2. But no, it's clearly.
So just a city. This is how we define elasticity, and that's depending on D. We can estimate the model if we have a.
Well, if we could somehow transform it to a linear model, which I don't believe we can do.
Or if we use a method for nonlinear models. So the method we look at today is nonlinear least squares and it allows you to estimate beta 123.
No, directly using the nonlinear specification of the model.
Once you've got the estimates of the parameters, then you have the estimated function for the elasticity and then you can test some hypothesis about these parameters. Beta two and beta three to answer questions about the elasticity.
Yeah, so if beta three is equal to 0 or tends to 0.
Beat threes and the nominator here. So if beta three is tending to 0, then it means this elasticity is tending to beta two and then we have constant. We essentially have constant elasticity. Then in the limit so.
We can test for beta 3 equal to 0.
And by doing that where we're testing for.
Constant elasticity again.
Yeah, we we could also test whether the for the.
For whether the elasticity is.
Elastic, another, was greater than one or less than one, and so on.
So the second example here is a model for the proportion of food expenditure in household income, so that variable the proportion is being called Y. Here dependent verbal and as explanatory variables. For that we have the household total consumption expenditure and the household size.
So you know, we may expect, for example, that as the total household consumption is going up, so as household expenditures going up. Generally the food expenditure might go up linearly with it, or.
Perhaps as the total consumption expenditure goes up.
None of it's going on. Food expenditure or very little. Perhaps it's being spent on other things now.
So in a nonlinear model, we can potentially test that.
So the opposite, this model down here.
Yeah so.
This is the model, so if we look at X2 in particular.
I mean X3, we're expecting to have a positive effect on why, but let's look at X2.
So if beta two is equal to 0.
Then
well, we don't have X2 in the model, so in the restricted model we don't have it, so that would be the case where.
An increase in the total consumption expenditure isn't affecting the food expenditure.
And this case here Beta 3 equal to 1 is where you know provided beta 2 is not equal to 0. The food expenditures increasing linearly in next 2.
So if beta three is equal to 1, then we have a linear relationship between X2 and Y.
So there's more that can be said in these two cases, but hopefully you can see that.
We we can potentially test these sorts of hypothesis and linear effect versus no effect. We could also test for this for the sign of beta three and so on.
In this context, with a nonlinear model.
So as a prelude to to going into non non linear least squares properly we have to focus on this alternative way of representing a linear model. So we used to this format and the model in the module so far where we have a vector of observations for the dependent variable. A matrix with rows of observations for each of the regressors.
In this right hand side, vector of coefficients beta and a vector error terms, but we can pick out a single observation of this set of observations essentially. So if we pick out the 8th element of why so the Heights observation?
Then
the also the 8th row of X.
Then
well, if we pick out the writethru of X and call that XI, then XI dashed beta is giving us the part of this X beta that corresponds to why I so we have Yi. The ights observation of Y and we're using the right view of X and multiplying by beta. And also we have the I top, the eighth ever term as well.
So.
XI is a K * 1 vector. It's the 8th through of the end times, K matrix X, and obviously this. This model is linear in the parameters beta.
This is just a normal linear model, but represented in a different way just for the Heights observation.
So a non linear model could be written like this where F would be some function relating these coefficients and the 8th observations of the X data together so.
In our specification, F is.
Just a linear combination of the axis where the beaters are the coefficients, but F could be nonlinear.
So a nonlinear model for the item observation can be written like this. It's in the same format that we have here.
Except that.
This is a particular function of mapping of the XI and the beta.
Yeah, this is more general, so sometimes it's possible to write a nonlinear function as a piecewise linear combination of functions of the regres is. So here we have a nonlinear function, but perhaps we can write it in this format, so here we have something that's linear in the parameters. But now if we write Y.
I.
As this plus the error terms are replacing this with what we have on the right hand side here, then why is now?
A so so now is so why with regressors F1F2 up to FK can now be estimated by linear method, so by ordinary least squares. So we we transform the regressors so we have a vector of aggressive. Here that observation I in each case. But we have these, possibly nonlinear functions of them.
However, we are linear in the parameters.
So these parameters beta one to beta K can be estimated by ordinary least squares by regressing Y on F1 to FK. So sometimes that's possible.
And then we can avoid having to use nonlinearly squares.
So here's an example. We have a nonlinear function here.
This is clearly.
Nonlinear in the regressors.
However, we can define F1 to be one and F2 to be 1 / X two and then.
This is now why I equal to.
Beta one F1 plus beta 2F2 and so we just regress essentially. Why are the constants and F2? This knew Regressor and we can estimate that by ordinary least squares.
So, less formally, we're just defining a new variable, so zed I equal to 1 / X two I and then regressing Y on a constant Enzo die. So you may have seen this idea before, but you know formally what we're doing. Is splitting up a nonlinear function into a piecewise linear combination.
OK, another situation where we can avoid nonlinear least squares is where we can transform the original model so we can. So here we have a model.
Our Cobb Douglas production function from usually macroeconomics. We can see it's nonlinear, so clearly the coefficients or the parameters we like to estimate in the exponents. However, we can take the log of both sites, the logs that both sides of this theoretical model. We get something that's now linear in the parameters and then we can just add an error term.
So.
If we add an error term to this, then it becomes our statistical model and we can just.
Regres why I mean like why on the constant this would be an estimate of log of A and on log of K and lock of L and then we get estimates of Alpha beta from least squares?
So in some cases, obviously it's not possible to use either of those methods and then we need to minimize the sum of squared residuals again. But according to the.
The errors in the residuals that we get from the nonlinear model. So if we estimate beta.
We'll have this.
We will have the predicted values for for Y given the nonlinear model.
And the difference between the true value and the predicted value using our nonlinear least squares estimates, the beta will be the residual, and so via nonlinear least squares we want to choose beta here in order to minimize the sum of squared residuals when this is.
No. Why is just a single observation?