What to expect from Econometrics 1

What to expect from Econometrics 1

Christiaan Cakici
26 november 2018

The bachelor Econometrics consists of courses in a variety of fields, including economics, computer programming and mathematics. What might seem strange, is that students are only first introduced to econometrics courses in the second year of their bachelor. The reason is that advanced knowledge of probability theory and statistics is required. As a result, freshmen students might not have a clear picture of what econometrics exactly entails. The purpose of this guide is to give a short introduction to the course Econometrics 1 and therefor econometrics in general.

Let us first start with a primer on probability theory and statistics. Statistical estimators can have many properties, but in general an estimator is considered “good” if it at least has the following properties: unbiasedness, consistency and efficiency. These concepts will be explained in depth in the course probability theory and statistics 3. Here we only focus on the intuitive interpretation of these desirable properties. An estimator is unbiased (dutch: zuiver) if its expectation is equal to the true value, consistent if the probability of “choosing” the true value converges to 1 as the sample size goes to infinity and efficient if it has the smallest variance of all other estimators you could possibly come up with. The intuitive idea behind these properties is as follows: unbiased means that if you get to guess, your guess is on average correct, consistency means that if you are given all the information in the world, your estimator will be accurate and efficiency means that the estimator is more closely centred around the true value than other possible estimators.

In the course econometrics 1 you begin by deriving the OLS (Ordinary Least Squares) estimator. The OLS estimator is basically a formula that fits a straight line through datapoints such that the squared distance (i.e. the squared error) between line and the datapoints are minimized. You take the square of the distance, because otherwise positive and negative errors would cancel each other out.
As discussed in the paragraph above, we want an estimator to be unbiased, consistent and efficient. So does the OLS estimator have these properties? This question cannot be answered in general, because the properties depend on the data. A very elegant solution to this problem is to ask ourselves ‘which assumptions do we have to make on the data such that the OLS estimator is unbiased, consistent and efficient?‘ These assumptions are known as the Classical Linear Regression Model (CLRM) assumptions. Making these assumptions might sound like a self-fulfilling prophecy (because we assume our estimator is what we want it to be), but this is not the case. The harsh truth is that the dataset you are working with almost always violates at least one of the CLRM assumptions. Why would you make assumptions if you already know they will be violated in practice? It turns out that often a transformation exists such that the transformed model does satisfy the CLRM assumptions. This transformation is commonly pre-multiplying with a matrix (a linear transformation – taught in mathematics 3) or taking the logarithm of your data.
To summarize, first you make assumptions such that you get the results you desire, realize these assumptions will probably be violated once you start to do research in practice and finally apply some magic to your data such that the assumptions will be true. This might seem unnecessary and over-the-top, but it is of course impossible to discuss a new model for every different kind of dataset there exists. That is, this method of modelling is very general and can be applied to any dataset. Because of this general approach, there is a big difference between theory and practice. Both the theoretical part (the OLS estimator and the CLRM assumptions) and practical part (violations of the assumptions and how to solve them) are taught in econometrics 1, which might be one of the reasons this course is considered to be very difficult. Students who fail to understand the theoretical part of a certain topic, will not be able to apply this knowledge in practice.

Other topics discussed in econometrics 1 are the detection of a violation of one of the assumptions and the proofs of the properties of the OLS estimator. Detection is done by performing statistical tests. We can again distinguish between a theoretical and a practical part: students are required to derive the tests and the distribution of the test statistics themselves and use these results to detect a violation in practice. The most important topic in the entire course is probably proving the estimator is unbiased, consistent and efficient. You will first prove this for the OLS estimator and by assuming the CLRM assumptions, but as discussed before, these assumptions do not typically hold in practice. Hence you need to redo all the proofs for every possible violation (the CLRM consists of 7 assumptions), for all estimators you will discuss during the course and several other cases. You always know how to begin a proof, since the first steps are always similar to the basic case of the OLS estimator and the CLRM assumptions, but at some point you cannot make a certain step and need to get creative.

Luckily the lecturers (yes, plural: one “theoretical” teacher for the lectures and one “practical” teacher for the computer sessions) are very skilled and willing to jump through loops to explain the materials and answer your questions. To any student taking this course I recommend asking many questions to the teachers and occasionally yelling into your pillow to release some of the frustration.