Course outline |
This is a course in econometric analysis of discrete data. There are a huge variety of models that are used in this context. We will focus on four that arguably comprise the foundation for the area, the fundamental model of binary choice (and a number of variants), models for ordered choices, models for count data and the most basic model for multinomial choice, the multinomial logit model. The course will consist of a series of discussions of models and methods followed by laboratory sessions that will apply the techniques to ‘live’ data sets. Discussions will cover the topics listed below. Lab sessions will apply the techniques discussed in the preceding sessions. Practicals will consist of directed exercises and student assignments to be completed singly or in groups. There is a home page for this course:
http://people.stern.nyu.edu/wgreene/DiscreteChoice2013.htm
All of the materials used in this course are available for download at this home page. Some of these will be copied to your lab computer. Others may be downloaded at your convenience.
The main textbook references for the course are Econometric Analysis, 7th ed., by Greene, W. (Prentice Hall, 2012). Six specific chapters: 11, Panel Data; 12, Estimation Methods; 14, Maximum Likelihood; 15, Simulation Based Estimation; 17, Discrete Choice Models; and 18, Discrete Choices and Event Counts are available on the course home page. Applied Choice Analysis by Hensher, D., Rose, J. and Greene, W., (Cambridge University Press, 2005).
I will also draw on three surveys: “Modeling Ordered Choices,” W. Greene (with D. Hensher), 2010, Cambridge University Press. “Functional form and Heterogeneity in Models for Count Data,” W. Greene, Foundations and Trends in Econometrics, 2007 “Discrete Choice Modeling,” W. Greene, Palgrave Handbook of Econometrics, Vol. 2, 2009
The received literature on discrete choice models is vast – one could easily compose a list of thousands of articles and base a full semester course on them. A few artices, mainly for the purpose of illustrating particular techniques or ideas, are:
“Economic Choices,” American Economic Review, McFadden, D. (2001). (Nobel Prize lecture) “Mixed MNL Models for Discrete Response,” McFadden, D. and Train, K., Journal of Applied Econometrics, 2000. “Convenient Estimators for the Panel Probit Model: Further Results,” Greene, W., Empirical Economics, 2003. “Non-parametric regression for binary dependent variables,” Frolich, M. Econometrics Journal, 2006 “The Mixed Logit Model: The State of Practice,” Hensher, D. and Greene, W., 2002. “Choosing Between Conventional, Electric and LPG/CNC Vehicles in Single-Vehicle Households,” Hensher, D. and Greene, W., 2000. “Deriving Willingness to Pay Estimates from Observation Specific Parameters,” Hensher, D., Greene and Rose, J., 2004. “A Latent Class Model for Discrete Choice Analysis: Constrast with Mixed Logit,” Greene, W. and Hensher, D., 2003. “Finishing High School and Starting College,” Evans, W. and Schwab, R., QJE, 1995.
These articles are all posted on the course home page.
Topic Outline and Course Agenda
Session Topics _____________________________________________________________________ Day 1
1 Introduction to the course, methodology, software, modeling concepts, regression basics 2 Standard models for binary choice
Lab 1: Software. Using NLOGIT, Regression, Computations, Binary Choice Modeling
3 Analysis of binary choice. marginal effects, fit measures, prediction, hypothesis tests 4 Panel data models for binary choice, random effects, fixed effects, Mundlak formulation, incidental parameters problem, dynamic probit model
Lab 2: Binary Choice Models, Estimation, Testing Hypotheses, Prediction, Analysis Panel Data Models
Day 2
5 Choice model extensions: endogenous variables, bivariate probit models, simultaneous equations, sample selection, multivariate probit model, marginal effects, prediction and analysis 6 Panel data, heterogeneity, simulation and latent class models,
Lab 3: Model Building; Extended Binary Choice Models, Heterogeneity
7 Ordered choice models, ordered outcomes, estimation and inference, generalized models, recent developments 8 Models for count data – applications in health econometrics
Lab 4: Model Building for Ordered Choices, Poisson and Negative Binomial Models for Count Data
Day 3
9 The multinomial logit model, random utility models, IIA, nested logit modeling, fit and prediction, marginal effects, model simulation, heteroscedasticity, use of the MNL model 10 Extensions of the MNL model, heteroscedasticity, multinomial probit, nested logit
Lab 5: Multinomial logit models and multinomial choice models multinomial probit
11 Heterogeneity in multinomial choice models, latent class and mixed logit models 12 Repeated observations, panel data, revealed vs. stated preference data
Lab 6: Multinomial choice models with random parameters, latent class models, stated and revealed preference data
Conclusion: Topics in discrete choice models; discussion, closing remarks
Followup: Student Project
Course Materials
The course materials include displays (Powerpoint presentations), a set of assignments, and several data sets. Each of the numbered sessions is accompanied by a set of slides, Part#-title.pptx. There are electronic copies of these on the computer you are using, and you have a hard (paper) copy of them in your course materials.
Part 1: Methodology Part 2: Binary Choice; Estimation Part 3: Binary Choice; Inference Part 4: Panel Data Models for Binary Choice Part 5: Bivariate and Multivariate Choice and Mode Extensions Part 6: Modeling Heterogeneity Part 7: Ordered Choice Part 8: Count Data Models Part 9: Multinomial Logit Models Part 10: Multinomial Logit Extensions; Nested Logit Part 11: Heterogeneity in Multinomial Choice Models; Latent Class and Mixed Logit Part 12: Stated Preference Data and Models
The software for the course is NLOGIT (Econometric Software, Inc., Version 5.0, 2012). The course will include several hands on sessions (two each day) using NLOGIT. NLOGIT is installed on all the lab computers that we will use during this course. There is a short introduction to NLOGIT on the course home page that will get you started using the program. I will also give instruction on using the program during the labs.
The computations that we will do in these lab sessions are completely prepackaged. You do not actually need to learn how to use NLOGIT to do the labs – the purpose of the course is to learn about the models, not the software. We will use NLOGIT to illustrate the models and methods that we discuss in class. There are four tutorial sets of slides prepared that will show how to use the program if you would like to work through these as we do the exercises. These are:
LabPart1-GettingStarted.pptx LabPart2-BinaryChoice.pptx LabPart3-BinaryChoiceExtensions.pptx LabPart4-OrderedChoice&CountData.pptx LabPart5-MultinomialChoice.pptx LabPart6-StatedPreference.pptx
The six lab sessions will each cover a specific part of the course. The presentation is made in the form of a series of ‘Assignments,’ that is, a set of calculations that you will carry out to illustrate and learn about the topics we discuss in class. Hard copies of the assignments will be distributed with the course materials. Since the assignments involve large amounts of NLOGIT computation, I have packaged all of the commands you will need to carry them out in NLOGIT command files,
LabAssignment#.lim (e.g., LabAssignment1.lim)
You need only upload the relevant data then execute the scripts to carry out the assignments. You will then edit and change the scripts to experiment with the models and the program.
We will use several data sets in this course. The data files are in the form of ‘limdep project files,’ name.lpj. The file type .lpj is ‘registered’ with Windows on your computer, so you can launch NLOGIT and upload the data file by double clicking the project file name. The data files that we will be analyzing are the following:
dairy Balanced panel data, 247 dairy farms, 6 years, 1482 observations milk output, 4 inputs.
amEx American Express data base, 13,444 observations. Contains data on several binary and count variables.
mnc a. Discrete choice simulated data. Panel data set on brand choices. 800 people, 4 choices, 8 replications. b. Survey data on mode choice for travel between Sydney and Melbourne. 210 individual choices. Used for study of multinomial choice models.
healthcare: Unbalanced panel data, 7209 individuals, 1 to 7 observations. Many health care variables. Binary choice and count data and a self administered ordered choice on health satisfaction.
labor: Mroz’s 753 observations on women’s labor supply. Use to study binary choice and sample selection.
panelprobit: Panel data, 1270 firms, 5 years. Binary choice data on innovation behavior. From published study on FDI, Imports and innovation.
sprp: Mixed stated and revealed preference choices. 672 individuals. 4 choice tasks, one between 2 alternatives, three among 4 alternatives.
|
10thSummer School—EEG, Universidade do Minho, Braga, Portugal |
Economic Policies Research Unit—NIPE |