Handling multi-collinearity using principal component analysis with the panel data model

Keywords: fixed effect estimation, Hausman test, panel data, principal component analysis, robust regression


When designing a statistical model, applied researchers strive to make the model consistent, unbiased, and efficient. Labor productivity is an important economic indicator that is closely linked to economic growth, competitiveness, and living standards within an economy. This paper proposes the one-way error component panel data model for labor productivity. One of the problems that we can encounter in panel data is the problem of multi-collinearity. Therefore, multi-collinearity problem is considered. This problem has been detected. After that, the principal component technique is used to get new good unrelated estimators. For the purposes of our analysis, the multi-collinearity problem between the explanatory variables was examined, using principal component techniques with the application of the panel data model focused on the impact of public capital, private capital stock, labor, and state unemployment rate on gross state products. The analysis was based on three estimation methods: fixed effect, random effect, and pooling effect. The challenge is to get estimators with good properties under reasonable assumptions and to ensure that statistical inference is valid throughout robust standard errors. And after application, fixed effect estimation turned out to play a key role in the estimation of panel data models. Based on the results of hypothesis testing, the real data result showed that the fixed effect model was more accurate compared to the two models of random effect and pooling effect. In addition, robust estimation was used to get more efficient estimators since heteroscedasticity has been confirmed


Download data is not yet available.

Author Biographies

Ahmed Hassen Youssef, Cairo University

Department of Applied Statistics and Econometrics

Engy Saeed Mohamed, Cairo University

Department of Applied Statistics and Econometrics

Shereen Hamdy Abdel Latif, Cairo University

Department of Applied Statistics and Econometrics


Lazarsfeld, P. F. (1940). “Panel” Studies. Public Opinion Quarterly, 4 (1), 122. doi: https://doi.org/10.1086/265373

Andreß, H.-J. (2017). The need for and use of panel data. IZA World of Labor. doi: https://doi.org/10.15185/izawol.352

Baltagi, B. H. (2005). Econometric analysis of panel data. John Wiley & Sons Inc.

Zulfikar, R. (2018). Estimation Model and Selection Method of Panel Data Regression: An Overview of Common Effect, Fixed Effect, and Random Effect Model. INA-Rxiv. doi: https://doi.org/10.31227/osf.io/9qe2b

Born, B., Breitung, J. (2014). Testing for Serial Correlation in Fixed-Effects Panel Data Models. Econometric Reviews, 35 (7), 1290–1316. doi: https://doi.org/10.1080/07474938.2014.976524

Greene, W. (2012). Econometric analysis. Prentice Hall.

Ramón Gil-García, J., Puron-Cid, G. (2014). Using panel data techniques for social science research: an illustrative case and some guidelines. CIENCIA Ergo Sum, 21-3, 203–216. Available at: https://www.redalyc.org/pdf/104/10432355004.pdf

Adeboye, N. O., Fagoyinbo, I. S., Olatayo, T. O. (2014). Estimation of the Effect of Multicollinearity on the Standard Error for Regression Coefficients. IOSR Journal of Mathematics, 10 (4), 16–20. doi: https://doi.org/10.9790/5728-10411620

Gujarati, D., Porter, C. (2008). Basic Econometrics. McGraw-Hill. Available at: https://cbpbu.ac.in/userfiles/file/2020/STUDY_MAT/ECO/1.pdf

Costa, J. C. G. D., Da-Silva, P. J. G., Almeida, R. M. V. R., Infantosi, A. F. C. (2014). Validation in Principal Components Analysis Applied to EEG Data. Computational and Mathematical Methods in Medicine, 2014, 1–10. doi: https://doi.org/10.1155/2014/413801

Katchova, A. (2013). Panel data models. Hentet, 4 (13).

Hansen, C. B. (2007). Asymptotic properties of a robust variance matrix estimator for panel data when is large. Journal of Econometrics, 141 (2), 597–620. doi: https://doi.org/10.1016/j.jeconom.2006.10.009

Arellano, M. (2009). PRACTITIONERS’ CORNER: Computing Robust Standard Errors for Within-groups Estimators. Oxford Bulletin of Economics and Statistics, 49 (4), 431–434. doi: https://doi.org/10.1111/j.1468-0084.1987.mp49004006.x

Baltagi, B. H. (2021). Econometric analysis of panel data. Springer Cham, 424. doi: https://doi.org/10.1007/978-3-030-53953-5

Cook, L. M., Munnell, A. (1990). How does public infrastructure affect regional economic performance? New England Economic Review, 11–33. Available at: https://econpapers.repec.org/article/fipfedbne/y_3a1990_3ai_3asep_3ap_3a11-33.htm

Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons Ltd.

Breusch, T. S., Pagan, A. R. (1979). A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, 47 (5), 1287. doi: https://doi.org/10.2307/1911963

Stock, J. H., Watson, M. W. (2002). Forecasting Using Principal Components From a Large Number of Predictors. Journal of the American Statistical Association, 97 (460), 1167–1179. doi: https://doi.org/10.1198/016214502388618960

Handling multi-collinearity using principal component analysis with the panel data model

👁 39
⬇ 31
How to Cite
Youssef, A. H., Mohamed, E. S., & Abdel Latif, S. H. (2023). Handling multi-collinearity using principal component analysis with the panel data model. EUREKA: Physics and Engineering, (1), 177-188. https://doi.org/10.21303/2461-4262.2023.002582