Basic Regression on Urban Population Growth and GDP per Capita
Continuing with Regression Modelling in Practice… If you have been following along with my work, you will know that I am interested in the relationship between urbanization and economic development and am posing the general question of whether urbanization drives economic growth? Through the past two courses, Data Analysis Tools and Data Management and Visualization, I established that there was a correlation between urban population and GDP per capita. For this assignment, my primary explanatory variable is Urban Population Growth rate and response variable is GDP per capita, both figures are from 2010. This is my code in Python: import pandas import numpy import seaborn import matplotlib.pyplot as plt import statsmodels.formula.api as smf import statsmodels.stats.multicomp as multi gapminder = pandas.read_csv(‘Data1.csv’, low_memory=False) gapminder[‘GDP2010’] = gapminder[‘GDP2010’].replace(0,numpy.nan) gapminder[‘GDPGrowth2010’] = gapminder[‘GDPGrowth2010’].replace(0,numpy.nan) gapminder[‘UrbanPop2010’] = gapminder[‘UrbanPop2010’].replace(0,numpy.nan) gapminder[‘UrbanPopGrowth2010’] = gapminder[‘UrbanPopGrowth2010’].replace(0,numpy.nan) gapminder = gapminder[[‘Country’, ‘UrbanPop2010’, ‘UrbanPopGrowth2010’, ‘GDP2010’, ‘GDPGrowth2010’]] gapminder = gapminder.dropna() PopDes = gapminder[‘UrbanPopGrowth2010’].describe() print (PopDes) RegData = gapminder[[‘Country’, ‘UrbanPopGrowth2010’, ‘GDP2010’]] RegData[‘UrbanPopGrowth2010’] = RegData[‘UrbanPopGrowth2010’] – RegData[‘UrbanPopGrowth2010′].mean() print (RegData.describe()) UrbanReg = smf.ols(formula=’GDP2010 ~ UrbanPopGrowth2010′, data=RegData).fit() print (UrbanReg.summary()) seaborn.regplot(x=’UrbanPopGrowth2010′, y=’GDP2010’, fit_reg=True, data=RegData) plt.xlabel(‘Urban Population Growth …