R packages we use in this sectionlibrary(tidyverse)
library(stargazer)Research Question
Theory: Social capital enhances local government’s performance.
Source: Robert Putnam, (1994) Making Democracy Work: Civic Traditions in Modern Italy,
Princeton, NJ: Princeton University Press)
Theory
The differences on local government’s performance can be explained by the degree of social capital in each local government.
Social capital can be defined as “the networks of relationships among people who live and work in a particular society, enabling that society to function effectively”.
It involves the effective functioning of social groups through interpersonal relationships, a shared sense of identity, a shared understanding, shared norms, shared values, trust, cooperation, and reciprocity.
Social capital help people build cooperation one another.
→ In the area with more social capital, the more people trust and cooperate one another, which leads to high quality government performance.
✔ Goldberg’s argument (1996)
gov_p differ by location?data folder in your RProject folderputnamputnam <- read_csv("data/putnam.csv") names(putnam)[1] "region" "gov_p" "cc" "econ" "location"
Data:
| Types of variables | Variables | Details |
|---|---|---|
| Outcome | gov_p |
Performance of Italian local governments |
| Predictor | region |
Abbreviation of Italian local governments |
| Predictor | cc |
Civic Community Index |
| Predictor | econ |
Economy Index (the larger, the better) |
| Predictor | location |
Area dummy (north,south) |
putnumDT::datatable(putnam)putnam %>%
ggplot(aes(x = location, y = gov_p, fill = location)) +
geom_boxplot() +
labs(x = "Location Dummy", y = "gov_p",
title = "Government Performance in Italy by Location") +
stat_smooth(method = lm, se = FALSE) It looks like there is a clear difference between north and south
Conduct a t-test (unpaired)
t.test(putnam$gov_p[putnam$location == "north"],
putnam$gov_p[putnam$location == "south"])
Welch Two Sample t-test
data: putnam$gov_p[putnam$location == "north"] and putnam$gov_p[putnam$location == "south"]
t = 6.8253, df = 14.552, p-value = 6.737e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
4.607777 8.808890
sample estimates:
mean of x mean of y
11.83333 5.12500
Result
・Average gov_p (North) = 11.833
・Average gov_p (South) = 5.125
・The difference (-6.708) is statistically significant with the 1% significant level (p-value = 6.737e-06)
→ As Goldberg (1996) argues, there is a clear difference in government performance between north and south.
econ explain gov_p?It is seems that economy (econ) is related to government performance (gov_p) in Italy.
However, it is not clear yet that this is the case both in northern area and southern area.
Draw a scatter plot between econ and gov_p
putnam %>%
ggplot(aes(econ, gov_p)) +
geom_point() +
theme_bw() +
labs(x = "econ", y = "gov_p",
title = "Economic situation and Government Performance in Italy") +
stat_smooth(method = lm, se = FALSE)econ and gov_p.model_1 <- lm(gov_p ~ econ, data = putnam)
summary(model_1)
Call:
lm(formula = gov_p ~ econ, data = putnam)
Residuals:
Min 1Q Median 3Q Max
-4.3386 -1.7733 0.0086 0.8336 5.5114
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0108 1.3847 2.174 0.043264 *
econ 0.5889 0.1200 4.909 0.000113 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.659 on 18 degrees of freedom
Multiple R-squared: 0.5724, Adjusted R-squared: 0.5487
F-statistic: 24.1 on 1 and 18 DF, p-value: 0.0001131
\[\widehat{gov_p}\ = 3.01 + 0.589econ\]
putnamstr(putnam)spec_tbl_df [20 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ region : chr [1:20] "Ab" "Ba" "Cl" "Cm" ...
$ gov_p : num [1:20] 7.5 7.5 1.5 2.5 16 12 10 11 11 9 ...
$ cc : num [1:20] 8 4 1 2 18 17 13 16 17 15.5 ...
$ econ : num [1:20] 7 3 3 6.5 13 14.5 12.5 15.5 19 10.5 ...
$ location: chr [1:20] "south" "south" "south" "south" ...
- attr(*, "spec")=
.. cols(
.. region = col_character(),
.. gov_p = col_double(),
.. cc = col_double(),
.. econ = col_double(),
.. location = col_character()
.. )
→ Change the class of location from charactor to numeric
→ Change the name of data frame as df2
df2 <- mutate(putnam,
location = as.numeric(location == "north" )) # north = 1, south = 0DT::datatable(df2)econ) is related to government performance (gov_p) both in northern area and southern area, we need to simultaneously include econ and location in our regression model.model_2 <- lm(gov_p ~ econ + location, data = df2){r} with {r, results = "asis"} as the chunk optionstargazer(model_2, type = "html")| Dependent variable: | |
| gov_p | |
| econ | -0.019 |
| (0.220) | |
| location | 6.884*** |
| (2.229) | |
| Constant | 5.222*** |
| (1.347) | |
| Observations | 20 |
| R2 | 0.726 |
| Adjusted R2 | 0.694 |
| Residual Std. Error | 2.190 (df = 17) |
| F Statistic | 22.531*** (df = 2; 17) |
| Note: | p<0.1; p<0.05; p<0.01 |
SRF for model_2:\[\widehat{gov_p}\ = 5.222 - 0.019econ + 6.88location\]
We see that econ is not related to gov_p
We see that location is related to gov_p
→ When location = 1, (that is, when the local government is located in the North), government performance is higher by 6.884 points.
By substituting location = 0 and 1, we get the following two regression functions:
Note: The two slopes are identical!
loation = 0
\[\widehat{gov_p}\ = 5.22 - 0.019econ\]
location = 1
\[\widehat{gov_p}\ = 12.11 - 0.019econ\]
Result ・The relationship between economic situation (econ) and government performance (gov_p) is spurious correlation.
cc explain gov_p?It seems that civic community index (cc) is related to government performance (gov_p) in Italy.
However, it is not clear yet that this is the case both in northern area and southern area.
Draw a scatter plot between cc and gov_p
putnam %>%
ggplot(aes(cc, gov_p)) +
geom_point() +
theme_bw() +
labs(x = "cc", y = "gov_p",
title = "civic community index and Government Performance in Italy") +
stat_smooth(method = lm, se = FALSE)cc and gov_p.model_3 <- lm(gov_p ~ cc, data = putnam)
summary(model_3)
Call:
lm(formula = gov_p ~ cc, data = putnam)
Residuals:
Min 1Q Median 3Q Max
-2.5043 -1.3481 -0.2087 0.9764 3.4957
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.71115 0.84443 3.211 0.00485 **
cc 0.56730 0.06552 8.658 7.81e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.789 on 18 degrees of freedom
Multiple R-squared: 0.8064, Adjusted R-squared: 0.7956
F-statistic: 74.97 on 1 and 18 DF, p-value: 7.806e-08
\[\widehat{gov_p}\ = 2.711 + 0.567econ\]
putnamstr(putnam)spec_tbl_df [20 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ region : chr [1:20] "Ab" "Ba" "Cl" "Cm" ...
$ gov_p : num [1:20] 7.5 7.5 1.5 2.5 16 12 10 11 11 9 ...
$ cc : num [1:20] 8 4 1 2 18 17 13 16 17 15.5 ...
$ econ : num [1:20] 7 3 3 6.5 13 14.5 12.5 15.5 19 10.5 ...
$ location: chr [1:20] "south" "south" "south" "south" ...
- attr(*, "spec")=
.. cols(
.. region = col_character(),
.. gov_p = col_double(),
.. cc = col_double(),
.. econ = col_double(),
.. location = col_character()
.. )
→ Change the class of location from charactor to numeric
→ Change the name of data frame as df2
df2 <- mutate(putnam,
location = as.numeric(location == "north" )) # north = 1, south = 0DT::datatable(df2)cc) is related to government performance (gov_p) both in northern area and southern area, we need to simultaneously include cc and location in our regression model.model_3 <- lm(gov_p ~ cc + location, data = df2){r} with {r, results = "asis"} as the chunk optionstargazer(model_3, type = "html")| Dependent variable: | |
| gov_p | |
| cc | 0.571** |
| (0.215) | |
| location | -0.048 |
| (2.678) | |
| Constant | 2.698** |
| (1.121) | |
| Observations | 20 |
| R2 | 0.806 |
| Adjusted R2 | 0.784 |
| Residual Std. Error | 1.841 (df = 17) |
| F Statistic | 35.402*** (df = 2; 17) |
| Note: | p<0.1; p<0.05; p<0.01 |
SRF for model_3:\[\widehat{gov_p}\ = 2.698 - 0.571econ + 0.048location\]
We see that cc is related to gov_p
We see that location is not related to gov_p
→ Regardless of the value of location that is, when the local government is either in the North or in the South), government performance does not differ.
By substituting location = 0 and 1, we get the following two regression functions:
Note: The two slopes are identical!
loation = 0
\[\widehat{gov_p}\ = 2.65 - 0.571econ\]
location = 1
\[\widehat{gov_p}\ = 2.698 - 0.571econ\]
gov_p) in Italy.Result ・The Civic Community Index (cc) matters in predicting government performance both in the north and in the south in Italy
gov_p?model_4 <- lm(gov_p ~ cc + econ + location, data = df2)stargazer(model_1, model_2, model_3, model_4,
type = "html")| Dependent variable: | ||||
| gov_p | ||||
| (1) | (2) | (3) | (4) | |
| econ | 0.589*** | -0.019 | -0.269 | |
| (0.120) | (0.220) | (0.199) | ||
| cc | 0.571** | 0.700*** | ||
| (0.215) | (0.230) | |||
| location | 6.884*** | -0.048 | 0.858 | |
| (2.229) | (2.678) | (2.698) | ||
| Constant | 3.011** | 5.222*** | 2.698** | 3.495** |
| (1.385) | (1.347) | (1.121) | (1.243) | |
| Observations | 20 | 20 | 20 | 20 |
| R2 | 0.572 | 0.726 | 0.806 | 0.826 |
| Adjusted R2 | 0.549 | 0.694 | 0.784 | 0.794 |
| Residual Std. Error | 2.659 (df = 18) | 2.190 (df = 17) | 1.841 (df = 17) | 1.797 (df = 16) |
| F Statistic | 24.097*** (df = 1; 18) | 22.531*** (df = 2; 17) | 35.402*** (df = 2; 17) | 25.370*** (df = 3; 16) |
| Note: | p<0.1; p<0.05; p<0.01 | |||
Conclusions ・The Civic Community Index (cc) matters in predicting government performance
・Economic situaion (econ) does not matter in predicting government performance
・The location (location) does not matter in predicting government performance