POL269 - Political Data Research: Solutions Seminar 10

What is the Effect of the Death of the Leader on the Level of Democracy?

Based on Benjamin F. Jones and Benjamin A. Olken. 2009. Hit or Miss? The Effect of Assassinations on Institutions and War. American Economic Journal: Macroeconomics, 1(2): 55–87.

All materials presented here build on the resources for instructors designed by Elena Llaudet and Kosuke Imai in Data Analysis for Social Science: A Friendly and Practical Introduction (Princeton University Press).

Let’s go back to the leaders.csv data we have considered in previous seminars. Remember that there is a longstanding debate in the study of international relations on whether individual political leaders make a difference. To explore this issue, we’ll explore the effect of the death of the leader on the level of democracy of a country. For this purpose, we will analyse data on assassinations and assassination attempts against political leaders from 1875 to 2004.

Whether an assassination attempt occurs or not is not a random process. However, once an assassination attempt has occurred, one could argue that whether the assassination attempt is successful or not is the result of small elements of randomness, such as the timing and path of the weapon. As a result, we can consider (at least for now) that, after an assassination attempt, the death a leader is close to random and, thus, the assassination attempts where the leader ended up dying should be, on average, comparable to the assassination attempts where the leader ended up surviving. If this is true, then we can estimate the average causal effect of the death of the leader by computing the difference-in-means estimator.

To measure the level of democracy of the country, we will use polity scores. Polity scores categorize the regime of a country on a 21-point scale ranging from -10 (hereditary monarchy) to +10 (consolidated democracy). The Polity Project has produced polity scores for all countries from 1800 and on. For example, here are the 2018 polity scores.

The dataset is stored in a file called “leaders.csv”. Table 1 shows the names and descriptions of the variables in this dataset, where the unit of observation is assassination attempts.

Table 1: Variables in “leaders.csv”

Variable	Description
year	year of the assassination attempt
country	country name
leadername	name of the leader
died	whether leader died: 1=yes, 0=no
politybefore	polity scores before the assassination attempt (in points)
polityafter	polity scores after the assassination attempt (in points)

In this problem set, we practice answering questions related to causal studies: (1) What is the estimated average treatment effect? and (2) Is the effect statistically significant at the 5% level?

As always, we start by loading and looking at the data (remember to set your working directory first!):

leaders <- read.csv("leaders.csv") # reads and stores data
head(leaders) # shows first observations

  year     country       leadername died politybefore polityafter
1 1929 Afghanistan Habibullah Ghazi    0           -6   -6.000000
2 1933 Afghanistan       Nadir Shah    1           -6   -7.333333
3 1934 Afghanistan      Hashim Khan    0           -6   -8.000000
4 1924     Albania             Zogu    0            0   -9.000000
5 1931     Albania             Zogu    0           -9   -9.000000
6 1968     Algeria      Boumedienne    0           -9   -9.000000

What is the estimated average causal effect of the death of a leader on the level of democracy?

Given our research question, what should be our outcome variable (Y)? Visualise its distribution, using the ggplot2 package and a suitable method.

The outcome variable should be polityafter since that is the variable that records the level of democracy observed in our country cases after assassination attempts have taken place. Based on the histogram of this variable below, we can see that there is a good spread of levels of democracy in our sample.

library(ggplot2)
ggplot(data = leaders, aes(x = polityafter)) + geom_histogram() #plotting a histogram of variable polityafter

Given our research question, what should be our treatment variable (X)? Visualise its distribution and comment on the proportion of leaders who die after assassination attempts (i.e., is the proportion large or small?).

The treatment variable should be died since that is the variable that records whether a leader dies, or not, after assassination attempts have taken place. Based on the table of proportions calculated below, we can see that roughly 1/5 of leaders or 20% of leaders appear to die after assassination attempts, in our sample.

library(tidyverse)
leaders %>% count(died) %>% mutate(prop = prop.table(n)) #calculates the proportion of sample cases where leaders did and did not die after assassination attempts

  died   n  prop
1    0 196 0.784
2    1  54 0.216

Now that we have both our Y and X variables, fit a linear model to the data in such a way that the estimated slope coefficient is equivalent to the difference-in-means estimator you are interested in and store the fitted model in an object called fit.

You can do this in two different ways, either using:

fit <- lm(polityafter ~ died, data = leaders) #linear regression, y = polity after, x = died, with separate data argument

fit <- lm(leaders$polityafter ~ leaders$died) #linear regression, y = polity after, x = died, with integrated data argument

Adding fit with an assignment operator after it, before the regression function, asks R to save the results of the regression model fitted in an object called fit. This is a convenient way to store regression results if you plan to come back to them and use them at a later date.

What is the estimated slope coefficient?

fit  #typing the name of our stored regression object to return/view results


Call:
lm(formula = leaders$polityafter ~ leaders$died)

Coefficients:
 (Intercept)  leaders$died  
      -1.895         1.132

The estimated slope coefficient is 1.132, this is equivalent to the difference-in-means estimator of the variable died, as our regression model includes just a single X variable, which denotes the treatment effect.

Now, let’s answer the question: What is the estimated average treatment effect?

We estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average.

Note: the mathematical definition of \(\widehat{\beta}\) is the \(\triangle \widehat{Y}\) associated with \(\triangle X\)=1. In this case, \(\widehat{\beta}\) is the \(\triangle\)\(\widehat{\textrm{polityafter}}\) associated with \(\triangle {\textrm{died}}\) = 1. Hence, the death of the leader (that is, when died increases by one unit, from 0 to 1) is associated with an increase in polity scores after the assassination attempt of 1.13 points, on average. The unit of measurement is points because Y is non-binary and measured in points so \(\triangle \overline{Y}\) should also be in points.

In addition, because here X is the treatment variable, \(\widehat{\beta}\) is equivalent to the difference-in-means estimator so we should use causal language in its interpretation. Instead of “associated with an increase,” we should say “increases” or “causes an increase of.”

Final answer: We estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average. This average treatment effect should be valid if the assassination attempts where the leader ended up dying are comparable to the assassination attempts where the leader ended up surviving, that is, if there are no confounding variables present.

Is the effect statistically significant at the 5% level?

Let’s start by specifying the null and alternative hypotheses. Please provide both the mathematical notations and their meaning.

The null and alternative hypotheses are:

\(H_0 {:} \,\, \beta{=}0\) (meaning: a leaders’ death after an assassination attempt has no average causal effect on the level of democracy observed in the nation they govern, at the population level).

\(H_1 {:} \,\, \beta{\neq}0\) (meaning: a leaders’ death after an assassination attempt either increases or decreases the level of democracy observed in the nation they govern, on average, at the population level)

Note that the null and alternative hypotheses refer to \(\beta\), which is the true average causal effect at the population level, not to \(\widehat{\beta}\), which is the estimated average causal effect at the sample level.

What is the value of the observed test statistic? Using summary(fit) to return your stored regression results may be useful here.

summary(fit)  #returning the coefficients stored in the regression item fit, created earlier


Call:
lm(formula = leaders$polityafter ~ leaders$died)

Residuals:
   Min     1Q Median     3Q    Max 
-9.238 -5.238 -2.105  4.895 11.895 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -1.8946     0.4659  -4.067 6.41e-05 ***
leaders$died   1.1322     1.0024   1.130     0.26    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.522 on 248 degrees of freedom
Multiple R-squared:  0.005118,  Adjusted R-squared:  0.001107 
F-statistic: 1.276 on 1 and 248 DF,  p-value: 0.2598

The value of the observed test statistic, is 1.13, when rounded to two decimal places.

Note: The observed test statistic for regression coefficients equals \(\widehat{\beta}\) divided by the standard error of \(\widehat{\beta}\). Here, it equals 1.132212/1.002369 = 1.13, which is exactly what R provides as the t-value for the coefficient of the variable died, that is, the value in the cell in the second row, third column of the table above.

What is the associated p-value?

The associated p-value is 2.597630e-01, which is 0.260. The notation -01 indicates that we need to divide the value returned by 10 to give the correct number.

We can interpret this as indicating that, if the null hypothesis were true, the probability of observing a test statistic equal to or larger than 1.13 (in absolute value) is about 26% (0.260 * 100 = 26%). This is quite a large probability, well above 5%, so we find no sufficient evidence to reject the null hypothesis.

Now, let’s answer the question: Is the effect statistically significant at the 5% level? Please provide your reasoning.

No, the effect is not statistically significant at the 5% level. Because (a) the absolute value of the observed test statistic is lower than 1.96 (1.13<1.96), and/or (b) the p-value is greater than 0.05 (0.260>0.05), we find no evidence to reject the null hypothesis and conclude that on average, a leaders’ death after an assassination attempt has no causal effect on the average level of democracy observed in the nation they govern at the population level.

In other words, we find no evidence to conclude that the death of a leader after an assassination attempt is likely to have an average effect different than zero on the average level of democracy observed in the nation they govern, at the population level.

Note: You do not need to provide both reasons, (a) and (b). One of them suffices since both procedures should lead to the same conclusion.