POL269 - Political Data Research: Solutions Seminar 6

What is the Effect of the Death of the Leader on the Level of Democracy?

Based on Benjamin F. Jones and Benjamin A. Olken. 2009. Hit or Miss? The Effect of Assassinations on Institutions and War. American Economic Journal: Macroeconomics, 1(2): 55–87.

All materials presented here build on the resources for instructors designed by Elena Llaudet and Kosuke Imai in Data Analysis for Social Science: A Friendly and Practical Introduction (Princeton University Press).

There is a longstanding debate in the study of international relations on whether individual political leaders make a difference. To explore this issue, let’s estimate the causal effect of the death of the leader on the level of democracy of a country. For this purpose, we will analyse data on assassinations and assassination attempts against political leaders from 1875 to 2004.

Whether an assassination attempt occurs or not is not a random process. However, once an assassination attempt has occurred, one could argue that whether the assassination attempt is successful or not is the result of small elements of randomness, such as the timing and path of the weapon. As a result, we can consider (at least for now) that, after an assassination attempt, the death a leader is close to random and, thus, the assassination attempts where the leader ended up dying should be, on average, comparable to the assassination attempts where the leader ended up surviving. If this is true, then we can estimate the average causal effect of the death of the leader by computing the difference-in-means estimator.

To measure the level of democracy of the country, we will use polity scores. Polity scores categorize the regime of a country on a 21-point scale ranging from -10 (hereditary monarchy) to +10 (consolidated democracy). The Polity Project has produced polity scores for all countries from 1800 and on. For example, here are the 2018 polity scores.

The dataset is stored in a file called “leaders.csv”. Table 1 shows the names and descriptions of the variables in this dataset, where the unit of observation is assassination attempts.

Table 1: Variables in “leaders.csv”

Variable	Description
year	year of the assassination attempt
country	country name
leadername	name of the leader
died	whether leader died: 1=yes, 0=no
politybefore	polity scores before the assassination attempt (in points)
polityafter	polity scores after the assassination attempt (in points)

In this problem set, we practice fitting a linear model to estimate average causal effects.

As always, we start by loading and looking at the data (remember to set your working directory first!):

leaders <- read.csv("leaders.csv") # reads and stores data
head(leaders) # shows first observations

  year     country       leadername died politybefore polityafter
1 1929 Afghanistan Habibullah Ghazi    0           -6   -6.000000
2 1933 Afghanistan       Nadir Shah    1           -6   -7.333333
3 1934 Afghanistan      Hashim Khan    0           -6   -8.000000
4 1924     Albania             Zogu    0            0   -9.000000
5 1931     Albania             Zogu    0           -9   -9.000000
6 1968     Algeria      Boumedienne    0           -9   -9.000000

First, let’s identify our Y and X variables. Given that we are interested in estimating the average causal effect of the death of a leader on the polity scores of a country:
1. The Y variable should be polityafter, which is a non-binary variable since it can take more than two values. Since we are trying to measure the effect of the death of the leader on the level of democracy after the assassination attempt, polityafter should be the outcome variable (Y). Note that an even better Y variable would be the difference between polityafter and politybefore, which would allow us measure the effect that the death of the leader had on the change in the level of democracy. However, this is more advanced that what we do in this class.
2. The X variable should be died, which is a binary variable since it can only take 1s and 0s. Since we are trying to measure the effect of the death of the leader on the level of democracy, died should be the treatment variable (X). All treatment variables we will consider in this class are binary. They equal 1 when the observation was treated, 0 when the observation was not treated. In this case, the treatment is the death of the leader and, thus, the treatment variable is died, which equals 1 when the leader died and 0 when the leader did not die.
To compute the difference-in-means estimator directly (just as we did in seminar 3), we do the following:

First, we install the tidyverse packages if we have not done so already, and load these into R. The tidyverse set of packages contains the summarise() function which we use to calculate the means that are used in our difference-in-means estimator.

install.packages("tidyverse") # installing tidyverse packages, only do this if you ahve not already done so!

library(tidyverse) # loading tidyverse packages, which includes the summarise() package used to calculate means

Recall that the difference-in-means estimator is calculated by subtracting the average outcome for the control group (here those cases where assassination attempts did not result in death, died == 0) from the average outcome for the treatment group (here those cases where assassination attempts did not result in death, died == 1). We can do this in two different ways, shown below, with both giving very similar answers. The subtle differences in these answers are caused by differences in rounding across the two methods, the first uses two decimal places and the latter uses 6, so the final results produced are slightly different, but both are completely correct.

leaders %>% group_by(died) %>% summarise(mean = mean(polityafter)) # calculates the average of polityafter by died

# A tibble: 2 × 2
   died   mean
  <int>  <dbl>
1     0 -1.89 
2     1 -0.762

-0.762--1.89

[1] 1.128

leaders %>% filter(died ==1) %>% summarise(mean = mean(polityafter)) # calculates the average of polityafter in cases where assassination attempts succeed

        mean
1 -0.7623457

leaders %>% filter(died ==0) %>% summarise(mean = mean(polityafter)) # calculates the average of polityafter in cases where assassination attempts succeed

       mean
1 -1.894558

-0.7623457--1.894558

[1] 1.132212

Depending on which method you use/prefer, the difference-in-means estimator is either 1.12 or 1.13.

We need to think about the direction (positive/negative), size (distance from 0) and unit of observation when interpreting this effect, as well as thinking about the assumptions we must make when using the difference-in-means estimator. Considering all of this together, we can say that assuming that the chance of survival after an attempted assassination is approximately random, we estimate that the death of a leader through assassination increases the polity score of, and thus level of democracy observed in, that country by around 1 point, on average.

We now use the base R lm() function to fit a line to the data and summarize the relationship between X and Y. The lm() function requires an argument in the form Y~X, either specified as:

lm(leaders$polityafter ~ leaders$died)


Call:
lm(formula = leaders$polityafter ~ leaders$died)

Coefficients:
 (Intercept)  leaders$died  
      -1.895         1.132

Or as:

lm(polityafter ~ died , data = leaders)


Call:
lm(formula = polityafter ~ died, data = leaders)

Coefficients:
(Intercept)         died  
     -1.895        1.132

The first way of specifying our linear regression model uses the $ operator to tell R we are looking for the columns polityafter and died within the leader dataframe/object, while the second way explicity sets out the dataframe to be used with the data argument and therefore does not need to explicitly specify the dataframe to be searched when listing the variable/column names in the lm() function. We can see that both ways of specifying this linear regression provide exactly the same results, so it is again up to you to use whichever specification you prefer.

This is the general formula $\widehat{Y} = \widehat{\alpha} + \widehat{\beta} X$ for a fitted regression line. We can write the specific fitted line for our regression model substituting each term with our model values, i.e., substitute $Y$ for the name of the outcome variable, substitute $\widehat{\alpha}$ for the estimated value of the intercept coefficient, substitute $\widehat{\beta}$ for the estimated value of the slope coefficient, and substitute $X$ for the name of the treatment variable.

This gives the following fitted line (after rounding regression estimates to 2 d.p.): $\widehat{\textrm{polityafter}}$ = -1.90 + 1.13 died.

Note: The Y variable is polityafter, $\widehat{\alpha}$=-1.90, $\widehat{\beta}$=1.13, and the X variable is died.

Yes, we can see that the estimated slope coefficient of our regression model is equivalent to the difference-in-means estimator we calculated earlier. Both equal to 1.13, or 1.12 (if we use one of the alternative methods of manually calculating the difference-in-means estimator).
A full substantive interpretation of the estimated slope coefficient (including the unit of measurement) would read something like this…

We estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average.

Note: the mathematical definition of $\widehat{\beta}$ is the $\triangle \widehat{Y}$ associated with $\triangle X$=1. In this case, $\widehat{\beta}$ is the $\triangle$$\widehat{\textrm{polityafter}}$ associated with $\triangle {\textrm{died}}$ = 1. Hence, the death of the leader (that is, when died increases by one unit, from 0 to 1) is associated with an increase in polity scores after the assassination attempt of 1.13 points, on average. The unit of measurement is points because Y is non-binary and measured in points so $\triangle \overline{Y}$ should also be in points.

In addition, because here X is the treatment variable, $\widehat{\beta}$ is equivalent to the difference-in-means estimator so we should use causal language in its interpretation. Instead of “associated with an increase,” we should say “increases” or “causes an increase of.”

Final answer: We estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average. This average treatment effect should be valid if the assassination attempts where the leader ended up dying are comparable to the assassination attempts where the leader ended up surviving, that is, if there are no confounding variables present.

There are a number of aspects we must consider when coming up with a conclusion regarding the average causal effect of the death of a leader on the polity scores of a country. Let’s start by figuring out each key element separately.

First, we consider the assumptions we make. We assume that the assassination attempts where the leader ended up dying (the treatment group) are comparable to the assassination attempts where the leader ended up surviving (the control group). If this assumption were not true the difference-in-means estimator would NOT produce a valid estimate of the average treatment effect.

Then, we consider whether this assumption is reasonable. Even though the dataset does not come from a randomized experiment, whether a leader dies after an assassination attempt is arguably close to random, so we can be fairly confident that this assumption is, in fact, reasonable.

We then state the treatment and outcome. Treatment = the death of the leader. Outcome = polity scores after the assassination attempt.

Finally, we consider the direction, size and unit of measurement of the average causal effect. We see an increase of 1.13 points, on average. We see an increase because we are measuring change — the change in the outcome variable caused by the treatment — and the difference-in-means estimator is positive. The unit of measurement is the same as the unit of measurement of our outcome variable, since Y is non-binary and measured in points, the difference-in-means estimator, and the regression coefficient, are also measured in points.

Full answer: Assuming that the assassination attempts where the leader ended up dying are comparable to the assassination attempts where the leader ended up surviving (an assumption that might be reasonable if the death of the leader after an assassination attempt is close to random), we estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average.