POL269 - Political Data Research: Seminar 8

Do Negative Political TV Ads Decrease Voter Turnout?

All materials presented here build on the resources for instructors designed by Elena Llaudet and Kosuke Imai in Data Analysis for Social Science: A Friendly and Practical Introduction (Princeton University Press).

Today’s seminar will consist of two parts. The first will practice one of the most useful skills we can teach you, which is the ability to evaluate social scientific studies. All of you will need, at one point or another, to read a published research article, make sense of it, and figure out whether you can trust the findings. So we are going to begin this week’s seminar by practising just that.

Part 1

For this purpose, we are going to read sections of: Ansolabehere, Stephen, Shanto Iyengar, Adam Simon and Nicholas Valentino. (1994) Does Attack Advertising Demobilize the Electorate? The American Political Science Review, 88(4), pp.829-838.

You can find the article with the highlighted sections I want you to focus on uploaded to the POL269 website.

Please answer the following questions based on the highlighted sections of the text.

Is this a causal study? In other words, is the aim of the study to estimate the causal effect of a treatment on an outcome? Yes or no?
Is this a randomised experiment or an observational study? Explain your reasoning.
What is the treatment the study is interested in estimating the effects of? (Technically, the study is interested in the effect of two different treatments, but in this seminar we just focus on one of them.)
What is the outcome variable?
What was the unit of observation? Or, in other words, what does each observation represent?
How many people participated in this study? Hint: you may need to look into the table containing the results of the analysis.
What was the estimated average causal effect of the treatment on the outcome? In other words, what were the findings of the study? Make sure to include the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect.

Part 2

The second part of today’s seminar focuses on confounding variables, and how controlling for these changes our estimates of average causal effects.

In Seminar 6, we used the dataset “leaders.csv” to estimate the causal effect of the death of a leader on the level of democracy of a country, and showed that the difference-in-means estimator is equivalent to the slope coefficient estimated by a linear regression in cases where regression models include just an outcome (Y) and treatment (X) variable.

In this seminar, we show that the difference-in-means estimator is not equivalent to the slope coefficient estimated by a linear regression in cases where regression models include confounding variables, in addition to the outcome (Y) and treatment (X) variable.

Table 1 provides a quick reminder of the names and descriptions of the variables in the leaders dataset, where the unit of observation is assassination attempts.

Table 1: Variables in “leaders.csv”

Variable	Description
year	year of the assassination attempt
country	country name
leadername	name of the leader
died	whether leader died: 1=yes, 0=no
politybefore	polity scores before the assassination attempt (in points)
polityafter	polity scores after the assassination attempt (in points)

As always, we start by loading and looking at the data (remembering to set our working directory first!):

leaders <- read.csv("leaders.csv") # reads and stores data
head(leaders) # shows first observations

  year     country       leadername died politybefore polityafter
1 1929 Afghanistan Habibullah Ghazi    0           -6   -6.000000
2 1933 Afghanistan       Nadir Shah    1           -6   -7.333333
3 1934 Afghanistan      Hashim Khan    0           -6   -8.000000
4 1924     Albania             Zogu    0            0   -9.000000
5 1931     Albania             Zogu    0           -9   -9.000000
6 1968     Algeria      Boumedienne    0           -9   -9.000000

Use the base R lm() function to fit a line to the data and summarise the relationship between X and Y, such that we can estimate the causal effect of the death of a leader on the level of democracy of a country. Remember that the lm() function requires an argument in the form Y~X, and that we can specify our regression models using either a separate or an integrated data argument/call.
Now use the base R lm() function to fit a line to the data and summarise the relationship between X and Y, such that we can estimate the causal effect of the death of a leader on the level of democracy of a country, while controlling for the confounding effects of countries’ prior democracy scores, using the variable politybefore.
Interpret the slope coefficient for died estimated in this ‘with controls’ regression model and think about how this differs from that estimated in the simpler regression model. Are the difference-in-means estimator and slope coefficient still equivalent in this ‘with controls’ regression model?