POL269 - Political Data Research: Solutions Seminar 8

Do Negative Political TV Ads Decrease Voter Turnout?

All materials presented here build on the resources for instructors designed by Elena Llaudet and Kosuke Imai in Data Analysis for Social Science: A Friendly and Practical Introduction (Princeton University Press).

Today’s seminar will consist of two parts. The first will practice one of the most useful skills we can teach you, which is the ability to evaluate social scientific studies. All of you will need, at one point or another, to read a published research article, make sense of it, and figure out whether you can trust the findings. So we are going to begin this week’s seminar by practicing just that.

Part 1

For this purpose, we are going to read sections of: Ansolabehere, Stephen, Shanto Iyengar, Adam Simon and Nicholas Valentino. (1994) Does Attack Advertising Demobilize the Electorate? The American Political Science Review, 88(4), pp.829-838.

You can find the article with the highlighted sections I want you to focus on uploaded to the POL269 website.

Please answer the following questions based on the highlighted sections of the text.

Is this a causal study? In other words, is the aim of the study to estimate the causal effect of a treatment on an outcome? Yes or no?

Yes, it is a causal study because it aims to estimate causal effects.

Is this a randomised experiment or an observational study? Explain your reasoning.

It is a randomised experiment because the treatment was assigned at random.

What is the treatment the study is interested in estimating the effects of? (Technically, the study is interested in the effect of two different treatments, but in this seminar we just focus on one of them.)

Exposure to a negative political TV advertisement (rather than a non-political one) is the treatment.

What is the outcome variable?

Intention to vote.

What was the unit of observation? Or, in other words, what does each observation represent?

Each observation represents a participant or person.

How many people participated in this study? Hint: you may need to look into the table containing the results of the analysis.

1,655 people.

What was the estimated average causal effect of the treatment on the outcome? In other words, what were the findings of the study? Make sure to include the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect.

Let’s start by figuring out each key element separately.

What’s the assumption? We assume that the participants who were exposed to a negative political TV advertisement (the treatment group) were comparable to the participants who were exposed to a non-political TV advertisement (the control group). If this assumption were not true, the difference-in-means estimator would NOT produce a valid estimate of the average treatment effect.

Why is the assumption reasonable? Because negative political TV advertisements were assigned at random OR because the data come from a randomised experiment. Remember that random treatment assignment makes the treatment and control groups identical to each other in all observed and unobserved pre-treatment characteristics, on average.

What’s the treatment? Being exposed to a negative political TV advertisement.

What’s the outcome? Intention to vote.

What’s the direction, size, and unit of measurement of the average causal effect? A decrease of 2.5 percentage points, on average. It is a decrease because we are measuring change—the change in the outcome variable caused by the treatment—and the difference-in-means estimator is negative.

The difference-in-means estimator = the proportion of participants who intend to vote, among those exposed to negative political TV advertisements - the proportion of participants who intend to vote, among those exposed to non-political TV advertisements - 58% - 61% = -2.5 percentage points.

Assuming that the participants who were exposed to a negative political TV advertisement were comparable to the participants who were exposed to a non-political TV advertisement (a reasonable assumption since negative political TV advertisements were assigned at random), we estimate that being exposed to a negative political TV advertisement decreases intention to vote by about 2.5 percentage points, on average.

Part 2

The second part of today’s seminar focuses on confounding variables, and how controlling for these changes our estimates of average causal effects.

In Seminar 6, we used the dataset “leaders.csv” to estimate the causal effect of the death of a leader on the level of democracy of a country, and showed that the difference-in-means estimator is equivalent to the slope coefficient estimated by a linear regression in cases where regression models include just an outcome (Y) and treatment (X) variable.

In this seminar, we show that the difference-in-means estimator is not equivalent to the slope coefficient estimated by a linear regression in cases where regression models include confounding variables, in addition to the outcome (Y) and treatment (X) variable.

Table 1 provides a quick reminder of the names and descriptions of the variables in the leaders dataset, where the unit of observation is assassination attempts.

Table 1: Variables in “leaders.csv”

Variable	Description
year	year of the assassination attempt
country	country name
leadername	name of the leader
died	whether leader died: 1=yes, 0=no
politybefore	polity scores before the assassination attempt (in points)
polityafter	polity scores after the assassination attempt (in points)

As always, we start by loading and looking at the data (remembering to set our working directory first!):

leaders <- read.csv("leaders.csv") # reads and stores data
head(leaders) # shows first observations

  year     country       leadername died politybefore polityafter
1 1929 Afghanistan Habibullah Ghazi    0           -6   -6.000000
2 1933 Afghanistan       Nadir Shah    1           -6   -7.333333
3 1934 Afghanistan      Hashim Khan    0           -6   -8.000000
4 1924     Albania             Zogu    0            0   -9.000000
5 1931     Albania             Zogu    0           -9   -9.000000
6 1968     Algeria      Boumedienne    0           -9   -9.000000

Use the base R lm() function to fit a line to the data and summarise the relationship between X and Y, such that we can estimate the causal effect of the death of a leader on the level of democracy of a country. Remember that the lm() function requires an argument in the form Y~X, either specified as:

lm(leaders$polityafter ~ leaders$died)


Call:
lm(formula = leaders$polityafter ~ leaders$died)

Coefficients:
 (Intercept)  leaders$died  
      -1.895         1.132

Or as:

lm(polityafter ~ died , data = leaders)


Call:
lm(formula = polityafter ~ died, data = leaders)

Coefficients:
(Intercept)         died  
     -1.895        1.132

As our regression model includes just an outcome (Y) and treatment (X) variable, the slope coefficient estimated here (1.13) is equivalent to the difference-in-means estimator. We demonstrated this in the seminar in week 6.

We interpret this slope coefficient/difference-in-means estimator as follows: Assuming that the assassination attempts where the leader ended up dying are comparable to the assassination attempts where the leader ended up surviving (an assumption that might be reasonable if the death of the leader after an assassination attempt is close to random), we estimate that the death of the leader increases the country’s polity scores after the assassination attempt by 1.13 points, on average

Now use the base R lm() function to fit a line to the data and summarise the relationship between X and Y, such that we can estimate the causal effect of the death of a leader on the level of democracy of a country, while controlling for the confounding effects of countries’ prior democracy scores, using the variable politybefore.

We do this like so:

lm(leaders$polityafter ~ leaders$died + leaders$politybefore)


Call:
lm(formula = leaders$polityafter ~ leaders$died + leaders$politybefore)

Coefficients:
         (Intercept)          leaders$died  leaders$politybefore  
             -0.4346                0.2616                0.8375

Or as:

lm(polityafter ~ died + politybefore, data = leaders)


Call:
lm(formula = polityafter ~ died + politybefore, data = leaders)

Coefficients:
 (Intercept)          died  politybefore  
     -0.4346        0.2616        0.8375

Note that to include confounding variables in a linear regression model you simply add these on the right hand side of your regression equation, using a + sign to add each additional variable to the other X variables.

We can see that the slope coefficient estimated by this ‘with controls’ regression model has changed substantially compared to that estimated in the prior model, going from 1.13 to 0.26.

After controlling for countries’ democracy scores which were recorded prior to the assassination attempt, we estimate that the death of a leader after an assassination attempt increases a country’s polity scores by only 0.26 points, on average.

The substantial reduction in the size of the slope coefficient which is linked to controlling for countries’ prior democracy scores suggests that much of the effect of the death of a leader after an assassination attempt on a countries’ level of democracy, which we observed in the previous model, is actually being driven by pre-existing differences in countries’ prior democracy scores.

Interpreting the regression coefficient for politybefore tells us that a one-point increase in a country’s democracy score prior to an assassination attempt is associated with a 0.84 point increase in that same score after an assassination attempt. There is a clear correlation between polity (democracy) scores both pre- and post- assassination attempts on average, which suggests successful assassinations attempts do not make a large difference to the average levels of democracy observed in the countries considered here, on the whole.

This exercise provides a clear demonstration of the fact that the difference-in-means estimator is not equivalent to the slope coefficient estimated by a linear regression in cases where regression models include confounding variables, in addition to the outcome (Y) and treatment (X) variable.

When looking to estimate causal effects while controlling for confounding variables, we should always use the linear regression method rather than the difference-in-means estimator.