Solutions Seminar 3

Elizabeth Simon
2024-02-12

Do Women Promote Different Policies than Men?

Based on Raghabendra Chattopadhyay and Esther Duflo. 2004. Women as Policy Makers: Evidence from a Randomized Policy Experiment in India, Econometrica, 72(5): 1409–43.

All materials presented here build on the resources for instructors designed by Elena Llaudet and Kosuke Imai in Data Analysis for Social Science: A Friendly and Practical Introduction (Princeton University Press).

Let’s continue working with the data from the experiment in India. As a reminder, Table 1 shows the names and descriptions of the variables in this dataset, where the unit of observation is villages.

Table 1: Variables in “india.csv”

Variable Description
village village identifier (“Gram Panchayat number_village number”)
female whether village was assigned a female politician: 1=yes, 0=no
water number of new (or repaired) drinking water facilities in the village since random assignment
irrigation number of new (or repaired) irrigation facilities in the village since random assignment

In this problem set, we will practice (1) how to estimate an average treatment effect using data from a randomized experiment and (2) how to write a conclusion statement.

As always, we will start by loading and looking at the data (don’t forget to set your working directory first!):

       village female water irrigation
1 GP1_village2      1    10          0
2 GP1_village1      1     0          5
3 GP2_village2      1     2          2
4 GP2_village1      1    31          4
5 GP3_village2      0     0          0
6 GP3_village1      0     0          0
india <- read.csv("india.csv") # reads and stores data
head(india) # shows first observations
  1. To estimate the average causal effect of having a female politician on the number of new (or repaired) drinking water facilities we would use the difference-in-means estimator.

  2. To find out the average number of new (or repaired) drinking water facilities in villages with a female politicians, we need to do the following:

install.packages("tidyverse") # installing tidyverse packages
library(tidyverse) # loading tidyverse packages, which includes the summarise() package

We can then calculate the mean of compute mean of water in villages with female politicians like so:

india %>% group_by(female) %>% summarise(mean = mean(water)) # calculates the average of water by female
# A tibble: 2 × 2
  female  mean
   <int> <dbl>
1      0  14.7
2      1  24.0

We specify the name of the dataframe, india, before the first ‘pipe’ operator %>% as the variable water, which we want to take the mean of for villages with female politicians, is located in this dataframe. The group_by() command then specifies that we want to group our dataframe, india, according to the variable female - one group of villages with female politicians, represented by the value 1 (see Table 1), and one group of villages which are not, represented by the value 0. The first call to mean in the function summarise () asks R to produce a record called ‘mean’ which stores the specific type of mean value specified subsequently and the latter call to mean asks R to produce the mean of the variable water grouped by the variable female in the dataframe india.

Our output tells us that the average number of new (or repaired) drinking water facilities in villages with a female politician (female == 1) is 24 facilities.

  1. What is the average number of new (or repaired) drinking water facilities in villages with a male politician? Please answer with a full sentence.

The output presented in question 2 also provides us with the answer to this question, telling us that the average number of new (or repaired) drinking water facilities in villages with a male politician (female == 0) is 15 facilities. We round up from 14.7 to 15 as it is not possible to have a fraction of a drinking water facility.

  1. What is the estimated average causal effect of having a female politician on the number of new (or repaired) drinking water facilities? Please provide a full substantive answer. Make sure to include the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect.

The difference-in-means estimator is calculated by subtracting the average outcome for the control group from the average outcome for the treatment group.

Difference-in-means estimator = average outcome for treatment group - average outcome for control group

Difference-in-means estimator = 24 - 14.7

Difference-in-means estimator = 9.3

The positive estimator produced here shows that there tends to be more drinking water facilities in villages with female politicians than those with male politicians. Specifically, that there is an increase of 9 drinking water facilities in villages with female politicians, on average.

Assuming that the villages that were randomly assigned to have a female politician were comparable to the villages that were randomly assigned to NOT have a female politician (a reasonable assumption since the female politicians were assigned at random), we estimate that having a female politician increases the number of new or repaired drinking water facilities by 9 facilities, on average.