Jump to content
Seriously No Politics ×

Cause-and-effect research


Recommended Posts

Hi all,

Based on a certain topic in the General Discussion that has since been removed, I thought it might be useful to provide a quick-and-dirty overview of what makes a credible cause-and-effect study in the natural and social sciences. You might also see these called impact assessments, interventions, program evaluations, causal inferences, or causal research.

I teach quantitative Program Evaluation at the University of Calgary and the London School of Economics, where a major focus is selecting appropriate 'research designs' that let us identify cause-and-effect and isolate it from mere correlation. To put it another way, you have likely heard the phrase, "correlation does not equal causation." A question should naturally flow from this statement, but often does not, which is - "so what does equal causation?"

The pithy answer is - a counterfactual - an example, subject, or scenario where all things are constant except one key factor, which may help us identify its influence on some sort of outcome. Put another way, if "X causes Y," the counterfactual to this would be, "If X does not occur, then Y does not occur."

When we seek to identify this factor and its influence, we have to assume certain things about it in order to determine its causal impact. These are:

1. Excludability - the outcome we observe for a subject is due to the factor of interest, to the exclusion of other factors related to the factor of interest.

2. Independence - the subject receives or experiences the factor of interest, independent of whether others also experience it or not.

3. Non-interference (also called the 'Stable unit treatment value assumption' or SUTVA): the potential outcome of any subject is not affected the outcome of another subject also being influenced by the factor of interest.

If we want to safely assume these things with regards to a factor and its influence on a particular outcome, randomization is a powerful tool. Let's say we want to understand the influence of a vaccine on a population of interest. There are many factors that may influence the outcome of interest in this study, which is the rate of sickness from a particular disease. This includes general health and well-being, mental illness, current acute illnesses, etc. But if we take a representative sample of that population, we can safely assume it experiences a rate of these random influences equal to the population as a whole. From there, we (the scientists) can randomly assign people to a 'treatment group'. For the purposes of studies, treatment does not mean a medical treatment necessarily. It just means that someone is assigned to receive the factor of interest. This could be a pamphlet from a political scientist studying GOTV campaigns, advertising on LinkedIn about professional career mentoring, or whatever. In this example, the treatment is the vaccine. We call these typies of studies 'experiments' or more specifically, Randomized Control Trials (RCTs). They are generally considered the 'gold standard' in cause-and-effect research.

I have literally seen randomization done. In my case, it was using the R statistical package called 'randomizr', which acts like a random number generator. Everyone who got assigned to the treatment group was given the number 1. Everyone else got the number 0. The 0's comprise the control group. Now, going back to the above. If the sample size we have is representative of the population, then that means any random events that may affect the outcome of interest (in this case, sickness from a disease) should be evenly distributed between the 1's and the 0's. If we don't see that before we administer the treatment, and rates are skewed, it might be because there's some sort of bias in our sampling. But, if it is indeed evenly distributed, we can assume any post-treatment differences are the result of vaccine status, and not common factors that are shared between the two groups. In other words, we can exclude these factors. Likewise, we know that someone cannot self-select into the treatment group vs. the placebo. So we know their outcomes are independent of each other. Non-interference *can* be trickier, but if we segregate subjects from each other, we can assume they aren't impacting each other's outcomes. Sometimes this means literally keeping research subjects apart under observation. Other time, it might just mean making sure they don't have any pre-study connections, making them sign a non-disclosure, etc.

After the treatment is administered, we can simply measure the average outcome of the treatment group that got the vaccine from the control group that got the placebo. There are fancier ways to do this, but if randomization is done correctly, then an average is oftimes good enough.

Link to comment

Part II: Observational Studies

Sometimes, randomization is not possible. This can be due to ethical reasons, practical ones, or it might simply not be of interest. To give you an example, in my own research, I study the impacts of volunteerism on charitable activities, pro-social behaviours, poitical outcomes, etc. More specifically, I try to understand whether the volunteerism of one subject impacts these outcomes in someone else. Currently, this research is focused on sponsoring refugees.

It would be unethical for me to assign a random group of Americans to sponsor a refugee. Even if it weren't unethical, it might not be possible given budgetary constraints to do this study, assuming that I'd have to pay for the initial matching and transport of refugees. Perhaps most importantly, I am interested in the impact of real life public policies as formulated by governments, which are, by design, not assigned randomly. Therefore, an RCT assigning people to sponsor a refugee is neither ethical, practical, nor of interest to me.

So, we do an observational study instead, of a 'quasi-experiment.' It's called a quasi-experiment, because, if designed correctly, we can account for excludability, independence, and non-interference, but only through means other than randomization. Put another way, in a quasi-experiment, the researchers themselves cannot randomly assign treatment status - someone else assigned it instead. To use a real life example, I did a paper years ago estimating the change in enrolment by international students at Canadian universities, given a certain travel policy implemented by the United States. In this case, treatment status (whether your country was banned from travelling to the US or not) was assigned by the US government, not me.

Quasi-experiments must be designed very careful to make sure we aren't actually measuring other factors' influence on our outcome of interest. We must also make sure that we aren't just measuring self-selecting traits into our treatment status (independence), or that people who are treated by an intervention aren't influencing each other's outcomes (non-interference). Because we can't rely on randomization, we have to employ other techniques. Chief among these are matching research subjects that are similar in most possible ways, except one happens to be in the treated group, and has one distinct characteristic that makes it so. I will provide two examples:

1) Under Italian law, a municipality that is 5000 people or more receives a 33% wage subsidy for its mayor. We can assume that towns in the same region that are 4999 people and 5000 people respectively are similar in many ways, except one receives the subsidy and other does not. Treatment group status is therefore a simple case of having 1 extra person, and that the presence of that person is more likely due to random factors rather than some systemic difference between towns. In a study on this subject, it was found that towns just barely over the threshold attracted mayoral candidates with a much higher level of education, meaning more white collar mayors. We call this a "Regression Discontinuity Design" by the way, meaning there was some discontinuous intervention somewhere along a continuous line (in this case, a line of people).

2) Card and Krueger famously won a Nobel Prize a few years ago for their work beginning with a 1994 study on the impacts of minimum wage increases on fast-food employment in Pennsylvania and New Jersey. NJ was raising its minimum wage, but PA was not. They surveyed restaurants across the river from each other before NJ raised its minimum wage and afterwards. If Card and Krueger had just measured restaurants in NJ pre- and post-MW increase, it would be impossible to exclude other factors. What if NJ had undergone a recession or a boom in the intervening period? What if their supply chains broke down? What if there was something distinct about the demographic pool of employees? Any one of these (or all of them) could influence firm responses to a minimum wage increase. By surveying restaurants on both side of the state line (which were sometimes only divided by a street or a bridge), you could account for many of these - both PA and NJ would likely be influenced by the same economic cycle at the same time, and restaurants across from each other likely relied on the same supply chains and pool of employees. To calculate this, they then took average employment rates in NJ post-MW and differenced (subtracted) them from the average pre-MW - giving the change over time. They then then did the same averaging and differencing exercise for PA in the periods before and after NJ raised its minimum wage, thereby accounting for changes over time in PA. Finally, they took the resulting values from both of the above differences, and differenced (subtracted) them from each other, thereby eliminating any state-specific differences between them (maybe PA has an inherently higher economy). The result is we isolate the effects of state-level differences and time-level differences, leaving only the impact of the minimum wage on employment rates. I'll leave you to find out the results of their study, but the design has since become more refined, and is now called "difference-in-differences," or the difference between the first two differenced averages.

You can start to get a sense from the above with how researchers need to account for other factors, assignment to treatment status, and spillover effects, and how difficult it is without randomization. You essentially need to 'find' rather than select a treatment group, and an appropriate control group.

In my own research on refugee sponsorship, I have tried to do this with the following:

1) Compare current refugee sponsors to pending sponsors - people who have signed up to sponsor a refugee but haven't yet me them.

2) Collect information on zip codes.

This helps control for the fact that sponsoring a refugee is not assigned independently, and that we can't exclude other factors influencing their outcomes. Likewise, zip codes help me make sure that people in either group aren't talking to each other or influencing each other's outcomes.

To wrap this all together, whenever you see a study come out, don't just add a tally point for your side of a particular debate given a study confirms your views. Investigate how a study was designed. If it wasn't random, did they select an appropriate treatment and control group? To lightly touch on the example from the banned thread (without wanting to necromancy its topic), we need to make sure that there aren't systemic differences between our population of interest and comparison groups - otherwise we cannot be sure if we're measuring the impact of an intervention on them vs. demographic traits.

Anyways, thank you for attending class today. Remember that papers are due in a few weeks and you will be penalized for going over the word limit.

Link to comment
  • 1 month later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...