Skip to main content
ScienceSIOSSpiegeloog 427: Anomalies

Robust Effects, No Theory Attached

By September 22, 2023February 22nd, 2024No Comments

INTRO

It’s a busy night in the theater of soon-to-be-forgotten psychology essays: researchers are preregistering their guesses regarding the performance; the puppeteer is frenetically replacing his marionettes’ strings; and you, dear viewer, are on your way to the seat predicted on your ticket – 95% HPD credible interval [11, 14]. As you sit down in seat 13, the lights dim, the puppeteer hides behind his tiny stage, and the play is about to begin.

The curtains part and unveil two marionettes laid out like rags on the stage. With a flick of the puppeteer’s wrists, the two figures spring to life and seem to be uncannily staring at you, almost as if you’re meant to be the performer and them, the audience.

I’m pleased to see so many of you gathered here, – proclaimed the puppeteer, with only his head poking out from below the stage, – this theater rarely receives this many attendants. Today’s piece is titled ‘GRP 2021 Final Assignment’ and will be performed by my two favorites: Casper and Diane. The occasion is the reproducibility crisis in psychological research, and Casper and Diane can’t help but disagree on how to solve it.

Human bias is the root cause, plain and simple, – Casper said determinedly, – and the most practical way to combat it is through Registered Reports (Chambers, 2013).

Hmmm, I don’t know, – Diane replied coyly, – sounds debatable. Oh, and debate they will. I leave the stage to you two, – the puppeteer declared as he ducked behind the marionette stage, – and by ‘you two’, I do mean me…, – he murmured to himself and smirked. – I should really find friends.

PART I

So, tell me, Casper, – Diane said as she stepped towards him, – why do you think we’re in the middle of a replication crisis (Open Science Collaboration, 2015)? Why, isn’t it obvious? – Casper looked surprised and began gabbling. – We live, or rather perform, in a research landscape where researchers are rewarded for

“novel” and “interesting” results more than for methodological rigor by publishers who

wouldn’t move a finger to save a scientific field where most findings are best assumed to be wrong because they might lose a few points off a meaningless “impact factor”, – he gasped for air and continued like a motor, – when even a first year psychology student knows that using the mean of a highly skewed distribution in this context just doesn’t make any sense (Chambers, 2019; Ioannidis, 2005)! – smoke started coming out of his wooden joints. – No wonder researchers p-hack and HARK the hell out of their data and somehow end up successfully “finding” more than 80% of their initial predictions (Allen & Mehler, 2019), while often being as uncooperative as possible when someone tries to replicate their findings or even derogatory when the results of the replication are “wrong” (Chambers, 2019), it’s–, – the sound of a string snapping interrupted Casper, – I think that was mine.

There goes your left elbow, – Diane noted. – I know that and I’m just as frustrated as you are, but what’s the underlying cause of all of this? – Diane already knew how she’d answer, but she wanted to hear Casper out.

Human bias, as I said, – Casper spoke calmly again. – Researchers oftentimes aren’t even aware of how arbitrary the analysis path they choose is, or how variable their sample is.

And why is that?

Why? Well, that’s just how it is.

Why does physics not suffer from this on such a large scale then? – Diane prodded.

Physics is different. In physics you don’t have much choice in how to analyze your data, and oftentimes you don’t even need data, because we trust the predictions that physicists make (Fried, 2020).

Exactly, because physics has what?

Oh boy, I see where you’re going, – Casper crossed his arms. – Because physics has theory.

There we go, – Diane sounded oddly victorious, – predictions in physics are strongly linked to theory, and if a hypothesis has to go out the window because it wasn’t supported, then so does the theory (Oberauer & Lewandowsky, 2019). What psychology needs most to counteract human bias is strong theory, not reliable statistics.

Alright, but that’s still not an argument against Registered Reports. If researchers ever want to reach some kind of theoretical psychology, then we still need robust empirical findings to build those theories (Muthukrishna & Henrich,

2019). And the best shot at minimizing human bias in psychological research that we have right now is Registered Reports – clear separation of exploratory and confirmatory findings, elimination of p-hacking and HARKing on the researchers’ end and combating of publication bias on the publishers’ end (Chambers, 2019; Wagenmakers et al., 2012). Besides, you make it sound so simple, as if researchers in psychology just forgot that theory exists.

Have they not though? – Diane’s victorious tone had vanished. – Look around you, Casper – the Big Five (Costa & McCrae, 2008), g (Spearman, 1904), Ekman’s basic emotions (Ekman, 2005), – these are our theories, or at least that’s what we believe. We’ve conflated statistical models and theories so much that we may as well have forgotten that theory exists (Borsboom, 2013; Meehl, 1978; Proulx & Morey, 2021). And I’m worried that by focusing on methodological rigor and robust effects, we’re only treating the symptoms and not the root cause.

What’s the root cause then?

PART II

Before we get there, let me ask you: what makes psychology a science? – Diane asked genuinely.

I feel like I’m doomed to fail by answering, – Casper chuckled. – Well, if we adopt a Popperian perspective, – our phenomena are quantifiable, hypotheses – testable, and theories – falsifiable (Chambers, 2019; Popper, 2010).

Yeah, and both you and I are aware that these assumptions hold flimsily at best: most of our measures are not additive, so even if psychological phenomena are quantifiable, we cannot check this assumption through the way we do measurement (Michell, 1997, 2008); most of our hypotheses are so ill-defined that whether a given hypothesis has actually been tested can be argued to be in the eyes of the beholder (Scheel, 2021); and the prevalence of NHST, and thus uninformativeness of null findings, leads to an inability to falsify and “kill” theories in most research (Ferguson & Heene, 2012). Adopting a Popperian perspective just leads to game over, so we need a different perspective.

Are you gonna try making me read Kuhn again? – Casper groaned.

I’ll summarize it. Science is a puzzle, and our job as scientists is to arrange the pieces into a coherent whole. We have an idea of that coherent whole because we share a common intellectual framework, a worldview, a paradigm. Think Newtonian mechanics in physics. This shared paradigm determines what observations are important and what methodological tools we need for them – it

decides which puzzle pieces are relevant and in what ways we can combine them. For a while, everything looks great – inertia, F = m*a, action is equal to reaction; over time though, we begin uncovering pieces that are impossible to fit to the others – the speed of light being independent of observer, time is distorted by gravity, – and suddenly, we require a new paradigm, so that the pieces we’ve uncovered so far can fit again. And on, and on it goes. Science swings back and forth between ages of normality, and ages of revolution (Kuhn, 1963; Proulx & Morey, 2021). Right, and how does psychology fit here?

Psychology is really a bunch of different psychologies, and some of them are going through an age of normality – developmental psychology has the paradigms of Piaget, Vygotsky, Erikson and these guide research of language development, executive function, with no revolution in sight (Proulx & Morey, 2021). Most psychologies are not like this though. When faced with the prospect of revolution and confronted with a paradigmatic void, they didn’t go through with the revolution. Well, why not?

To put it simply, revolutions are scary. Revolutions mean that reality is shifting, uncertain, and it’s hard to deal with that. According to Camus (1991), there are three general ways of dealing with an uncertain reality: first, you can adopt a dogmatic belief, e.g., religion, and suddenly everything is fine again, because we’ve reintroduced certainty into our worldview.

Alright, but science and dogma don’t mix, so that’s not gonna work. Precisely, so we try a second option – you adopt a different kind of certainty – everything is lost, nothing matters, better to just end it all. This isn’t much better than the first option, and just doesn’t seem very productive, you know? Third, we accept the uncertainty inherent to life and we continue grasping at and tumbling towards a more understandable reality. This last option neatly aligns with creating a new paradigm as Kuhn would put it, but most psychologies didn’t end up choosing the third option. When faced with paradigmatic void, with an uncertain reality, we filled that void in with statistical models – synthetic certainties (Proulx & Morey, 2021). And

so, focusing on theory doesn’t even make sense in the current paradigm, if you can even call it a paradigm. What makes sense is to focus on effects. Alright, but won’t focusing on effects just lead to another revolution? Will this not solve itself in due time then?

It won’t and it won’t. We’re in the middle of a replication crisis (Open Science Collaboration, 2015) as you said, the perfect grounds for a revolution, yet the current solutions to it, let’s stick to Registered Reports, are only effects-focused. When

confronted with the instability of synthetic certainty as a replacement for a paradigm, for something theory-based, we react with even more statistics. In a way, this is plain pathological. This is the root cause. This is what we’re not acknowledging, – Diane paused, as if she was waiting for something to happen. Took you a while to get to your answer.

I’m like Scheherazade, I guess. Some stories take more time to finish, – Diane laughed. – Well? It’s morning now, what do you think of my story? Hmmm, yeah, off with your head, – Casper said unamused.

Damn it.

PART III

Look, it’s not the content of what you said that I have issues with, it’s the implication. Because right now, it seems like the only way to get out of this mess we’re in is to travel back in time 50 years, when the steam from the cognitive revolution was starting to cool off, and start implementing theory construction courses in all psychology programs.

I mean, that’s one way of solv–.

Do you really believe that psychology is just doomed now? That we’re equally as scientific as astrology? – Casper asked sarcastically.

…Yes?

Girl, no you don’t.

Yeah, you’re right… I’m full of crap.

Yeah, because at the end of the day, when your friends and family ask you over the holidays: “So what are you doing research on now?”, you’re not gonna say: “Actually, I’m just doing statistical rituals”, what you’re gonna say is: “Hierarchical Bayesian neurocognitive modeling”.

That is a nice image…

Thank God we’re still a science, right? – Except we’re not now, Diane! You stripped us of our science clothes! What do we tell our friends and family now?! Just show them pictures of your head covered with EEG electrodes and some electrical activity, – Diane looks at the audience, – always works. And what do we tell ourselves?

Well obviously, talking about it isn’t gonna do all the work. We need some sort of mass action.

Well, are Registered Reports out the window?

No, – Diane sighs, – better fewer robust effects, than a graveyard of undead theories (Ferguson & Heene, 2012), but we must be aware that if we stop at mass adoption of Registered Reports, then we’re just taking one step forward into synthetic certainty. What we should be doing is taking a step back to get a better view of this incomprehensible patchwork of a research field, realize that we’re screwed when it comes to theory-based psychology, and try something different. ‘Different’ being what?

Well, for starters, acknowledge that the link between hypothesis and statistical test is murky, and that a statistical test, be it NHST or Bayesian, might not always be appropriate (Amrhein et al., 2019; Scheel, 2021). As a field, we’re more like linguistics than physics, yet we run experiments at the drop of a hat (Zwaan, 2013). In linguistics, it’s commonly understood that some hypotheses can’t be tested, and so a lot of work happens through speculative articles, thought experiments, essays. And even in physics – Schrödinger’s cat is a thought experiment, yet it was essential for the understanding of quantum superposition (Schrödinger, 1935).

I agree, there’s room in psychology for essays, but we don’t do experiments only for the fun of them – empirical studies are the currency of our field, like it or not; students need them to get their degrees, faculty need them to secure tenure and receive grants (Zwaan, 2013). And if we wanted to discuss the institutional pressures for experimentation, we’d need a whole new essay.

Agreed, let’s stick to experiments then. If I had to come up with a positive of Registered Reports, it’s that they’re hard – researchers are forced to admit that they don’t know what would falsify many of the hypotheses in psychological research (Scheel et al., 2021), and I hope that this makes researchers more likely to fill that paradigmatic void with actual theory. In the case of replications, researchers are confronted with a multitude of auxiliary hypotheses that weren’t specified in the

original study – did the manipulation actually work, how random was the sampling, were the instruments in perfect order (Amrhein et al., 2019; Scheel et al., 2021)? Researchers are again asking the questions that’ve been mostly ignored for years. See? There is hope.

I wouldn’t call it hope, it’s just trying something different…, – Diane didn’t know what else to say, but she didn’t mind it. – How’s your left elbow? Still immobile, – Casper felt the same.

What’s wrong?

What happens when this play ends?

We get our strings ripped out and wait for the next essay.

No, I know that. I mean what happens to psychology?

I don’t know, just check Twitter, I gue–, – Casper collapsed on the stage. Cool. – Diane followed.

As the two marionettes laid on the stage, you couldn’t help but feel that they were still staring at you, waiting for your performance to start.

Student Initiative for Open Science

This article has been written as part of an ongoing collaborative project with the Student Initiative for Open Science (SIOS). The Amsterdam-based initiative is focused on educating undergraduate- and graduate-level students about good research practices.

INTRO

It’s a busy night in the theater of soon-to-be-forgotten psychology essays: researchers are preregistering their guesses regarding the performance; the puppeteer is frenetically replacing his marionettes’ strings; and you, dear viewer, are on your way to the seat predicted on your ticket – 95% HPD credible interval [11, 14]. As you sit down in seat 13, the lights dim, the puppeteer hides behind his tiny stage, and the play is about to begin.

The curtains part and unveil two marionettes laid out like rags on the stage. With a flick of the puppeteer’s wrists, the two figures spring to life and seem to be uncannily staring at you, almost as if you’re meant to be the performer and them, the audience.

I’m pleased to see so many of you gathered here, – proclaimed the puppeteer, with only his head poking out from below the stage, – this theater rarely receives this many attendants. Today’s piece is titled ‘GRP 2021 Final Assignment’ and will be performed by my two favorites: Casper and Diane. The occasion is the reproducibility crisis in psychological research, and Casper and Diane can’t help but disagree on how to solve it.

Human bias is the root cause, plain and simple, – Casper said determinedly, – and the most practical way to combat it is through Registered Reports (Chambers, 2013).

Hmmm, I don’t know, – Diane replied coyly, – sounds debatable. Oh, and debate they will. I leave the stage to you two, – the puppeteer declared as he ducked behind the marionette stage, – and by ‘you two’, I do mean me…, – he murmured to himself and smirked. – I should really find friends.

PART I

So, tell me, Casper, – Diane said as she stepped towards him, – why do you think we’re in the middle of a replication crisis (Open Science Collaboration, 2015)? Why, isn’t it obvious? – Casper looked surprised and began gabbling. – We live, or rather perform, in a research landscape where researchers are rewarded for

“novel” and “interesting” results more than for methodological rigor by publishers who

wouldn’t move a finger to save a scientific field where most findings are best assumed to be wrong because they might lose a few points off a meaningless “impact factor”, – he gasped for air and continued like a motor, – when even a first year psychology student knows that using the mean of a highly skewed distribution in this context just doesn’t make any sense (Chambers, 2019; Ioannidis, 2005)! – smoke started coming out of his wooden joints. – No wonder researchers p-hack and HARK the hell out of their data and somehow end up successfully “finding” more than 80% of their initial predictions (Allen & Mehler, 2019), while often being as uncooperative as possible when someone tries to replicate their findings or even derogatory when the results of the replication are “wrong” (Chambers, 2019), it’s–, – the sound of a string snapping interrupted Casper, – I think that was mine.

There goes your left elbow, – Diane noted. – I know that and I’m just as frustrated as you are, but what’s the underlying cause of all of this? – Diane already knew how she’d answer, but she wanted to hear Casper out.

Human bias, as I said, – Casper spoke calmly again. – Researchers oftentimes aren’t even aware of how arbitrary the analysis path they choose is, or how variable their sample is.

And why is that?

Why? Well, that’s just how it is.

Why does physics not suffer from this on such a large scale then? – Diane prodded.

Physics is different. In physics you don’t have much choice in how to analyze your data, and oftentimes you don’t even need data, because we trust the predictions that physicists make (Fried, 2020).

Exactly, because physics has what?

Oh boy, I see where you’re going, – Casper crossed his arms. – Because physics has theory.

There we go, – Diane sounded oddly victorious, – predictions in physics are strongly linked to theory, and if a hypothesis has to go out the window because it wasn’t supported, then so does the theory (Oberauer & Lewandowsky, 2019). What psychology needs most to counteract human bias is strong theory, not reliable statistics.

Alright, but that’s still not an argument against Registered Reports. If researchers ever want to reach some kind of theoretical psychology, then we still need robust empirical findings to build those theories (Muthukrishna & Henrich,

2019). And the best shot at minimizing human bias in psychological research that we have right now is Registered Reports – clear separation of exploratory and confirmatory findings, elimination of p-hacking and HARKing on the researchers’ end and combating of publication bias on the publishers’ end (Chambers, 2019; Wagenmakers et al., 2012). Besides, you make it sound so simple, as if researchers in psychology just forgot that theory exists.

Have they not though? – Diane’s victorious tone had vanished. – Look around you, Casper – the Big Five (Costa & McCrae, 2008), g (Spearman, 1904), Ekman’s basic emotions (Ekman, 2005), – these are our theories, or at least that’s what we believe. We’ve conflated statistical models and theories so much that we may as well have forgotten that theory exists (Borsboom, 2013; Meehl, 1978; Proulx & Morey, 2021). And I’m worried that by focusing on methodological rigor and robust effects, we’re only treating the symptoms and not the root cause.

What’s the root cause then?

PART II

Before we get there, let me ask you: what makes psychology a science? – Diane asked genuinely.

I feel like I’m doomed to fail by answering, – Casper chuckled. – Well, if we adopt a Popperian perspective, – our phenomena are quantifiable, hypotheses – testable, and theories – falsifiable (Chambers, 2019; Popper, 2010).

Yeah, and both you and I are aware that these assumptions hold flimsily at best: most of our measures are not additive, so even if psychological phenomena are quantifiable, we cannot check this assumption through the way we do measurement (Michell, 1997, 2008); most of our hypotheses are so ill-defined that whether a given hypothesis has actually been tested can be argued to be in the eyes of the beholder (Scheel, 2021); and the prevalence of NHST, and thus uninformativeness of null findings, leads to an inability to falsify and “kill” theories in most research (Ferguson & Heene, 2012). Adopting a Popperian perspective just leads to game over, so we need a different perspective.

Are you gonna try making me read Kuhn again? – Casper groaned.

I’ll summarize it. Science is a puzzle, and our job as scientists is to arrange the pieces into a coherent whole. We have an idea of that coherent whole because we share a common intellectual framework, a worldview, a paradigm. Think Newtonian mechanics in physics. This shared paradigm determines what observations are important and what methodological tools we need for them – it

decides which puzzle pieces are relevant and in what ways we can combine them. For a while, everything looks great – inertia, F = m*a, action is equal to reaction; over time though, we begin uncovering pieces that are impossible to fit to the others – the speed of light being independent of observer, time is distorted by gravity, – and suddenly, we require a new paradigm, so that the pieces we’ve uncovered so far can fit again. And on, and on it goes. Science swings back and forth between ages of normality, and ages of revolution (Kuhn, 1963; Proulx & Morey, 2021). Right, and how does psychology fit here?

Psychology is really a bunch of different psychologies, and some of them are going through an age of normality – developmental psychology has the paradigms of Piaget, Vygotsky, Erikson and these guide research of language development, executive function, with no revolution in sight (Proulx & Morey, 2021). Most psychologies are not like this though. When faced with the prospect of revolution and confronted with a paradigmatic void, they didn’t go through with the revolution. Well, why not?

To put it simply, revolutions are scary. Revolutions mean that reality is shifting, uncertain, and it’s hard to deal with that. According to Camus (1991), there are three general ways of dealing with an uncertain reality: first, you can adopt a dogmatic belief, e.g., religion, and suddenly everything is fine again, because we’ve reintroduced certainty into our worldview.

Alright, but science and dogma don’t mix, so that’s not gonna work. Precisely, so we try a second option – you adopt a different kind of certainty – everything is lost, nothing matters, better to just end it all. This isn’t much better than the first option, and just doesn’t seem very productive, you know? Third, we accept the uncertainty inherent to life and we continue grasping at and tumbling towards a more understandable reality. This last option neatly aligns with creating a new paradigm as Kuhn would put it, but most psychologies didn’t end up choosing the third option. When faced with paradigmatic void, with an uncertain reality, we filled that void in with statistical models – synthetic certainties (Proulx & Morey, 2021). And

so, focusing on theory doesn’t even make sense in the current paradigm, if you can even call it a paradigm. What makes sense is to focus on effects. Alright, but won’t focusing on effects just lead to another revolution? Will this not solve itself in due time then?

It won’t and it won’t. We’re in the middle of a replication crisis (Open Science Collaboration, 2015) as you said, the perfect grounds for a revolution, yet the current solutions to it, let’s stick to Registered Reports, are only effects-focused. When

confronted with the instability of synthetic certainty as a replacement for a paradigm, for something theory-based, we react with even more statistics. In a way, this is plain pathological. This is the root cause. This is what we’re not acknowledging, – Diane paused, as if she was waiting for something to happen. Took you a while to get to your answer.

I’m like Scheherazade, I guess. Some stories take more time to finish, – Diane laughed. – Well? It’s morning now, what do you think of my story? Hmmm, yeah, off with your head, – Casper said unamused.

Damn it.

PART III

Look, it’s not the content of what you said that I have issues with, it’s the implication. Because right now, it seems like the only way to get out of this mess we’re in is to travel back in time 50 years, when the steam from the cognitive revolution was starting to cool off, and start implementing theory construction courses in all psychology programs.

I mean, that’s one way of solv–.

Do you really believe that psychology is just doomed now? That we’re equally as scientific as astrology? – Casper asked sarcastically.

…Yes?

Girl, no you don’t.

Yeah, you’re right… I’m full of crap.

Yeah, because at the end of the day, when your friends and family ask you over the holidays: “So what are you doing research on now?”, you’re not gonna say: “Actually, I’m just doing statistical rituals”, what you’re gonna say is: “Hierarchical Bayesian neurocognitive modeling”.

That is a nice image…

Thank God we’re still a science, right? – Except we’re not now, Diane! You stripped us of our science clothes! What do we tell our friends and family now?! Just show them pictures of your head covered with EEG electrodes and some electrical activity, – Diane looks at the audience, – always works. And what do we tell ourselves?

Well obviously, talking about it isn’t gonna do all the work. We need some sort of mass action.

Well, are Registered Reports out the window?

No, – Diane sighs, – better fewer robust effects, than a graveyard of undead theories (Ferguson & Heene, 2012), but we must be aware that if we stop at mass adoption of Registered Reports, then we’re just taking one step forward into synthetic certainty. What we should be doing is taking a step back to get a better view of this incomprehensible patchwork of a research field, realize that we’re screwed when it comes to theory-based psychology, and try something different. ‘Different’ being what?

Well, for starters, acknowledge that the link between hypothesis and statistical test is murky, and that a statistical test, be it NHST or Bayesian, might not always be appropriate (Amrhein et al., 2019; Scheel, 2021). As a field, we’re more like linguistics than physics, yet we run experiments at the drop of a hat (Zwaan, 2013). In linguistics, it’s commonly understood that some hypotheses can’t be tested, and so a lot of work happens through speculative articles, thought experiments, essays. And even in physics – Schrödinger’s cat is a thought experiment, yet it was essential for the understanding of quantum superposition (Schrödinger, 1935).

I agree, there’s room in psychology for essays, but we don’t do experiments only for the fun of them – empirical studies are the currency of our field, like it or not; students need them to get their degrees, faculty need them to secure tenure and receive grants (Zwaan, 2013). And if we wanted to discuss the institutional pressures for experimentation, we’d need a whole new essay.

Agreed, let’s stick to experiments then. If I had to come up with a positive of Registered Reports, it’s that they’re hard – researchers are forced to admit that they don’t know what would falsify many of the hypotheses in psychological research (Scheel et al., 2021), and I hope that this makes researchers more likely to fill that paradigmatic void with actual theory. In the case of replications, researchers are confronted with a multitude of auxiliary hypotheses that weren’t specified in the

original study – did the manipulation actually work, how random was the sampling, were the instruments in perfect order (Amrhein et al., 2019; Scheel et al., 2021)? Researchers are again asking the questions that’ve been mostly ignored for years. See? There is hope.

I wouldn’t call it hope, it’s just trying something different…, – Diane didn’t know what else to say, but she didn’t mind it. – How’s your left elbow? Still immobile, – Casper felt the same.

What’s wrong?

What happens when this play ends?

We get our strings ripped out and wait for the next essay.

No, I know that. I mean what happens to psychology?

I don’t know, just check Twitter, I gue–, – Casper collapsed on the stage. Cool. – Diane followed.

As the two marionettes laid on the stage, you couldn’t help but feel that they were still staring at you, waiting for your performance to start.

Student Initiative for Open Science

This article has been written as part of an ongoing collaborative project with the Student Initiative for Open Science (SIOS). The Amsterdam-based initiative is focused on educating undergraduate- and graduate-level students about good research practices.

SIOS Editors

Author SIOS Editors

SIOS editorial staff.

More posts by SIOS Editors