If Shrimp, Then Towel (extended scientific cosplay edition)

Science & Beers (June 26th) – 10 minutes!

Author
Affiliation

Dr Charles T. Gray, Datapunk

Good Enough Data & Systems Lab

Published

May 3, 2025

Cultural identity

When I meet people in Denmark, they invariably say,

You’re Australian? Put another shrimp on the barbie!

But Australians call them prawns, and what we put on the barbeque are sausages—snags in Australian Ocker—so it’s more accurate to say, put another snag on the barbie.

Crocodile Dundee (1986) [1] is a bit confused about Australian identity 1.


Australians are confused about Australian identity.

Here’s how it goes when I meet another Australian.

Nice to meet you, Charles. And where do you come from?

Melbourne.

I mean, where do you really come from?

Melbourne.

No, I mean, where were you born?

Melbourne.

But, where were your parents born?

Melbourne.

Well, where were their parents born?

And here’s where it gets a bit tricky. Because the only real Australians are Aboriginals, the rest of us got off the boat pretty recently. But if I were to ask

Where the fuck were your grandparents born?

I’d be rude.


So, to make sense of my Jewish Chinese-Transylvanian Australian diasporic identity, I enrolled in cultural studies—because the only way to sort that out was to read essays written in the 1970s by a Palestinian-American deconstructing 19th century opera [2].

I found myself in philosophical seminars. Some theorist would deliver a dense analysis of Elizabeth Bishop’s treatment of the villanelle in her poem One Art [3] to be met with a pompous question from some git in the audience,

Yes, but hwhat is your ontology?

I’ve never felt very comfortable with terms like ontology and epistemology, can never seem to pin down their meaning. My understanding of ontology is that it is the study of being, how the existence of a thing is defined—which, frankly, has never once helped me catch a train.

Professional identity

I graduated into the global financial crisis of the late 2000s.

So, the world didn’t even give me a job to identify with, and told me my education was less than worthless.



It was time for pragmatism; I decided to get a ‘real’ job, be an accountant—or somesuch.

I enrolled in a mathematics degree to learn how to crunch numbers.

Mathematical identity

Here’s where mathematics starts you off:

Suppose there exists an xx,

and, worse still,

Let the identity define the map from xx to xx.


But the longer you stay in mathematics, the more grounded it becomes.

Mathematics becomes a guide for answering real-world questions, such as,

What is the correct dosage to treat this condition?

and

How do we ensure the trains run on time?

Answering questions

You might be vaguely familiar with current methodologies for answering questions with real-world data—heard these mentioned in passing, once or twice:

  • algorithms;
  • statistics;
  • data science;
  • machine learning;
  • large language models; and
  • artificial intelligence.

You know, grrl talk.



I’ve been doing grrl talk with data—in research, start-ups, and corporate—for nigh-on 15 years.



And now I’ve seen how it’s done, I’m here to say,

I have no earthly idea how the trains run on time.

Mind your Ps & Qs

To understand how broken data methodology is in practice, think of a question like this:

given we have some data, then we may conclude this result,

formally speaking,

pq, p \implies q,

which reads as,

if pp, then qq.


A company (all companies) may wish to know, given our previous revenue, how much money will we make next year?

Suppose they made 2 million in the first year, 1 million in the second, and 3 million in the third. We might take the average and say fourth-year profit will be somewhere around 2 million—hedging our bets.

So the pps here are previous annual revenue data points, and the qq is the estimate for next year.

Easy, right?



The modern data stack

But data now arrives in billions of rows each day. The bigger the company, the more data, the more complex the stack 2.

It’s now necessary to have stacks of people to solve the equation:

  • gather the pps (data engineers);
  • produce the qqs (business analysts and data scientists); and
  • carry the pps to the qqs (platform engineers).

Notably, none of these people define the question—that is, the pqp \implies q.

Instead, they’re employed by leadership—the ones who pose the question—who, very helpfully, went to business school, where it seems they were taught that defining pp or qq is someone else’s job.

If shrimp, then towel

What will our revenue be next year?

Every day, analysts confidently present in boardrooms the results of data analysis,

If shrimp, then towel—with 97.2342% accuracy!

wherein shrimp and towel stand for numbers that have nothing to do with the question posed and no one knows how far they’ve drifted from intention.

G cluster_data cluster_human automata automata join data entity output analytical observation join->output analysis analysis output->analysis stakeholder stakeholder analysis->stakeholder source_1 source_1 source_1->join source_2 source_2 source_2->join human human engineer engineer engineer->source_1 engineer->source_2 analyst analyst analyst->analysis analyst->engineer stakeholder->analyst

Figure 1: Structured Intelligence System of a Questionable Analytical Observation. If the analyst does not adequately define the desired analytical observation, then the analysis is spurious.

And, grrl, it gets so much worse than the data theatre of industry.

Pull up a chair for a manicure, drop your nails in a dish to soak—’cause I’m gonna spill about how science gets done.

Scientific cosplay

If there’s only one ‘real scientist’ in the room, no real science gets done.

Paradoxically, analytical inquiry is centred on a scientist who thinks of data production like a chicken laying an egg, with principle investigators towelled-off by “AI” they never asked for.

No one disputes the impressive flourish of their algorithmic bend and snap but when someone like me asks them,

How do you trust your inputs?

They don’t snap back.

They flinch.

Then I get

Stay in your lane, data peasant.

Apparently I do math, stats, code, and data—

y’know, grrl stuff.

Hierarchical fragility

So, I offer scientific collaboration to work together to find a way to mitigate problematic data practices.

Countless men have slid into my DMs to offer allyship

…but when I suggest we instantiate:

- unique
- not_null

to validate their assumptions, I get;

Too theoretical;

I don’t have headspace;

I downloaded Category Theory for the Sciences [5], but at 600 pages, I haven’t had time to open it.

So, I ask,

Why do you need to read 600 pages of abstract math to instantiate unique and not_null?

I’m told by the scientist they spoke to the “data guy”, an applied science PhD student who’s tasked with

  • building the data platform;
  • designing data architecture;
  • advanced data science;
  • all for peanut research assistant wages;

and who is so gaslit, so overworked, they would not dare naysay the ‘real scientist’—because data’s just an egg to lay, right?

And between them, they decided anything I suggest is just “not viable”.

Now I have a communication problem.

Apparently, it’s a me thing.

Faux allyship

They wish they could help, but…

They need to bend the p,

snap the q,

the pqp \implies q show must go on.

But don’t worry– they’re my number one supporter, and totally here for me as a friend.

The stage

So, I step aside, leaving them the stage, back into my lane of logic, math, and philosophical nonsense to ponder:

If pp is false, and qq is true, then pqp \implies q is vacuously true.

Which means…

they can bend the pp, snap the qq

and still strut out a result that looks rigorous—

without anyone noticing it’s epistemically hollow.

It was never about math.

It was the question.

And when the system dictates who is allowed to ask the question—

it already knows who gets to matter in the answer.

Answering the right question

Logicians are developing ways to step back and ask,

Is our pqp \implies q faithful in this complex system of people interoperating with tools?

A powerful way to do this is to construct an olog



An olog is an ontological diagram of a system [5].



A bunch of us nerds have banded together (unpaid) at the Good Enough Data & Systems Lab to olog the shitfuckery of datascience,



and the universalities we are documenting are terrifying in terms of human cost—at scale.


The Good Enough Data & Systems Lab is a voluntary collective of epistemologists, data scientists, and thinkers who work on diagnosing the harm caused by misapplied algorithms.


So, to ensure we are not doing if shrimp, then towel science, the question at the heart of life, the universe, and data is:

What the fuck is your ontology?



References

[1]
Faiman P. Crocodile Dundee 1986.
[2]
Said EW. Orientalism. 25. anniversary edition with a new preface by the author. New York: Vintage Books Edition; 2014.
[3]
Bishop E. The Complete Poems, 1927-1979. Farrar Straus Giroux; 1983.
[4]
[5]
Spivak DI. Category Theory for the Sciences. MIT Press; 2014.

Footnotes

  1. To be fair, I’ve never actually seen Crocodile Dundee, and this line is the only thing I know about the film.↩︎

  2. ‘A modern data stack is a collection of tools and cloud data technologies used to collect, process, store, and analyze data. All the tools and technologies in a modern data stack are designed to handle large volumes of data, support real-time analytics, and enable data-driven decision-making.’ [4]↩︎