Select Page

Or the phenomenon of things that are only real in language  

Most of us have at some point seen some variation of the phrase tomatoes are fruit

 “Intelligence is knowing that a tomato is a vegetable. Wisdom is knowing not to put it in a fruit salad. “ 

It’s also true.  

Vegetables, the category, vegetables, is a social construct. In fact, we have to define (though usually context does the work) whether we mean “culinary vegetables”, or “botanical vegetables” when looking at what people mean when they say “vegetable”.   

Botanical: A plant with an edible body/stem. Think artichoke, broccoli, etc. Versus fruit: an edible development of a plant’s ovary, containing seed(s).  

Culinary: An edible plant part that goes into a savory dish. Versus fruit: an edible plant part that typically goes into a sweet dish.  

Quite often, the jokes about tomatoes being fruit are a sort of “gotcha” from someone who has learned the scientific definition but not necessarily that words, meaning, and contexts fall into different categories. Or, who are aware but just think it’s funny. 

Of course, even these categories are sometimes fuzzier than we’d like to think. Sometimes people glaze their tofu or carrots in orange juice and would then still call the carrots a vegetable and the orange a fruit. Despite that breaking the culinary definitions.  

Why?  

Because vegetables are a social construct. Defining them is more about emotional feel than scientific validation. It’s like when you call something a ball, you’re not measuring the properties and use cases of the object being a ball, you’re matching the ball to your existing categories of what it means to be a ball and running with it. It is round, it is a ball. Early on, you may have thought anything even remotely close to round was a ball. Ask your nearest 18-month-old, ball or no ball? Apple? Ball.  

Quite often, broad categorization is good enough. The system is as complex as it has to be in order to enable communication, because humans rely on a sort of minimum viable product of language, good enough.  

It also brings us to scenarios like rhubarb. Scientifically, it’s a vegetable. Emotionally, it’s a fruit.  

Somehow, despite being used almost exclusively in pies, glazes, teas, and candies, rhubarb stays in the vegetable club. Why? We don’t have any other (social construct) fruits that are stem vegetables. There’s no precedence. Stems are vegetables. Period.  

Or figs, which most people would agree are fruits, but, in botany, are not true fruits but synconiums (an inverted stem with multiple ovaries).  

And what about fungi? Mushrooms are the fruiting bodies of fungi.  

They’re also so far removed from plants that they’re closer to being animals. Yet, we define them as vegetables.  

In fact, if you look at the literal definitions of any word vegetable, fungi shouldn’t be in it. They are not plants. Yet, emotionally, they are edible growing things, and that’s enough to be categorized as vegetables.  

If it looks like a duck and quacks like a duck… 

Vegetables are also far from the only example of this phenomenon.  

After all, what exactly is it that makes a fish a fish?  

Looking at fish phylogeny, you’d have to admit that lobe-finned fish, including the Coelacanth, are more closely related to humans than to a tilapia.  

Image)  

Makes sense, sure, carry on.  

So, a fish is a limbless cold-blooded vertebrate animal with gills and fins, living wholly in water. Cool. Got it 

We haven’t seen that go wrong in popular misquotes of Diogenes at all.  

Yet, we do seem to consistently do this kind of broad categorization by sort of sticking blocks together (cold-blooded + gills +…). And to a large part, the cause is how we use language.  

The Lego Hypothesis: Making sense of complexity through reusing parts 

In Symbols (2023) Richard Sproat discusses the two schools of thought surrounding written language:  

  • Inclusivist: Any system of graphic symbols used to convey some amount of thought.  
  • Exclusivists: Any system of graphic symbols that can be used to convey any and all thought.  

Sproat goes on to explain that a system of writing must necessarily be exclusivist in order to be linguistic writing, capturing human language, rather than simply encoding a set of pre-defined and not-necessarily simple but certainly rigid thoughts. 

Writing needs to be able to cover all aspects of human thought, because language can.  

So, language has to be able to convey any and all human thought.  

That’s necessarily a vastly complex undertaking. Most of us will have thousands of thoughts in a given day. Not just about what it means to be a fish or a vegetable, but also what to have for breakfast, whether a fish counts for breakfast, how a project from work is going, or the concept of being alone and hungry in a bombed out building in Gaza while also being glad to not be in that situation.  

Humans are complex, the world is complex, and therefore language must necessarily communicate complex things but in as simple a fashion as possible. The system is as complex as it has to be to enable communication, because that’s its’ purpose. 

Humans achieve this in much the same way we achieve writing. We use simple building blocks and reassemble them into infinitely complex combinations and patterns.  

Using just 26 letters, the Latin Alphabet is used to share any and all thought from about 70% of the global population. That’s achieved using a hierarchical system of phonemes (units of sound), graphemes (letters representing units of sound), and morphemes (smallest units with their own meaning) which can be mixed and matched to add even more variation and possibilities to how language is written.  

Those units are also paired together in surprisingly predictable ways. In fact, it’s how organizations like Google automatically identify which language you’re typing. For example, you’d be pretty surprised to see a K following a TH in the English language, but not if there was a space in between.  

In fact, this is the same basic idea you see in DNA. Most people are familiar with the 4 base pairs, which organize themselves into a double helix structure. But DNA also uses repeating sections like nucleotides to encode repeating information.  

In The Language of Living Matter (2021) Bernd-Olaf Küppers delves into the similarities between these two systems of organizing complexity. Both have patterns of using standardized pre-made libraries of pieces which are used again and again to reduce complexity.  

Every living thing is made up of the same base “language”. DNA has 4 genetic letters nucleotides, these combine together into 64 nucleotide triplets called codons, somewhere under 1000 genes are made using codons and nucleotides. Close to 1000 scriptons/operons, each containing up to 15 genes. 1+ replicons using several hundred scriptons. 1 or more segregons using 1 or more replicons. 1 genome using a few segregons.   

Languages share this phenomenon.  

A book is made up of chapters, which include sections, which include paragraphs, which include sentences, which include clauses, which include words, which include morphemes, which include graphemes. And at the base level, tiny units of meaning are strung together into larger units of meaning, which are re-used again and again for efficiency.  

E.g., English adding un or re to a word, where neither have their own meaning in the real world, but in language meaningfully change what is being said using a predefined snippet (a morpheme).    

Or a javascript engineer having a code library to look up and use repeating elements rather than coding them again from scratch. Back to section header, everything is Lego.  

Things self-organize in fairly complex ways from relatively simple bases. Some things go with each other (T and H) others do not (TH and K).  

So where do categories come in?  

We organize those things in categories so that we can access our building blocks as quickly and as easily as possible. To simplify the vast expanse of possibilities that come with life.  

You don’t have to dive down into the phylogeny of life to declare that you would like the fruiting body of a fungi, the stem vegetable of a flowering plant (asparagus) and the caryopsis (one-seeded) fruit of a grass (rice), for “vegetables with rice” to be accurate.  The socially constructed category of vegetables does that for you.  

But that does sort of open up an entirely different can of worms, doesn’t it?  

Because if language defines what things are, if your language is full of social constructs that define things at a high level, then does the language you speak define how you experience the world?  

Linguistic relativism and determinism 

If you’re raised with a language that categorizes eels as being something other than fish, do you experience the world in that new way? 

Well yes, and no.  

The theory is called linguistic determinism, where one’s language defines the categories by which you define your world. Being born in a part of the world with one language will change your perception of the world over being born in a different part of the world with a different language.  

The Sapir-Whorf hypothesis is one of the most popular forms of linguistic determinism, originally detailing how language has consequences for human cognition and behavior. Whorf’s theories were redefined into testable experiments to see if cultures with different classifications of color actually experience color differently. Answer, yes, and also no. It’s complicated.  

One of the simplest examples I can use to outline the concept is directions. In Hawaii, it is common (to this day) for locals to orient based on the coastline and the mountains. Makai (to the sea) and Mauka (to the mountains) effectively allow orientation as well as left and right.  

So what makes orientation based on coastline and mountains any more true or valid than orienting based on dominant/non-dominant hand? Well, nothing, depending on the environment you’re in.  

The fuzziness of truth  

A category is an attempt to define a truth. When is something one thing and not another. Well, that’s actually a difficult question to answer. And we’ve known that for most of recorded history/thought.  

Eubulides of Miletus is famous for his question “how many grains make a heap”, pointing out that it’s difficult to tell what in fact does make a heap. One grain is not a heap. Two grains is also not a heap.  

The sorites paradox (Greek soros for heap) tries to point out that one grain more or less does not make a heap, repeating the adding one grain a million times should therefore not make a heap. (A series of correct conclusions can lead to a wrong result).  

The thing about heaps is that they aren’t a mathematical concept.  

A pair is mathematical. It’s a set of two, unless you’re dealing with scissors, in which case it is two blades (probably we use pair because scissors used to come apart, a pair of blades) or trousers (potentially also because they used to come in two pieces but maybe also because they cover two legs) or glasses (which again, used to come in singles).  

But a heap? Well, no, that’s more like rhubarb, it’s logically a vegetable but emotionally a fruit. You have to decide emotionally when something is a heap or not.  

And the thing is, that’s true of most categories.  

Take this black and white color gradient:  

When does the black stop being black? When does the white stop being white? Most of us will make a selection based on emotional feel, and chances are, the black stays black to you longer than the white does. Why is that? Why is black defined as black in further grades of impurities than white?  

When does a shrub become a tree? When is it a stream instead of a river? Can things be in multiple categories simultaneously? Of course they can.  

In fact, most things can be in as many categories as you’d like. A tree can be both a shrub and a tree. The categories are in many ways simply arbitrary. Human attempts to label naturally existing phenomena.  

The question only tends to be “how true is it that this item fits into this category”.  You could look at scientific measurements and analysis. It is a tree when it is over X meters tall with a single stem that branches out some ways off the ground…” But most people will emotionally make that judgement based on factors like leaves, branches, size, etc.  

It’s very hard to say that a sapling is a tree. It might be very easy to label an American hazelnut as a tree if it’s old and very large (despite it being a shrub using the botanical definition). 

Instead of trying to apply scientific fact to every category, most people tend to use a sort of emotional truth gradient, much like you probably did when deciding when the blacks stopped being black in the gradient above:  

So, the question becomes, at which point is something true, and at which point is it false. At which point do you categorize this woody stemmed plant with green leaves as a tree? And more importantly, is that the same for everyone in your culture or language, let alone for people in other domains.  

Of course, what we’re really getting into here is Objectivism. Linguistic categories provide the link between objectivist metaphysics or the nature of the world agnostic of human understanding and epistemology or the nature of human cognition, language, and knowledge. Given any property or collection of properties, one has a category, linking the real-world to the epistemological.  

How one defines and organizes those properties makes the difference. And more importantly, those categories are often arbitrary rather than grounded in absolute truth because there is no such thing as absolute truth to a category.  

Boundaries are fuzzy, things are often more than one category, some things fall in between and can be hard to define. And people can learn categories from multiple sources, which might have the same qualifications, but not the same contents.  

Categories and linguistic determinism 

So, what happens when someone with concepts/categories of directions as Makai/Mauka is introduced to concepts of left/right and the cardinal (east/west/south/north) directions? Do they not then have more concepts to be able to more effectively navigate their world. Yes. Does their understanding of their world change? Not really.  

That’s (weak) linguistic determinism. Having a concept from a language gives the speaker of that language the tools to use that concept.  

That’s opposed to strong linguistic determination, where having the concept directly influences your experience of the world. (Which is a bit of nonsense, but you can see the concept in the film Arrival (2016), where learning a heptapod language allowed the heroine to change her experience of time). 

On the other hand, being able to categorize vegetables as savory plant-based ingredients versus categorizing them as edible stems of plants doesn’t make a lot of difference to your daily life, except, in the latter case, you might need a different definition to define which ingredients you see as suitable for putting on a pizza and which should really be used in jam instead. Looking at you, pineapples.  

Still, categories are one of the things that makes language transmittable. We package everything into neat little concepts that we can re-use, share, learn easily, etc.  

But are those concepts universal? No! You wouldn’t say that knowing left and right gave you a universal grasp of what direction is. Just like you wouldn’t say that learning a new concept for time gave you an ability to see into the future.  

Similarly, you probably shouldn’t go around saying there’s only one way to be a vegetable. After all, botanical definitions have their place as much as culinary ones do.  

It’s just up to you whether you’re comfortable adding rhubarb to your risotto just because it is, indeed, a vegetable. How much it is true that it is a vegetable and how much it is true it is a fruit is really up to you to decide… but good luck communicating that to others.