Language is a Bottleneck for Thought

Marco Giancotti,

April 25, 2024

Cover image:

The Telephone (circa 1884), Georg Pauli

Why is it so hard to convey your feelings? Why do words so often fail you when you try to infect a friend with your enthusiasm for a person, a place, an idea? How do you teach someone file version management in the least number of words? For someone who writes, this kind of frustration is part of the process. I feel it right now. But anyone who uses words or symbols to commune with others, no matter the format, is intimately familiar with the feeling. We take it for granted. But why is it so?

I believe the answer is not in the complexity of those concepts, nor in a nihilistic worldview where we're all alone in this world and beyond salvation, but in a system-centered interpretation of language.

To see that, we need only two concepts I've written about before (but will summarize below): models and framings.

The Problem with Words

A model is a system that predicts the future. It predicts only a small subset of the future, and it comes with a good deal of uncertainty, but it's an invaluable tool for any living organism. This includes mental models, the more-or-less-conscious, customizable models we humans cultivate in our heads. They're how you know that a sip of water will make you feel less thirsty, and that a certain kind of joke would not be appreciated by Linda. Ah, Linda.

Models are able to make predictions of things before they happen because they are simpler than the things being simulated. They do away with the irrelevant details, and for that they are faster. That doing away is done with framings. The way I define it, a framing is a choice of boundaries, or a choice of what to treat as a black box and what to see as interactions between black boxes. We don't need to think about what goes on inside the boundaries of those black boxes, only what they are prone to do, and how they can interact with other black boxes. Often we treat people as black boxes, for example, but also machines, places, the weather, you name it.

Now, all of this has little to do with language. The way we think (and a lot of what our bodies do without our need to think about it) is based on this kind of black-box mechanism, where we abstract away the things that don't matter at the moment in order to make our much-needed predictions. It happens inside each of our crania. We grow our mental models and tend to them as our most precious instruments—appendages, almost.

But as long as there's more than one human being on Earth, we'll want to communicate. Sharing our good framings is a powerful way to help each other. This leads to the question: if your mental models exist as the firing of neurons and the shifting of chemicals, how do you transfer them to someone else using only your muscles?

It's not really possible. Or at least, it's not feasible.

One problem is that language, whatever its form, comes with its own framing built in, what I call the Fundamental Framing of Human Language (FFHL). This is dictated by the use of words and sentences to communicate, and by the need for those words to be understood by many. It means that, the moment we try to put our thoughts into words, we're forced to map our framings into the framing of the shared vocabulary. This is rarely a neat one-to-one mapping.

If I wanted to explain to you what the excellent pizza I had last night tasted like, I'd had to reach for a series of words like "crispy", "fresh", "earthy", "harmony of flavors", and so on and so forth. These are emphatically not the terms in which I think about the flavor of pizza while I'm savoring it. I don't use a black box for "Mozzarella Cheese", which interacts in some way or another with the "San Marzano Tomato sauce" black box, and so on. My experience is purely non-verbal—with boundaries and simplifications, of course, but not those implied by those words.

If I were the one baking the pizza, my framing would be different, and I might indeed have black box concepts for each separate ingredient. But, even in that case, I would probably tune their boundaries to my needs, for instance I might think of the cheese only in its custom-sliced form, and the sauce as really whole tomatoes picked no more than N days ago. In explaining the pizza-making process to others, I would have to leave off some of that nuance.

In other words, we always make a lossy conversion from one framing to another. That doesn't mean that we can't build more sophisticated and accurate framings by piling on more words—only that the very foundations are different. It's a bit like converting a non-analog photograph to digital. It often works just fine, but there are differences that can be of consequence (tiny details lost due to pixelation, color space reduction, and digital artifacts to name a few).

Another problem with language is that the best way to frame and model a system depends entirely on your purpose, instant per instant. You need to make predictions because you have a purpose you're trying to achieve, be it removing thirst or getting hired by a company. Purpose determines what the "irrelevant" bits are that you can stuff into the black boxes. It doesn't make sense to speak of framing and modeling the world without knowing what your goal is.

We churn through our purposes really fast, too. Once the thirst is quenched, your new goal might be to wash the glass: and suddenly what was the refreshment-yielding "Glass of Water" black box a minute ago has been reframed into a bacteria-collecting "Dirty Glass" black box. You're trying to predict another aspect of reality, so you dynamically change your framing.

But what are the chances that the people you're talking or writing to have the same purpose as you, at the same exact time?

Scant, the chances are.

Other people usually have different purposes, so what for you is an optimal, well-functioning framing and model, for other people is less than optimal.

In the 2019 movie Little Women, two of the main characters (Jo and Laurie) are so close that they often borrow each other's clothes. You might think they suit both well enough... until you learn that the costume department had to tailor two versions of each garment due to their very different proportions.

It's a bit like sharing clothes with a sibling or a friend. Unless you both have exactly the same body size and shape, the same hair and skin tone, the same personality, it's likely that whatever suits one of you very well will look less than stellar on the other.

We can't make that kind of relativism go away with language. All we can make is compromises. How do we do that, and does it work?

Mitigating the Bottleneck

Of course, there's always the show-don't-tell approach. Instead of describing the pizza, have the other taste it directly. Instead of writing instructions on how to swing a racket, have the kid play tennis. But we're talking about language now.

Our default language-based strategy is to lean into the FFHL and pick vocabulary with the best balance between being fit to the purpose of your specific audience and being accessible, i.e. known to them. For example, the Wikipedia article about the videogame Baldur's Gate 3 begins like this:

Baldur's Gate 3 is a role-playing video game with single-player and cooperative multiplayer elements. Players can create one or more characters and form a party along with a number of pre-generated characters to explore the game's story.

The assumption of the author is that the reader knows these specialized terms:

role-playing
video game
single-player
multiplayer
party
pre-generated character
...

That's a fair assumption. For most of the visitors on that page, the goal is something like "decide whether the game is worth playing for me" or "understand the properties that set this game apart from the others". In other words, they're either gamers or game researchers (?). Given that context, framing "role-playing video game" as a black box (for instance) is a decent framing, and one that people are already used to converting to and from in their heads. But it's still not even close to conveying what every player of that game understands directly: what this particular game feels like to play, and how you play it in practice.

So, besides employing off-the-shelf terminology, we use two more strategies: increasing the resolution and making up new terms.

Words are cheap, so when the first few aren't enough to approximate the framings inside our heads, we can always add more words after them to increase the resolution. We can break down the concept and go into detail about what we really mean. This has the benefit, usually, of making the message accessible to a broader audience—more multi-purpose—because with more moving parts, people will be able to make them interact in more ways in their own mental models.

The downside of increasing the resolution is that it takes an unbelievable amount of time compared to the immediacy of the thoughts you're thinking. (This is when they say that a picture is worth a thousand words, and things to that effect.)

The other strategy is to invent new words. This doesn't solve the problem of explaining something for the first time, but it saves you a lot work once the new term is established within the group. It's why scientific disciplines—and really all groups with common purposes—are so rife with jargon. They cluster in groups that have similar purposes, and make words that define boundaries that work well enough to be shared.

In Tetris, some really good players sometimes use this kind of crazy maneuver with the T-shaped piece to win the game:

A complex move involving the rotation of a T-shaped tetris piece quickly as it touches down.

It would take a lot of words every time one wants to refer to this move, so they've come up with a name for it: "T-spin". Now anyone in the Tetris community can mention a T-spin on passing in their language and the others will immediately map that back to their own framing of the maneuver.

Of course, neologisms have the opposite downside compared to breaking down things with more words, that is, they make the message more inaccessible to outsiders.

Is Language an Obstacle to Thinking?

What about the thinking you do inside your head, when you're all alone? Ideally, that should allow you to disregard communicability and make the trade-off 100% in favor of optimizing your mental models. Only in your own head, with your uncompromised and well-understood—or at least "felt"—purposes, can you really feel free to redraw all the boundaries, to ignore whatever prepackaged framings your culture may have forced upon you, and finally think clearly.

That's the ideal. In reality, things aren't so straightforward. For one thing, it's unclear how much one's "thought language" (the language one uses inside their head) affects the way we think and forces us into predetermined patterns of thought. This is one of the longest-running debates in linguistics, and there are good arguments for and against both sides.

With my limited knowledge of the topic, I'd conservatively settle for a midway point: our language probably affects our thinking, even when alone, at least in a "soft" way, not preventing us at all costs from forming certain thoughts that it doesn't support, but making them more difficult and less intuitive.

I also believe we always keep in a corner of our minds, even when trying to figure things out independently, the knowledge that we'll probably have to explain this stuff to someone else later on. Some people may unpack their thinking imagining that they're actually talking to someone. This might encourage us to remain in the shared playing field of the language, without overthrowing too many of its forms and rules. We're inherently social creatures, and that probably affects what we do even when we are alone.

We don't know enough to say anything clear about this, but if the above hypothesis is true, then language could pose some obstacles to clear, free thought. On the other hand, we would only get a tiny fraction of our knowledge if we didn't have language, so it's an easy crime to forgive.

Pick Your Own Answer

In summary, language is incapable of transferring "optimal" framings and models between people. At most, it can transfer the best framings among those feasible to share among people.

That's a big difference. Does it mean that we're doomed to be dumber because of all the limitations of words? Let me answer four times.

Yes, we're doomed, because we lose something in the conversion to and from sub-optimal language-based framings. Probably all major breakthroughs in thought have happened inside individual heads ruminating in thoughts beyond language—the Einsteins and the Shakespeares, the Maxwells and the Aristotles. Only after having made the breakthrough, those people made the effort to put them into imperfect words for others to clumsily word-grapple with. We don't know if they were ever fully understood by anyone.

No, we're not doomed, because we have the ability, with enough exposure to the intended context and purpose, to re-hydrate the sub-optimal framings of language into good models for ourselves. You can feel this when something that has been explained to you at length finally clicks for you. It's not the words carrying the message that have made it click, but your own language-less puzzle-solving inside your head. What's more, even if the framings and models you reconstruct in your head are a bit different from the one in the source's head, that's often a good thing. Your model might be a novel upgrade on the original one, or it could be more well-adapted to your own variant of the purpose.

Language bottlenecks were a more acute problem during WW2.

Yes, we're doomed because "languagification" is a big overhead. A lot of the time, when we say "I need to think it through", we mean that we need to spend time about how to put it into words, not to understand or solve a problem. I think that much less "thinking through" would be necessary if we didn't have to eventually share what we're thinking.

No (this is the last one), we're not doomed, because of the obvious network effects. If the overhead for language were really too big to be worth the trouble, it would never have evolved in the first place. The effort and the data-loss implied in communication is usually offset by the greater number of thinking heads, each with its own chance at making the necessary breakthroughs.

So What?

After all those words, I hope that at least the following (tentative) takeaways have made it to your head more or less intact.

First, intended context and goal are paramount when communicating. They're the basis for any thought process. If the speaker/writer/gesturer/... isn't clear on the receivers' assumed context and purpose, or vice versa, the communication will likely fail. Always clarify your assumptions.

Second, it's tempting to "think in words", using categories familiar to many, but that might be a mistake. Try to break free from the external constraints of language when thinking by yourself. Use imagery, sounds, emotions, space, and gut instead: they have more dimensions and more elastic boundaries.

Third, demanding "exactitude of language" is a laughable idea.

Fourth, communicating your thoughts is hard and it's fraught with errors and frustration and, more often than not, it's well worth it all. ●