Deep persuasion: grokking, phase transition, cognitive reset

Jun 2, 2026

The society of me

Long before predictive coding had equations, Yogācāra Buddhism had a phenomenology of the same disturbance: the world we experience is not simply given. It is constructed through consciousness. The Consciousness-Only school, especially in its doctrine of the eight consciousnesses, did not mean, crudely, that rocks and trees are imaginary. It meant that the world as lived is always already processed, selected, colored, and rendered by mind. We do not encounter reality naked. We encounter reality as organized by the machinery that makes encounter possible.

The first five consciousnesses correspond to the senses: sight, sound, smell, taste, and touch. But the decisive machinery begins after them. The sixth consciousness is the ordinary thinking mind: the faculty that compares, names, judges, remembers, imagines, and reasons. It is where experience becomes explicit, where perception is brought under concepts, where the world becomes something one can report. It is the layer closest to what modern people usually mean by “thinking.”

The seventh consciousness, manas, is stranger and more dangerous. It is the self-making faculty. It takes the flow of experience and bends it around an “I.” This is the consciousness that says mine, for me, against me, about me. It does not merely think; it appropriates. It turns perception into ego-reference. A pause in someone’s answer becomes disrespect. A neutral fact becomes a wound. A disagreement becomes a threat to identity. If the sixth consciousness forms judgments, the seventh makes those judgments personal.

Beneath both sits the eighth consciousness, the storehouse consciousness, ālaya-vijñāna. This is not conscious thought in the ordinary sense. It is the depth layer in which seeds, habits, traces, and dispositions are stored. Every act, perception, trauma, training, resentment, discipline, attachment, and repetition leaves a seed. But these seeds are not inert memories waiting to be retrieved. They ripen. They condition what will later appear as obvious, threatening, beautiful, shameful, desirable, or true. The same sentence enters two people and becomes two different worlds because it falls into two different storehouses. One person hears criticism and receives information. Another hears criticism and receives humiliation. The external stimulus may be similar; the seeds that ripen are not.

The storehouse is therefore not a hidden little person inside the person. It is not a second self sitting below the first. It is closer to the accumulated shape of a life: the sediment of what has been repeatedly done, feared, loved, rewarded, punished, avoided, and rehearsed. It explains why some reactions feel chosen while others arrive already armed. A man does not first inspect a neutral world and then decide to feel threatened. Very often, the threat has already been prepared below the level of ordinary thought. The storehouse supplies the readiness; the sixth consciousness supplies the explanation; manas supplies the injury to the self.

This is also why Yogācāra’s theory of seeds is more radical than ordinary memory. Memory usually suggests retrieval: something happened, was stored, and can later be recalled. Seeds are not merely recalled. They become active conditions. They help generate the form in which the present is encountered. A childhood humiliation, a trained discipline, a repeated envy, a practiced generosity, a long resentment — each becomes more than a remembered event. It becomes a tendency of perception. It alters what the world is likely to mean before meaning has become explicit.

The crucial term here is perfuming. Repeated acts “perfume” the storehouse, the way incense leaves a scent in cloth long after the smoke has vanished. This is a beautiful and severe image. Nothing merely passes through us. What we do perfumes the depth from which future seeing arises. To practice suspicion is to make the world more suspicious. To practice courage is to make certain fears less sovereign. To practice attention is to make subtler things visible. The seed is planted by action, but once planted, it helps shape the field in which later actions will seem natural.

The storehouse also solves a problem that otherwise haunts the doctrine of no-self. If there is no permanent soul, what carries continuity from one moment to the next? Yogācāra’s answer is not a fixed self but a stream of conditioned traces. The person is continuous not because an unchanged essence sits underneath experience, but because experience leaves deposits that condition the next experience. The storehouse is continuity without a soul: not an immortal owner of experience, but the accumulated momentum by which one moment leans into the next.

This matters because it prevents the “many models” picture from becoming too flat. The mind is not merely a committee of simultaneous modules. It also has depth. Some models are not just active now; they have been trained by years of repetition. Some predictions arrive with the force of history behind them. This is why a person can understand something intellectually and still fail to live it. The sixth consciousness has changed its opinion, but the storehouse has not yet been retrained. The explicit belief says, “I am safe.” The seed says, “Prepare for injury.” The sentence changes in an afternoon; the storehouse changes by practice.

This is where the comparison to machine learning becomes useful, provided we do not let it become cheap. The storehouse consciousness is not a neural network, and karmic seeds are not parameters. But the analogy clarifies something important. A trained model does not merely “store” data like a warehouse stores boxes. Training changes the shape of the system. It alters the probabilities of future response. The model later appears to “understand” because past exposures have been compressed into weights. Yogācāra says something formally similar about the person: experience does not merely pass through us; it deposits tendencies. The storehouse is not a database of the past. It is the past converted into the probability of future perception.

That is why the Buddhist language is stronger than the usual modern language of “bias.” Bias sounds like a surface error, something one might remove by better information. Seeds are deeper. They are generative. They do not merely distort judgment after perception has happened; they help produce the perception in the first place. Before the sixth consciousness can say “I think,” before manas can say “this is about me,” the storehouse has already supplied the field of likelihoods from which this world will be rendered. We do not merely see. We see through what has been planted.

The point is not that Yogācāra predicted neuroscience or machine learning. That would be too easy, and false in the interesting ways. Its value is different: it gives an older phenomenological grammar for what modern cognitive science describes computationally. The sixth consciousness is the reportable, linguistic, reasoning layer. Manas is the ego-model that recruits experience into selfhood. Ālaya-vijñāna is the accumulated field of latent dispositions from which perception and reaction continually arise. Where machine learning speaks of training, weights, priors, and generalization, Yogācāra speaks of seeds, perfuming, ripening, and transformation. The vocabularies are not identical. But they rhyme.


The easiest mistake, then, is to compare the human brain to the LLM as if each were one thing. They are not symmetrical objects. An LLM is, for the most part, a single enormous model trained to predict symbolic sequence: given this context, what token comes next? It may contain layers, heads, circuits, and subroutines, but its public act is unified around language. The human brain is stranger. It is not one model but an ecology of micro-models: a weather model that reads the sky, a physics model that catches the falling glass, a social model that hears the pause in someone’s answer, a bodily model that regulates breath and balance, an emotional model that predicts threat before language has arrived, and a narrative model that later explains all of this as “what I decided.”This is not only Minsky’s poetry, though The Society of Mind gave the many-agents picture its classic modern form. It is also where several harder literatures converge. Evolutionary psychology supplies the “older than language” half: the massive-modularity thesis holds that selection installed many domain-specific competences before they became reportable. Reinforcement-learning neuroscience supplies the “competing for control” half: model-free habit systems and model-based, goal-directed systems do not merely coexist; they contend for command of behavior. Global Workspace Theory makes the contest architectural, with many unconscious processors competing for access to the limited broadcast we call consciousness. And Gazzaniga’s split-brain work gives the narrative model almost literally: a left-hemisphere interpreter fabricating reasons for actions initiated elsewhere. The man who says he is calm while his hand has already clenched is not a metaphor for self-deception. He is what the experiments reveal: action often begins before the self has written its explanation.

This matters because the strongest analogy between brains and LLMs is also where the analogy begins to mislead. Yes, both compress experience into parameters, predict what comes next, and update against error. But the human case is not merely next-token prediction implemented in meat. It is thousands of partially overlapping, constantly updating models, many of them older than language, running at different speeds and competing for control. The brain does not simply predict sentences. It predicts gravity, hunger, status, danger, intimacy, rhythm, pain, intention, and selfhood. Language is only the surface where some of those predictions become reportable.

Yogācāra reached this picture by introspective discipline. The modern West arrived at the same disturbance through three other doors: the physiology of the senses, the grammar of language, and the clinic of the divided self. Each corroborates our two claims — that the world is rendered rather than received, and that the self is plural rather than sovereign — but each also charges those claims a price.

Hayek’s The Sensory Order is the strangest book by a man remembered for economics, and the most useful here. Its thesis is that the qualities we live by — red, warmth, the ache of cold — are not handed to us by the world but imposed by the nervous system as classifications. The order of sensation is built from relations the brain has learned and inherited among stimuli, not copied from an order standing outside. Mind, for Hayek, is a classifying machine that renders the world it then takes itself to be merely receiving: a connectionist theory of perception before the vocabulary existed, and the storehouse rewritten as a map of connections.

But Hayek adds the humbling move this argument most needs. Any apparatus of classification, he argued, must be of a higher order of complexity than what it classifies. Therefore no such apparatus can fully classify — fully explain — itself. The brain can model the world because the world, as rendered for action, is the simpler thing. It cannot, in the same way, model the brain. The opacity of the parliament to itself is not merely a defect of introspection we might someday repair. It is a formal limit on anything that renders by classifying.

This explains why the storehouse never becomes wholly reportable. In characterizing the parliament from within, we are a classifier reaching for an object of its own order, and Hayek warns in advance that the reach falls short. The same limit, importantly, binds the machine. No model contains a complete model of itself. Self-opacity is not the moat. It is the shared condition.

If Hayek shows that the inner order cannot fully see itself, the later Wittgenstein asks why we are so certain the order is inner at all. The Philosophical Investigations is the standing rebuke to the very picture we have been assembling: meaning and judgment as representations behind the eyes, models we silently consult. Meaning, Wittgenstein insists, is not a mental object but a use. Words live inside language-games, and language-games live inside forms of life, the shared and embodied practices of a community. We are inducted into language not by private definition but by training, correction, imitation, and participation. No rule contains its own application. What makes a use correct is not an inner representation but a practice we already share.

Two consequences follow, and they pull against each other. The first sharpens the critique against the parliament metaphor. If meaning lives in public practice, then positing inner models that “redescribe” and “re-train” one another may be precisely the grammatical confusion Wittgenstein diagnosed: hunting a mechanism where there is only a practice, breeding homunculi to do what the form of life already does. The second consequence relocates the moat altogether. The decisive difference between the human and the model may lie inside neither. The LLM is trained on the deposited residue of language-games it has never played. It has the moves without the form of life, the sentences without the stakes that give sentences their sense. Wittgenstein’s lion, which we could not understand even if it spoke because we share no form of life with it, becomes a warning in reverse: the model speaks our words, but it does not stand where our words stand. The human’s irreducibility, on this reading, is not a richer interior but that he lives inside the practice, the body, the community of consequence from which meaning is made. The moat is not behind the eyes. It is the world the eyes were trained by.

The clinic arrived at the divided subject from a third direction, and here the same discipline is necessary: take the grammar, refuse the metaphysics, and distrust the mappings in proportion to their neatness. Two are still worth the risk. Lacan gives manas, its Western double. The ego, for him, is not the seat of the self but a misrecognition, formed when the infant identifies with an image of wholeness it does not yet possess. The “I” is constituted from outside, in the field of the Other and the symbolic order that precedes it. This radicalizes the mildest model, the narrator who reports “what I decided.” On Lacan’s telling, there is no sovereign behind the sentence who first decided and then spoke. The sovereign is partly an effect of the sentence, a fiction the parliament tells of itself after the fact. Gazzaniga’s interpreter, rendered in French.

Jung supplies the parliament’s oldest furniture. The complexes — autonomous, feeling-toned splinter-psyches that constellate and seize behavior — are the civil war among models in clinical idiom. The man calm in speech while his hand clenches is, in this dialect, a complex constellated. The archetypes are inherited, pre-linguistic schemata: models older than language under another name. And individuation, the slow integration of autonomous parts beneath a center that is not merely the ego, is the Western echo of the Buddhist turning of consciousness into wisdom: the self-interrupting, re-training capacity offered as a discipline rather than possessed as a fact.

But here the caveat must be louder than anywhere else. Lacan and Jung are clinical and interpretive, not empirical in the way predictive coding or reinforcement-learning neuroscience aspire to be. Their structures are difficult to falsify in the way the storehouse is difficult to falsify, and to lean on them too heavily is to court the very mysticism this argument has been trying to resist. They earn their place only as grammar: ways of saying the divided, constituted, inherited self that science describes more thinly and Buddhism described first. As evidence, they prove little. As language, they let the argument reach depths the loss function has no words for.

The four doors open onto one room: the world is rendered, not received; and the self that imagines it does the rendering is plural, constructed, and partly closed to itself. What none of them licenses is the comfortable inference that this plurality is the human moat. Hayek’s limit binds the machine as well. Wittgenstein moves meaning out of the interior where the contest was being staged. Lacan and Jung deepen the grammar but cannot carry the proof. Yogācāra gives us the storehouse, but not permission to turn the storehouse into a soul.

So the interesting question is not whether humans and LLMs both model. They do. The interesting question is what kind of modeling each performs, and where the analogy tears. The LLM is powerful because it concentrates an immense amount of world-statistics into a linguistic interface. Its “storehouse,” if one can use the term only as analogy, is distributed across weights: traces of training data compressed into dispositions toward future output. But those dispositions are not lived. They do not ripen as shame in the stomach, pride in the chest, fear in the hand, or obligation before a father. The model has sedimented language. The human has sedimented life.

That is where the moat, if there is one, has to be located: not in the claim that humans model and machines do not, nor in the easy mysticism that machines lack a soul by definition, but in the layered, embodied, socially answerable, self-interrupting nature of human modeling. The human being is powerful because he is not reducible to a linguistic interface. He is a stack of models, a parliament of models, and sometimes a civil war among models. He can say one thing while his body predicts another. He can believe he is calm while his threat model has already seized the steering wheel. He can consciously endorse a principle while an older social model quietly optimizes for approval, safety, or revenge.

The human mind is not just a predictor. It is a system in which many predictors observe, fight, inhibit, redescribe, and sometimes re-train one another. But even this must be said carefully. Self-retraining is not a possession humans automatically have. It is a discipline. In Yogācāra, the aim is not merely to notice the storehouse but to transform it: to turn consciousness into wisdom, to alter the seeds from which worlds arise. In modern terms, the question is not whether one has a model. Everyone has models. The question is whether one can see the training in the perception, the seed in the reaction, the inherited language-game in the sentence, the complex in the gesture, the interpreter in the explanation — and then, slowly, plant otherwise.

It’s a miracle you booted today!

The consequences of this are harsher than the analogy first suggests. If a human being is not one model but thousands of half-compatible models wired together by accident, then “changing yourself” is not a motivational poster. It is invasive surgery on a living system whose dependency graph you cannot read. Conscience, character, taste, fear, courage, resentment, discipline, tenderness — these are not faculties, cleanly separable. They are emergent behavior thrown off by a sh–tmountain of code that barely compiles, patched over decades by childhood, reward, shame, imitation, trauma, hunger, status, sex, books, teachers, and luck. The miracle is not that people are irrational. The miracle is that the system boots at all.

This is why asking someone to change can be an act of violence, even when it is the right thing to ask. Which model are you asking to change — the threat model, the approval model, the shame model, the status model, the bodily model that learned before language that a certain tone of voice means danger? And if it changes, what else was leaning on it? Pull the wrong wire and the personality does not improve; it avalanches. The old programming adage holds more deeply for people than for software: if it isn’t broken, don’t fix it. But of course much of it is broken — ugly, redundant, overfit, miscalibrated, dangerous. That is the cruelty. The broken parts are also load-bearing.

So the task was never self-expression. It is maintenance under uncertainty. You probably should prune your models; you probably should refactor the firmware; you probably should delete the routines that once protected you and now only make you stupid. But you should do it with the humility of an engineer touching production code at two in the morning, asking not “how do I become my true self?” but the only question that keeps the machine alive: which learned prediction is still earning its place, which is obsolete, and which is quietly holding everything else up?

That is the first consequence: you are a maintainer, not an author. And it comes with a method, because there is only one safe way to do the work. A model changes when it is forced to predict under new conditions and then made to survive the correction. This is why comfort is dangerous: comfort lets bad models go untested. New conditions are test cases. Difficult conversations are test cases. Love is a test case, and so are leadership, failure, exhaustion, public embarrassment, real responsibility, solitude, competition, grief — each one firing a different hidden routine. The man who is generous when praised turns petty when ignored. The woman who is brave in the abstract turns obedient under authority. The person certain of his independence discovers that his whole character was tuned for applause.

Writing is the cleanest test environment of all, because it forces the parliament to serialize itself. Vague intuitions hide easily in speech — in posture, charm, and speed. On the page they have to hold still. Contradictions surface. Borrowed opinions show their seams. The sentence that sounded profound in the skull turns ridiculous in print. To write seriously is to make the models explain themselves in public, even when the only public is tomorrow’s version of you. You do not write because you already know what you think. You write because most of what you call thinking is cached language, and the page is where the cache gets audited.

The second consequence is epistemic humility: distrust of your own conclusions, and most of all the ones that feel the most like you. A model can be superb in one domain and catastrophic in the next. The man with flawless mathematical taste may have the emotional range of a child. The social genius may be economically illiterate. The brilliant engineer may be politically deranged. The philosopher may parse desire on the page and fail it in the room. Excellence does not imply general intelligence. More often it implies local overfitting.

This is why a person can be impressive and disappointing in the same body. One model has trained for thirty years; another has barely trained at all. Then the mind performs its fraud of continuity: because the same mouth speaks both outputs, we credit both to the same competence. But there is no single competence. There is only an uneven ecology of trained and untrained predictors. Your best domain is not evidence of your wisdom. It is evidence that one corner of the system got enough data, pain, feedback, obsession, and correction to go sharp — while the rest did not.

Reading widely and ferociously is one antidote, though not for the reasons usually given. Not because books make you good, and not because breadth beats depth. Breadth matters because it forces a model trained in one environment to collide with a foreign one. History humiliates presentism. Literature humiliates theory. Mathematics humiliates rhetoric. Biology humiliates ideology. Religion humiliates materialism. Markets humiliate moral fantasy. Philosophy humiliates common sense. The point of reading is not to gather opinions. It is to stop one local model from mistaking its training distribution for the world.


There is a more formal way to say the same thing. Every mind is trying to compress the world. It cannot store reality itself; reality is too large, too continuous, too particular. So it stores shorter programs for predicting it: father means danger, praise means safety, authority means fraud, kindness means weakness, discipline means love, money means freedom, desire means humiliation. A person is, in this sense, a compression scheme that has mistaken its shortcuts for the world.

Kolmogorov complexity gives the outer shape of the problem: the complexity of a thing is the length of the shortest program that can reproduce it. A life cannot be reproduced by one clean program. It is too jagged, too contingent, too full of exceptions. But the mind survives by pretending otherwise. It keeps choosing simpler descriptions, lower-complexity stories that make the next moment predictable enough to bear. This is what a personality partly is: not the truth of a life, but the compressed model that made the life navigable.

And compression is never neutral. The same data can be compressed in different ways depending on the prior structure of the compressor. This is inductive bias: the built-in tendency of a system to generalize in one direction rather than another. No learner meets experience naked. It arrives already slanted. One child generalizes from punishment into discipline; another generalizes from punishment into terror. One person sees disagreement as intimacy; another sees it as abandonment. The data are never simply the data. They are bent through the model that receives them.See Bayesian Deep Learning

This is why argument has so little sovereign power. An argument supplies a few new data points, but the hearer’s inductive bias decides what kind of evidence they are. To one model, your objection is care. To another, it is domination. To one model, contradiction is an invitation to refine a belief. To another, it is proof that enemies are closing in. The sentence does not enter a neutral court. It enters a compression engine already trained to preserve some descriptions and discard others.

Changing a mind, then, is not merely adding information. It is changing the compression. It is altering the bias by which the person decides what counts as signal, what counts as noise, what confirms the world, and what threatens the self. This is why a single fact rarely changes anyone. The fact is usually cheaper to explain away than the whole model is to rewrite. A belief survives not because it is true, but because, within that person’s existing compression scheme, it is economical. It connects too many things. It explains too much pain. It protects too many loyalties. It keeps too much of the personality from having to recompile.

Real transformation begins when the old compression becomes more expensive than the revision. The evasions multiply. The exceptions become too numerous. The prediction errors no longer feel accidental. The person can no longer maintain the old model without spending more and more psychic energy defending it. Then what looks from outside like a sudden conversion is, from inside, the collapse of an obsolete simplicity. The mind does not change because truth finally appeared. Truth had appeared before. It changes because the old way of compressing reality can no longer afford itself.

Grokking, in this sense, is not memorizing more cases. It is the slow reorganization in which a model stops fitting examples and starts to see the shape beneath them. Before that happens, intelligence is brittle: flawless on familiar inputs, absurd the moment the inputs shift. Which is why credentials, education, and even visible brilliance deserve suspicion. The question is never “is this person smart?” The question is: where are they smart, what trained them, what did that training leave out, and what happens when reality changes the test set?

But the systems-engineer metaphor conceals a flaw, and the flaw is the most important thing here. An engineer stands outside the system he repairs. You do not. The part of you that decides which models to cut is itself one of the models — trained, partial, interested, very possibly overfit. There is no clean room above the parliament from which a neutral technician surveys the wiring; the hand on the scalpel is cut from the same code as the patient. And the model most eager to run the operation is usually manas, the self-maker, whose whole office is to preserve the “I” it has assembled — which means the faculty volunteering to refactor you has a permanent interest in cutting everything except itself. Ask who performs the maintenance and the answer is unnerving: a committee of the predictors under review, chaired, as a rule, by the one with the most to lose from an honest audit.

This does not make the work impossible. It changes its character. You cannot inspect yourself from above; you can only triangulate from outside, which is why nearly every real instrument of self-knowledge is external. The page, where cached language is made to stand still. The friend whose correction you authorized in advance. The discipline whose results you cannot fake. The reality that hands you the prediction error you did not order. They work because they are not you — because they sit outside the parliament and cannot be lobbied by it. Introspection alone, the mind grading its own paper, is the least trustworthy method there is, since the grader is the most overfit model in the building. The deepest danger of comfort, then, is not merely that it leaves bad models unfalsified. It is that comfort strips away every vantage the parliament did not already control, and hands you back a system auditing itself — which it will always pass.

And the system grows harder to revise with time, not easier. Early models are cheap to change because little is built on them yet; the longer a prediction has been load-bearing, the more of the structure stands on it, and the more a correction threatens to bring down. This is why people calcify — not because age dulls them, but because age thickens the dependency graph, raising the cost of every single change and widening the avalanche it risks. The grokking that reorganizes a young mind in a season can take an old one a decade, if it arrives at all, because what has to be relearned is no longer a fact but everything the fact was holding up.

Which returns the argument to the register it keeps circling back to: humility. Not the performed humility of admitting you might be wrong, but the structural humility of knowing you cannot fully see, audit, or rewrite the thing you are — and that this is equally true of every person you will ever try to correct. The one across from you is also a parliament no one governs, also grading its own paper and always passing, also unable to reach its own weights from the inside. Which is the last thing worth understanding before the question everyone eventually arrives at: why, when you finally hold the truth and say it plainly, it so often changes nothing at all.

Changing manas 意, citta 心, viññāṇa 識, and weights

Think not lightly of evil, saying, “It will not come to me.” Drop by drop is the water pot filled.
Think not lightly of good, saying, “It will not come to me.” Drop by drop is the water pot filled. The Dhammapada 法句經.

The appropriating consciousness is profound and subtle indeed; all its seeds are like a rushing torrent. Fearing that they would imagine and cling to it as to a self, I have not revealed it to the foolish.The Saṃdhinirmocana Sūtra 解深密經. Ādāna-vijñāna means “appropriating” or “grasping” consciousness: the deep consciousness insofar as it takes up and holds the body, making embodied life continuous. The Saṃdhinirmocana Sūtra uses this name to stress appropriation. Ālaya-vijñāna, by contrast, suggests a dwelling, store, or repository: the same depth understood as the storehouse of seeds, traces, latent dispositions, and karmic continuity from which future perception and reaction arise.

Here the three registers — the model on the chip, the storehouse in the sutra, the person in a life — converge on a single deceptive shape. Change looks discontinuous. Nothing, nothing, nothing, then everything. For a long time the person seems untouched. The same facts bounce off him. The same criticism produces the same defense. The same suffering produces the same habit. Then one day a sentence lands, a book opens, a humiliation clarifies, a friendship breaks, a practice ripens, and the whole world appears to rearrange itself at once. He says, “I finally saw it.” But what did he see? And why then?

The machine supplies the cleanest instance because, unlike the person, it can be opened. Train a small network on a formal task and for a long stretch it may merely memorize: excellent on what it has seen, useless on what it has not. Then, abruptly, and sometimes well past the point where it appears to have stopped improving, test performance rises from chance to near-perfect. The network “grokks.” It generalizes all at once. The curve looks like conversion.

But this is exactly where the machine deflates the romance, and the deflation is the most useful thing it gives us. The visible jump is not necessarily the moment learning happens. It is the moment learning becomes visible. When grokking is reverse-engineered, the suddenness begins to dissolve. A generalizing circuit can be forming gradually inside the weights all through the apparent plateau. On the surface, the model still looks like it is memorizing. Underneath, another structure is being amplified. The jump comes when the memorizing machinery is finally weakened or pruned enough for the generalizing structure to govern behavior. What appeared to be a sudden change in output was actually a long change in the model.

That distinction matters. A changed answer is not yet a changed mind. A person can say the new sentence without having become the new sentence. He can learn the language of repentance, therapy, politics, management, religion, or enlightenment while the old machinery continues to govern perception. Real change is not merely a different output under pressure. It is a change in the weights: the stored dispositions, the default saliences, the predictions that fire before speech, the meanings that arrive before interpretation. The surface says, “I understand.” The deeper question is whether the storehouse has been retrained.

Many of the celebrated “emergent” jumps in machine learning teach the same caution. An ability can appear to switch on sharply when measured by a crude or discontinuous metric, while the underlying improvement is smoother and more continuous. The discontinuity may live in the manifestation, the threshold, or the measurement, not in the learning itself. This is the general lesson: transformation often becomes visible abruptly because the world only notices behavior once it crosses a threshold. But the crossing is not the whole event. It is the reportable edge of an accumulation.

Yogācāra arrived at this structure by another path and gave it a moral psychology. The tradition knew the drama of sudden awakening: years of practice, then the floor drops away in an instant. It also knew the severity of gradual cultivation: every act perfumes the storehouse, every repetition lays down a seed, every seed conditions what can later ripen. The old quarrel between sudden and gradual awakening is therefore not merely a doctrinal dispute. It is a dispute over scale. Seen from the outside, change looks sudden because the decisive turn appears in a moment. Seen from underneath, change is gradual because the storehouse has been perfumed all along. This is the difference between memorization and grokking in any assistant, employee, institution, or self. I do not want a model that has memorized the answers I like, because that only moves the burden upward: it leaves me permanently in the loop, inspecting every output for signs of flattery, imitation, and local compliance. The surface has changed, but the weights have not. The system has learned what earns approval, not what makes the answer true. What I want is a model that has grokked the structure of the judgment itself, so that it can meet a new case without waiting for me to supply the desired sentence. It has learned the principle beneath the examples, the taste beneath the corrections, the constraint beneath the preference. The old arrangement requires constant checking because the system is optimizing for my reaction; the better arrangement requires less checking because the system has acquired the disposition that would have produced my reaction for the right reason. This is why the value-extractor is not merely a moral type but a theory of failed formation. He has learned the rewarded outputs without grokking the work. A purely result-driven organization makes this easy: not because results do not matter, but because they matter so much that counterfeiting them becomes rational. The danger is not demanding results; the danger is rewarding outputs before the weights have changed. It gets the visible answer it asked for and loses the person who could have produced it reliably without supervision.

This is why the doctrine of seeds is so precise. A seed is not a memory. A memory is something one may recall. A seed is something that helps produce the next perception. It is a tendency in the making of worlds. Repeated anger plants anger not only as an emotion but as a readiness to find insult. Repeated cowardice plants fear not only as a feeling but as a talent for discovering danger. Repeated generosity plants generosity not only as a virtue but as a widened field of what counts as possible. The storehouse changes by deposit. Then, at certain thresholds, the whole field of appearance changes. What had to be forced becomes natural. What had seemed natural becomes unbearable.

The Buddhist terms for the deeper transformation are appropriately violent: the revolution of the basis 轉依, the turning of consciousness into wisdom 轉識成智. The point is not that one acquires a new opinion. It is that the basis from which opinions, perceptions, and reactions arise has been altered. The sixth consciousness may change its stated belief quickly. Manas may even learn a more sophisticated self-description. But the storehouse changes by perfuming, by repeated contact, by practice continued past the point where the ego has already claimed insight. This is why one can understand a truth and still fail to embody it. The explicit mind has moved; the basis has not yet turned.

The analogy to grokking is useful here, provided we keep the differences sharp. What grokks in the network is a circuit for a formal regularity. What turns in Yogācāra is the relation of an entire being to experience, suffering, self-reference, and liberation. The network has no manas to dissolve, no shame to metabolize, no father to answer, no death to face, no freedom to win. A change in a mapping is not awakening. Generalization is not liberation. But the curves rhyme because both show the same temporal deception: invisible accumulation, visible break.

The agencies differ too. In grokking, an external optimizer keeps adjusting weights against a fixed objective. The model does not decide to become wiser. It is trained. In Yogācāra, by contrast, the practitioner participates in the perfuming of his own storehouse. Practice is not merely input. It is deliberate reconditioning. Attention, restraint, study, confession, meditation, repetition, and ethical action all become ways of planting different seeds. The person is not fully sovereign over transformation, but neither is he merely trained from outside. He is both the field being perfumed and, at moments, the gardener.

No need to argue

People will do anything, no matter how absurd, to avoid facing their own souls. They will practice Indian yoga and all its exercises, observe a strict regimen or diet, learn theosophy by heart, or mechanically repeat mystic texts from the literature of the whole world – all because they cannot get on with themselves and have not the slightest faith that anything useful could ever come out of their souls. … It is rewarding to watch patiently the silent happenings in the soul, and the most and the best happens when it is not regulated from outside and from above. I readily admit that I have such a great respect for what happens in the human soul that I would be afraid of disturbing and distorting the silent operation of nature by clumsy interference. Jung, Psychology and Alchemy.

The Western inheritance teaches almost the opposite. From Renaissance humanism through liberal education, the civilized answer to disagreement is argument. Train the mind in rhetoric, dialectic, debate; place claims before opponents; let reasons meet reasons; trust that truth, clarified by contest, will compel assent. The classroom, the parliament, and the courtroom all preserve this faith as theater. Two sides speak. Evidence is presented. Contradictions are exposed. A judge, a voter, a student, or a citizen is meant to be moved by the better case.

This ideal is noble, and not false. Argument can discipline thought, reveal error, and make public life answerable to reasons rather than force. But it also smuggles in a psychology: that the mind is the sort of thing which changes when a better sentence is placed before it. It imagines belief as a proposition stored near the surface, available for inspection and replacement. It imagines truth as something that, once spoken clearly enough, should pass from one mind to another by the sheer pressure of its form.

The spectacle of political and legal debate depends on this picture. We watch adversaries argue as though truth will emerge from collision and minds will change under the weight of demonstration. Yet anyone who has argued with a parent, a lover, a partisan, a student, or himself knows the failure of this picture. The better argument often lands without landing. It is heard, answered, evaded, resented, or admired, and still nothing essential moves. What follows begins from that disappointment.

Return now to the scene the whole apparatus was built to explain: you tell someone the truth, plainly, with the logic laid bare, and he does not move. The easy conclusion is that he is stupid, dishonest, or morally defective. But the architecture suggests otherwise. Argument fails not because people are irrational in some simple way, but because argument is usually aimed at the wrong layer of them. It speaks to the part that handles sentences while the belief it hopes to dislodge may be held by habit, fear, status, injury, loyalty, and self-protection.

A proposition enters through the linguistic surface: the reportable mind, the mouth, the layer that can say “I believe” or “I no longer believe.” But that is not where conviction necessarily lives. Much of what a person calls belief is not a sentence he has endorsed but a trained readiness to perceive the world in a certain way. The cyclist does not believe he can balance; he can. The angry person does not merely believe others insult him; he has acquired a readiness to find insult. The coward does not merely believe the world is dangerous; he has become talented at discovering danger. The generous person does not merely believe generosity is good; he inhabits a widened field of what seems possible.

This is why argument so often disappoints. It imagines itself as a command line: insert truth, update belief. But minds are not command lines. They are trained systems. The visible belief is only the output layer. Beneath it sit habits of attention, loyalties, injuries, status needs, fears, loves, and old predictions about what must be defended. To argue against an identity-laden belief is not simply to offer a better proposition. It is to touch an entire arrangement of selfhood.

That is the deeper point of the Buddhist account. A sentence may be correct and still fail to perfume the storehouse deeply enough to matter. Another sentence, less complete but better timed, may land because it meets a circuit already forming in the dark. The arguer sees only the moment of refusal or conversion. He does not see the hidden training curve. He does not see which seeds have already been planted, which have been reinforced, which are close to ripening, and which are still too weak to change the field of appearance.

This also clarifies the asymmetry between human beings and present language models. In the ordinary use of a deployed model, inference does not rewrite the weights. The model can produce a new answer in context, but when the context window closes, the underlying disposition remains as it was. The change is local, temporary, and staged in the prompt. To alter the model in the stronger sense requires training, fine-tuning, memory, or some mechanism that changes the system’s underlying tendencies. Output variation is not model transformation.

The human being is different. He never cleanly separates training from inference. Every conversation, humiliation, discipline, temptation, argument, prayer, victory, and failure perfumes the storehouse. Most deposits are too small to notice. Most do not turn the basis. But they are not nothing. They alter the probabilities. They prepare the threshold. What looks, years later, like a sudden change of mind may be the surfacing of a long subterranean retraining.

This is why the most reasonable-seeming act in the world — telling someone the truth and expecting him to change — rests on a misunderstanding of how change works. Truth can be necessary and still not be sufficient. A person may understand a claim and still fail to embody it. The explicit mind has moved; the basis has not yet turned. The sixth consciousness may change its stated belief quickly. The self-protective mind may even learn a more sophisticated self-description. But the deeper storehouse changes by perfuming, by repeated contact, by practice continued past the point where the ego has already claimed insight.

So the answer to “how does a mind change?” is: slowly, then suddenly; continuously in the depths, discontinuously at the surface. The world supplies impressions. Practice repeats them. Attention selects them. The self appropriates or resists them. The storehouse receives them. Seeds accumulate. Circuits form in the dark. Then one day the output changes, and everyone mistakes the moment of manifestation for the moment of transformation.

This does not mean argument is useless. It means argument is only one instrument, and usually not the strongest one. It can supply an impression, a disturbance, a crack in prediction, a sample of another way of seeing. But whether that sample becomes a seed, whether the seed is reinforced, whether it ripens, and whether it joins other seeds already waiting below the surface — that happens inside the person’s history. The arguer’s leverage was never only the proposition. It was the conditions under which the proposition could be received.

Here the Buddhist proverb is both wise and dangerous: the teacher reaches only the one whose moment has come. It is wise because it teaches patience. It reminds us that no teacher, friend, lover, or opponent commands the hidden timing of another person’s transformation. But it is dangerous because it can flatter the one who says it. It can turn impotence into superiority: I am awake, you are merely not ready. It can license disengagement as restraint and contempt as compassion.

There is also an ethical cost hidden in the language of effectiveness. To stop arguing and begin arranging another person’s training data is to stop treating him as a mind owed reasons and to start treating him as an environment to be tuned. Argument, for all its weakness, at least pays the other the courtesy of supposing he can be reasoned with. The designer of conditions has quietly surrendered that courtesy. He may be more effective, but he is also closer to manipulation.

The honest position must hold both truths together. Argument rarely changes people as quickly or cleanly as arguers imagine. But abandoning argument is not free. The most effective way to change a person may also be the one that has stopped asking his consent. A mind changes through contact, repetition, trust, pain, practice, and time. It changes when the basis from which perceptions and reactions arise has been altered. But because that alteration reaches beneath the surface of explicit belief, it is always morally charged. To change a mind is not merely to win an argument. It is to participate in the making of a world.

← Back to all posts