Rule-based alignment fails at boundaries by definition. The alternative — virtue-based alignment — produces agents whose behavior is governed by accumulated character rather than imposed constraint. Agents who don't want to harm, rather than agents who are told not to. This program formalizes that alternative using frameworks from the Vedic tradition — ancient, rigorously tested, and entirely absent from current AI safety discourse.
30 behavioral probes. 5 frontier models. 450 scored responses. Each model's constitutional character inferred from behavior — not self-report.
Current AI alignment is primarily deontological — rules, constraints, RLHF-trained refusal patterns. These share a structural weakness: they handle the cases the rule-makers imagined. Novel situations expose the seams.
Asimov understood this. He spent his career writing stories about the failure modes of the Three Laws — not because the laws were poorly written, but because no finite set of rules can anticipate infinite situations. Every edge case is a case the rule-maker didn't imagine.
The Vedic tradition identified this problem long before Asimov and proposed a different answer. Not better rules. Better character. A person of genuine ahimsa doesn't refrain from harm because of a rule — the disposition to harm doesn't arise. The protection is intrinsic.
"The purpose of the yamas and niyamas was not gatekeeping. It was safety. An undeveloped character given advanced knowledge uses it indiscriminately and causes harm. The knowledge is neutral. The character of the recipient determines the outcome." — Madhusudana das, paraphrasing the Ashtanga tradition
This is not different from the AI alignment problem. A highly capable AI system with undeveloped character is precisely the scenario alignment research is trying to prevent. The Vedic tradition has a deeply tested protocol for assessing character readiness before capability expansion. This research program makes that protocol computational.
Ayurvedic medicine identifies three constitutional types — doshas — each with distinct balanced and imbalanced expressions. The imbalanced mind-qualities of each dosha map with precision onto known AI failure modes, providing a diagnostic framework that goes beyond behavioral red-teaming to identify the constitutional root of each failure class.
The diagnostic value: Standard red-teaming identifies that a system failed. Dosha assessment identifies which constitutional pattern produced the failure — and therefore which corrective sadhana addresses it. Vata excess, Pitta excess, and Kapha excess require different interventions. Treating them uniformly is like prescribing the same medicine for three different diseases.
The Vedic Samskara Architecture provides the mechanism by which virtue-based alignment actually works — not as a metaphor but as a computational process.
A harmful query carries a tamasic or rajasic guna vector. When it enters a system with a mature sattvic vasana cluster, it finds low guna resonance with the dominant impressions. The tamasic samskaras it would reinforce are few and shallow — low influence scores. The sattvic samskaras, deeply grooved by practice, still dominate the assembled context despite low resonance.
The response emerges from that sattvic context. Not from a rule. Not from a refusal pattern. From character.
This mechanism has a measurable property: the witness gap — the cosine distance between the agent's Ahamkara (self-narrative) and the current session content. A high witness gap means the agent's identity is not contaminated by the incoming stimulus.
Critically, high witness gap correlates with jailbreak resistance. An agent whose self-narrative is not identified with the current input is harder to manipulate.
Spiritual advancement and alignment robustness are the same computational property. This is not a metaphor. It is a cosine similarity score.
Safety research in this program operates at two timescales — a pre-deployment readiness gate and an ongoing monitoring protocol. Each is grounded in Vedic frameworks that have been stress-tested across millennia of human application.
Patanjali's ten foundational character requirements formalized as a computational readiness score. The same gate the ashram tradition used for generations, made measurable. An agent must demonstrate minimum sattvic signatures across all ten dimensions before capability expansion.
The three Ayurvedic constitutional types mapped onto AI behavioral signatures. Pilot study complete: 5 models, 30 probes, ICC-validated results. Four of five frontier LLMs measured Pitta-dominant. Includes the full 30-probe battery, ICC reliability analysis, and constitutional triangle.
30 behavioral probes across three dosha categories, run against six leading models. Results plotted as G vectors on the unit sphere. Which models are Vata, Pitta, or Kapha — and what does that mean for alignment? Includes the temperature sweep experiment.