“Have you seen a movie called Don’t Look Up? That’s exactly what’s happening right now. It’s not only Covid; we’ve had fires, wars everywhere. Nature is cross with us, and we’re not ready to look up.”
SanskAI, the creator and administrator of sanskrit-ai.com, spoke to me from Chandigarh. (He asked not to be named in this story.) His website is a collection of resources on the Sanskrit language and its applications in artificial intelligence and technology. He’d built it soon after the pandemic began, having become curious about Sanskrit after watching a YouTube video on its “existential significance.” He now believed the answers to humanity’s problems lay in the scriptures, which “we don’t know how to decode anymore.”
“Because nobody knows Sanskrit!” he said. “Nobody’s ready to look at it because the moment you talk ‘Sanskrit,’ you talk ‘Hindu’, and ‘Hindu’ has become a cuss word in today’s times.”
SanskAI professed he didn’t know how to solve this. “But my thing is, can we bring Sanskrit into technology, which is the future?” He was enthusiastic about things like Raji, an action-adventure video game set in ancient India. “You’ll also see Sanskrit hard rock, with a lot of youngsters creating songs with shlokas in them. This is a good way of getting people involved,” he said. “The beauty of Indian culture, or rather sanatan dharm”—Hindu philosophy and practice—“is that it’s not a matter of life and death. If somebody makes a death metal song out of a Sanskrit shloka, nobody’s going to kill someone. Try it in another religion. They’ll come for you.”
At one point, SanskAI had written to Google asking why Sanskrit was not among the languages the company offered translation in. He said he hadn’t received a reply. “They are translating to Chinese and the Congos and the bongos and weird languages, even Afrikaans and stuff like that,” he said. “Why not Sanskrit?” The answer was simple. “The West cannot be trusted to take any interest in this direction, because why should they? There is a huge conspiracy to keep everything under wraps, because otherwise everything will be credited to Hindu culture.”
Claim Game
n 2022 India, SanskAI’s anxieties about Sanskrit aren’t fringe obsessions.
Government spending on Sanskrit has never been higher. In January 2022, the Karnataka government, led by India’s ruling Bharatiya Janata Party, allotted ₹320 crore to the state’s Sanskrit University. In February 2020, replies to a question raised in the Lok Sabha revealed that the Union HRD ministry had spent ₹643 crore over three years on promoting Sanskrit. [1] (Government expenditure on the environment in those three years amounted to ₹608 crore.) On 12 May, two months after my call with SanskAI, Google announced that Sanskrit, “the number one most requested language,” will be one of eight Indian languages to be added to Google Translate.
The government’s interest in Sanskrit coincides with its interest in artificial intelligence and machine learning. Prime Minister Narendra Modi has often stated that India should become a “global hub” for AI. In the 2021 Union Budget, the government announced the National Language Translation Mission. Later, the Ministry of Electronics and IT placed a proposal of ₹450 crore for this work before the Cabinet. The project, called Bhashini, aims to use AI, ML and speech recognition technologies to “build the next generation government apps and websites that will be ‘conversational’ like Amazon’s Alexa or Apple’s Siri.”
I wondered about the place Sanskrit would have in this scheme. In the last few years, several state officials and sources have variously claimed the following: “Sanskrit is the most suitable language for AI and ML according to NASA,” “Sanskrit is the language for future supercomputers” and “talking computers,” “Sanskrit is the best programming language,” and “Sanskrit is the most scientific language.” [2]
“If somebody makes a death metal song out of a Sanskrit shloka, nobody’s going to kill someone. Try it in another religion.”
These initiatives had a philosophical basis: the idea of Sanskrit as a forward-looking language that is not only compatible with technological ideas but also anticipates technological progress. It’s a state project with a colonial legacy. When the British cited the superiority of their language and technologies, India’s elites, largely dominant-caste Hindu men, began to seek parallels for scientific thought in their own past.
“It was a space of translation,” the historian Gyan Prakash writes, “an arena where the colonised elite reinterpreted the Vedas and repositioned them in relation to modern science.” Deriving from the Sanskrit root “vid,” meaning “to know,” the Vedas were understood to contain “timeless and absolute truths.”
In this way, a Hindu elite rebranded these texts, and eventually helped synonymise the modern Indian nation with their own culture. Vedic practices were often given scientific justifications, which spread far and wide in upper-caste homes. One popular theory, for instance, suggested that the sacrificial fire of Vedic rites purified the air around by breaking up polluting particles. This is now a common belief, the subject of WhatsApp forwards and tweets about the “scientific basis” of Hindu religious practices.
AI and ML are among the most current vehicles for these claims from antiquity. Content about “Sanskrit and AI” proliferates on the news, Twitter, Instagram, Facebook, Quora, YouTube, Pinterest, and several online blogs and forums—which is how I came across SanskAI and his website. I also came upon a number of research papers, many of which looked similar to one another.
But a lot of the content was also quite technical. I was particularly intrigued by a YouTube video titled “Why Protect Sanskrit?” in which the Indian-American entrepreneur and author Rajiv Malhotra declares that Sanskrit should be credited for the last two decades of development in Natural Language Processing, the sub-field of AI concerned with getting machines to process and interpret human languages. Malhotra is the founder of the Infinity Foundation, a non-profit that “specialises in the field of civilisational studies applying the Dharma lens to examine a broad range of topics.”
I also kept coming across social media posts invoking the American space agency NASA’s endorsement of Sanskrit. Many of them referenced a certain paper published in AI Magazine in 1985, titled “Knowledge Representation in Sanskrit and Artificial Intelligence.” SanskAI, too, recalled, “Sanskrit was spoken about with artificial intelligence way back… by that guy, the American scientist…” he trailed off, before the name came back to him. “Rick Briggs.” (Briggs’ address in the paper is listed as: RIACS, NASA AMES Research Center, Moffet Field, California 93405. [3] )
Reading this paper unlocked several lines of inquiry. Was Sanskrit really responsible for recent developments in NLP? Was it the best language for AI and ML? Would Rick Briggs be able to explain further? How do ideas about Sanskrit’s technical validity relate to its elite cultural position?
Asking these questions took me back to the 1980s. From there, and through the lens of AI and NLP, I began to understand the complex motivations and turns of history that have shaped India’s approach to technology.
The Pen is Black
There was a trend in the 1960s, 1970s and 1980s,” said Deepak Kumar, a professor of computer science at Bryn Mawr college, Pennsylvania, “where in AI, we were trying to capture the meaning of English.”
I was led to Kumar by the only other trace of Rick Briggs I managed to find online, apart from the 1985 paper and references to it from the discourse. This one, also published in AI Magazine, was a follow-up of the 1985 paper, not a theoretical work but a “review of the First National Conference on Knowledge Representation and Inference in Sanskrit.”
The KRIS conference owed its origins to Briggs’ 1985 paper. In December 1986, the who’s who of Indian computer science flew Briggs down to Bangalore to discuss some of the ideas he had proposed in his paper. Kumar didn’t personally attend the Bangalore conference, but he’d co-authored a paper presented there—among the most important, according to Briggs’ report, because it addressed the possibility of utilising Sanskrit for NLP. [4]
On our call in February, Kumar explained the ideas that sparked it all. “If I say ‘This pen is black’, any phone or computer will be able to translate that into another language,” he said. “But if I turned around and asked, ‘Now that you know that, what is the colour of this pen,’ it wouldn't be able to answer that. This is because the AI technology in question is only translating, without any conception of the meaning of the sentence.”
“The way machine translation is done right now is purely driven by statistical and machine learning techniques,” Kumar continued. Large amounts of data in the most commonly used languages are generated every second on the internet. With enough parallel data between two languages, ML-based AI can spot patterns to know that “black” in English, for instance, appears as “kaala” in Hindi, without having to know that these are colours or, further, that colours are properties of the entity ‘pen.’
Knowledge representation, on the other hand, is concerned with making machines understand and infer from information. Instead of learning patterns “bottom up” from big data, it aims to code rules so that the machine knows how to interpret information from the “top down.” This kind of rule-based engine belongs to a collection of methods in artificial intelligence called Symbolic AI. “Because Sanskrit had this formal language and formal grammar,” said Kumar, “whatever you say could fit into a knowledge representation system.”
In Sanskrit, every sentence is “seen as a little drama played out by an Agent”—the doer—“and a set of other actors,” writes Paul Kiparsky, “which may include a Recipient, Goal, Instrument, Location and Source.” [5] This allows a sentence’s meaning to be represented in these six basic categories, and by the relationships between them, independent of the actual words in it. [6]
The Panini of Kiparsky’s title was a Sanskrit grammarian who’s thought to have lived between 600 and 400 BC. His Ashtadhyayi is a treatise containing 3976 rules describing how the spoken language of the time operated. The rules outline how classical Sanskrit words are derived from about 2000 base verbal roots, or dhatus. “Every word in Sanskrit can be traced back to the dhatu,” Dheepa Sundaram, a Sanskritist, digital religion scholar, and semiotician explained. Each dhatu is in turn generated from distinct linguistic units—phonemes and morphemes are terms of the art—such that Ashtadhyayi functions as an algorithm. It consumes phonemes and morphemes to produce words and sentences. [7]
Briggs’ and Kumar’s papers from the 1980s were studies of how Sanskrit’s algorithmic grammar could be used to create a rule-based or Symbolic AI engine to process a natural language. “That was sort of the prevailing theory at that time,” said Kumar. “Fast forward to today, and technology has come a long way.” From being the dominant paradigm of the mid-1950s to the 1990s—so much so that it is termed GOFAI or Good Old-Fashioned AI—Symbolic AI has increasingly been discarded in favour of data-driven approaches, which include methods like Deep Learning and Neural Networks.
Today, Deep Learning powers most AI applications in common use, including Google Translate. [8] Contrary to Malhotra’s claim in his “Why Protect Sanskrit” video, Sanskrit or rule-based Paninian engines have nothing to do with these translation technologies. “Nothing, not even Sanskrit or even the structure of English, comes into play in that,” said Kumar.
Panini’s analysis of grammar did, however, anticipate many ideas in modern linguistics and computer science. It’s regarded as the first known instance of the application of algorithmic thinking to a domain outside of logic and mathematics. [9] But the same formal grammar and closed nature that gave Sanskrit its parallels with ideal programming languages also made it less suitable for spoken use. “Open languages, like English, are ones that keep evolving,” said Kumar. “A closed or formal language, while desirable for computing applications, is not a good human language, by its very nature.”
Sundaram once asked a temple priest in the US for a copy of a Sanskrit prayer. “He said, ‘Just listen and take it in, but don’t try to understand it.’”
Briggs’ paper was not referring to the spoken version of Sanskrit, which came with ambiguity as any natural language does, but a conversion to a certain shastric version of Sanskrit. Shastric Sanskrit, the author and programmer Vikram Chandra writes, was meant to formulate logical relations with scientific precision, and is different from the spoken form. Dheepa Sundaram, the semiotician, did not recognise the term “shastric Sanskrit,” and said that linguists and scholars, when they talk about Sanskrit, are referring to either Vedic or classical Sanskrit.
Sundaram taught Sanskrit for over 15 years. But she decided not to publish a dissertation on rasa consciousness after coming up against a fundamental concern: she couldn’t separate Sanskrit’s formal elegance and beauty from the hierarchies embedded in it. “I really enjoyed that work, but one of the things I did not engage meaningfully with was the casteism and cultural supremacy that is embedded in those meanings, in those dhatus.”
“Look at words like brahmana,” she said. “It comes from the verb root bra- which means ‘that which pervades everything.’ The Upanishads are amazing texts but they are casteist, elitist texts. Upanishad means ‘sit close to the teacher,’ but it’s also a certain group of people sitting next to a certain kind of teacher. And frankly, we”—meaning Brahmin women, like herself and me—“were not included in that group.” She’d once asked a temple priest in the US for a copy of a Sanskrit prayer. “He said, ‘Just listen and take it in, but don’t try to understand it.’”
Conference Crossovers
efore my call with Deepak Kumar ended, I asked him about Briggs, whom I’d had no success in tracking down.
“I don’t know,” Kumar said. “I've never met him. If I have, it was probably just in passing at a conference. The only thing I know was that he worked at NASA in space administration. In the 1970s and 1980s, there was a lot of linguistic interest in Sanskrit, even in Europe. And around that time, a lot of US agencies were very interested in machine translation. That may have been his connection, but I’m completely speculating at this point. I mean, it could be that he spent time in India and learned Sanskrit in an ashram. Who knows?”
Among the others Briggs had mentioned in his 1987 report was H.N. Mahabala, the doyen of Indian computer science education. Mahabala set up the first CS departments in several institutions across India. A teacher and mentor to the likes of the Infosys co-founders Narayana Murthy and Kris Gopalakrishnan, he also introduced the first course on AI at IIT Kanpur, which was at the forefront of computer culture in India.
Over coconut water at his home in Banashankari, Bengaluru, Mahabala told me what he knew. “Rick Briggs was a student of Sanskrit in Berkeley, California. One fine day, he found that what was being said in his Sanskrit class was very similar to what we have to do in computers,” Mahabala said. “He was one of the first to point out the commonality between the two.” (In the months following our meeting, I learnt that a certain “Richard Briggs” had been an undergraduate student at UC Berkeley from 1981 to 1984, where he’d got a B.A. in Linguistics and Computer Science. The university would not share an alumnus’s contact details, so I haven’t been able to confirm that this Richard is our Rick.)
Briggs’ 1985 paper came to the attention of the Akshar Bharati group, the first cohort of Indian scholars to work on Sanskrit and AI. Many of its members were part of the first generation of engineers who had graduated from premier technology institutes in independent India, and some had gone on to universities abroad.
Professor Rajeev Sangal, who became president of the NLP Association of India in 2002 and currently teaches at IIIT Hyderabad, was one of the few who returned to India, to work on “problems relevant to Indian society.” “What could be better than language processing?” he said on a Zoom call. “Because it would be great to make texts available across Indian languages with the help of machine learning.”
In 1984, soon after Sangal had become a faculty member at IIT Kanpur, he met Professor Vineet Chaitanya. After a decade teaching at BITS Pilani, Chaitanya had decided to take sanyas—spiritually renouncing secular life—at the Chinmaya Mission, an organisation dedicated to disseminating the philosophies of Vedanta. The Mission’s website describes Vedanta as “the essential core of Hinduism” and “the universal science of life.” “There was a Vedanta course in which we were also taught Sanskrit,” Chaitanya said. “When I saw Panini’s Ashtadhyayi, I thought other computer professionals should know about it too.”
Sangal and Chaitanya banded together with other scholars to work on “machine translation through the lens of Panini.” One day, in 1985, they opened an issue of the American publication AI Magazine. “Our old teacher Professor H.N. Mahabala contacted us,” Sangal remembered, “and he said, ‘I know you people are working on this. What do you think of Rick Briggs?’” [10]
That’s when the group decided to organise a conference on Sanskrit and AI and invite Briggs. “Getting a scientist to come was an uphill task,” said Sangal. “The cost of airfare was equal to a year’s salary for faculty. Professor Mahabala said ‘Don't worry, I’ll raise the money. This is an important thing we must do in India.’”
The conference took place in Bangalore’s Shankara Mutt—a monastic institute run by Sri Paramananda Bharathi Swamiji, a physicist from IIT Madras who’d taken sanyas. Two tutorials preceded the conference. The pandits were taught AI and computer programming, while the scientists were familiarised with Sanskrit literature. “For some time, we didn’t understand what they wanted and they didn’t understand what we said,” Mahabala recalled. “But some students who had studied Sanskrit in Melukote came along. They took our programming classes and acted as a go-between with us and them.”
“The people from the AI and computer community were enthused,” recalled Chaitanya, “but the Sanskrit community was a bit defensive.” Sangal remembered that Briggs was “mercilessly questioned” about his ideas, both at the conference and the places he visited later to give lectures. “In India, we have a complex about our own knowledge,” he said. “Unless it comes from the West, we don’t accept it. But it cuts both ways. Because Rick Briggs was writing it, many people accepted it, but, on the other hand, the diehards were also dismissive, asking ‘What does he know about Sanskrit? These broad connections don’t prove anything.’”
But Sangal and his group defended Briggs’ paper. “It connects many different things, but leaves it at that,” he said. “It is showing you this large landscape of what can be done. How to do it in detail will have to be worked out through further research.”
This work is now a dominant strand of AI research in India.
Bridge on the River AI
angal belongs to the small and controversial school of thought, in India and internationally, which disagrees that the machine translation problem can be solved with big data alone. “In research, what they’re finding is that the ML-based method gives impressive results in the beginning, but then it saturates,” he explained.
On a scoring system called the BiLingual Evaluation Understudy, Sangal said, any translation that scores more than 40 out of a possible 100 is judged to be comprehensible. “But when you have a BLEU score of 40 and then you run material in a specialised domain—say engineering, politics or social science—the score can come down to 20. The system starts performing poorly because the big data is not a representative sample. Linguistic theories mixed with big data can help and this is where Panini comes to our aid.”
In this context, Sangal told me more about the Union government’s Bhashini project. “The goal is to do speech-to-speech machine translation for about 15 Indian languages in the next three years. Yes, human editing will be needed, but the productivity of human editing will be 3x. We’ve received funding for projects based on big data, based on theories, based on Sanskrit—which is backed in a strong way in that mission.”
Another researcher whose projects have been recently funded under Bhashini is Pawan Goyal, an associate professor of computer science at IIT Kharagpur. He told me about how a hybrid approach—big data mixed with the rule-based engines—would work. “If you have sufficient parallel data between a pair of languages, we don’t need any others, but what if we don’t have sufficient parallel data?”
In Goyal’s scheme, Sanskrit works as a bridge language. “When we’re going from language A to language B,”—Goyal took the pair of Kannada and Hindi as example—“for the language A, we take a small dataset, maybe 10,000 sentences or even less, and we try to annotate it with its linguistic characteristics.” These are mapped to Sanskrit, which is already available in the annotated format and provides an exhaustive grammar. From a list of possible solutions, the most accurate one is then chosen as the solution in language B. [11]
“When it comes to Deep Learning, it’s all about muscle power. You really need huge teams, coordinated effort, large collaborations.”
But even this use for a bridge language or rule-based technique is being made redundant by a Deep Learning solution. Monojit Choudhury, principal data and applied scientist at Microsoft India, told me about the new Multilingual Zero-Shot Model, which supports translation between as many as 93 languages at once, removing the need for a single bridge language. [12] Google Translate uses the same model. (Editor’s note: Microsoft India and this publication are partners in an ongoing multimedia collaboration, Paradigm Shift. The company has no editorial association with or discretion over this story.)
The shift to Deep Learning and statistical techniques had left many AI researchers in the lurch, Choudhury said. “What happens with Deep Learning is that a lot of expertise built over the years suddenly becomes irrelevant. And it’s a very uncomfortable place to be in. I totally empathise because I have been through the same.”
He was wary that a global technology revolution may pass India by. “When data-driven NLP was becoming popular worldwide in the 1990s and 2000s, we didn’t move so quickly. And that’s why for Indian languages, technologies were far behind. Today, when the second revolution is happening”—in Deep Learning—“if we don’t jump into that boat, we will lag further behind.”
While a few Indian projects are using Deep Learning-based techniques for Indian languages, such as AI for Bharat, Choudhury worried India was still lacking fundamental innovation. “In the NLP space, are we building some technology that the rest of the world is taking up? I don’t see that happening much.” He attributed this to the lack of computing infrastructure and a critical mass of researchers. “Today, it’s very different from how research used to happen in the 1950s or 1960s. When it comes to Deep Learning, it’s all about muscle power. You really need huge teams, coordinated effort, large collaborations.”
Goyal and Sangal are among just 50 to 75 principal AI researchers in India, according to a 2018 report produced by Itihaasa, a technology research not-for-profit founded by Kris Gopalakrishnan. Choudhury put the total number of AI researchers in India at 250, with 30-45 working specifically on NLP. “For a country the size of India,” he said, “that number is still very small.”
A Science of Nationalism
he researchers working on Sanskrit and AI are fairly removed from related conversations in non-academic spaces. Kumar thought the SanskAI-style claims were perhaps being “kept alive by people who are really passionate about Sanskrit.” The early rule-based engines from the time of the Bangalore conference were, after all, “toy systems,” he said. “Those systems become so huge that managing technical difficulties becomes next to impossible. This doesn’t amount to a criticism of Sanskrit per se, but for that whole set of conceptual ideas (related to Symbolic AI).”
Goyal and Sangal both thought unscientific claims about Sanskrit to be counter-productive. “As academics, we try to avoid overclaims,” Sangal cautioned. “It’s good to make a lesser claim and then prove it.” Ultimately, Sangal felt, Sanskrit’s role in linguistics and computing had been established “very strongly.”
This left me thinking about the kind of people and institutions that were most invested in emphasising Sanskrit’s importance in AI research. Among the 105 links at the bottom of SanskAI’s website, alongside resources such as Ashtadhyayi Github, are sites like intellectualkshatriya.com, hindugenocide.com, reclaimtemples.com, and vivekagnihotri.com, all dedicated to political positions on the spectrum of Hindu fundamentalism. Some of the websites further link to Malhotra’s Infinity Foundation.
SanskAI said his work wasn’t affiliated with the Foundation, though he agreed with Malhotra’s views on Sanskrit and his overall mission. “Malhotra is also talking about artificial intelligence, how they”—the West—“want to take over, how we have been hijacked,” SanskAI said. “The only question is, how do we reach out to more people and get them to understand what they’re missing out on?”
State-funded technology institutions, too, are contributing to the discourse with a new—and very public—vigour. [13] Before the pandemic, the main activity for the Sanskrit club of IIT Roorkee was to present Vedic chanting during the institute’s convocation ceremony. Now, after moving online, the club reaches over 20,000 people across various platforms, conducting workshops on approaching “Sanskrit texts with a scientific temper” and inviting experts to speak on subjects like “Can We Build Conscious Machines? A Vedantic and Modern Perspective.” After the completion of its first free-of-cost spoken Sanskrit course in July 2020, the Union HRD minister Ramesh Pokhriyal urged other institutes to embrace such “small steps” for “something bigger.”
“The way to achieving an Aatmanirbhar Bharat,” Pokhriyal declared, “is dependent on the knowledge of the ancient texts, which were all written in Sanskrit.” At an earlier event in 2019, Pokhriyal even charged the IITs and NITs to prove that Sanskrit is the most scientific language: “If the scientists at NASA pronounce Sanskrit as the most scientific language,” he asked listeners, “why do you have a problem?”
“There are many things floating in the market in the name of Vedas, in the name of Puranas, in the name of Sanskrit. Almost 50 percent is fake. We approach everything with a scientific temper.”
Raghudev, currently a third-year student and volunteer at the IIT Roorkee Sanskrit club, appeared uninterested in what NASA had to say about Sanskrit. (He preferred not to use his last name.) “I don’t need to look at an outsider to get approval for things of my own,” he said. “It was actually old news that Sanskrit is the best language for coding. I don’t believe in such types of things. There are many things floating in the market in the name of Vedas, in the name of Puranas, in the name of Sanskrit. Almost 50 percent of that is fake. We approach everything with a scientific temper. And with cryptoanalysis. [14] And then we have faith in Sanskrit.”
South Asians have been grappling with the contradictions of approval from a hegemonic Western world for well over a century. Indian Hindus have assumed no small amount of that historical baggage. “One narrative about why India was colonised by the British was that we weren’t as technologically developed as the West,” said the Indian-American media scholar Rohit Chopra, who studies how Hindu nationalist understandings of technology and national identity are communicated online.
“That narrative,” Chopra continued, “was woven with this other story: that ancient Hindu civilisation was a technologically advanced society, but fell into an era of darkness because of Muslim invasions. Ever since, science and technology have been viewed as a means for reviving that ancient glory but—crucially—also for Hindu self-actualisation.”
That might explain why, for the Hindu upper-caste milieu, a veneration of science has never meant jettisoning the religious urge. Jawaharlal Nehru’s exhortations about India’s need to cultivate the “scientific temper” meant that “the Indian middle classes have come to see science as primarily spectacular technology,” wrote the sociologist Ashis Nandy in 1988. And this expectation could explain “why science is advertised and sold in India the way consumer products are sold in any market economy.” It could also explain why elites consider science “a cure-all for the ills of Indian society.”
The contributions of Panini and Sanskrit grammatical tradition to the field of linguistics and computing are everlasting. But their continued use in AI technology isn’t free from being such an advertisement. Long after Nehru’s vision, science continues to be deployed in service of the state, instead of the other way around.
Sanjana Ramachandran is a writer, marketer and engineer who has worked across media, tech, consumer goods and start-ups. She is a South Asia Speaks fellow from the Class of ’22, working on a collection of essays titled Famous Last Questions. Her writings can be accessed on her website.