Hacker News new | ask | show | jobs
Does India need its own vernacular internet? (techinasia.com)
52 points by amitsy 3473 days ago
8 comments

I am not sure if I subscribe to the findings of this article completely. Popular English language words have made their way into almost all Indian languages. When you are accessing the internet, unless you are reading long articles or online books, all you need is passing reading ability of some English words to get through. To say that Instagram doesn't have a Hindi equivalent of "filter" is simply missing the point - "filter" itself is called "फिल्टर" in Hindi which has the same English pronunciation and many people will get that. Also, many of my friends communicate with me in Hindi by literally using English alphabets but writing Hindi sounding words - another thing which the author fails to mention in his article.
I agree to the first point you mentioned. In certain utility apps like payments etc this would hold true. In an internet without words i.e. images and video that would be true as well. But currently a large part of internet we use is text. Case in point, this article itself and this discussion. With the current technologies we have, for most part of Hindi speaking users, it is easy to speak in Hindi but very difficult to type in it.

As per using hindi words in english alpbhabets, my argument is - majority of India can't even read or speak english, how will they type in English? A lot of your friends do that because Hindi input is very hard and we need better tools.

> majority of India can't even read or speak english, how will they type in English?

They won't. They type in Hindi, using the roman alphabet. Big difference. If anything, the roman alphabet is simpler than the syllable based Devanagari.

https://en.wikipedia.org/wiki/Devanagari

Apparently SMS is big in India:

> SMS is hugely popular in India, where youngsters often exchange lots of text messages, and companies provide alerts, infotainment, news, cricket scores updates, railway/airline booking, mobile billing, and banking services on SMS.

https://en.wikipedia.org/wiki/Text_messaging

I assume it has largely been done with cheap dumbphones even after 2007, which would mean no access to Devanagari characters, so there has been a long time, and a huge incentive to learn transliteration.

> If anything, the roman alphabet is simpler than the syllable based Devanagari.

Sure, until you need to figure out which of the four phonemes a letter corresponds to which is very difficult for foreign language learners of Hindi, as it can be for English. I am a native English speaker but I picked up Devanagari in a day or so because it's actually relatively consistent due to the phonetic syllabic nature.

But that problem is entirely due to English, not the roman alphabet. Same in French. No such problem in German or Italian.
You would need quite a few diacritics to map to all of the Hindi phonemes, especially including loanword phonemes (/z/, etc.) I'm not saying this is bad, but it's not the easy solution that it was made out to be. Meanwhile, Devanagari handles consonant and vowel phonemes quite consistently as others have pointed out.
Swedish has a few redundant "ch" sounds; k, sj, sch, skj and tj.
With the growing use of Whatsapp in India, SMS is actually quickly dying as a medium of communication between humans here. Poeple who have whatsapp (there's a lot of them in india) seem to use that to communicate instead of SMS (which is then used mostly for messages from companies, or stuff like 2-factor authentication etc).

Also, I see a lot of people use Whatapp to send/share actual hindi (written in devnagari) instead of hindi written in roman alphabet. I would say the latter is still more popular, but the former growing quite a lot.

Another thing I see is people use whatsapp to bypass typing altogether. Since you can share pretty much any media easily on whatsapp and other messaging serivices, I've seen some people leave short voice messages instead of typing text.

During Nasscom Product Conclave this year someone mentioned that more than 60% of WhatsApp messages are in local language (Phonetic or Native typing mode).
So, are you saying in the long run Devanagari as a script will die but Hindi as a language shall remain. Pretty interesting point. Seems like this is the direction we are going as well.
I suppose it is possible.

For that to happen, there might be a need to teach and encourage it. I don't know how nationalistic India is, and if it would be well recieved. There is apparently an ISO standard for Indic scripts: https://en.wikipedia.org/wiki/ISO_15919

My own experience is limited to my sri lankan wife who enjoys texting with her friends. Among them, spelling is rather haphazard.

Is the roman alphabet actually simpler? In some sense it is, but that's partly because it obscures a lot of detail that Devanagari captures.

For example, consider the name "Poisson". One one occasion I repeatedly tried to communicate it it's pronunciation in English, but failed. Then I wrote "प्वसैं" and my friend totally understood it.

In spite of speaking virtually no Hindi, I find it far easier to express various phonetic words in Devanagari than in Roman letters. And as far as learning pronunciation, it's virtually impossible learning Hindi correctly from Roman letter transliteration.

In my view as a native English speaker, Devanagari is a much better alphabet than the Roman one.

I'm not sure you're really comparing qualities of alphabets here.

Whether there is a good phonetic correspondence between phonemes and written syllables is independent of the alphabet.

If you read German or Italian, the correspondence is close to perfect. In English or in French, it's awful.

It's possible that there is a better phonetic correspondence between Hindi phonemes represented by Devanagari characters and French phonemes (in the case of "Poisson") than with English phonemes represented with roman characters (e.g. no satisfactory transcription for the French "on"), but that's unrelated to the alphabets.

If you were talking to people who could read French, you'd show them "Poisson", and they'd pronounce it perfectly. It just so happens that English speakers pronounce "on" a different way.

This is a good comment, but slightly complicated and took my a while to figure out what you were saying. I think your point is that he said "alphabet" but meant "language".

Basically, you can't really figure out English pronunciation from the way a word is written at all, because English isn't phonetic (e.g. Polish and polish, and many, many more).

For phonetic languages, the correlation between spelling and pronunciation is close to perfect (i.e. words are pronounced the way they are spelt [or spelled - wtf English]) - but only for the sounds in that language. Hence it may be possible to transliterate most English word to Devanagari. Most famously, e.g. English into Japanese doesn't work, even though Japanese itself is also phonetic but is e.g. missing the 'r' sound. The reverse is also true, unless you use extra punctuation to signify syllable boundaries or stress. E.g. Asakusa is 浅草 Asa-kusa and thus the stress changes and it's pronounced AsaKSA, not ASAkusa. I guess you could also transliterate it close enough with asaxa and people would probably get the stress right, further proof than English isn't the best tool for this.

(I think all the explanation of the previous paragraph is encompassed by "phonemes", which I had to google and still don't quite understand.)

The only "language" that can handle transliteration of any language is the International Phonetic Alphabet (IPA), which isn't a language because it doesn't have grammar or vocabulary or really anything except well-defined glyphs/symbols and the corresponding pronunciation.

From what little I've seen of French, the French person just knows how Poisson should be pronounced and does it.

This is even how most of english works. And there is actually a much better phonetic correspondence between Devanagari and English phonemes than between roman letters/letter pairs and English phonemes.

Consider "The" (as in the 3 letter word) vs "Theranos". The roman string "the" represents a different sound in both words. In Devanagari the two sounds are represented by "द" and "ये" (the letter is य means "tha" and the े changes the "a" to "e" as in "egg"), respectively, so you'd transliterate to द and येरनौस respectively. (A native speaker please correct me if I'm getting this wrong.)

It's possible that in German and Italian, the correspondence between letters and sounds is a lot better than in English. But from what I can tell, devanagari would make a more phonetic English alphabet than Roman letters do.

> If anything, the roman alphabet is simpler than the syllable based Devanagari.

Actually, you're just arguing that the input methods for Latin are simpler (and I agree). Devanagari is superior in almost every respect to Latin alphabet, from phonetics (in पाणिनी) to arrangement - it can be entered as easily with IMEs - the lack of which is an indication of the state of things.

To put this in perspective, if Indians were using logographic characters like China/Japan, the current input methods would be akin to old Chinese typewriters.

I have probably several thousand Whatsapp messages in Hindi typed phonetically in the Roman alphabet on my phone. Well, more honestly it's Hinglish, but still.
> They type in Hindi

Not sure the majority of Indians speak Hindi either.

Your link claims 41% for Hindi. The majority seems to speek something else as the mother tongue.
Slightly OT: isn't the "English Alphabet" actually the "Latin Alphabet"?
Yes it is.
Latin doesn't have a "J".
Classical Latin doesn't, and also variation around 'u'/'v'/'w'.

The "English alphabet" is _a_ Latin alphabet, however - cf. ISO basic Latin alphabet.

https://en.wikipedia.org/wiki/ISO_basic_Latin_alphabet

Fantastic article. I'm not Indian, so I apologise for my ignorance - but I would have liked to have read a bit more under the 'education' heading.

I would (this is where I fear I may be ignorant of the truth) assume that there is perhaps a rather strong correlation between money/education and English-fluency?

With that premise, it would seem that those with the power to effect change (educated engineers and entrepreneurs) are less incentivised to do anything about it - since they're presumably able and content to use "the English-language internet".

I suppose money can be the motivator though, if there's ~0.9*1.2B people clamouring for more internet in their native tongue.

I am trying to learn Hindi, so having read this I think it'd be interesting to toy around with pan-alphabet internationalisation when I've got a bit of a grip on it.

Thanks OJFord. Under education, I tried to highlight a few short term opportunities for indian startups to focus on. From K 12 education to college education, all technology solutions have been built only for English speaking users.

There is no mobile based learning app for students to learn Math in Hindi language. No test prep platform in Hindi/Bengali or any other language (there are tons of test prep apps in English) No employee training or certification providers provide their online platform in any regional language. Not even an LMS exits (to be fair not many regional language medium institutions would be willing to pay for one) There is no Khan Academy in Hindi language.

As per the second point, yes I believe that should be true. Its not necessarily true that these non-english speakers don't have propensity to pay.

Am I the only Indian who thinks this is not such a bad thing, prefer one language to unite us all (isn't that the common base for most far-future novels?), don't care which, I can converse in three Indian languages, and know bit of french. In Sweden, all that knowledge is useless, all the main sites are in Swedish, now I am learning å,ö,ä, guess how much fun I am having?
> prefer one language to unite us all

Its not a bad thing if that happens properly. But instead currently we have small number of 'elite' people who always prefer English over native language and we have large number of people we don't know how to read & write English. Since the output(business/knowledge/tech) of this 'elite' group almost always in English and by definition they are in control its not easy for person who don't know English to break loop. Its very easy for the person who know English to get the job compared to who don't. One might say, just learn English and become the 'elite'! Well, learning English is not at all easy for a kid from rural background. This initial barrier dividing among ourselves and now we now have invisible class system based on language. this is apparent if you visit any restaurants/supermarket in cities or just visit city outskirts. My point is process of implementation of single language for all is very difficult.

Why would they want to break the loop?

If one language to unite us all is a good thing, then either kids from rural backgrounds have to be taught English, or everyone has to learn the native language of that kid. This will be impossible for a kid from a different rural background. Worse than English, even, since most other languages don't have the same amount of cheap available learning materials.

You probably are. Millions don't have the access to content because of the language barrier.
English is one of the official languages in India for historic reasons. The number of English speakers in India surpasses the population of the United Kingdom.

Now, the field of automated translation is making enormous progress lately. I think eventually whatever the gaps in understanding are, will eventually be solved.

Google search across languages would be amazing, e.g., accessing old scholarly research written in a native tongue. Wonder what the underlying structure of the neural network is for translations between all languages:

http://www.kurzweilai.net/googles-new-multilingual-neural-ma...

Agree with the fact that automated translation is making progress but not fast enough. It is very hard to understand context still. The efforts are being to made to bring these 500 million users online now, and an english internet is being shoved on them.
In India, the state is simply lazy.

All most all English words should have an equivalent in all native languages. But, unfortunately no one forms any new words.

If apps can be fully translated to the local languages, it will help everyone.

I don't think its the responsibility of the state. It is of the startups to move out of their english speaking bubble and realise the potential of non-english speaking india.
Buying power is largely concentrated with the English speaking though, even if it isn't their first language. Most people who do not speak English are also lower on the income scale. As in less than $500 per month in income
The people who have the most purchasing power and speak English, are they bilingual or do they only speak English?

If not, could language be used as a tool for market segmentation? Give price-sensitive people lower prices to increase the size of the market without hurting the profits extracted from the wealthier people.

The article says "88 percent of Indians can’t speak English (let alone read it)"

I'm not a native English speaker, but that sentence implies that reading English is more difficult than speaking it. But in my experience, most people who learn English as a second language learn to read first, especially if they specifically learn it to use the Internet.

For native English speakers: Is my interpretation of that phrase correct?

For non-native English speakers: Did you learn to read English first or did you start by learning to speak it? Does it vary on first language, literacy level, motivation?

> reading English is more difficult than speaking it

I'm a non-native English speaker. I agree, reading English is easier than speaking. In rural places basic grammar is not taught properly. Even if they do learn to read somehow with the help of dictionary or internet, Speaking English is far out of their reach. For using App reading ability is enough. But for overall, English as medium of communication this is very problematic. This is more frustrating when you are among elite (for academics or career purpose) who speak English fluently.

Most important thing is they can't express their idea. For using App this isn't big problem, but as overall expressing idea this is a barrier.

(Vernacular - "Language of the slave").

It's far more complicated that "need its own vernacular internet". The internet is an emergent thing, and what one sees are just the symptoms. This is also the reason India's literacy is so low (even compared to Africa/Middle East).

If India wasn't a linguistic apartheid regime, we'd already have seen a native ecosystem which is lacking quite badly. This needs fixing at the state and political levels, which I have exactly zero hope of ever happening. This despite having some nauseously xenophobic organizations like ShivSena (in Maharashtra), the DMKs (in TN), the KaRaVes (in KA). This is in addition to the "nationalistic" organizations like RSS and BJP at the central level.

These orgs are essentially vehicles which instrumentalize the widespread disaffection from the apartheid state, in order to put themselves in power (Advani's use of Ayodhya is a nice study). If you study their policies carefully however, you realize they plan to do precisely nothing that is the cause for the inequity.

This is not different from the independence movement, where a bunch of Brown folk wanted to run the colony.

Digital India, be in no doubt, is meant for the 200 million Brown sahibs, who do know English. The rest are peasants who have been at the losing side of the inflationary system today, and of the cruel taxation system of the British, kept at bay via endless subsidies (a dog hardly bites a master ?) and mutual bickering.

There is zero empathy from the former class, and these pretentious people are the source of endless pain for the red-pilled; it is very disappointing to live in communities where the kids start speaking English before anything native, and worse when the state hold such clones in such high regard.

May kek not have mercy on the clones (apologies for the 4ch lingo).

Technical:

I think Sailfish has better localization than Android. Keyboard is a disaster everywhere, since no one in India uses native language keyboard/input. They are very rare, if at all available, and next to no one knows they exist - it's in fact easier to find such resources in the US than within in India. Swarachakra is too crowded and very very information inefficient (unlike the 5-vowel Japanese system its based on), which leads me to believe even their creators don't use it actively.

The lack of feedback means that the ITRANS layout is very very bad, and unusable (xkeyboard moth-balled most layouts due to disuse).

Again, the roots of the problem lie with the education & state policies that are reinforced via state violence. China gets this right because India is a colony & it isn't.

(Downvote all you want clone people; सत्यमेवजयेत् !)

[1] https://news.ycombinator.com/item?id=12237411

Agree completely. The Government of India, State Government of Tamil Nadu and Rajya Marathi Vikas Sanstha are actually institutional members of the Unicode consortium which makes decisions on behalf on India. Safe to say this has not gone well for Indians since the companies who are full members choose to ignore the basic encoding requirements for Indic languages. And oh, the membership fee to Unicode for full votes starts at $12,000 per year.

By the way, could you check out the Swalekh Android keyboard app and give me your honest feedback? Disclaimer: I work for Reverie and we made this keypad.

What do you mean by "linguistic apartheid regime"?
See,

http://www.forbes.com/sites/realspin/2014/11/06/the-problem-...

The "socialist" state runs a systematic unwritten discrimination regime where every thing from higher education to state services are restricted only to English speaking class; yet, it is the remainder that get the boot due to the inflationary forces of the currency. It is by every meaning of the word, an apartheid; this state mandated scheme of slavery is kept in check by various schemes of cultural propaganda and distractions by the Anglical state. Of course, without education, feudalism becomes only too normalized.

This system is widespread all over former colonies in Africa and Asia.

Great points you made. Agree with the fact that Swarachakra is too crowded and huge scope of improvement there. Also agree that the root of problem lie with education and state policies. But even with the huge push that english gets in this country, the reality is more than billion people don't understand it. This might still become 500 million is a decade but still half the country is being denied of internet.

Didn't know about Sailfish. Will check it out.

> This might still become 500 million is a decade but still half the country is being denied of internet.

Perhaps, but to paraphrase a comment in the attached thread, I'd be more worried about feeding the cow before milking it. I mean, for a linguistic population comparable to native English speaking population of the entire world, there exists not a single school teaching Engineering of Medicine in Hindi (not to speak of other far older languages) - these people barely have any money to subsist on on average (and arguably also why medicine and infrastructure suck).

Also see,

https://www.youtube.com/watch?v=OZwq4JnCZ4A

http://sankrant.org/2011/03/the-english-class-system-2/

Completely agree with you on this one. The situation is much more worse than what we living in a bubble imagine. The video really touches your heart. Reminds me of a couple of my friends in my college, and the struggle they went through - Shifting from one language to another.
The bubble thing is very real.

Isn't it interesting then that all our intelligensia is engaged in the whole apochryphal "caste-system" narrative [1] [2], while essentially being the gate keepers of a state/violence sanctioned system of linguistic apartheid ?

Even Chomsky, when questioned about his accomplice Roy, seemed perfectly okay with the current state of things. My entire schooling/conditioning has been turned on its head - I can't recommend S N Balagangadhara & Dharampal's works enough.

[1] http://www.hipkapi.com/2011/04/02/mantras-of-anti-brahmanism...

[2] https://archive.org/details/DharampalCollectedWritingsIn5Vol...

Cash transactions provide Privacy/Security; Govt must give Gun-Licenses to Common man if it really wants a Cashless society; https://en.wikipedia.org/wiki/Estimated_number_of_guns_per_c...