Hacker News new | ask | show | jobs
by giancarlostoro 3692 days ago
I was going to say... my main interest in this project is precisely for Biblical studies... I could talk about analyzing the Bible for hours, but let's just say there's way more depth than many even realize. The Aleph Tav in relation to the Book of Revelation is one such example, many translations omit it, but the Aleph Tav Study Bible explores it in depth. There could be many discoveries made with these kind of projects that are missed by just about anyone only reading a translation.

There are a ton of Jewish Idioms in the Bible that many don't understand at all, including "No man knows the day or the hour" which is a traditional Jewish Wedding Idiom. Lots and lots of things could be explored with enough data and resources.

1 comments

I'd think that the advantage of machine translation is on corpora that are not known up front (i.e. user-supplied text) or corpora that are exceptionally large.

If you have a small (ish), well-known text, I don't think you will get much insight from machine translation. Certainly there are plenty of uses for computer text analysis/mining in biblical studies, but I doubt translation is one of them. And for obscure idioms or hapax legomena, machine translation definitely can't help you because by definition there are no other sources to rely on.

With a sufficient level of precision, there's room for machine analysis to "reveal" things we are ignoring out of custom. A lot of text analysis done by people is full of biases and deferral to authorities.

E.g. I remember from school getting in into an argument with a teacher over the interpretation of a poem. "His" interpretation, which was really the interpretation of some authority who'd written a book was blatantly contradicted by the text if you assumed that the author hadn't suddenly forgotten all his basic grammar despite all the evidence to the contrary everywhere else that he was always very precise in this respect.

Of course, in some of these kind of instances, it will be incredibly hard to overcome the retort that any "revelation" is just a bug.

In a more general sense, people are typically exceedingly bad at parsing text, judging by how often online debates devolve into bickering caused largely by misunderstanding the other party's argument. Often to the extent of even ending up arguing against people who you agree with. Having tools that help clarify the parsing for people might be interesting in that respect too.

Well I wouldn't look for idioms, but it would be interesting to throw in information such as "Strong's Concordance" into the mix, I've yet to really think of an application for this library fully, but it would be fun to play around with it nonetheless. I would be analyzing the Hebrew / Greek / Syriac scripts, seeking verses omitted, or missing, etc. It would make for interesting studying if anything.
You might be interested in Andrew Bannister's research on computer analysis of the Quran. He wrote a book on it [1], and there's also this paper which gives a high-level overview [2].

[1] http://www.amazon.com/Oral-Formulaic-Study-Quran-Andrew-Bann...

[2] http://www.academia.edu/9490706/Retelling_the_Tale_A_Compute...