| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mmooss 537 days ago

There isn't much about accuracy:

"Ithaca restored artificially produced gaps in ancient texts with 62% accuracy, compared with 25% for human experts. But experts aided by Ithaca’s suggestions had the best results of all, filling gaps with an accuracy of 72%. Ithaca also identified the geographical origins of inscriptions with 71% accuracy, and dated them to within 30 years of accepted estimates."

and

"[Using] an RNN to restore missing text from a series of 1,100 Mycenaean tablets ... written in a script called Linear B in the second millennium bc. In tests with artificially produced gaps, the model’s top ten predictions included the correct answer 72% of the time, and in real-world cases it often matched the suggestions of human specialists."

Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.

Obviously, they've thought of that, and it's worth experimenting with these powerful tools. But I wonder how they've solved that problem.

3 comments

sapphicsnail 537 days ago

> Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.

Without an extant text to compare, everything would be a guess. Maybe this would be helpful if you're trying to get a rough and dirty translation of a bunch of papyri or inscriptions? Until we have an AI that's able to adequately explain it's reasoning I can't see this replacing philologists with domain-specific expertise who are able to walk you through the choices they made.

link

EA-3167 537 days ago

I wonder if maybe the goal is to provide the actual scholars with options, approaches or translations they hadn't thought of yet. In essence just what you said, structured guessing, but if you can have a well-trained bot guess within specific bounds countless times and output the patterns in the guesses, maybe it would be enough. Not, "My AI translated this ancient fragment of text," but "My AI sent us in a direction we hadn't previously had the time or inclination to explore, which turned out to be fruitful."

link

mmooss 537 days ago

I agree, but lets remember that the software repeats patterns, it doesn't so much innovate new ones. If you get too dependent on it, theoretically you might not break as much new ground, find new paradigms, discover the long-mistaken assumption in prior scholarship (that the software is repeating), etc.

link

Zancarius 536 days ago

Human proclivities tend toward repetition as well, partially as a memory/mnemonic device, so I don't see this as disadvantageous. For example, there's a minor opinion in biblical scholarship that John 21 was a later scribal addition because of the end of John 20 seeming to mark the end of the book itself. However, John's tendencies to use specific verbiage and structure provides a much stronger argument that the book was written by the same author—including chapter 21—suggesting that the last chapter is an epilogue.

Care needs to be taken, of course, but ancient works often followed certain patterns or linguistic choices that could be used to identify authorship. As long as this is viewed as one tool of many, there's unlikely much harm unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).

link

mmooss 536 days ago

> unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).

This is what I was talking about. Knowledge and ideas develop often by violating the prior patterns. If your tool is (theoretically) built to repeat the prior patterns and it frames your work, you might not be as innovative. But this is all very speculative.

link

Validark 536 days ago

Interesting point in theory but I'd love to get to the point where our problem is that we solved all the problems we already know how to solve.

link

rnd0 536 days ago

Thank you, and also I'd like to know how they'd even evaluate the results to begin with...

I hope to GOD they're holding on to the originals so they can go back and redo this in 20,30 years when tools have improved.

link

manquer 536 days ago

If the texts are truly missing , then accuracy is subjective ? i.e. human opinion versus AI generation

link

ip26 536 days ago

artificially produced gaps in ancient texts

Someone deleted part of a known text.

This does require the AI hasn’t been trained on the test text previously..

link

rtkwe 536 days ago

They do mention that the missing data test was done on "new" data that the models were not viewed trained on in the article so it's not just regurgitation for at least some of the results it seems.

link

BeefWellington 536 days ago

One way to test this kind of efficacy is to compare it to a known sample with a missing piece, e.g.: create an artifact with known text, destroy it in similar fashion, compare what this model suggests as outputs with the real known text.

The "known" sample would need to be handled and controlled for by an independent trusted party, obviously, and therein lies the problem: It will be hard to properly configure an experiment and believe it if any of the parties have any kind of vested interest in the success of the project.

link

mmooss 536 days ago

> If the texts are truly missing , then accuracy is subjective ?

Then accuracy might be unknown but it's not subjective.

link