|
|
|
|
|
by giraffe_lady
53 days ago
|
|
I'm not in this field but I know someone who used to be and we've talked about it a fair bit. A quick overview of what's needed from what I understand: Old books aren't that neat, you tend to have a lot of notes and other documents, translations, scribal annotations from different eras interleaved or in the margins. You need to make decisions about that stuff as you go, which requires being informed about the context and meaning of those documents, that may well be in another language, or from hundreds of years before or after the document you're trying to process. For any given physical object it's quite likely that no single scholar has all the information necessary. It is also extremely important to preserve all the context, things like which exact pages a fragment is stuck between, even its orientation, can be critical information to later scholars. And then in all of this you're handling ancient & precious one of a kind paper documents. It's just slow going, and well beyond what I would even consider "skilled labor" this very much is the work of research & scholarship. By the time you get a camera pointed at a page you're at the easy part. |
|
As for imaging, there is Irish Scripts on Screen (https://www.isos.dias.ie/) which covers many different places and time periods.
Answering the grandparent comment, LLMs are not good at Old Irish. Seriously, they are awful at it. There is just too little data for it to work. I wrote a very little bit about text clustering in Old/Middle Irish (see https://doi.org/10.1515/9783110680744-005). I think the better place to start is transcription and there are some tools out there which help, like Transcribus (https://www.transkribus.org/), which I haven't used but it looks useful.
edit:typos