| HN Mirror

my first way of showcasing this was by taking a spare computer sitting around the office then writing a little python script that used and LLM to parse information out of file names that our finance team would use to label rebilling invoices. the invoices included the client, payment date, amount, late payment status, etc write in a concluded an completely non consistent file name. the little office PC had 16gb of ram so it was usable for an LLM via the CPU and I just let it run for like 2 days. I continued with my normal work and when it finished I had an intern spend 1 whole day validating just 6% of the data and found it to be 97 percent accurate. I made some obvious changes an was able to fill in that 3% gap. (later we did find a hand full of errors but over all you could consider the validation 99% accurate)

While it really resonated with my management I felt worried I wouldn't be able to replicate these kind of results on other projects.

THE ONLY REAL ADVICE I CAN GIVE ON AI PROJECTS IS . . . don't let your managements expectation of LLMs out weigh its capabilities.

I'm sure I speak for many people here when your non-tech fluent directors get together and think GPT4 is some sort of deity. GPT4 smart (or used to be at least) ill give it that, but small locally hosted 7b/13b LLMs are very limited and people for whatever reason get AI infatuation the second they finally see you show direct value in it they will lose there shit in its assumed capabilities. you got to be direct with them that no matter what dumb video they saw on Sam Altman, what your are proposing is not that. Be very clear in its possible scope because there is some idiot in our organization that will assume assume you can programmatically answer prayers. I actually had this guy from our networking team try and raise a concern about the LLM going sentient and us having a "Skynet" problem. granted this was back in march/2023 so AI histira was a little more rampant but still.

tl;dr my recommendation for your pdf project is run https://github.com/oobabooga/text-generation-webui. if your can get a 30 series GPU in your company Then run a 13B 4bit model that can pull info, assign tags, run minor analysis on your text. else find a spare 16gb machine and do the same but but over a longer time scale.

run a prompt that checks for hallucinations. "does the following text make sense? previous prompt + text if yes then keep else make intern do it.

GPT-j-7b is still one of the best models because it has indexing & categorizing at the main prosperous. other models are great but core idea behind LLMs is that its just a high level auto complete