Hacker News new | ask | show | jobs
by internetter 882 days ago
Marginally related, but this is one of the things I'm bullish on ChatGPT for. Too frequently, I've gotten hundreds of lines of malformed textual data that I need to standardize. This is like impossible with REGEX but I can drop it into GPT and it does this wonderfully.
4 comments

There is however no indication if it failed on a line when using ChatGPT, it could provide you with a slightly incorrect result.
Yeah that's always been a fear but I always dog food and I've had no issues yet
ive done it with transforming data (for example pasting a table in and asking it to turn it into LaTeX) or something and had the occasional issue with it misordering or forgetting things. It didn't take long to spot the error for me though
You could run it through thrice with a different prompt/temperature/model and pick the majority result (or exit with success on the first two passing runs).
Good idea. If the data is a list of records where the order isn't important, randomly permuting them (ETA: then sorting the final outputs) would be another option.

ETA2: Would the downvoter care to explain why? Genuinely puzzled.

I have no idea how Regex became the standard. The syntax is impossible to remember unless you write regex expressions daily. Most people only rarely need regex so it needs to be relearned every time. It is also incredibly unsatisfying to write (and read).
I used it a lot for a few years decades ago, and only use it rarely now. I remember the syntax well and I am not know for having a great memory. I think its terseness suits the extreme focus of its use perfectly.
As much as I hate to hop on the AI bandwagon, this is definitely where tools like Chat GPT shine.

Not to mention, non tech people will now be able to use what once could’ve only been done with cryptic regex.

The only good thing to come out of regular expressions is https://regexcrossword.com/
What would you replace regex with?
I tried using ChatGPT (4) for format conversion. I had a draft yaml file and needed some differently structured json. Mainly with the same content.

If you just want to change the format it works. If you need more than programming skills it seems too fail duo to the amount of text.

E.g. if you have a list of items and want ChatGPT to generate a meta field which it cannot generate using simple python code it stops after 10 to 20 elements.

Thus at least the cloud version doesn't work so well here.

I also wanted it to help me fill out my i18n file with translations and plural forms. Even thought he got every word correct i needed to split it into multiple requests. Not sure if the api would have worked better (used the web frontend).

For the plural forms I finally added them myself as it was way faster for my natural language than copy pasting all the small chunks. Really hoped for more help there.

hey, if you are search for really seamless i18n with nice DX, check out https://inlang.com – js library, web editor, automation cli & vs code extension are just some of the completely free and open source offerings
Agreed. It works especially well for formatting where semantics matter, such as separating the term and definitions of flashcards. Hard to do with code, but easy with GPT.