|
|
|
|
|
by janstice
1422 days ago
|
|
I spent a decade parsing text-with-angle-brackets with regexes, and it sucks. It’s always tempting to try an html parser but if the code is written by a human (or worse, a mixture of human and machine, especially if the machine involves MS Word) it just doesn’t work. I’d suggest rather than attempting to do big regexes that capture a bunch of stuff in one call, break it down to a bunch of smaller, more targeted calls - one call to capture the text of
the whole record, another with 3 variants to get the title, another with 2 variants to pick up a tag line, etc. |
|
Regex isn't really the problem though (even though it technically should also not be the solution in this case, but I cannot dictate the techstack). It was just the last drop on my frustration with the situation and myself not being able to do, what my colleague does, even though I want to. I felt the need for help, and I got it. Awesome community around here.