Y
Hacker News
new
|
ask
|
show
|
jobs
by
willwade
186 days ago
I wonder if this would have been useful
https://github.com/microsoft/presidio
- its heavy but looks really good. There is a lite version..
3 comments
shaoz
186 days ago
I've used it, lots of false positives out of the box, you need to do a ton of tuning or put a transformer/BERT model with it, but then at that point it's basically the same thing as the OP's project.
link
threecheese
186 days ago
Looks like it uses Googles Langextract, which uses only LLMs for NLP, while OP is using a small NER model that runs locally.
link
winchester6788
186 days ago
full of false positives though. but definitely good for some types of entities and regexes
link