Hacker News new | ask | show | jobs
by ketralnis 6683 days ago
I can not over-recommend crm114. I've been using it to classify some database entries and its accuracy is second-to-none, and its custom language makes working with strangely-stored data (like database entries) easy (after you learn the strange language)
1 comments

Is there any documentation that stands out for learning the alien language?

No doubt, I'll be combing through all of the CRM114 information on the website. Is there anything that is not referenced there that will be of use?

If you are doing a ham/spam type classification, then you won't need the alien language. I am almost a total tech novice and I was able to do well with just some bash scripts. Of course the docs will teach you about better ways to train the system, if you are interested in going from 98% correct classification to 99.5% correct.

learn ham.css < file_to_learn.txt

learn spam.css < file_to_learn.txt

classify < file_to_classify.txt

I am NOT doing ham/spam type classification. I need to define some classifications for specific types of content.
Then substitute ham/spam for whatever those types of content are.
Thanks for the affirmation! It helps when jumping into territory with which I have no previous experience.