Hacker News new | ask | show | jobs
by mrkrwtsn 4028 days ago
I'm unable to get it to run either. Turning text into tokens is something I've wanted to be able to do in golang before, so it would be super nice if it was easy to install with `go get`. I still need to work through some of the other libraries here: http://biosphere.cc/software-engineering/go-machine-learning...

I agree though, seems promising.

1 comments

If you just want to segment larger blocks of text into tokens you can try the segment library (it implements the word boundary portion of unicode annex 29):

https://github.com/blevesearch/segment

If you need more manipulation of tokens after segmentation/tokenization, you could look at the analysis sub-package of bleve. Its intended to be able to be used indepenently of the rest of the library.

https://github.com/blevesearch/bleve