Hacker News new | ask | show | jobs
by olh 3631 days ago
Does anyone know good resources/research about generating latent vector representations with iterative processes using numerical analysis algorithms and not neural networks?

The black-box effect on word2vec and similars puts back some applications like generalizing linguistics methods to bioinformatics.

5 comments

hmmh... I don't believe word2vec or item2vec would be considered neural network algorithms.

you come up with a model where a numerical vector represents the attributes of the word or item, you predict the likelihood of a match between words/items by multiplying vectors together, and then you use numerical optimization, i.e. an iterative gradient descent algorithm starting from randomly initialized vectors, to estimate the vectors that work best.

They're NNs because you learn the representation using RNNs. Everything afterwards is trivial since you're in a hilbert space. But getting the representations is the hard part.
word2vec does not use RNNs, the network is trained on a simple classification task "neighborhood" -> "word". Each word in the corpus is an independent example, there's no sequential dependence.
Word2vec doesn't use RNN. It has only one softmax layer after embedding.
oh, ok. Do you have to use RNNs? I think I've done them without RNNs.

Would love a good RNN word2vec type example with Tensorflow if anyone knows one.

Or you could use a pre-trained list like the ones from Google [1]. If not you probably solved an open problem in the area and publishing it would help us not to lose time trying to solve it again.

[1] - https://code.google.com/archive/p/word2vec/

Edit: word2vec on tensorflow tutorial https://www.tensorflow.org/versions/r0.7/tutorials/word2vec/...

Yeah, I implemented something based on the code from the Udacity course that Googlers (Vincent Vanhoucke) did on Tensorflow, basically same I think

their version https://github.com/tensorflow/tensorflow/blob/master/tensorf...

my version https://github.com/druce/streeteye_word2vec/blob/master/word...

That's standard word2vec, not an RNN.
Iterated Least Squares? https://en.wikipedia.org/wiki/Iteratively_reweighted_least_s...

Unless I misunderstood the question...

GloVe might be what you're looking for: http://nlp.stanford.edu/projects/glove/
"generating latent vector representations with iterative processes using numerical analysis algorithms"

Sounds like word2vec.