Hacker News new | ask | show | jobs
by hackerlight 843 days ago
> this would likely require an additional tensor channel, so that each "feature" carries information in a high enough dimensional space

Suppose input data is [batch_size, num_features]. Then you do x.unsqueeze(1) giving you [batch_size, num_features, 1]. Then what?

1 comments

You probably want something equivalent to (however you make it fast in your chosen framework):

einsum('bf,fc->bfc', batched_inputs, channel_embedding)

Then carry that info through the network and project it down at the end. It's roughly equivalent to the token embedding step in an LLM.