Hacker News new | ask | show | jobs
by LZ_Khan 497 days ago
Shouldn't there be a comparison in performance on common benchmarks to other models?

Like a 7B toponet model vs a 7B Llama model?

As a layperson I don't understand why topology is a thing to optimize for.

1 comments

The only potential benefit shown in the paper is the topologically local models seem to be more resilient after pruning.

So you may be able to prune a 7B model down to 6B while maintaining most of the capability.

> The only potential benefit

Other benefits:

1. Significantly lower dimensionality of internal representations 2. More interpretable (see: https://toponets.github.io)

> 7B model down to 6B

We remove ~80% of the parameters in topographic layers and retain the same performance in the model. The drop in parameter count is not significant because we did not experiment with applying TopoLoss in all of the layers of the model (did not align with the goal of the paper)

We are currently performing those strong sparsity experiments internally, and the results look very promising!