Hacker News new | ask | show | jobs
by shgidi 1902 days ago
Right, that's kinda nasty. Titles of papers refer deep learning, but I don't think fully connected networks might be considered a as deep learning.
2 comments

What? No. Fully connected networks are deep learning, and actually the most important deep learning workload. See: Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective:

https://research.fb.com/publications/applied-machine-learnin...

Table 1 shows News Feed service uses fully connected networks model, and table 3 shows this workload dominates all other workloads.

Transformers, which are currently waging a successful campaign to conquer all Deep Learning, are largely stacked feed-forward networks, matrix multiplies and maps. Some ideas to make attention more scalable, such as LSH or large sparse attention matrices seem like they'd be well suited to this approach.

Their approach should also be readily adaptable to RNNs, including LTSMs.

Certainly worth investigating as an alternative for efficiently running and training giant networks on less expensive hardware.