Hacker News new | ask | show | jobs
by niclane7 1189 days ago
Yes, this would be exciting to see. One approach wouldn't require federated learning however. If you had direct access to the data then you could build a conventionally trained large language model (i.e., collect all the data together placed in a data center). However, given the context of this discussion -- you are probably asking about if we could use Flower to train in a federated manner. I believe so. Although again, we'd probably be training a LLM which brings added complications due to its size (and other factors). Internally at Flower we have been testing methods to overcome this and are confident we can pull this off. One could imagine someone hosting a pre-trained LLM and contributing institutions acting as nodes in the network, each performing some small part of the training based on the fraction of the literature they have access to. We plan to release LLM based federated technology in the coming months.

For those that are interested: The best work currently I've seen on training very large models under federated learning, that also makes very realistic assumptions about the likely underlying participating hardware, is this: https://arxiv.org/abs/2206.11239 -- although I expect more in this direction to come soon.