|
|
|
|
|
by peadarohaodha
1962 days ago
|
|
Thanks for the feedback! It's a good question re compute. There are some fun engineering and ML research challenges that we are constantly iterating on that are related to this. A few examples
- how to most efficiently share compute resources in a JIT manner (e.g. GPU memory) during model serving for both training and inference (where the use case and privacy requirements permit)
- how to construct model training algorithms that operate in a more online manner effectively (so you don't have to retrain on the whole dataset when you see new examples)
- how to significantly reduce the model footprint (in terms of memory and flops) of modern deep transformer models given they are highly over-parameterised and can contain a lot of redundancy. this stuff helps us a lot on the margins point! |
|