|
|
|
|
|
by mempko
134 days ago
|
|
I'm curious about something. If this is based on historical datasets, and people build strategies using LLMs, then in theory this is deeply flawed since LLMs would contain the knowledge about some of the datasets, and certainly the prices of the biotech stocks. This approach cannot be used to figure out which strategies are good because they know the future outcome. How do you prevent this problem? It's a classic problem in backtesting strategies where you leak future information into the model. EDIT: Some context, I ran a quant fund before. |
|
One solution could be to get experts to write similar press releases so that the text itself is out of distribution or if an actual quant firm has internal models, they can just make sure that there is a cutoff date to the pre-training data.
I'm curious, when you ran a quant fund, what was your approach?