| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Jugurtha 373 days ago

When I was in EE at university, I worked on heart anomaly detection and multi-phase flow classification for oil & gas. The papers I was reading used synthetic data with a nice noise dust sprinkled on it. Meanwhile, I worked on data from hospitals acquired on restless, sweaty, hairy, dudes with rusty, banged up electrodes and abused probes.

Needless to say, the data I saw on these papers looked nothing like the data I worked with, whether from hospitals or what I saw at Schlumberger in the Sahara.

The real world tends to be ... interesting.

1 comments

cpard 372 days ago

That makes sense, do you think LLMs have or can potentially change that and end up having more realistic synthetic data than what you've seen in the past? I guess the data you were working were more like time series data but still if an large language model can be perceived as a universal approximator of some sort, might be able to generate more realistic synthetic data than the approach you described with the noise dust sprinkled on data.

link