| ProtoScience is a deterministic pipeline that takes raw numerical data and autonomously discovers governing equations. It does not use LLMs for discovery — only sparse regression, power-law fitting, and statistical validation. Results so far: - Kepler's Third Law (P² = a³ / M) from 3,519 NASA exoplanets — R² = 0.998
- Sun’s ~27-day rotation period from solar wind plasma data — 93% accuracy
- Power law T ~ v^3.40 in solar wind (NOAA/NASA spacecraft data)
- 5/5 General Relativity predictions from simulated black hole observables — all R² = 1.000
- Chirp mass relationship from 219 LIGO gravitational wave events — R² = 0.998 It also detects when no meaningful law exists — Bitcoin daily prices returned R² = 0.00. Pipeline: raw data → feature extraction → candidate law generation → fitting → verification An LLM (Claude) is only used at the end to interpret results in natural language — it is never involved in the discovery step. All experiments are fully reproducible. Code:
https://github.com/SaulVanCode/protoscience-nasa-experiments |