|
|
|
|
|
by chrisaycock
1637 days ago
|
|
I built my own table-oriented language out of frustrations I had with with time-series analysis: https://www.empirical-soft.com Empirical has statically typed Dataframes. It can infer the type of a file's contents at compile time using a ton of metaprogramming techniques. >>> let trades = load("trades.csv")
>>> trades
symbol timestamp price size
AAPL 2019-05-01 09:30:00.578802 210.5200 780
AAPL 2019-05-01 09:30:00.580485 210.8100 390
BAC 2019-05-01 09:30:00.629205 30.2500 510
CVX 2019-05-01 09:30:00.944122 117.8000 5860
AAPL 2019-05-01 09:30:01.002405 211.1300 320
AAPL 2019-05-01 09:30:01.066917 211.1186 310
AAPL 2019-05-01 09:30:01.118968 211.0000 730
BAC 2019-05-01 09:30:01.186416 30.2450 380
CVX 2019-05-01 09:30:01.639577 118.2550 2880
... ... ... ...
Functions have generic typing by default; the caller determines the type instantiation. Here is a weighted average: >>> func wavg(ws, vs) = sum(ws * vs) / sum(ws)
Queries are built into the language. Here is a five-minute volume-weighted average price: >>> from trades select vwap = wavg(size, price) by symbol, bar(timestamp, 5m)
symbol timestamp vwap
AAPL 2019-05-01 09:30:00 210.305724
BAC 2019-05-01 09:30:00 30.483875
CVX 2019-05-01 09:30:00 119.427733
AAPL 2019-05-01 09:35:00 202.972440
BAC 2019-05-01 09:35:00 30.848397
CVX 2019-05-01 09:35:00 119.431601
AAPL 2019-05-01 09:40:00 204.671388
BAC 2019-05-01 09:40:00 30.217362
CVX 2019-05-01 09:40:00 117.224763
... ... ...
Everything is statically typed. Misspelled column names, for example, result in an error before the script is even run! |
|