Hacker News new | ask | show | jobs
by plg 2511 days ago
Something I always try with new (to me) languages: write a short script to

(1) load a .txt file containing space-delimited columns of data;

(2) fit a linear model in which one column is predicted by a linear combination of the others;

(3) plot the predicted values again the actual values using dots and overlay a y=x line

Tried this with Julia a short while ago and basically gave up, couldn’t figure out how to get something to plot. Has the Julia-verse changed? Is it easier now?

I can do this in MATLAB in basically 3 or 4 lines of code. Python, not much more than that.

2 comments

Not familiar with matlab but its not as terse as R

``` using GLM, CSV, Plots

data = CSV.read("data.csv", header=["x","y"], types=[Float64, Float64]) #returns dataframe

ols = lm(@formula(y ~ x), data)

ypred = predict(ols)

yall = Base.hcat(data.y, ypred)

plot(data.x, yall, linewidth=2, title="Linear regression", label=["y", "ypred"], xlabel="x", ylabel="y")

```

The comment I was going to reply to disappeared, but for something closer in form to the Matlab example that used to be here:

  using Plots, DelimitedFiles
  d = readdlm("data.tsv",'\t')
  A = [ones(10,1) d[:,1:2]]; B = copy(d[:,3]); X = A\B
  plot(B, seriestype=:scatter, color=:blue); plot!(A*X, seriestype=:scatter, color=:red)
I find Julia syntax feels closer to Matlab than to Python or R, just different enough to be frustrating for the first month or so (followed by a period of "oh, that's why Julia does it this way instead!")
No need to `copy` from `d[:,3]`. Slicing already creates copies (so you're copying twice), but even if it made a view you could still just write `B = d[:, 3]`.
Ah, didn't realize slicing always made copies. A view/alias/whatever would have been fine for the example, but having multiple variable names referring to the same memory has caused me enough problems to try to avoid it by default.
That is strange. It is literally as easy as `using Plots; plot(...)`