Hacker News new | ask | show | jobs
Range joins in DuckDB (duckdb.org)
111 points by hfmuehleisen 1491 days ago
5 comments

DuckDB has become my preferred tool for hardcore data wrangling. Excel is fine for like 80% of data processing tasks, but the remaining twenty percent are a pain, especially when you're CPU bound on a remote desktop. Smuggling the DuckDB JDBC driver onto said remote machine was the most productive infosec violation I've ever committed.
I don't know why I waited so long to try it.

I wrangle a ton of raw and aggregate data locally every day. I've had a 10-year habit of massaging via unix CLI tools and pipes then moving to excel. I guess I didn't wanna write code. Funny thing is I love SQL.

But with `duckdb_cli` it's a game-changer. I'm truly truly impressed.

> I've had a 10-year habit of massaging via unix CLI tools and pipes then moving to excel.

Have you, at any point, considered / used dgsh [0] or a similar tool? If so, how has been your experience with it?

[0] https://news.ycombinator.com/item?id=13352659

Interesting. Never tried dgsh before, but yea it looks like my poor-man's CLI workflow for a lot of tasks
Our lips are sealed.
I've been beating my head trying to get duckdb to statically link into a Go program (I'm neither an expert with cgo nor ld). If anyone else has been able to do this I'd love to see your build steps.

https://github.com/marcboeker/go-duckdb produces a non-static binary by default.

I'm not familiar with the project. Does it use any net-related code? That won't be static because it will want to load C-libs for using /etc/nsswitch.conf to handle DNS/name stuff.

https://stackoverflow.com/questions/33228809/why-is-my-go-ap...

I don't have the source code in a good state to publish yet but here's where I'm at. At some point before this CGO_LDFLAGS does work and the header is found (omit the -ldflags args). But when it goes to statically link it can no longer find the header.

  CGO_LDFLAGS="-L$(pwd)/duckdb/src/include" CGO_CFLAGS="-I$(pwd)/duckdb/src/include" go build -ldflags '-extldflags " -lstdc++ -lm -lduckdb -static"'
  # github.com/marcboeker/go-duckdb
  ../../go/pkg/mod/github.com/marcboeker/go-duckdb@v0.0.0-20220427142532-cd9f33e64d9a/connection.go:4:10: fatal error: duckdb.h: No such file or directory
    4 | #include <duckdb.h>
      |          ^~~~~~~~~~
  compilation terminated.
Edit, nevermind about not being in a good state! Here's my code: https://github.com/multiprocessio/duckdb-tests.
Put the file in quotes. Angle brackets are for built-in files. #include "duckdb.h"
That's not my code.
But also, just to double check, I modified the vendored code and no difference:

  CGO_LDFLAGS="-L$(pwd)/duckdb/src/include" CGO_CFLAGS="-I$(pwd)/duckdb/src/include" go build -ldflags '-extldflags " -lstdc++ -lm -lduckdb -static"'
  # github.com/marcboeker/go-duckdb
  vendor/github.com/marcboeker/go-duckdb/connection.go:4:10: fatal error: duckdb.h: No such file or directory
    4 | #include "duckdb.h"
      |          ^~~~~~~~~~
  compilation terminated.
I tried to replace SQLite with DuckDB for a customized install of better-sqlite3[1] and failed.

[1] https://github.com/JoshuaWise/better-sqlite3

We have a node client if that would be helpful! https://duckdb.org/docs/api/nodejs
I tried the same thing, also failed… I am also not an expert however. But I am very interested in this. Anyone reading this that could point me to some resources that might help?
Per this post [0] by Andrew Kelley, Zig's lead developer, projects with "large dependency trees" are better off using other tools than rely on Zig's cross-compile magic.

DuckDB needs Python3 to build as well, so not sure how easy it might be to get it cross-compile with Zig CC.

[0] https://archive.is/7SuAf

Also, the issue isn't cross compiling it's just static linking.
Gotcha. Targeting musl instead of glibc with Zig CC should get you a statically-linked binary, though, unsure if duckdb and its deps play nice with musl.

Personally, a duckdb golang binary interests me. But: I haven't yet mustered enough patience to sit through a time-consuming duckdb build.

Great to see new features being implemented. I'm using DuckDB for a thesis project and integrating it into my own Python CLI/web tool has been super easy -- I especially love the direct integration with DataFrames, it makes things really seamless.
I've been using DuckDb a fair bit recently and really enjoy it... When it has slightly better ide support (eg, I can use it in pycharm) and can take in geospatial data, I'll be ecstatic.
Well I feel silly, based on a slight mis-reading of the title, I totally thought that Range was some company that was acquired by DuckDB.
Oh dear I can see that - sorry for the confusion! I'll see if we can come up with something a bit longer. It was a bit nerdy...
I also thought that DuckDB had acquired a company named Range. Interesting article regardless!
I was thinking there was some 10x programmer known only as Range that had joined.