Tiedot - Your NoSQL document database engine powered by Go

Y	Hacker News new \| ask \| show \| jobs

	Tiedot - Your NoSQL document database engine powered by Go (github.com)
	85 points by howardg 4742 days ago

11 comments

vladev 4742 days ago

I'm hoping to start seeing projects in Go because Go gives them an advantage, not for the sake of it being in Go.

link

hox 4742 days ago

Does the project have to be open source?

My company currently is using Go to monitor and respond to events on hundreds of thousands of RabbitMQ message queues. We create a goroutine for handling each queue, and Go handles all of the concurrency and threading in the runtime while avoiding the resource overhead of standard threads. All of this is done in an application that took about 6 hours to write.

It really made a difference in our experience.

link

sbarre 4742 days ago

Can you share any details about how many resources this program requires to run?

link

hox 4742 days ago

For the vast majority of cases that we deal with, we don't really need much for resources to run this application.

One specific test uses approximately 20k goroutines and averages 15-20MB RAM depending on test load. As for CPU utilization, the impact is minimal; RabbitMQ is the biggest bottleneck, as our peak message throughput for a single RabbitMQ broker is about 50k/messages per second, which our go process is able to handle without much issue. The worst-case scenario that I've been able to test for is one where those 50k messages are evenly spread across different queues; even then, our CPU utilization wasn't any higher than 15%/core on a 12-core server.

link

arms 4742 days ago

We'll never know where Go excels unless we build lots of things with it first :)

link

jurre 4742 days ago

Go seems like a reasonable choice for this type of thing though, doesn't it?

link

AYBABTME 4742 days ago

I'm reading over the source with my coffee this morning. I'll pseudo-CR it if I see mistakes (commented on a typo already). Hope you don't mind! I'm interested to see how you are doing this because I'm playing with my own toy key-value store on the week-ends [1].

In [2], wondering why you make GOMAXPROCS=2*Cpu by default?

[1]: https://github.com/aybabtme/dskvs/blob/proto/

[2]: https://github.com/HouzuoGuo/tiedot/blob/master/src/loveonea...

link

howardg 4742 days ago

Thanks for noticing the typo, it's been fixed and will commit shortly.

Regarding point 2, I made a note here:

https://github.com/HouzuoGuo/tiedot/wiki/Embedded-Usage

I experiment with different GOMAXPROCS settings on three machines and noticed that 1CPU does not run tiedot to its full potential, 3CPU seems to be slower than 2*CPU.

link

pkulak 4742 days ago

With GOMAXPROCS=n*CPU, n is roughly the amount of pre-emptive (vs the built-in cooperative) multitasking that you want going on, with 1 being none. Handled by the OS, of course. Interesting that you noticed a speed up > 1.

link

AYBABTME 4742 days ago

I didn't think about that, I'll write that down in my checklist of things to do when testing/benchmarking my projects... like another dimension to take care of when testing. Aside from domain-range, good data, bad data, edge cases... and other parameters - now add to that concurrency scenarios.

link

jzs 4742 days ago

Nice to see a nosql database written in go. By the way your repository structure is incompatible with "go get".

link

howardg 4742 days ago

Much appreciated! I will soon configure my web server to be compatible with `go get`.

link

dlsspy 4742 days ago

We've done a few of them here. Notably:

* http://cbgb.io/

* http://dustin.github.io/2012/09/09/seriesly.html

* https://github.com/couchbaselabs/sync_gateway

cbgb is an API compatible Couchbase implementation in go. We use it in place of couchbase when we need something tiny to play around with.

seriesly is a time series database for storing and aggregating sample data and doing things like this: http://bleu.west.spy.net/~dustin/seriesly/

sync_gateway is how our mobile team synchronizes data across all your phones and tablets and your central DB.

link

VeejayRampay 4742 days ago

Is there anything Golang can't do fairly well with a small code base? Damn, I wish I were still 20 and had shitloads of time on my hands to invest in learning the ins and outs of the language.

link

jzelinskie 4742 days ago

It does take Google ~100k lines to load balance their MySQL servers, so the size of your program is still highly dependent on how simple you want to make it. I'm 21 and spending most of my time writing Go -- there aren't many "ins and outs" required to learn. I find that it is very idiomatic to write simple solutions and simply take advantage of interfaces if someone wants to implement a more specific piece of your code base. If you want to learn Go well all you have to do is read the spec[1] and the source code of at least some portion of the standard library.

As an aside, this project is interesting. I've been kinda curious of experimenting on a project like this on my own. However, I wish the author's documentation opened with what ideas from what papers inspired the project.

[1] http://golang.org/ref/spec

link

wilsonfiifi 4741 days ago

You should really make the effort to take a look at it, it's well worth it! If you have a 'c' background it should be relatively easy to pick it up.

This should be an easy weekend read http://www.golang-book.com/

link

throwit1979 4742 days ago

What does being 20 have to do with anything? I'm 34 and I've been deeply acquainting myself with Go over the course of the last month.

link

hnriot 4742 days ago

20 - as in before life starts to throw serious time sinks at you, wives, children, mortgages, high stress jobs etc etc.

At 20, I just did whatever the hell I wanted, bummed around Europe before figuring out where to do my masters. Responsibility was not paramount on my mind. Maybe 20 something's today are different, but not the ones I know, it's still all about having fun, learning new stuff and exploring the possibilities.

I fail to see how anyone could not have understood what the comment meant.

link

hox 4742 days ago

I think he's referring to his 20-year old self, not being 20 in general. I definitely had more time to look at stuff when I was 20.

link

kingmanaz 4742 days ago

A re-write of "starbase" in Go would be a fun project for someone with the time.

http://hopper.si.edu/wiki/mmti/Starbase

Starbase is pattered after "/rdb", a flat-file relational database adhering to the Unix-philosophy, ie., piping together small, single-purpose tools. The approach is covered in "Unix Relational Database Management" ( http://www.amazon.com/Relational-Database-Management-Prentic... ), a book which anticipated the "suckless" movement by a couple of decades ( http://suckless.org/philosophy ).

It would be nice to see something like /rdb, except with: 1. Better transparent support for optional indexes when querying. 2. Automatic updating of indexes when deleting/updating data. 3. Scripts included in the package written in "rc" rather than "sh". 4. BSD license.

Perhaps something like tiedot could be built on top of the above: a single, statically-compiled binary to expose the flat file database through a JSON/REST interface and to honor the Unix user/group table-file permissions through standard HTTP authentication. Forms could be designed against the web service while system administration is handled with as much of the unix system as possible.

Such a stack would be great for smaller start ups and where *nix experience is available.

link

jedc 4742 days ago

Does anyone know how this does/doesn't compare to Camlistore? http://camlistore.org/ (which is also written in Go)

link

howardg 4742 days ago

according to my (limited) understanding, camlistore is a generic BLOB storage - which is not something tiedot addresses. tiedot is a generic unstructured data storage - more like CouchDB/Cassandra, it stores serialized JSON data rather than BLOB.

link

throwit1979 4742 days ago

Except cassandara is a columnar data store and has absolutely nothing in common with couchdb (or tiedot) other than it stores and retrieves data.

link

farmdawgnation 4742 days ago

I'd like to see how this compares performance-wise with MongoDB and other JSON-based document stores, especially with data sets that are at a larger scale. I know Mongo tends to start crumbling if it can't fit an entire index in the available memory (which happens when you have a 4GB data set, unfortunately). Have you done any of those comparisons?

That said, this looks really interesting. Though I can't imagine for the life of me why you'd indent such a wonderful project with tabs. ;)

link

howardg 4742 days ago

Thank you for the feedback! I noted down your recommendation here:

https://github.com/HouzuoGuo/tiedot/issues/3

And actually `go fmt` prefers to use tab over spaces ;)

link

karaziox 4742 days ago

The formatting style is standardized by go (Enforced by gofmt) and they made the choice of tabs over spaces :)

http://golang.org/doc/effective_go.html#formatting

link

willvarfar 4742 days ago

How does it stand on ACID, joins, redundancy, scaling etc?

Its a hobby project so I'm going to presume the worst, but I want to know what the author intends to do for each of these.

link

howardg 4742 days ago

Indeed, its wiki pages should have mentioned ACID properties, I noted the issue down here:

https://github.com/HouzuoGuo/tiedot/issues/4

Currently, tiedot's stance on ACID is similar to MongoDB's.

I have not spent enough time on the project to support redundancy in it, sorry. I totally agree that redundancy is a must-have if someone wants to use tiedot in serious scenarios, so I will definitely spend time on making this feature available.

Scalability on symmetric multiprocessing architectures has been seriously considered and implemented - basically, tiedot can demonstrate that more CPUs = more performance. However scaling by replication has not been considered yet.

link

jerf 4742 days ago

You should decide if you want to make a serious run at being an option for true production-quality deployment, or if this is a fun project. If it's just a fun project, you may want to consider not worrying about replication/redundancy; it's tricky, quirky, and if you haven't been considering it from day one, likely to require a near-complete rewrite, which may be an awful lot of work for a fun project. Of course, if you are going to be serious, it is a must.

I am completely neutral on which direction you go; my point here is just that if you are just having some fun, you may find replication will turn out to be, well, potentially rather unfun. Educational as can be, though. It's a far, far more subtle problem than initially meets the eye.

link

arethuza 4742 days ago

Or it could be an embedded database (like SQLite) that keeps replication and redundancy out of scope?

link

phasevar 4742 days ago

Cool to see something like this in Go. :-)

link

nickpresta 4742 days ago

Looks pretty interesting. I like the embedded use. I submitted a PR to enhance your software: https://github.com/HouzuoGuo/tiedot/pull/6

link

trailfox 4742 days ago

Might help to give it a well known open source license e.g. BSD, MIT, Apache

link

howardg 4742 days ago

The license is 2-clause BSD license.

link