Hacker News new | ask | show | jobs
by tomnipotent 1686 days ago
Thanks! I take it this file is where I can get started to learn more:

https://github.com/jitsucom/jitsu/blob/0aaa74b59eb9d8c885c80...

I see that it instantiates an "AsyncLogger" - does the service wait until data is written to the log prior to returning success to the client?

Is the WAL the same source used to feed both database storage destinations and other SaaS destinations?

1 comments

Hi! My name is Sergey, I’m a Jitsu product engineer. I’ll gladly answer your question! AsyncLogger works asynchronously by design. There is a go channel which writes JSON’s to the log file. Answering your question: the service doesn’t wait until data is written to the log prior to returning success to the client. WAL log is designed for keeping events JSON’s between Jitsu instance restarts to prevent data loss. When you deploy your Jitsu application, it will handle service restart signals (e.g. sigterm) and closes database connections as well as other resources. All incoming events are stored in WAL log in this time. So, after the Jitsu starts, all events from WAL log will be passed to the main events JSON pipeline and stored to the destinations.
Is the WAL only used during restart, or also during normal operations? Trying to create a mental model of how data flows through the system and into destinations.
During normal operations as well. Jitsu supports destinations in two modes: stream and batch. In case of using batch mode: all JSON events will be stored into WAL asynchronously (client doesn't have to wait) and then batch destination processes WAL files in background and stores data in batches. In case of using stream mode: all JSON events will be stored into queue (which is persistent) and will be processed one by one and stored with insert statements into the data warehouse.