Hacker News new | ask | show | jobs
by StreamBright 2338 days ago
I have created a simple workflow using AWS Lambda + Kinesis + S3 to track our customers and not to have any 3rd party dependency. It took roughly 2 weeks but it is worth it since do not leak customer data and we have much tighter control over what we collect (no PII except the source ip that gets hashed in the process).
2 comments

FYI, if your setup relies on API Gateway, you probably could use VTL / Mapping Templates to directly send from API Gateway to Kinesis and skip the lambda altogether, like some do for dynamodb

See https://hackernoon.com/serverless-and-lambdaless-scalable-cr... And https://aws.amazon.com/blogs/compute/using-amazon-api-gatewa...

Woo thanks! I did not know it. I might re-architect the workflow to have this.
> I have created a simple workflow using AWS Lambda + Kinesis + S3 to track our customers and not to have any 3rd party dependency.

Except for each of the 3 components you listed that make up your system. They are 3rd party dependencies,

Everything is a 3rd party dependency then. The only way to not have a 3rd party dependency is to build your own infrastructure and use open source solutions (and even with OS you're still dependent).

I think OP was clearly referring to a self-managed solution as opposed to a set of 3rd party services like GA, Segment, etc, where the flow of data is out for your control.

I meant no 3rd party dependency on storing customer data that requires extra legal work in GDPR land. Maybe we need to include AWS in that though. I need to look into it how cloud vendors are 3rd party in that sense. Is there a difference between Google Analytics vs. storing data on S3 even if we do not collect PII?