Hacker News new | ask | show | jobs
by paulddraper 2318 days ago
> If I’m troubleshooting by logging into EC2 instances, there is something wrong with my logging infrastructure.

I suppose it's possible to build enough logging to account for an interactive SSH session for debugging problems...but that would be massive.

I ran out of disk space. Why?

1 comments

If you’re logging to a local disk on ephemeral VMs, that doesn’t make the situation any better.

That’s why you need a central logging facility. If you’re using AWS, you could store your structured JSON logs in S3 and query them with Athena. (https://medium.com/quiq-blog/store-json-logs-on-s3-for-searc...)

Of course there are other ways both using AWS and third party services. Centralized logging is a solved problem.

AWS isn’t going to run out of disk space any time soon. You could also use a lifecycle policy to delete old logs or move them to a lower cost storage depending on your retention policy.

I’m not saying that I have never had to log on to a VM to troubleshoot, but that’s a sign of the need of better logging.

And if my logging infrastructure isn’t good, how pray tell will I troubleshoot my programs running on Lambda or Fargate?

I never said your disk was full with logs.

> how pray tell will I troubleshoot my programs running on Lambda or Fargate?

That is indeed a big problem running on Lambda and Fargate.

In my experience, Fargate isn't very commonly used and Lambda is used for only relatively simple things.

It’s not a problem at all with lambda or Fargate. Logging can be as simple as printing to the console and they go to CloudWatch.

It’s the same concept. If you’re troubleshooting at any point involves needing to log in to an EC2 instance, you might as well have a few bespoke servers called “Web01” and “Web02”. You’re just using ASG to create pets at scale. We run an ASG in production that scales from 2 to 30 instances based on the number of messages in a queue, lambdas running all of the time, some a Fargate tasks etc. it would be a nightmare to troubleshoot all of those processes without centralized, queryable logs.

In my experience, Fargate isn't very commonly used and Lambda is used for only relatively simple things.

And that experience is representative of the entire AWS ecosystem?