Hacker News new | ask | show | jobs
by placardloop 408 days ago
This so called “security risk” is a role in a nonprod that can list metadata about things in your production accounts. It can list secret names, list bucket names, list policy names, and similar.

Listing metadata is hardly a security issue. The entire reason these List* APIs are distinct from Get* APIs is that they don’t give you access to the object itself, just metadata. And if you’re storing secret information in your bucket names, you have bigger problems.

3 comments

Depending on what the metadata is, it can be a huge security risk.

For example, some US government agencies consider computer names sensitive, because the computer name can identify who works in what government role, which is very sensitive information. Yet, depending on context, the computer name can be considered "metadata."

AWS does not treat metadata with the same level of sensitivity as other data. The docs explicitly say that sensitive information should not be stored in eg tags or policies. If you are attempting to do so, you’re fighting against the very tool you’re using.
To add on this point, in my interaction with AWS employees it seems that

- The account manager and the enterprise support TAM can view a list of all resources on the account, including metadata like resource name, instance type and cost explorer tags. Enterprise support routinely present a monthly cost review with us, so it is clear that they can always access this information without our explicit consent. They do not have the ability to view detailed internal information about it though, such as internal logs.

- When opening support case, the ticketing system ask for resource ARN which may contains the name. It seems that the support team can view some data about that object including monitoring data and internal logs, but potentially accessing "customer data" (such as ssh-ing into an RDS instance) requires explicit, one off consent.

- I never opened any issues about IAM policy, so I don't know if they see IAM role policy document

- It seems that the account ID and account name is also often used by both AWS' sales side and reseller's side. I think I read somewhere that it is possible to retrieve the AWS account ID if you know S3 bucket or something, and when exchanging data with external partner via AWS (eg. S3, VPC peering) you're required to exchange account ID to the partner.

I used to work for an AWS support partner, essentially being tier 3 support. We were also involved in onboarding large customers to AWS and working with their teams to migrate their systems.

We always told them that AWS account IDs are not considered sensitive information by AWS, neither are S3 bucket names. Metadata (tags etc) is generally visible to AWS in a variety of ways if you ask for help, but are not public.

This helped them use the services to the sensitive information was actually hidden and apply the correct security policies.

Most of the access youre describing is based on AWSServiceRoleForSupport https://docs.aws.amazon.com/awssupport/latest/user/using-ser.... You can see both the IAM policy and the cloudtrail access logs in your account. There are internal tools which use IdP + business justification (ex open support ticket for a specific service) before giving a human limited, predefined, access to that role.

Some services will have an internal “admin” tool that is limited to a smaller group with similar limited access + review mechanisms. IME those tools are generally built in to the service implementation and dont expose access via a similar service principal/role. The reduced customer visibility is mitigated by very restrictive access, like “team primary oncall + manager approval + high severity ticket”.

I invite you to consider the possibility that even though that’s the case, it’s Amazon’s fault for this design choice and one that can be critiqued especially since metadata disclosure can be paired with other exploits. For example, if I know a bucket name then I know the bucket’s domain name since buckets are by default created open to the public.

There’s no inherent reason for treating metadata as less sensitive and there would be fewer problems if it were treated with the same sensitivity as normal data.

Said another way, some users expect the metadata to be treated sensitively and Amazon’s subversion of this is an Amazon problem not a user problem since this user expectation is rather reasonable.

> Said another way, some users expect the metadata to be treated sensitively and Amazon’s subversion of this is an Amazon problem not a user problem since this user expectation is rather reasonable.

It's an Amazon problem to the extent that they lose business over it. But if people choose to use AWS, despite having different requirements for data security than AWS provides, that is a user problem. At some point the onus is on the user to understand what a tool does and doesn't do, and not choose a tool that doesn't meet their requirements.

s3 buckets being public by default was stopped 2 or 3 years ago: https://aws.amazon.com/about-aws/whats-new/2022/12/amazon-s3...

longer if using the console

S3 buckets were never public by default. From the link you posted:

"...Amazon S3 buckets are and always have been private by default. Only the bucket owner can access the bucket or choose to grant access to other users..."

The feature and announcement you linked was about making active an additional safety feature that would block them becoming public. Even if you intentionally ( or accidentally ) configured them with public access.

The well known accidents in the past, of Facebook or the Pentagon having private data in public S3 buckets, I can only attribute to the modern practices of self-paced learning, skipping videos on Udemy courses or deciding formal training is no longer necessary because I can Google it...

The globally unique names of S3 could be problematic with just the metadata of name.

You could figure out how a company names their S3 buckets. It's subtle, but you could create a bunch of typo'd variants of the buckets and sit around waiting for s3 server logs/cloudtrail to tell you when someone hits one of the objects.

When that happens, you could get the accessing AWS Account # (which isn't inherently private, but something that you wouldn't want to tell the world about), IAM user accessing object, and which object was attempted to be accessed.

Say the IAM user is a role with terribly insecure assume role policy... Or one could put an object where the misconfigured service was looking and it'd maybe get processed.

This kind of attack is preventable but I doubt most people are configuring SCPs to the level of detail you'd need to completely prevent this.

That’s why Amazon recommends the use of the expected owner parameter for S3 operations.

ISTR it’s also possible to apply an SCP that limits S3 reads and writes outside your organization. If not via an SCP then via a permission boundary at the least.

I know because it was one of the key decisions we made with R2 and pushed this point in the community.

The majority of S3 buckets, especially valuable ones, remain created back when it was the default and thus the metadata sensitivity with bucket names remains (and that isn’t the only metadata issue).

AWS S3 buckets have always been default private since forever.
> since buckets are by default created open to the public.

This is false

"We kill people based on metadata."

- Ex-NSA chief Michael Hayden

Metadata is data. In a large corporation, metadata can also reveal projects under NDA that only a select few employees are supposed to know about about.

That sounds more like the government’s fault for putting a secret in the name.

Make the computer name a random string or random set of words, no relation to the person or department who uses it. Problem solved.

And more problems created.

Now you have to have another system that decodes the random words to human usable words. Is that information going to be stored all in one system? Is each team going to be responsible for the translation? How is that going to be protected from information loss?

I work with systems like this so, yea, it can be done. But it cannot be done trivially.

As someone who has been in the IT department for a company, this is a basic and trivial function of an asset tracking system. It’s literally just matching the serial number or computer name of the asset to the employee.
I don't the US government is representative of any kind of advisable behavior. Perhaps if they weren't doing stuff that makes people want to murder them we wouldn't have to light piles of cash on fire to protect the perpetrators.
Whether or not it’s advisable doesn’t really change the fact that the if US government is commonly doing something then that it is not correct to describe a security impact to those SOPs as “hardly a security risk”
> And if you’re storing secret information in your bucket names, you have bigger problems.

Yeah but the design should be made on the assumption that some customers will do stupid things, and protect them.

Not an identical case, but I once bought a Cisco router for home lab/learning and it appeared to be a hardware decommissioned by one of European banks, not flashed before being handed over to some asset disposal contractor. It eventually landed on an auctioning portal with bank's configuration. The bank was very meticulous with documenting stuff like the address of the branch where it was installed in device's config and ACL names/descriptions included employees' names and room numbers. You could easily extract the names of people granted extended access to internal systems.

So while I agree with you in principal, even financial institutions do stupid things, lack procedures or their processes don't always follow them. Cloud provider's design should assume their customers not following best practices.

At the end of the day if you deploy a tool that can access production data, you need to treat it like production. That's the reality here.
No, that’s not the reality. “Production data” isn’t as black and white as that.

Metadata about your account, regardless of if you call it “production” or not, is not guaranteed to be treated with the same level of sensitivity as other data. Your threat model should assume that things like bucket names, role names, and other metadata are already known by attackers (and in fact, most are, since many role names managed by AWS have default names common across accounts).

Hey, author of the blog here :)

Just wanted to point out that it is not just names of objects in sensitive accounts exposed here - as I wrote, the spoke roles also have iam:ListRoles and iam:ListPolicies, which is IMO much more sensitive than just object names. These contain a whole lot of information about who is allowed to do what, and can point at serious misconfigurations that can then be exploited onwards (e.g. misconfigured role trust policies, or knowing about over-privileged roles to target).

ListPolicies does not show the contents of policies, so the information you mentioned isn’t possible to obtain from there.

Things like GetKeyPolicy do, but as I mentioned in my comments already, the contents of policies are not sensitive information, and your security model should assume they are already known by would-be attackers.

“My trust policy has a vulnerability in it but I’m safe because the attacker can’t read my policy to find out” is security by obscurity. And chances are, they do know about it, because you need to account for default policies or internal actors who have access to your code base anyway (and you are using IaC, right?)

You’re right to raise awareness about this because it is good to know about, but your blog hyperbolizes the severity of this. This world of “every blog post is a MAJOR security vulnerability” is causing the industry to think of security researchers as the boy who cried wolf.

> “My trust policy has a vulnerability in it but I’m safe because the attacker can’t read my policy to find out”

The goal in preventing enumeration isn't to hide defects in the security policy. The goal is to make it more difficult for attackers to determine what and how they need to attack to move closer to their target. Less information about what privileges a given user/role have = more noise from the attacker, and more dwell time, all other things being equal. Both of which increase the likelihood of detection prior to full compromise.

iam:ListRoles tells you ARNs and policy rules — at least, in their example response.

https://docs.aws.amazon.com/IAM/latest/APIReference/API_List...

I don’t think this is a major or severe issue — but it certainly would provide information for pivots, eg, ARNs to request and information about from where.

I disagree with your opinion here: The contents of security policies can easily be sensitive information.

I think what you mean to say is, "Amazon has decided not to treat the contents of security policies as sensitive information, and told its customers to act accordingly", which is a totally orthogonal claim.

It's extremely unlikely that every decision Amazon makes is the best one for security. This is an example of where it likely is not.

It’s not orthogonal. The foundation of good security is using your tools correctly. AWS explicitly tells users to not store sensitive information in policies. If you’re doing so, it’s not AWS making the mistake.