Hacker News new | ask | show | jobs
by pc 1827 days ago
(Stripe cofounder.)

> Considering that Stripe was originally known for letting websites accept credit card payments without seeing your credit card number, one might assume that Stripe Identity only allows websites to see the verification result, and not your selfies and scans of your identity documents.

A few points:

- Fundamentally, Identity makes it possible to choose how much of this data traverses / is stored on your servers, just as Stripe did with card numbers.

- There's a basic difference between card numbers and identity verification. With card numbers, you (generally) don't really care about the number -- you just want the payment. With ID verification, however, many businesses have good reason to want more than just the verification result. For example, they are often subject to compliance requirements that mandate that they themselves possess or have access to the raw information. They may need or wish to perform additional checks on their side. Etc.

- The relevant UI in Identity is deliberately very clear on this points in order to avoid the assumption you're stating. The flow explicitly says "Stripe and [Business] may each use your data." Even though an end user might consider it suboptimal for the business to have their data, we still view it as an improvement to the usual status quo, where this data is frequently stored in very ad hoc fashion and without rigorous security protections.

- While many of the businesses initially building on Identity wanted access to the raw information, it may well make sense for us to enable them to restrict themselves in the future. In this world, Stripe could tell their customers that the business doesn't have access to the raw details. (This might even make sense for Stripe payments in the future.) As a philosophical matter, we consider ourselves to serve the business, which means that limiting access to what we consider to be the business's own information feels a bit strange. That said, it might sometimes be in the interests of the business to allow them to limit themselves in this fashion (especially as Stripe's brand recognition among consumers grows).

- There's a separate concern about compromise of the business's credentials leading to inadvertent disclosure of this information (a situation analogous to an S3 bucket key getting leaked). This is of general concern to us in lots of situations, not just with Identity. We have some new functionality on the way here.

13 comments

Thanks for your reply.

> Fundamentally, Identity makes it possible to choose how much of this data traverses / is stored on your servers, just as Stripe did with card numbers.

There's a stark difference in how Stripe treats exports of card numbers versus exports of raw identity verification data. This makes it way easier, and more likely, for Stripe customers to choose to store raw identity verification information.

> With ID verification, however, many businesses have good reason to want more than just the verification result. For example, they may be subject to compliance requirements that mandate that they themselves possess or have access to the raw information. They may need or wish to perform additional checks on their side. Etc.

I acknowledge that some businesses have a need for this. But I see Discord and Clubhouse among your customer logos, and your product page talks about non-KYC use cases. Many of your customers will have access to identity documents without really needing it. That sucks for the end users of Stripe Identity, because it makes it more likely their data will be misused.

A concrete suggestion: make it possible for businesses to choose whether they have access the raw data, and expose the choice to the end user in the Stripe Identity flow. Ideally, businesses that want the raw data would be subject to security compliance requirements. This is an opportunity for Stripe to be a leader in setting high standards on how this type of data should be handled.

Appreciate your feedback. On the first point, limitations on what the secret key can access are coming very soon.

> A concrete suggestion: make it possible for businesses to choose whether they have access the raw data, and expose the choice to the end user in the Stripe Identity flow. Ideally, businesses that want the raw data would be subject to security compliance requirements. This is an opportunity for Stripe to be a leader in setting high standards on how this type of data should be handled.

Yes, per GP comment, I think this is a good idea. I suspect we'll do it.

+1 on being able to choose. I’m building a personal finance app right now, and where I can I’m choosing to not ingest or retain sensitive data. While the origin of this is scratching my own itch, I suspect that I’ll get better traction if I can overtly say I’m not collecting data I don’t need or holding onto it for longer than you want me to. I’d love to be able to just get a Boolean back.
Here we go, online IDs. It seems inevitable that some entity will leak this data at some point. Then what?
Businesses collecting identity information is nothing new. Somebody like Stripe putting a concerted effort out there to make it more secure and improve the experience so that identity information is stored in a less ad-hoc way is a win and will reduce the odds of some catastrophic leak. If you are only worried about identity leaks now then you are simply miss-calibrated on your assumptions about the nature of online identities. If you are seriously this worried, then you probably shouldn't be using the internet for anything.
> so that identity information is stored in a less ad-hoc way

It will be more ad hoc. Stripe does not decide how their client stores such data. Stripe will make asking for an ID very easy and that will vastly expand the number of businesses utilizing this method of registration.

Right now I think of Stripe as a reliable service. When one of their customer's data is breached or leaked, I don't know that everyone will still trust Stripe as a brand. News articles about such breaches won't be able to relate the nuance of who's at fault.

I'm not concerned about my online personas being linked to me. I'm concerned about making it easy for bad actors to perform identity theft en masse.

I'm not sure you understand. When a business needs your ID to do business, they ask you for it and store it in their infrastructure. This already happens today. Nothing Stripe is doing necessarily changes this. Stripe is simply providing a streamlined mechanism by which business can fulfill their KYC requirements and obtain this information. And now they have the choice to continue to store it in their infrastructure or look it up via the API as needed. If somebody breaches WellsFargo and dumps all the identity info of their customers, clearly Wells Fargo is at fault. Nobody will care if the entry form where they put their info in when they signed up for a bank account was hosted by Stripe and white labeled by Wells Fargo, or if there was a permission box that popped up from Stripe asking if you'd like to allow Wells Fargo access to your info, or if it was simply hosted by Wells Fargo. I don't see the problem here.
It’s great that you think that limiting the firehose-style wild-west dissemination of people’s identity data might be a good idea and I have good feelings about your suspicions, I suspect they might be well founded.

Might as well wait until anybody that can drag and drop Stripe code into their app gets as many photos of people’s IDs and faces and security questions from their users and squirrels it away into their private databases.

Once that’s done it’ll be a good time to fire off a blog post about how not doing that was always in the works and announce groundbreaking features like “basic privacy permissions for identity data “ will become default.

Maybe it’ll be a paid feature for end users?

Fully agree here - I would say that I am a bit shocked at the lack of regulation regarding access to people’s identity documents as compared to credit cards. Credit/debit cards are your money, and there’s an entire network of both regulations and intermediaries working against fraud in this space.

Your identity can create new credit cards. It can take out loans. It is inherently a higher order security risk, and therefore should by default have more restrictions. I as a consumer trust Stripe to do the right thing, but I do not trust its customers. This seems to be the most reasonable stance, but yet the policy does not reflect that. I am concerned that this wedges open a really big new avenue for cybercrime without having any sort of regulations in place a-la PCI audits.

> Your identity can create new credit cards. It can take out loans. It is inherently a higher order security risk, and therefore should by default have more restrictions.

It's a security risk because of the first couple things you listed. The problem is that identity cannot be simultaneously a secret and a public identifier. As the name should suggest, identity serves a much better use as a public identifier. So we should stop treating it like a secret and start creating real infrastructure for actual secrets.

By the way, this is completely analogous to credit cards. There's a reason the industry has moved to chip cards physically and tokenized cards virtually. And that's because the card number was serving as both identity and secret, and that doesn't work. The deviation is that, in this case, we've decided to make the credit card numbers a secret which is cryptographically protected (chips) or at the very least stored in an opaque manner (tokens).

> I would say that I am a bit shocked at the lack of regulation regarding access to people’s identity documents as compared to credit cards.

To some degree it's because there isn't much point. You can call up my home state today, pinky promise that you're me, hand over $20, and they'll ship you my birth certificate or other important documents. We don't have private keys or other kinds of unique identifiers assigned at birth, so attempts to lock it down further would lock people out of their own identities.

Scale does matter, and a breached database of identity documents is definitely worse than having to pay a nominal fee and wait a few days, but given the context of other manual labor like securing loans I'm not sure the extra ease would result in much more fraud.

It's supposed to work in quite a few countries, and not all make it so easy. Given the requirement in my country for ID when obtaining any other ID, I'm actually puzzled about what happens if you lose everything.

https://stripe.com/docs/identity/verification-checks

For me, the general process would require a police report for lost/stolen ID (mandatory, so that it can be marked as lost/stolen so that it would be detected if someone tries to use it) and verification with the data they have on file - nowadays with EU biometric IDs they can be quite sure that I'm the same person as the one who got the previous ID as the face and fingerprints can be verified.
There's an honor system in many places. You sign a document stating you are who you say you are, and have it witnessed by someone who is "deemed trustworthy" - local police, teacher, clergy.
When Stripe handles the data of residents of the European Economic Union it is subject to the General Data Protection Regulation [0].

[0] https://en.wikipedia.org/wiki/General_Data_Protection_Regula...

Just from an end user POV, would I be able to request from Stripe a logs for metadata about which type/how much of my personal data has been shared to the companies?
> Ideally, businesses that want the raw data would be subject to security compliance requirements.

Isn’t that already true for businesses that store this data from any source?

No. Unfortunately, most businesses in the US are not under any compliance requirements or regulations around identification. Certain states have special rules (like California I think?) but in most places US businesses can generally do anything they want with an ID card or relevant information, so long as they don't impersonate you or commit a crime with it.

Given the way Stripe has implemented this today, Stripe might as well be selling their business customers a <input type="file" /> tag for Driver's Licenses, because that's the level of security 99% of all business will be using around this. There's going to be Amazon S3 buckets filled up with Drivers Licenses JPEG's provided by Stripe Identity, in a few months time.

> There's going to be Amazon S3 buckets filled up with Drivers Licenses JPEG's provided by Stripe Identity, in a few months time.

What makes you think these don't already exist? Have you ever needed provide your identity information to use a service online (e.g. a insurance service, bank, alcohol/weed delivery, crypto market, etc.)? Where do you think the identity information you provided is stored?

If you don't use these type of services, then nothing will change--stripe won't magically have all your identity info. If you do use these services maybe they'll partner with Stripe, maybe not. The only outcome I can see from this news is that it's likely there will be fewer AWS buckets with your identity info moving forward, because Stripe can do that for you now.

Putting my lazy developer hat on for a second here… I think I would choose to store the Stripe Identity token in my db and then pull the JPEG’s on demand from Stripe’s API. Saving the image to S3 would be additional work, and well, I’m a lazy developer.
Depending on where you're located, there is a responsibility to only take information you require.

I get your point, but you seem to be implying this data is captured without the customer being aware. That will not be the case, surely.

Hey Patrick,

> As a philosophical matter, we consider ourselves to serve the business, which means that limiting access to what we consider to be the business's own information feels a bit strange.

Maybe I'm wrong , but once a customer upload the document on Stripe Identity they are supposed to be YOUR documents.

I worked in Bank as a Service , fundamentally when a customer goes through a verification process , the documents uploaded are not the owned by the partner using our APIs. They are owned by us , the Bank.

For Stripe Identity the same should have apply. Here the goal is not "Lock the Partner" but rather to protect them.

Now that discord has access to my Passport , in case of an identity theft could you tell me EXACTLY whose liable for the leak in regards to the law ?

With BaaS it's pretty clear , the Bank carry the responsibility to keep those documents safe , thus it's safer to not give access to a basic business to the raw details.

With the current API design you are offering, it's more ambigous and more prone very large leak within a business information system like Discord or Uber etc..

Those leak will happen.

> Now that discord has access to my Passport , in case of an identity theft could you tell me EXACTLY whose liable for the leak in regards to the law ?

Discord only has access to your passport if you upload it to them. They don't have access to it by virtue ofthem being a stripe customer.

Do you verify when a business downloads our identity documents from your servers that they're only doing so to meet regulatory requirements? What promise do we have you're not just making it as easy as possible to obtain drivers licenses, passports, birth certificates, etc. so that every little monster who has something we want will start making it a requirement? Have you considered how your service might impact trans people or undocumented citizens?
> With card numbers, you (generally) don't really care about the number -- you just want the payment.

I don't ever want to have a card number in my database or via a administration system (my own or my provider's).

So I care... but just perhaps not in quite the way you're thinking :)

There are many use cases where it's enough to verify that the user is an actual person, and also to prevent the same person to have multiple accounts. So, it would make sense that Stripe verifies the person, but keeps the details from the business itself.

I trust Stripe more than a random online forum, a dating app, or a social network, which might offer a higher quality service when people are verified. There's a high risk that the ID documents will leak from these services at some point if they get access to them. I don't want them to know who I am at all, if they don't need to know.

It would also offer a way for preventing sybil attacks on P2P networks, or help connecting to non-evil nodes on a P2P network (such as Bitcoin Lightning Network) without knowing the other person. In these cases there could be a some kind of signature generated by Stripe that could be used as an additional trust factor without centralizing the system.

One of the points brought up by privacy folks in review of Apple’s plan to have your ID in your digital wallet is that the mere convenience of allowing access to ID may create ID requirements for users where none existed before, which is a loss for privacy. Do you think that Identity is going to create such new requirements?
I sure hope so! Anonynimity is not a fundamental human right, it is a tool that should be used sparingly and only when the situation is appropriate (whistleblower, for example). The internet would be a better place if there were more identity requirements SO LONG AS companies are not legally allowed to sell or transmit that information to advertisers or other third parties without explicit opt-in consent ideally on a per-use basis. Or simply at all. If easier access to online identity systems means we as a society turn focus on legal ground rules governing how that data is treated and used, then we'll be in a really good position (: I'm excited.
What a terrible, broad statement to make, and on an anonymous forum of all places. There are plenty of places where default anonymity makes a lot sense and it is important to a good societal structure. History has shown time and again that those in positions of advantage will abuse their access to information for their own gains. Increasing the surface of your online activity trail can and will be used against you by a bad actor when the opportunity arises. There is simply no good reason to make identity requirement as the default. There is a reason identity requirements have traditionally been restricted to highly regulated entities, but off late there seems to be a trend of "internet companies" freely exchanging KYCs with each other. This blurring of boundaries between banks and regular companies is a dangerous precedent and I'm afraid it will be too late before we realise the net damage to society as a result.
> There are plenty of places where default anonymity makes a lot sense and it is important to a good societal structure.

Can you list some examples of the types of places where you think this property holds true and explain what you mean by "good social structure"?

> History has shown time and again that those in positions of advantage will abuse their access to information for their own gains.

What are some examples of scenarios where this has happened in relation to online identity where there have been legal restrictions in place that would have otherwise prevented it? The healthcare industry and credit card industry seem to do a pretty good job of protecting sensitive information, for example.

> Increasing the surface of your online activity trail can and will be used against you by a bad actor when the opportunity arises.

How anonymous do you think you are online? If you're not deliberately taking steps to conceal your identity, your trail is thick and clear for the people who know how to track it. And that's an actual problem: people track you even if you think you're anonymous and we have no legal protection in place to prevent abuse of data that can identify you online. If you are in a position where you need to *depend* on anonymity, you simply can't because nobody will respect your wish. So the internet operates in this grey zone where because we have no rules governing abuse of PII, everyone throws on the cloak and turns to anonymity as the answer. This degrades our ability to fight spam and makes things like strong mutual authentication very very hard to do because platform vendors can't ever expose any sort of fixed identifier because privacy. Look at the insane things Apple does: zero out your mac address when scanning for wifi networks and recently issue a new certificate for every single use so that a persistent identifier does not show up. And look at IPv6, we invented "privacy extensions" where you generate a random IP every few minutes. These hacks break functional systems because we don't understand how to regulate the internet as a society.

All that is somewhat irrelevant, though. We're talking about the identity relationship between you and a service, not necessarily "the features of interacting with the internet that can be recorded and tracked either on purpose or incidentally". Do you think your email address makes you anonymous? Again, unless you're deliberately taking steps to maintain pristine op sec with your online browsing, you identify yourself to service providers one way or another. And again, the problem is people think they're anonymous when they really aren't so they misinterpret what it means to be anonymous and its importance in good societal structure. I honestly don't see a difference between providing a service your email address or your physical address or telephone number. What's so bad about having a third party say "yeah, this person is who they say they are" and optionally "and here's the list of verified fields"? The internet is the only place where people get weirded out when someone asks for an ID. Do you not show the bar tender your ID when asked because you need to be anonymous at a restaurant? How about at the gas station, the liquor store, the axe throwing range, the DMV, the hospital, when making a purchase on a credit card, taking out a loan, etc. What real world interactions do you have that are primarily anonymous? It's not normal.

Strong identity combats spam and abuse. I would choose strong identity over spam almost every single time. I do not disagree that there are some online communities that are respectfully anonymous. But do you think e.g. Reddit is one of those? Because I do not. Regardless, you can still both a) identity check and b) run an anonymous community (and c. not store identity information). You don't have to expose the identity data in the product/community/forum itself, so nothing about making identity easier to use and more streamlined defeats the ability to operate pseudonymous services in the least. I really don't understand the "anonymity by default is good for a wholesome society" angle whatsoever.

Oh no, I'm not going to go down that slippery slope. We are not talking about CIA whistleblower levels of anonymity here. This is just basic sanity. You may never be able to fight abuse 100%, so it's good practice to reduce the surface of compromise as much as possible. If the information is not needed, just don't send it. It's about de-risking the possibilities. The fact that banks, healthcare institutions etc. are trusted within a boundary does not automatically mean every tom and dick company out there should be trusted as well. There must be a strong justification for access to identity and spam is certainly the weakest out there. Fake identity is not hard to create. Bank fraud is rampant in many countries where fraudsters run large rings using such fake accounts. If banks are not able to stop these, online communities for the purpose of bot detection most certainly won't.
Fake identity is is not hard to create online. You’re right! That is the problem. Fake identity is orders of magnitude harder to create in meatspace. You don't solve that problem by saying “welp I guess we just have to deal with spam to realize pseudo-security via anonymity”. I don't disagree about privacy, even. I think you’d find we agree about not sending information you don't need. Where we talking past each other is on the topic of anonymity vs privacy. I want strong identity and privacy and tools and laws that protect my identity and privacy online as well as offline. Tools that let me manage who has access to my private information and for what use cases. Tools that alert me when that information is accessed or shared. Tools to allow me to verify the information provided by others is genuine. This has nothing to do with anonymity.
> The internet would be a better place if there were more identity requirements

This is a completely baseless claim, as most arguments against weak (ie pseudo) anonymity seem to be. Outside of banks, healthcare providers, and payment processors, I see little of benefit. Before bringing up any arguments that involve poor behavior or misinformation, please refresh yourself on the current state of Facebook (where nearly everyone is using their full name).

I already think twice before (and often decide against) using a service that requires my phone number. I will _never_ use Discord or Twitter (in my personal life at least) for this reason. Except for banks, liquor, and the pharmacy, I am almost certain to decline doing business rather than providing my ID.

I'm curious, do you take this same stance in meat space? Would you rather not know who your friends are and address them by a changing handle? Would you rather be given a pseudonymous name to use for the duration of your trip to the grocery store? Would you prefer to be delivered a new car every time you need to go somewhere so people can't associate you with a vehicle? Do you really have these anonymity requirements.

The claim is not baseless. There are strong technical reasons why identifying the components in your system is a good thing. and there are practical social reasons.

> I'm curious, do you take this same stance in meat space? Would you rather not know who your friends are and address them by a changing handle?

There are many people I'm friendly with that I know little about. They could very well be giving me fake information about their life. I don't see this as a problem.

> Would you rather be given a pseudonymous name to use for the duration of your trip to the grocery store?

Well in most cases I wouldn't give anyone any name at all. Why does the grocery store require my name?

> The claim is not baseless. There are strong technical reasons why identifying the components in your system is a good thing. and there are practical social reasons.

There are also strong technical reasons not to. And there are practical social reasons not to. As far as I can tell, you've provided essentially no argument supporting this general claim:

> The internet would be a better place if there were more identity requirements

We already have a society that identifies people when doing business. The burden of proof is on an anonymity advocate to demonstrate why that is harmful and should be changed. I may mot have convinced you that having strong identity enables strong security and reduces spam (that is my argument). But it’s also not my problem if you aren’t aware of the nuances surrounding how security, privacy and anonymity work. You haven’t made any compelling argument as to why we don't need identity in cyberspace beyond a naive axiomatic assertion that “businesses don’t need them so they shouldn’t collect them” and some FUD level fear that strong identity is an Orwellian technology hell bent on ruining your life. There is so much nuance I don't feel like we’re doing the topic justice. There is a huge spectrum between “ad tech tracking everything you do” and “everyone looks like a spam bot”. The mindshare is heavily skewed toward spam bot because ad tech is abusive. You can have strong identity and privacy without invoking anonymity. You can be anonymous and still fall victim to fishing attempts and scams. Anonymity is not synonymous with security or privacy. Security means you know who you’re communicating with online so you can establish trust. Privacy means you don't need to share invasive personal details in the regular course of existing in society. Anonymity means nobody knows who you are. I want a society where my digital communication with other people is authenticated and a baseline of trust is established. Do you use a secure messenger app that has E2E encryption? Guess what, that depends on strong identity. You are not anonymous but you are private. I would take a secure and private society every time over an anonymous one that offers weak, if any, guarantees of security and/or privacy.

I work on a product that doesn't collect any PII. We made the decision very early on not to collect any information we don’t need because that’s literally not our business. I am deeply aware of the landscape on these topics. However, as a society we cannot run in a “normal meatspace anonymous cyberspace” mode. We need to bridge civil identity in a secure and private (those are fundamental human rights) way into the online era. That is the core focus of the product I’ve been working on. In reality people have identities whether they use them offline or online. The goal is to protect those identities so they cannot be abused, not remove them altogether.

So do you provide your full name, street address, phone number, drivers license, and social, to everyone you meet? And do you require that from everyone you wish to be friends with? Otherwise how do either party know the other is not providing false information? This is essentially what you are stating you are hoping for on the internet by allowing every company to request identity information.
> The internet would be a better place if there were more identity requirements SO LONG AS companies are not legally allowed to sell or transmit that information to advertisers or other third parties without explicit opt-in consent ideally on a per-use basis. Or simply at all

This is a pipe dream. The online world spans the globe and we can only enforce the law in our own respective countries.

And even if all countries were cooperative about enforcement, distributed communication tools already exist. The internet has always been a place where you can go to share your thoughts without worrying about what your family or friends think. I don't think that will change in our lifetime, if ever.

Anyway, the market can sort this out. If using an ID to authenticate your Twitter account makes Twitter more successful than its competitors, great! I would not count on it.

A fully anonymous society is also a pipe dream. It doesn't work.

You already provide your name and phone number and email to Twitter. You already identify yourself. We're talking about making that exchange more reliable and more secure...

I haven't called for a fully anonymous society. I said realistically we cannot force people to identify themselves across the world. And, once there is a breach of identities, we will be back to where we are now where we can't reliably sort out who's who. It is a pointless exercise that potentially enables authoritarian regimes to silence dissent indefinitely. No thanks.
> it may well make sense for us to enable them to restrict themselves in the future. In this world, Stripe could tell their customers that the business doesn't have access to the raw details

This sounds great -- I don't want to be handling sensitive data of users, and I don't want to give sensitive data to businesses. But I'd rather this be a separate Verification product, with different branding, docs, and UI, so users and businesses are all clear on what's happening to user data.

> subject to compliance requirements that mandate that they themselves possess or have access to the raw information

It's literally called "(K)now (Y)our (C)ustomer".

And such a short edit distance from CYA!
Very glad to see that 4th bullet point there. I really like the option of, as a business, being able to say "No, I want to know whether the ID matches their Name/Address, but I don't want to be able to access the image data".
Any plans to add developing countries, in particular the Philippines?
How are you going to handle E.E.U. citizens? It seems that the GDPR applies here. The only real solution I see is to have a separate E.E.U.-based company.
Do you feel in doing this that you're making the web worse? As a business, you certainly have no obligation to be ethical, but doesn't it feel a bit strange as a person who presumably grew up with the web to be playing such a big role in harming the people who use it?
Emphasis mine.

> They may need or wish to perform additional checks on their side. Etc.

So they get all the data in the off chance that a Stripe customer might want to do something with the data aside from the basic “yeah our large global identity verification service says this person is legit.”

I’m not super clear what a company might ”wish to” do with that data that isn’t served by the basic “this person is who they say they are” function (Does Stripe need their clients to act as guinea pigs to see if the service actually works as intended? If their mysterious black box “wishes” turn up a case where this isn’t working as intended, are your customers required to share that data with you to ensure the overall reliability of the Stripe Identity service? Or do they just get to build a database of info they get from Stripe Identity?)

> While many of the businesses initially building on Identity wanted access to the raw information, it may well make sense for us to enable them to restrict themselves in the future.

Oh nevermind, asked and answered! Just turn on the data hose to whoever has a website and will pay Stripe for identity data and maybe adjust it later if you catch some flack for this practice?

It’s kinda hilarious that the whole “people trust Stripe with their data” as part of the sales pitch as if this didn’t come across to me (a layperson) as a direct violation of that particular trust.