Hacker News new | ask | show | jobs
by drewkim 1588 days ago
You nailed it!

First guarantee is that nobody is manually going in and poking around your account details since the process you've described happens entirely programmatically.

Now, we could program our system to do things other than what's mentioned. However, we're quite disinterested in (actually, emphatically against) ruining our trust/reputation with customers (plus the general public) given our dependence on such relationships and desire to succeed as a company. All that's to say, the second guarantee is that we won't be touching such sensitive data unless given permission to do so by the user.

An example of when we might need to is if the user wants to pay their utility bill using stored payment options instead of submitting payment information. Even in this case, there won't be human eyes on this data; only our Python backend will be interacting with it.

3 comments

Take this as constructive feedback.

I don't doubt your intentions but these guarantees don't hold their weight relative to the sensitivity of the data that you will safeguard. Despite the process happening programmatically, developers will still have access to the backend where this occurs. Who has access to this backend? What's stopping any of your engineers from peeking at the database where the credentials are stored? Is this data encrypted at rest and transit? What sort of information is this process logging to either first-party and third-party services? Will the code be audited? What sort of compliance certifications are you planning to obtain?

Maybe you do have answers to these questions so if you do I suggest that you communicate how credentials are properly safeguarded. The guarantees that you mention in this comment don't inspire confidence as a) they can't be taken at face value b) makes me doubt you are taking the due diligence required to manage this data.

Take a look at these examples of companies supporting their claimed guarantees:

* https://1password.com/soc/

* https://plaid.com/safety/

Thanks for the feedback! To answer your questions: - Myself and my co-founder - Credentials aren't stored in plaintext and the encryption key isn't universally available; "peeking" at the db is quite difficult - Data (I'm assuming you mean credentials) is encrypted at rest and in transit - Only business logic and errors are logged: e.g. when processes are completed and why things are breaking - Yes, eventually - Definitely ISO 27701 & SOC 2, perhaps others

Our process for safeguarding credentials is mentioned further down in the thread.

I'm not sure what more guarantees we can give to inspire confidence other than statements taken at face value. We don't have the scale or resources to undergo rigorous third party auditing at the moment. On the other hand, one of the first conversations my co-founder and I had was about hiring a security engineer as soon as we could afford one; we definitely take the matter seriously. Did you have any other ideas of ways we can showcase our commitment to security/privacy other than "trust us"? I do agree it's not the best method but am unsure of alternatives.

This should be written down somewhere! I glanced through the website and I couldn't find any mention of the security/safety measures taken other than the UI screenshot where it says "securely connect your utility account".

If I were interested in purchasing this service I would want to know how much I can trust you with my credentials. Perhaps having a page or section in the docs that explain the security measures would be an improvement. There are other ideas in another comment similar to this one.

You could let the utility companies, which users already "trust" evaluate and then recommend/sign-offs on your engine? Or just have the utility embed your engine and bill them?
While I respect your current intentions, I cannot take you at your word in perpetuity because I understand the pressures that come from:

1. Running such a business as an entrepreneur.

2. Running such a system as an engineer.

As an entrepreneur, are you promising your customers clean data regardless of the source? Are you promising AI magic? Are you promising a maximum failure rate?

From an engineering standpoint, how will you deal with portals changing their frontends and breaking your scrapers at any time?

I cannot know what you will do in response to these pressures, but I do know that the temptation will exist to build a system that puts a human in the loop to manually collect data from portals, to manually evaluate scrapers, to manually sift through the data and figure out what kind of Machine Learning models you can use to make your business function more effectively.

Have you considered sending users to the website to update payment information? Or if not, do you perform the PCI auditing process for your handling of payment information through your systems?

I don’t personally need the answers to these pair of questions, but I wanted to put them on your radar if they’re not already. I trust Plaid far enough to scrape account providers for me, but I do not trust Plaid far enough to provide Plaid my payment details — even if Plaid could theoretically construct them, that’s just not the relationship I want with a data conduit provider.