Hacker News new | ask | show | jobs
by yann_eu 2196 days ago
I'm Yann, one of the founders of Koyeb. Koyeb is a platform for developers and businesses to run serverless data processing apps in minutes.

We provide an easy to use platform to build production-grade workflows for all your data, including image, video, audio, or document processing.

To provide a little bit of context, we previously developed Scaleway (https://scaleway.com/), a European Cloud Service Provider, and started Koyeb initially around multi-cloud object storage (https://news.ycombinator.com/item?id=21005524) We are now going a step further: we are trying to also provide an easy way to process data and to orchestrate distributed processing from various sources.

Currently, we provide an S3 compliant API to push your data, you can implement processing workflows using ready-to-use integrations (https://www.koyeb.com/catalog) and store results on the cloud storage provider of your choice (i.e. GCP, Azure Blob, AWS S3, Vultr, DigitalOcean, Wasabi, Scaleway, or even Minio servers).

We're working on adding support for Docker containers and custom functions to let our users combine catalog integrations with their own code in workflows. We will also add support for new data sources to send, ingest, and import data from different services.

We of course take care of all the infrastructure management and scaling of the platform.

The platform is in early access phase and I'd love to hear what you think, your impressions and feedback.

Thanks a lot!

3 comments

I wasn't able to find a reference for the yaml schema faster than I can open the "add comment" page, so apologies if this is address by some docs somewhere:

Given that `steps:` is a list, isn't having `after: video-clipping` redundant, since it already comes after the video-clipping step?

The `after` attribute is present to let you implement your processing logic. The steps list contains the processing actions but it is not a sequential execution. The workflow can have multiple processing branch and perform a series of processing on the result of a specific step.
I'm curious as to how it works to develop on it locally vs a stage server & vs prod? I.e. how to make sure the workflows are synched and are easy to reason about?
We’re currently working on a deep Git integration where you basically push your updated workflow configuration with the code of functions/docker tag reference.

We have some tooling to develop individual catalog integrations locally and test the integration with object storage works as expected. We plan to publicly release this tooling for all users to be able to test their functions locally before using them on a workflow.

For workflows environment, currently, right now, you have to create one processing Stack for each environment, i.e. dev, staging, and prod. Later on, we want to spawn environements for each Git branch.

How does this compare to the Serverless framework?
We deal with both multi-cloud processing and storage with a managed platform when the Serverless framework simply allows you to configure and deploy functions on main cloud service providers.

We allow you to build, deploy and run (i.e. we operate the infrastructure) processing workflows using ready-to-use integrations, containers, or custom functions. We also provide a multi-cloud storage layer, you can use and push data stored on multiple cloud storage providers with a simple S3 interface instead of having to deal with each object provider implementation.

We also plan to be compatible with the serverless framework for the custom functions part.