| I could go pretty deep here, so let me know if I should elaborate on anything. The backend is a Ruby on Rails application that serves the frontend app's API. This interfaces with the user tables, database, and handles all the "state" of the app. The serverless stuff has changed over the months, but primarily it handles the stuff I don't want Rails to handle: file uploads, video processing and transcription. First, huge props to the Mux (https://mux.com) team and product. I can not express how easy it has been to build video (and audio) products. File uploads are handled to AWS/GCP (depending on a few things) and then trigger a serverless callback to Mux.com. Mux was the fastest way we found to turn an arbitrary video file (mp4/mov/etc) into HLS format for quick streaming. Then once the video is uploaded, we have another serverless callback that sends the video for transcription using Assembly AI (https://assemblyai.com). There are a ton of transcription based services and they vary dramatically in quality, based on the media content. I believe Google/Amazon services were largely built around the need to process phone calls, so unless you may for their "enhanced" models, the quality is surprisingly bad (and surprisingly slow). I *highly highly* recommend Mux and Assembly AI if you are doing anything video/transcription based work. To get an immediate update to the end user, we actually process two transcript requests - one that is just the first 60 seconds, and then the remainder of the video. This lets us render a preview transcript in the first 15-20 seconds. We also have a serverless pipeline for generating the videos, but I won't go into that unless you're interested. In short, a serverless function kicks off a Docker instance running on ECS. The requests to the serverless apps (mostly Node) have a callback to the Rails app, which then updates the end user state using websockets (which are very easy to use in Rail's ActionCable). |