Hacker News new | ask | show | jobs
Ask HN: What would be involved in building your own YouTube clone?
26 points by mrprogrammerguy 1237 days ago
I don't exactly want to clone youtube. But let's say I wanted to make a website where people can upload videos and others can play them in their browser. What would be actually involved in doing this?

I know quite a bit about backend and devops, but I have to admit I have no idea how to do this, if I was tasked to do this.

Also would this even be possible without flash or third party video players? I know youtube used to have an html5 version, but I can't recall what happened to that.

My question is more on the programming level of what you would use to build the actual video player and the how the video player would connect to the backend.

Not so much an infrastructure question.

13 comments

Moderation. The hard part is Moderation.

Especially the larger you scale, the less this becomes a technical problem, and more of a 'how will people abuse this' problem.

Second hardest part? Paying for it. Streaming 1080 streams can get expensive. especially if it's unpaid. Costs grow exponentially with video quality. 4k is significantly more bandwidth.

there's tons of ways to stream video to people. that's the easy part.

Is the first part actually true though? YouTube goes well beyond just moderating actually dangerous or undeniably extremist (think Isis videos) viewpoints. And to what profit? Have any major platforms failed due lax moderation policies?
Have any major platforms failed due lax moderation policies?

Yes, that's why you haven't heard of them.

They fail before they get big enough to be popular.

File sharing platforms - defined broadly enough to include Youtube - will attract adversaries with asymmetric incentives.

It will be easy for people trying to upload illegal content to upload illegal content to the degree you make it easy for people to upload content. [1]

When your file sharing server starts serving child-pornography - and it will - law enforcement will contact you and your best option will be shutting down.

You will never get to the point where radicalization videos are the problem.

[1]: What is illegal depends on which legal jurisdiction. What can legally be said about Atatürk is different in Turkey than Mexico. Libel in the UK is more broad than the US. Pictures of naked humans engaged in activity vary in legality around the world.

Sure, you don't hear about them because they were so full of spam no one bothered with them.
>Have any major platforms failed due lax moderation policies?

Tumblr

YikYak
> website where people can upload videos and others can play them in their browser. What would be actually involved in doing this?

The minimum would be a stateless server handling <input type=file>, storing it in the file system and responding with the path exposed by nginx or something. The user agent will take care of the playback, be it Firefox, VLC or mpv.

Now if your average users upload high bitrate videos but have shitty bandwidth, you'll need to transcode them down to lower resolution or higher compression to save them from rebuffering. Still, no client-side scripting needed.

In case their connection quality is unstable, HLS is finally necessary for on-the-fly adjustment of playback bitrate. This is the secondary purpose of JS viewers around the web these day (the primary one being DRM). The other possibility with custom viewer is to lighten the server load by enabling P2P transfering, e.g. in case of PeerTube. Realistically, you'd either deploy the barebone one I mentioned in the beginning or set up a PeerTube instance; anything in between is probably a waste of engineering effort.

Thanks this is a very good repl!
Thanks this is a very good reply!
Funny enough, this is a system design question asked in many interviews:

https://www.educative.io/courses/grokking-modern-system-desi... https://leetcode.com/discuss/interview-question/system-desig...

I've never designed such a system, but these are "baseline" solutions. It'd be interesting to see if anyone's actually implemented some of these canonical system design solutions and saw how well they actually scaled.

Thanks for your answer!
Depends what sort of thing you want to do.

If you want something popular, focus on getting people to use the platform. If you want more of a controllable platform, you'll need to build a platform with video transcoding, playback, stats etc.

You could always set up a PeerTube instance https://joinpeertube.org/ and/or customize the code behind that to your needs

I don't think there's much of a challenge ind that, except (most critically) for scaling it..

Since web browsers now have such excellent support for video codecs, you really only need to receive the files and maybe re-encode them to the most well supported format..

HTML has the <video> tag so you literally just link to that from your html, you don't need to "build a video player", browsers come with that already. And the data transport is just HTTP or HTTPS if you want to get fancy..

Everything else is the ops related, and is where it gets difficult.. Actually, it only gets difficult if your platform succeeds, but then it becomes majorly expensive and difficult.. Then you have to manage instances all around the world, probably negotiate with ISPs to place caches inside their datacenters, manage infrastructure to distribute videos on demand, manage abuse (harmful/illegal content) and oh my god it'd be a major pain and challenge,.

Shameless plug, I wrote a minimalistic audio-player, it's basically the same thing, except I didn't use the <video> tag.. Look at the server, there's pretty much nothing there except a bit of plumbing to query a sql database. https://github.com/DusteDdk/dstream

You may find some of HS' posts on YouTube's [early] architecture helpful: http://highscalability.com/display/Search?moduleId=4876569&s...

Or the Designing Instagram[0] and Designing Netflix[1] exercises or YouTube Scalability in 30 Minutes[2]

Also found this[3] which may be helpful

--------------

[0] http://highscalability.com/blog/2022/1/11/designing-instagra...

[1] http://highscalability.com/blog/2021/12/13/designing-netflix...

[2] http://highscalability.com/blog/2012/3/26/7-years-of-youtube...

[3] https://scaleyourapp.com/youtube-architecture-how-does-it-se...

Google :) https://www.w3schools.com/tags/tryit.asp?filename=tryhtml5_v...

Lets say rails.

1. add devise so users are users

2. add Video controller/model

3. connect data field as active_storage, backed by amazon

4. have video tag with link to uploaded file stored on amazon.

Wow that's crazy. Feels a bit overly simple. So why do people use third party video players on their website then?
To embed advertising, avoid simple downloading of video content, and add bells and whistles like timeline preview thumbs, custom context menus, etc.. I guess.
Before writing any code, you'll probably want to lobby the government to completely overhaul the broken copyright and DMCA laws first. Then design the hosting around a decentralized infrastructure like Peertube.
Mux or similar to serve videos with the right codecs for the right end user devices.

And only show comments from the people the user is following/friends with. Kills all spammers in the bud in one fell swoop.

Honestly, money and beign able to manage budget and partner.

Start with a way to distribute video content as cost effectively as you can. CDN and transfer will be your most expensive cost factors.

Take into account that you might need to have signed URLs for specific users in order to comply with content distribution rights.

Once you have decided a CDN/traffic partner you need to tailor your software to it and it will become the most crucial and expensive technical debt you'll have.

You would need some ingestion that takes in raw video from your creators and make it playable by as many people as possible. That would probably lead to HLS, CMAF, or similar.

Then you'd need some server to serve those assets. That would lead to a CDN or similar.

Then you'd need a player. HTML seems to play back mpeg fine but you'll probably need a player for HLS. video.js seems fine here.

Source: built this for $dayjob

Its says front end to youtube but it will give you a good idea of whats required-> https://github.com/TeamPiped/Piped
Thanks :)
Copyright offences on an industrial scale.
lol