Hacker News new | ask | show | jobs
by everythingctl 898 days ago
Cool toy and a nice piece for the CV perhaps, but it is difficult to take it seriously if you refuse to offer source code or a implementable specification.

I would give you the benefit of the doubt that it might just be code shyness or perfectionism about something in its early stages, but it looks like the last codec you developed (“HALIC”) is still only available as Windows binaries after a year.

I struggle to see an upside to withholding source code in a world awash with performant open source media codecs.

4 comments

Maybe it’s just me, but every lossless codec that’s:

1. Not FLAC

2. Not as open-source as FLAC

comes across as a patent play.

FLAC is excellent and widely supported (and where it’s not supported some new at-least-open-enough codec will also not be supported). I have yet to see a compelling argument for lossless audio encoders that are not FLAC.

FLAC’s compression algorithm was pretty much garbage when it came out, and is much worse now compared to the state of the art. Even mp3 + gzip residuals would probably compress better.

FLAC doesn’t support more modern sampling formats (e.g. floating point for mastering), or complex multi channel compression for surround sound formats.

There just isn’t something better (and free) to replace it yet.

> There just isn’t something better (and free) to replace it yet.

Apple's ALAC (Apple Lossless Audio Codec) format is an open-source and patent-free alternative. I believe both ALAC and FLAC support up to 8 channels of audio, which allows them to support 5.1 and 7.1 surround. https://en.wikipedia.org/wiki/Apple_Lossless_Audio_Codec#His...

These are distribution formats, so I'd be surprised if there were demand for floating-point audio support. And in contexts where floating point audio is used, audio size is not really a problem.

When FLAC compresses stereo audio, it does a diff of the left and right channels and compresses that. This often results in a 2x additional compression ratio because the left and right channels are tightly correlated.

Unless things have changed substantially and I missed it, FLAC does not do similar tricks for other multichannel audio modes. Meaning that for surround sound, each channel is independently compressed and it is unable to exploit signal correlation between channels.

Proprietary formats like Dolby on the other hand do support rather intelligent handling of multichannel modes.

FLAC is not solely a distribution format. Indeed as a distribution format it sucks in a number of ways. It is chiefly used as an archival format, and would in fact be ideal as a mastering format if these deficiencies Could be addressed.

In what ways does flac suck for distribution? All the music I download from Bandcamp is in that format, it works great for me.
It could be much smaller, maybe 2-3x better compression. Better support for surround sound / multichannel audio. If an AAC stream were used for the lossy predictive stage, then existing hardware acceleration could be used for energy efficient playback.
> FLAC’s compression algorithm was pretty much garbage when it came out, and is much worse now compared to the state of the art. Even mp3 + gzip residuals would probably compress better.

MP3 is a lossy format so I would practically guarantee that you’d end up with a smaller file but that’s not the purpose of FLAC. Lossless encoding makes a file smaller than WAV while still being the same data.

> e.g. floating point for mastering

I’m 0% sold on floating point for mastering. 32bit yes, but anyone who’s played a video game can tell you about those flickering textures and those are caused not by bad floating point calculations, but by good floating point calculations (the bad part is putting textures “on top” of each other at the same coordinates) . Floating point math is “fast” but not accurate. Why would anyone want that for audio (not trying to bash here, I’m genuinely puzzled and would love some knowledgeable insight)

> MP3 is a lossy format so I would practically guarantee that you’d end up with a smaller file but that’s not the purpose of FLAC. Lossless encoding makes a file smaller than WAV while still being the same data.

You misunderstood what you are replying to. FLAC works by running a lossy compression pass, and then LZ encoding the residual. The better the lossy pass, the less entropy in the residual and the smaller it compresses. FLAC’s lossy compressor pass was shit when it came out, and hasn’t gotten any better.

Flickering textures is caused by truncation and wouldn’t be any better with integer math. The same issues apply (and are solved the same way, with explicit biases; flickering shouldn’t be a thing in any quality game engine).

Floating point math is largely desired for mastering because compression (technical term overloaded meaning! Compression here means something totally different than above) results in samples having vastly different dynamic ranges. If rescaled onto the same basis, one would necessarily lose a lot of precision to truncation in intermediate calculations. Using floating point with sufficient precision makes this a non-concern.

> FLAC works by running a lossy compression pass, and then LZ encoding the residual.

Since when does FLAC run a lossy pass? You can recover the original soundwave from a FLAC file, you can't do the same with an MP3.

I'm pretty sure FLAC does not run a lossy compression pass.

Flickering textures in game engines are likely due to z-fighting, unless you're referring to some other type of flickering.

If you're looking to preserving as much detail as possible from your masters then floating points make sense. But its really overkill.

> The FLAC encoding algorithm consists of multiple stages. In the first stage, the input audio is split into blocks. If the audio contains multiple channels, each channel is encoded separately as a subblock. The encoder then tries to find a good mathematical approximation of the block, either by fitting a simple polynomial, or through general linear predictive coding. A description of the approximation, which is only a few bytes in length, is then written. Finally, the difference between the approximation and the input, called residual, is encoded using Rice coding.

Linear predictor is a form of lossy encoding.

Yes exactly. What you’re saying lines up with what I’ve learned through experience.

> If you're looking to preserving as much detail as possible from your masters then floating points make sense.

I’ve been searching for hours and gotten nothing more than the classic floats vs ints handwaving. Can you explain what you know about why using floats preserves detail?

what do you suggest instead?
I suggest that people who care enough about these things (not me, I’m just informed about it), come together and make a new lossless encoder format that has feature parity with the proprietary/“professional” codecs.
what codec are you suggesting is better, and how much better is it? unless encoders have wildly improved, alac's from apple is not better than flac. ape and wavpack seems to do a bit better, but not much
Support for >8 channels led me to use WavPack instead of FLAC.
What's the use case?
You are right about this. But there are things I should add to Halic and Halac. When I complete them and realize that it will really be used by someones, it will of course be open source.
One of the cool things about open source is that other people can do that for you! I've released a few bits of (rarely-used) software to open-source and been pleasantly surprised when people contribute. It helps to have a visible todo list so that new contributors know what to aim for.

By the way, there will always be things to add! That feeling should not stop you from putting the source out there - you will still own it (you can license the code any way you like!) and you can choose what contributions make it in to your source.

From the encode.su thread and now the HA thread, you've clearly gotten people excited, and I think that by itself means that people will be eager to try these out. Lossless codecs have a fairly low barrier for entry: you can use them without worrying about data loss by verifying that the decoder returns the original data, then just toss the originals and keep the matching decoder. So, it should be easy to get people started using the technology.

Open-sourcing your projects could lead to some really interesting applications: for example, delivering lossless images on the internet is a very common need, and a WASM build of your decoder could serve as a very convenient way to serve HALIC images to web browsers directly. Some sites are already using formats like BPG in this way.

> One of the cool things about open source is that other people can do that for you!

This is a very valid point, but we should all recognise that some people⁰ explicitly don't want that for various reasons, at least not until they've got the project to a certain point in their own plans. Even some who have released other projects already prefer to keep their new toy more to themselves and only want more open discourse once they are satisfied their core itch is sufficiently scratched. Open source is usually a great answer/solution, but it is not always the best one for some people/projects.

Even once open, “open source not open contribution”¹ seems to be becoming more popular as a stated position² for projects, sometimes for much the same reasons, sometimes for (future) licensing control, sometimes both.

--

[0] I'm talking about individual people specifically here, not groups, especially not commercial entities: the reasons for staying closed initially/forever can be very different away from an individual's passion project.

[1] “you are free to do what you want, but I/we want to keep my/our primary fork fully ours”.

[2] it has been the defacto position for many projects since a long time before this phrase was coined.

> I/we want to keep my/our primary fork fully ours

The "primary" fork is the one that the community decides it to be, not what the authors "wants". Does it really matter what is the "primary fork" for those working on something to "scratch their own itch"?

Hence I said my/our primary fork, not the primary fork.

If I were in the position of releasing something⁰: the community, should one or more coalesce around a work, can do/say what it likes, but my primary fork is what I say it is¹. It might be theirs, it might be not. I might consider myself part of that community, or not.

It should be noted that possibility of “the community” or other individual/team/etc taking a “we are the captain now” position (rather than “this is great, look what we've done with it too” which I would consider much more healthy and friendly) is what puts some people off opening their toy projects, at all or just until they have them to a point they are happy with or happy letting go at.

> Does it really matter what is the "primary fork" for those working on something to "scratch their own itch"?

It may do further down the line, if something bigger than just the scratching comes from the idea, or if the creator is particularly concerned about acknowledgement of their position as the originator².

--

[0] I'm not ATM. I have many ideas/plans, some of them I've mused for many years old, but I'm lacking in time/organisation/etc!

[1] That sounds a lot more combative than I intend, but trying to reword just makes it too long-winded/vague/weird/other

[2] I wouldn't be, but I imagine others would. Feelings on such matters vary widely, and rightly so.

I don’t get it. What the community does has no bearing on your fork, so why do you care? You can open source it and just not accept patches. Community development will end up happening somewhere else, but who cares?
Whatever position you are trying to argue seems to be so antithetical to Free Software, I'd say those sharing this view are completely missing the point of openness and would be better off by keeping all their work closed instead.

> other individual/team/etc taking a “we are the captain now” position rather than “this is great, look what we've done with it too”

The scenario is that someone opens up a project but says "I am not going to take any external contribution". Then someone else finds it interesting, forks it, that fork starts receiving attention and the original developer thinks to be entitled to control the direction of the fork? Is this really about "scratching your own itch" or is this some thinly-veiled control issue?

I'm sorry, after you open it up you can't have it both ways. Either it is open and other people are free to do whatever they want with it, or you say "it's mine!" and people will have to respect whatever conditions you impose to access/use/modify it.

> if the creator is particularly concerned about acknowledgement of their position as the originator.

That is what copyright is for and the patent system are for those who worry about being rewarded by their initial idea and creation.

If one is keeping their work to themselves out of fear of losing "recognition", they should look into the guarantees and rights given by their legal systems, because "feelings on this matter" are not going to save them from anything.

> When I complete them and realize that it will really be used by someones, it will of course be open source

There is a chicken and egg problem with this strategy: Few people will want to, or even be able to, use this unless it’s open source and freely licensed.

The alternatives are mature, open or mostly open, and widely implemented. Minor improvements in speed aren’t enough to get everyone to put up with any difficulties in getting it to work. The only way to get adoption would be to make it completely open and as easy as possible to integrate everywhere.

It’s a cool project, but the reality is that keeping it closed until it’s “completed” will prevent adoption.

Hakan: if you are going to go open source just do it now. You have nothing to gain and much to lose by keeping it closed.
Maybe he is just waiting for the right investor that has a purpose for the codec so he can reinburse his time investment.

Making it opensource now would just ruin that leverage.

I am with you OP

Looking at history, it seems trying to build a business model around a codec doesn't tend to work out very well. It's not clear what the investor would be investing in. It's a better horse.
When I bring my work to a certain stage, I would like to deliver it to a team that can claim it. However, I want to see how much I can improve my work alone.
Being open source doesn't mean you have to accept contributions from other people.
When you do decide to open the codec, you should talk to xiph.org about patent defensibility. If you want it open, but don’t build a large enough moat (multichannel, other bitrates, bit depth, echo and phase control, etc then the licensing org will offensively patent or extend your creation.
Thanks for the information about the license and patent. It can work with any bitrates for Halac. However, more than 2 channels and 24/32 bit support outside 16 can be added.
I understand a forward compatibility concern, but have you considered to put an attention-grabbing alert in the encoder and clearly state that official releases in the future won't be able to decompress the output? Also your concern may have been too overblown; there have been more than 100 PAQ versions with mutually incompatible formats but such issues didn't happen too often. (Not to say that ZPAQ was pointless, of course.)
You may be trying to kill all criticisms, this is not possible. Not everyone will like you and not everyone will like your code. Fortunatly people irl that have personal differences tend to be a but more tactful than the software crowd can be online but something like this bound to get overwhelming amounts of love.

No great project started out great and the best open source projects got to their state because of the open sourcing.

Consider the problems you might be spending a lot of time solving might be someone else's trivial issue, so unless this is an enjoyable academic excercise for you (which i fully support), why suffer?

I have no problem trying to kill criticism. I'm just trying to do what I love as a hobby(academic).

Or maybe it's better for me to do things like fishing, swimming as a hobby.

Don't let perfect be the enemy of good. If Linus didn't open source Linux until it was "complete", it wouldn't be anywhere near as popular as it is.
Thank you for your valuable thoughts.
You could open it now and crowd-source the missing pieces. I really see nothing to lose by making this oss-first.
Sounds like some words in Filipino:

Halic = kiss Halac = raspy voice

You got that backwards buddy. Nobody will use them so long as they remain closed source like this.
Maybe they want to sell it to a streaming service or something.
This. It's almost ragebait posting this: "I'm better but I won't show you."