| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cauchyk 956 days ago
	Author of the blog here, curious what a better alternative would be in this context. The channel has to be passed around for the producer and consumer to interface with each other. Are there better patterns for this?

5 comments

atombender 956 days ago

Not the parent, but I personally dislike it when Go libraries use channels in their public APIs, as it forces a specific concurrency model on the consumer; in particular, channels are quite slow, being protected by an internal mutex, so you're always paying for the overhead no matter if you need it or not.

You also have to be very careful about managing the channel lifecycle. If you're not pulling (selecting from) the channel, the library will be permanently stuck. So you must now have a way to tell the library to stop sending, and it must cancel any in-flight send operations if you call producer.Stop() or whatever. In my experience libraries often have bugs in their channel code. It's far too easy to get deadlocks with channels that have interdependencies, and you have to be very careful about buffered versus unbuffered channels, as they behave differently.

A better API, in my opinion, is to offer a callback or single-method interface. Then the implementer of that callback or interface can choose to use channels internally if they desire, or they can use something else. You get the same backpressure support since you can treat it as synchronous.

After all, a channel's send interface is essentially just:

    type Channel[T any] interface {
        Send(T)
    }

But a "chan T" doesn't offer this flexibility.

My rule of thumb for channels is that they're goroutine glue, not an API primitive. Build APIs out of interfaces, not channels. The only thing that uses channels should be the one that's controlling the goroutines, because it's the thing that orchestrates them.

That said, it's not a hard rule. There are places where channels may have their place in a public API, though I'm not sure I can think of any examples off-hand.

link

foobiekr 956 days ago

this breaks select to send and is a terrible reduction in capability.

you can always wrap channels to make them worse and less capable, but your API should expose the more capable option.

link

__turbobrew__ 956 days ago

I think it is a matter of preference. For me personally I use raw channels and goroutines all day every day and I really like using them. Channels are a core primitive in golang so I think it is worth getting familiar with them.

As you say being able to select is really nice too.

link

tw1984 956 days ago

> as it forces a specific concurrency model on the consumer

I found your excuse above is really nonsense. when your program is in Golang, you've already picked side, the concerned concurrency model has already been chosen by the user.

we are not talking about one of random concurrency models, we are talking about channel based sychnronization and communication in golang, if you don't want that and consider it as an issue, you shouldn't be using golang in the first place.

link

skybrian 956 days ago

Looks like the channel field is private in CDCRecordStream, but exposed by GetRecords. The callers mostly loop over Record objects. [1]

If I wanted to encapsulate iterating over a channel of Records, maybe it would be something like Go's io.Pipe function [2], which returns a PipeReader and PipeWriter? Except that it would work on Records rather than byte streams.

I don't have enough context to know if the extra encapsulation is a good idea in this case, though.

[1] https://github.com/search?q=repo%3APeerDB-io%2Fpeerdb%20GetR... [2] https://pkg.go.dev/io#Pipe

link

JyB 956 days ago

Please see this great talk by Bryan C. Mills touching on the subject: https://youtu.be/5zXAHh5tJqQ?t=421

link

candiddevmike 956 days ago

Why have consumers and producers vs doing it all in one goroutine, utilizing some kind of connection pool?

link

reactordev 956 days ago

Because then you are consuming, or producing, you can’t do both at the same time. You are either reading from a stream of data, or you are writing it. Using goroutines to separate these allows you to do both at the same time, as soon as data is available on the channel or you receive the signal to stop.

link

cauchyk 956 days ago

To get higher throughput we would need one goroutine to pull from the replication slot while the other is pushing to the target. The idea is to keep the Postgres connection useful and reading the slot while also pushing to the target asynchronously.

link

earthboundkid 955 days ago

Use an iterator object that can use channels behind the scenes.

link