| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cgarciae 2827 days ago

Point taken! Thanks a lot for your feedback. Just a few points: * pypeline is already taken :( * My main reason for this was because initially I was thinking that you did an `import pypeln as pl` and then called things like e.g. `pl.pr.map` since you cant abbreviate the module inside `pl` then I picked short names, but then I decided to go for and import the module kind of strategy.

I am thinking about expanding the module names to their worker names: * pr --> process * th --> thread * io --> task

And then have the conventions * from pypeln import process as pr * from pypeln import thread as th * from pypeln import task as io # as ta?

This conversation is very valuable, thank you all for the feedback.

4 comments

anentropic 2827 days ago

Hi, just coming back to say congrats on your library and I wish you all the best with it :)

I see your reasoning here (`import pypeln as pl`) but I still think where you have submodules you should use unabbreviated words for their names.

For me I'd be happy with `pl.process.map` in my code, but `pl.pr.map` feels a bit too obscure to have as the default.

These things are quite subjective of course, but part of that subjective judgement comes from the experience of what is commonly done in other Python libraries (the stdlib is a bit of a mixed bag in this regard unfortunately, riddled with CamelCase and other abominations).

link

bb88 2827 days ago

First of all, it's super cool. :)

There are a lot of "pipe" projects in PyPi, but your project is also about process management. Maybe you should avoid "pipe" in your name perhaps? FlowProcessor? nFlow? xFlow?

I do agree that you should avoid io for asyncio. You should probably at least use aio, but there's no reason you can't have asyncio_task, thread_task, multiprocessing_task.

Lastly, in my mind the killer app for this would be to allow something that works on top of Celery in production, but then be able to fall back to say multiprocessing or threading when running locally. That would allow me to prototype something, and then when I want to scale, I can just change a config setting.

link

cgarciae 2827 days ago

Hey, thanks for all the feedback. I will change the naming since its something most of you have agreed is a good change.

The goal I have for Pypeline is much simpler: let you easily setup data pipelines where you leverage processes, threads and asyncio where they are good at. So in my mind a killer app would be a pipeline that maybe starts with an asyncio stage for e.g. downloading images, maybe then a multiprocess stage for e.g. doing image processing, and finally a threading stage for e.g. interacting with the OS.

Right now I see Pypeline more as an easy to use single machine tool instead of a higher level distributed abstraction like Celery. Maybe other framework could leverage Pypeline to ease their work.

link

bb88 2821 days ago

So then I would want to have one more stage... a celery stage for when you want to cluster work across multiple machines. :)

link

PurpleRamen 2826 days ago

> My main reason for this was because initially I was thinking that you did an `import pypeln as pl` and then called things like e.g. `pl.pr.map` since you cant abbreviate the module inside `pl` then I picked short names

"Assumptions are the root of all evil."

With autocomplete a coder has no reason to use shortnames anyway.

link

neuromantik8086 2827 days ago

"import pypeln as pl" could cause quite a bit of confusion in Poland I would imagine.

link

bb88 2827 days ago

And maybe perl...

link

cgarciae 2827 days ago

jajajajaja

link