Hacker News new | ask | show | jobs
by jasongrout 2973 days ago
We would love to hear what you like about Rmarkdown and Sweave. Jupyter tooling is always improving, and we are very interested in engaging with users about their needs, and helping grow the ecosystem to be able to address those.
5 comments

My biggest frustrations with Jupyter are (see #4 for comments on Sweave etc):

1. The default front-end is a weak platform for getting work done.

It's a JavaScript code editor. It will never be as good as my personal text editor configuration. It will never be as good as an IDE like RStudio, Spyder, or Pycharm. It's good that there are keyboard shortcuts for doing things like adding cells, and extensions for things like folding cells and adding a table of contents. But it still isn't terribly comfortable to use all day. Also I personally hate doing everything in a browser. Apart from some useful notebook extensions, there are no viable alternative front ends yet.

2. Running a remote kernel is a pain in the ass (cat a config file then manually tunnel 4 ports over SSH), and I can't seem to get it to work on Windows at all.

This is an issue at my company because we do a lot of work on remote servers that can be accessed only through SSH or JupyterHub. Individual users do not have control over the latter, so we are stuck with the inadequate default experience I just described above.

3. No kernel other than Ipython is mature.

IRKernel is getting there. Everything else is at best a beta-quality product.

4. Notebooks are not a plain text file format.

Hand editing a notebook is messy. They do not play well with version control systems and diff tools. RMarkdown and Knitr/Sweave are just preprocessors for established plain text formats (Markdown and Latex with some extra syntax). With those formats you can take advantage of a wealth of existing tooling, as well as having the freedom to edit the file in a normal text editor without having to rely on a special front end. Ironically having everything formatted as JSON should make it easier to write those special front ends, but I have not seen any good ones yet.

Wow this was really useful. I was feeling guilty for not trying Jupyter, after using Python for 15 years and doing data science for much of that time.

I hear so many good things about it. I wrote this comment about it:

https://news.ycombinator.com/item?id=16979057

But ANY of those four is is a dealbreaker for me. I want to use languages other than Python, with remote kernels, and I want version control. And I like my text editor to be really fast.

I think it comes down to a scientific background vs. a software background. I've memorized a boatload of tools and weird shell incantations, but the result is that I have a more solid workflow than Jupyter provides. Solid in the sense that it is likely to produce reliable results, not that it's "easier".

But if you don't have that software engineering background then I understand that Jupyter makes a whole bunch of things easier. It's not optimal in my view, but it's easier.

Great set of complaints per Jupyter. I think the project is an excellent idea, but it needs to become more mature. I've used it for some prototyping, but it isn't the polished experience you'd like. The web browser is good for looking at the results, but pretty terrible for editing and developing. I'm hoping that with time there will be:

- better front end integration - e.g. a separate vim process connecting/editing cells of a running notebook and updating the browser view on each change

- Fewer bugs and more parity between the python kernel and non-python kernels

To address “hand editing” problem I wrote a little vim plugin that wraps around notedown (https://github.com/aaren/notedown) to edit notebooks on the fly in a markdown format. It’s not perfect, but it goes a long way for quickly editing notebooks “as a whole”: https://github.com/goerz/ipynb_notedown.vim

Other than that, I try to put any lengthy code in functions that are in a module alongside the notebook, so that the notebook mostly contains one-line commands to do kick off a calculation or to generate a plot. I also have a shortcut that copies the content of the current browser text field (notebook cell) into MacVim, and pastes it back automatically as soon as I close the editor.

1. Jupyter Lab (note: NOT Jupyter Notebook) is an attempt to make the interface more IDE-like. It's still not Rstudio due to the Jupyter's notebook nature, but it's close enough for me.

I do prefer Rstudio's REPL approach of being able to run code by line or by blocks (likely inspired by MATLAB's IDE), rather than Jupyter's approach of executing code by cell (which was inspired by Mathematica). They both let you try stuff out easily while maintaining state, but the former is far easier to productionize.

2. Remote kernels over SSH aren't that hard -- I do this all the time via SSH tunnels. I start Jupyter Lab in an SSH console (usually on a cloud-based VM), and create a tunnel to port 8888 (the default) using my Windows SSH app (Bitvise). 1 port. That's it.

3. No comment - I only use the Python kernel.

4. Correct. Notebooks do present challenges for version control.

Remote kernels over SSH aren't that hard -- I do this all the time via SSH tunnels. I start Jupyter Lab in an SSH console (usually on a cloud-based VM), and create a tunnel to port 8888 (the default) using my Windows SSH app (Bitvise). 1 port. That's it.

I want the opposite. I want to use a remote kernel with a local client.

Umm, yes, in my case, the kernel is running remotely on a cloud VM. My client is a local browser (Chrome) which connects to localhost:8888, which is a tunnel set up to connect to the remote machine on port 8888.

This lets me run computationally heavy Jupyter calculations on a beefy remote backend in the cloud. My local browser merely talks to that backend via a tunnel.

Here's something on the web that describes this [1] -- except with Bitvise on Windows, you don't have to enter any SSH commands. The tunnel setup etc. is all done via a GUI. This is a pretty standard SSH tunnel technique. You can use this for more than just Jupyter.

[1] http://www.vickyfu.com/2017/04/using-jupyter-notebook-remote...

Again, that's not what I mean. I want to run Jupyter (or some other front-end) on my laptop and have it talk to a kernel running on a server. You're describing running both Jupyter and the kernel on the server.
Oh I see now. You want to run the raw kernel with no front-end on the remote machine and communicate with it via the 0MQ/JSON transport layer. I'm curious, what is the advantage of doing this vs. simply running an instance of Jupyter on a remote machine?
Remote kernels: I'm working on some infrastructure that should make this easier, but it's still some way off being ready.

Kernel maturity: the Julia and Haskell kernels are pretty well supported, I understand, though I haven't used them myself.

Alternative frontends: Emacs IPython Notebook is pretty well maintained, if that's to your taste.

See whether you'd like Org-mode + Babel. Example: https://youtu.be/dljNabciEGg
Using a web browser as an IDE just seems like a solution in search of a problem. Also, maybe I just never spent enough time working with Jupyter, but it seemed to me that it encourages a sort of exploratory workflow, really well suited to teaching programming and data science. It was less clear to me how it could be used well in production environments.

I use Rmarkdown and Sweave to write homeworks for my students in a very Jupyter way. I also use them to generate data driven static webpages, procedurally generate production quality and easily formatted PDF and HTML reports. I also use them as a templating system for auto-generated model diagnostic emails. Perhaps I need to return to Jupyter to see what I'm missing, but I don't really know what purpose it would serve, or what kind of work it would make easier.

Doing the interface in the web browser has its ups and downs: it makes some things more awkward locally, but it's easy to deliver the same interface remotely - e.g. a university can run a JupyterHub instance for a course, and students visit a URL and login. nteract is an attempt to make a notebook interface as a local application.
For sure, I totally see how it's useful as a teaching tool. I don't see how it fits into a proper production system.
Different system and requirement need different tools, if you prefer text editor. Depending on your preference look at Emacs IPython Notebook, the Jupyter VS-Code extension, or Atom Hydrogen. That will basically let you select chunk of code and execute in a kernel. You do not have to use the notebook format, or the browser based editor.

JupyterLab also allow Rmarkdown-like workflow where code-blocks in a markdown document can be executed to display graph.

I believe the important part is to allow interoperability between different ways people want to work. You can't have 1 size fits all, and there are still a lot of work that can be done to cover some use case.

This may depend on what you mean by 'proper production system'. It's definitely meant to complement, not replace Python modules, scripts and so on. I wouldn't write a web app in a notebook. But the LIGO team that discovered gravitational waves published a notebook demonstrating their data analysis.

I see it as useful where illustrating and explaining some computational steps is at least as important as executing them. Teaching is one obvious use case, but it's also valuable for sharing scientific methods, documenting a library with runnable examples, or presentations at programming conferences.

Rmarkdown has the better version control story.
The number one feature request by far at this point would be collaborative features in the notebook. Not really because I actually want to collaboratively edit a notebook with others, but because I would like to open the same notebook in two separate browser windows (side-by-side), and edit a cell at the top of the notebook in one, and a cell at the bottom in the other, and have things get merged automatically. Or, not having to worry that I left a (remote) notebook open on my work computer when I connect to the same server from home, causing parts of the notebook to be accidentally overwritten.
JupyterLab has the ability to open multiple "views" on the same file. http://jupyterlab.readthedocs.io/en/latest/user/notebook.htm...
That's amazing! I'll have to try out JupyterLab sometime very soon. I was holding out because I'm using a whole bunch of plugins for the classic notebook from https://github.com/ipython-contrib/jupyter_contrib_nbextensi... that I think have't been ported yet
I guess it would be nice to make it run within an electron environment, without extra console and browser tabs to run it in. Just jupyter running as an electron app with maybe tabs for each notebook to give it a more native desktop app feel.
Check out https://nteract.io/. It's based on Jupyter standards (protocol, kernels, notebook format), written in JS, and runs on your desktop.