Hacker News new | ask | show | jobs
by sheepleherd 3651 days ago
the computer science / computer programming problem I'd like to see solved is, keeping projects "fresh" and open/accessible enough that people like this could feel like they were learning in an unencumbered way, and at the same time contributing something useful to an existing project, while at the same time pushing the capabilities of what available open source projects can provide.

"Reinventing the wheel" projects absolutely litter public source nodes; believe me, I know why people do it; but my dream is the dream of software that most of us have given up on, code reuse, "reentrancy", shared libraries, etc.

Maybe something like a "wikipedia of source code".

I'm not discounting the benefit of doing a project to learn about it; what I'm saying is, too bad it's not code that will be useful for anything else without a lot more work; and too bad work is going into something that is not reuseful-able.

5 comments

Any nontrivial project requires lots of time just to understand its design. Even minimal contributions will likely require comparable amounts of work to completing a toy project. And as any professional programmer knows, reading other people's code is a lot less fun than writing your own.
like you are saying "it's a problem", and like I'm saying "that's the problem I'd like to see solved"

as an example, what they teach us in school, and what large projects like NASA have do do, is to first agree on a specification for interfaces, then to write code to the interface, then iron out the kinks. Working on a project like that, and the bigger the project, soon we discover that there are many local wins if we can only change the interface that we agreed on because "we didn't know enough when we agreed" etc. etc.

As an example of what I'm saying (as a thought experiment solution) is that if a real live compiler project was written to clean specs (even if the specs came after the code), then there'd be a lexer, parser, etc. and for a little homebrew project like this one, you could write your own lexer from scratch, testing it all the while against the rest of a functioning compiler. Probably, you would not finish it because you would learn in a series of "aha" moments what "the hard parts" are, and how they are solved.

So you could abandon your own piece, but at the same time you would be now equipped to contribute to the real project.

Or you could move on to working on the parser... lather, rinse, repeat.

No need to tell me what all "the reasons that doesn't work is"... I know the reasons, and it's useful to identify the laundry list of them, but the part I'm interested in is the attitude that "hey, this is worth solving" and "hey, this could be solved..."

I've done something like that for the CPython bytecode compiler: https://github.com/darius/500lines/blob/master/bytecode-comp...

"From clean specs" didn't really hold because the bytecode VM needs better documentation. I started to address that with a version of the VM in Python (cutting down Ned Batchelder's byterun): https://github.com/darius/tailbiter and I've started a very spare-time project to redo the parser as well (in the same repo). It'd be neat to see this getting used in a compiler course -- CPython's about as simple as a popular compiler gets.

(I agree with the grandparent that it took a lot of time to learn all I needed to about CPython internals -- and for the parser there's more to learn.)

Does the LLVM introduction tutorial[0] kind of fit what you're suggesting? You learn how to implement a toy language called Kaleidoscope on top of the LLVM infrastructure with one data type (64-bit float), if/else, for loop, and a few other things.

It covers the lexer, parser, AST generation, and a few other things.

There's also one for writing a backend targeting a fake hardware architecture.

[0] http://llvm.org/docs/tutorial/index.html

thanks, that's very good.

quick critique (wanted to contribute to this conversation while it's active rather than delve deeply into LLVM for the rest of the day :) it's (naturally and understandably) written from the perspective of "this is how it is, if you want to connect with what we do here's what you need to do".

As a pedagogical tool (that is still a compiling tool) it could use an intro of more "here is what a lexer needs to do, here's how/why we chose to do it, here is why what is downstream belongs downstream, here is an example using a language syntax that is extremely simple" (C is not), "here is an alternative way you could try to do it", etc.

But definitely you point up a good way to start toward [mystic music] "my dream goal" in this example.

Again tho, I'm wishing that there were tools and "a way" that ALL projects could be managed this way, not just one great complier, but the several great compilers and editors, and all-the-types-of-things-people-keep-having-the-urge-to-reinvent

Seems like the ETH work on Oberon etc did that. The students kept learning by improving on or porting both the compilers and OS's. A2 Bluebottle is a fast, usable OS as a result. Barely usable given no ecosystem and few features. Usable, though, in a safe, simple, GC-ed language. :)

Far as C compilers, would this one previously posted fit your requirements?

http://c9x.me/compile/

https://news.ycombinator.com/item?id=11555527

It seems to be quite similar to what you describe. Designed to be compact, easy to understand, maybe easy to extend, and useful in practice. I remember liking it more than most of these submissions for those attributes.

Well that's an interesting idea. What sort of programming language would let us build a package repo that could survive a Wikipedia-style edit war?

Wikipedia works because articles are mostly independent. There are links, but if they're broken it's not fatal. There are also a fair number of stub articles and a bureaucracy around what counts as a notable subject.

In practice, projects have owners, and they're looking for help, but not just anyone's help.

> What sort of programming language would let us build a package repo that could survive a Wikipedia-style edit war?

Really, the programming language isn't important, its the package management and repository system that matters for that. And, a repository that keeps history and a package/dependency management system that lets you specify the particular version of a dependency would seem to suffice to address the particular problem you relate.

I find Kartik Agaram's writings interesting with respect to that problem.

http://akkartik.name/about

Thanks for saving me the decision of whether or not to do another shameless plug :)

I recently wrote this (in a very similar thread to this one) about the tension between real-world and teaching software: https://www.reddit.com/r/Compilers/comments/4jmb88/open_sour...

more details?
more detailed question please? I'm happy to discuss but I'm not sure whether you are looking for bottom up nitty gritty details or more top down grounded philosophizing.
Looking at the rest of the thread helped. Thanks!