Hacker News new | ask | show | jobs
by moocow01 4942 days ago
Maybe someone knows of a solution but it would be great if SVN/version control was able to store a standard formatted version of source code and then depending on client/user preferences could reformat the file accordingly on checkout and then also be able to factor in / unformat when doing diffs/checkins. I feel like if it could work like this it would mitigate a lot the inevitable battles that take place over small things like formatting/etc. There are obviously things like variable names that it wouldnt be able to solve but at least nobody would be complaining about where Joe put his curly bracket. I guess you can do something similar with an SVN hook but its hard to get a seamless process.
5 comments

What I've always found an intriguing idea is to store the abstract semantic graph of the program instead of (just) the text of the program in the revision control system. The source code could then be formatted / visualized in any way for the programmer while manipulating it. Kind of like a TeX, or CSS, but for source code. And I suppose it could also help with making sensible diffs, no longer need to be annoyed by addition of spaces / indentation / number of columns / etc.
"A language should be designed in terms of an abstract syntax and it should have, perhaps, several forms of concrete syntax: One which is easy to write and maybe quite abbreviated; another which is good to look at and maybe quite fancy, but after all, the computer is going to produce it; and another, which is easy to make computers manipulate."

--John McCarthy, http://www.infoq.com/interviews/Steele-Interviews-John-McCar...

Coffee Script / JavaScript dualty?
This won't work, except for trivial codebases, I guess. Dean Whelton argued a similar thing once [1]. There are always things that resist automated formatting or that are downright destroyed when using it. I use Ctrl+Shift+F or Ctrl+E,D sparingly for that reason. Eclipse has a tendency to mangle Javadoc comments sometimes. Sometimes you're breaking a long method invocation or formula over multiple lines for readability and not exactly at the 80-char boundary, etc.

Of course, trivial things like braces on the same or next line or whitespace around operators are solvable that way, but in general, as long as source code is text (and it will remain that way for quite some time). What perpetuates this state is obviously that we have lots of tools that deal with text and very few that deal with more specialised content. In general I find this sad, though, as text is often neither easiest to work with nor most expressive, despite of what diehard Unix users say.

What would be really lovely in my eyes woul be a source control tool that actually understood its content and could say "order of parameters of that function was changed" or "method added", etc. It's sometimes astonishing how the pursuit of optimal diffs masks the intent of a change where an added method diff starts with the closing brace of the previous method, for example.

[1]: http://welbog.homeip.net/glue/71/Whitespace_is_not_a_problem

"It's sometimes astonishing how the pursuit of optimal diffs masks the intent of a change where an added method diff starts with the closing brace of the previous method, for example."

In Python you can get the whole return statement. Especially when adding a similar type of method, e.g.: Django view functions. Makes

    git add -p
more fun.
I've found that experimenting with the various git diff algorithms helps out. IIRC there is an "-x patience" flag you can pass to git-diff and tools which use that, which spends more time to get "better" diffs.
Eclipse has pretty good code-formatting control for Java, quite customizable. It's very nice in that:

  * you can elect to turn off formatting for sections of your code
  * you can save/export your formatting convention in an XML file that looks like this: http://pastebin.com/fKjKRuZg
  * you can import that code convention into new/other projects
  * you can select any number of files or projects and choose to apply the formatting to the Java source
  * the result is very readable
This might not tie in directly to your source control tool as is but would make taming and standardizing source from multiple developers a bit easier. As long as every contributor's code eventually is coerced into this form your own/central copy can easily be diff'ed across versions. Also, you could apply other conventions with other formatters for clients that see things differently.
I know a solution: Agree on the broad points of formatting - ie. tabs/spaces, where the braces generally go. Make engaging in formatting battles a firing offense, but listen to anyone who can make substantial arguments in favour of a practise.

Automatic reformatting is evil - sometimes codes is more readable if formatted in a particular way. Readability and maintainability always trumps adhering to rules.

Automatic reformatting is the only solution, because you want to be able to make unreadable code readable without conscious effort. My preferred policy would be something like:

1. There is an agreed autoformatting template, checked into version control.

2. It is always ok to format the lines you are working on however you like, with the understanding that other people may run the autoformatter on the file.

3. It is always ok to run the autoformatter on a file you are working on (autoformatters only really work at file granularity).

4. It is not ok to format code you are not working on.

I mostly agree, although I'd like to enhance the bit in point 3 and 4 about only ever reformatting code you're actually working on - meaning, you take responsibility for any code that your autoformatter changes is just as readable as before.

Changing the name of a constant on line three doesn't give you license to autoformat the rest of the file.

I think it does. Carefully maintained manual formatting takes too much programmer effort. If you open a file to actually work on, you should be permitted to hit the autoformat button without thinking, because then formatting becomes something you just don't think about. It's not as good as good manual formatting, but it's good enough, and frees up mental resources for more important things.
I'd say this depends on the project and the developers. In some projects reading code is done a lot more often (and by more people) than actually writing code, so then you might want to spend more time on e.g. formatting, documenting, etc..
I guess what Im saying is if you could have a tool that could compare 2 pieces of source code, be able to merge them, and maintain the formatting of each persons local copy it would be great. Its probably a pipe dream because your version control would have to have an intimate knowledge of the specs of any particular language used in the source code (nevermind the version of the language you are using)
Yes, for merging, that would be incredibly helpful.

But what I'm saying is that the formatting of a file may contain contextual clues that will help someone reading the code to understand what it does. Formatting isn't just chrome around the code.

Example: In Java, when writing a custom predicate to filter by foos that are bars, I prefer

        filter(myList, new Predicate<Foo>() {
            @Override public boolean apply(Foo foo) { return foo.isBar(); } });
to

        filter(myList, new Predicate<Foo>() {
            @Override
            public boolean apply(Foo foo) {
                return foo.isBar();
            }
        });
(filter is statically imported from Collections2 in Guava)

especially when there are multiple of them and they line up neatly underneath each other.

Any meaningful autoformatter would change the latter to the former, and in the process loose readability.

EDIT: Another example would be when using the builder pattern - getting those methods to line up to be neatly readable often takes some none-standard indentation.

so I would have something on L249 and you would have it on L321? nonsense.
Yeah, you'd likely need some other way to refer so a position in the code than the line number. Maybe an anchor (like html) or a path in the syntax tree.
These tools already exist: JavaScript tools like minfiers or CoffeeScript can generate source maps which map code lines from one format to code lines in another format.

http://www.html5rocks.com/en/tutorials/developertools/source...

but they are used for debugging, not bi-directional transformation between two incompatible sources.