Hacker News new | ask | show | jobs
Show HN: Parse and output TODOs and FIXMEs from comments in your files (github.com)
49 points by pgilad 4041 days ago
18 comments

Many IDEs have built in support for this and I find it very useful. Good that it is coming to the command line too!

It works great as a code review tool, too. Agree on a marker, add review comments to code, commit to source control and tell your team mate about it. It's just about at-your-fingertips as diffs-with-comments tools (like github PRs), and way lighter on the tooling. You also see all the context, not just the changed lines. Eg:

    // REVIEW(me): This is getting big, maybe split into two classes?
In smaller teams (say, <5 people on a single shared codebase) I've found this to work remarkably well.
My developer has just shown me this.

My first reaction was saying we already have that in our rails project using `rake notes`. As he mentioned, the rake task is very slow, though. It will also ignore a lot of our code which is not rails specific.

Thinking about it twice, I realized: this should be hooked to git. This allows very fast lookup and will ensure we do not search in not commited and thus irrelevant files, like logs.

I came up with this alias for git:

    todo = grep -E 'FIXME|TODO'
Formatting is nowhere near as pleasant as Leasot or `rake notes`, but it's blazing fast and relevant.

Anyway, thanks for having made us think about that :)

Is there any advantage here over using Ag or Awk or similar:

  ag "(TODO|FIXME)"
Well you can see my comment above, but overall ag|awk|pt will be much faster (but might be less accurate). Leasot tries to weed out some false positive (by creating better comment for file type specific comment regexes). Also provides several reporters (JSON, XML, Markdown...) if you want to integrate with other tools (jenkins, travis etc...)
I guess the question is how does this compare to standard existing tools such as grep, i.e. something like

   $> grep -Hn TODO test.js
   test.js:241:TODO ...
A lot of comments here comparing it with grep or some other native CLI. I think this fares better at least on three metrics:

1. Fun to build. 2. Output looks prettier. 3. Easier to add more syntactic stuff as edge cases for different languages are factored in. Won't be too hard to extend it for github issues, for example.

Since there would be very few cases of partial matches of "TODO" I think grep would still be much faster.

1. thats a moot point because it doesn't explain why other people might want to use it.

2. $ alias grep="grep --color"

3. it would be even easier to extend grep via pipes. Plus pipes are langauge agnostic where as extending this tool requires knowledge of Python

As a personal project, this is fine. But I really don't see the point in anyone else using it when it's slower than existing tools, no more user friendly, requires more dependancies, and isn't part of the default install like grep and find (Windows CLI) are. Plus this tool doesn't even support all instances where a TODO might appear (https://github.com/pgilad/leasot#comment-format) nor all programming languages (https://github.com/pgilad/leasot#supported-languages) like grep and find do.

Leasot is written in Node.js. Regarding pretty output - that could definitely be argued, but Leasot also allows for different reporters, say you want the output in JSON/XML for an external tool. That is extendable, whereas grep over regex in CLI is fast & powerful but not as flexible
> Leasot is written in Node.js.

Ah yes, my mistake. Though being written in Javascript makes it even worse for requiring dependencies as at least Python ships with most distros default install.

> Regarding pretty output - that could definitely be argued, but Leasot also allows for different reporters, say you want the output in JSON/XML for an external tool. That is extendable, whereas grep over regex in CLI is fast & powerful but not as flexible

I'd already addressed that point. UNIX pipes allow you to extend grep using any language (including Javascript / node.js) you want. Leasot if only extendible if you already know Javascript and node.js. Plus there are plenty of CLI tools available that can read list input from STDIN and spit out JSON or XML - so you don't even need to learn how to program to convert data files.

I've lost count of the number of times I've seen people write multi-functional programs to solve problems that would have been quicker (both in development time, and execution) to pipe a couple of existing programs together. Heck, I've even fallen into this trap myself before.

I like the idea, but it's a deal-breaker if it can't parse TODOs and FIXMEs at the end of lines—that's where most of mine go.
This could actually be implemented, but it was more work with the regex so I skipped it for now, seeing that most TODOs are at the beginning of the line.

Also, you run into problems with strings which might be a false positive

I just do this: `alias TODO='ack TODO'`
Visual Studio does this automatically.

1. View -> Task List

2. At the top of the task list click the drop-down that is by default set to "User Tasks" and choose "Comments."

You can also customize the trigger words[1]. I very rarely use the feature myself as failing unit tests are far superior when compared to TODOs.

[1]: https://msdn.microsoft.com/en-us/library/zce12xx2.aspx

Eclipse has this as well; the feature is called "Task Tags" under General->Editors->Structured Text Editors in preferences. (Can just search for "task" in Preferences; Eclipse settings are a nightmare)

You can also configure the text for each tag (TODO, FIXME, etc.) as well as the priority.

Xcode does this natively as well since somewhere recent, like 6.3 I think.
Thanks for the comments. Regarding simple `ag` or `grep` usages: Yeah, that will definitely be faster (as with git grep). The problem arises when you have false positives due to either strings, variable names or other things (perhaps template or pattern matches). Then you will need extremely good regex (which leasot implements) or be really good with filtering the results.

Now what happens when you want the output in different formats? I had a person contacting me for exporting as xml since he wants to plug it in for a CI (jenkins). What if you want a JSON for your own tool?

Leasot is far from the perfect solution to TODOs, but if your use case requires anything other than simple regex, you will run into the same issues that Leasot tries to solve.

As far as speed, in my work project, parsing 552 javascript files takes around 0.2s on my mac. Some of these files being really big.

Outputting the data in your own format is actually remarkably simple from the command line - you could build up a complete JSON document by piping the input through `jq`. XML might be a bit harder, but with a bit of find and replace across a template (or by using xmllint), you can create pretty much any document you'd like.

In short, a cool tool, which I hope you had fun creating to fill your needs, but it's not something I could personally justify using at the moment. My lint tools (for Python) and `godoc` already alert me to "XXX" and "TODO" comments, letting me see them as part of my normal programming flow.

I wouldn't expend too much effort trying to convince others of it's utility - you'll get frustrated. Instead, let the tool's utility speak for itself (if its useful to you, it will also be useful to someone else), and move on and create more useful tools.

There is a plugin for Sublime Text with similar functionalities.

For ST2: https://github.com/SublimeLinter/SublimeLinter-for-ST2#subli...

For ST3: https://github.com/SublimeLinter/SublimeLinter-annotations

I've been using a one-liner git alias to do this for a while now. Works really well.

It uses git grep for grepping/colouring the output and sed to remove prefixed whitespace.

This is a great solution. I would only use Leasot if you need to weed out some false positives (variable names, strings etc...) and perhaps want to output in a special format (JSON, XML, markdown...) for another tool (Think jenkins CI for example).

If CLI regex matching (grep, ag, git grep, pt, ack...) works for you, I would stick with it ;)

looks great, any plans on integrating with github issues?

the watson[1] gem (for ruby) has a nice feature list you could go off of. leasot + some of those features might be more suitable for people who want to stay in npmland

[1]: https://github.com/nhmood/watson-ruby

Watson looks really nice... Could definitely learn from it.

In github issues you mean exporting a TODO to a github issue? If so, I don't think that belongs in Leasot, but rather an external tool for creating/manipulating Github issues (And I'm sure that kind of tool exists).

ah nice, I've done something similar that integrates with github issues and PRs, mine is not as polished tho https://sevki.org/joker
I always thought TODO and such were a huge anti-pattern. Let's be real, these are never fixed. I always thought that you should be logging this as cards in your backlog rather than littering the codebase.
I use them primarily when developing something large from scratch. All sorts of edge cases / things that can go wrong will pop into my mind when coding a method but I don't want to get out of the flow of whatever I'm doing. In this case a simple "TODO: fix for leap years" or whatever is helpful. When I get closer to completion I will search for these and attend to them.

In a production codebase, you are absolutely correct, you should standardize these items into a ticket manager.

I do this as well. The codebase becomes my scratchpad, and I leave all kinds of implementation notes throughout as I sketch out what the end product will end up looking like.

Then it's a simple matter of grepping through the codebase for "XXX" and "TODO" and implementing the changes.

Even when starting from scratch, my usual flow is to implement the top level logic using stubs (commented with "XXX"), then go back and fill in the logic from there.

I just do a global search using my text editor.
This functionality is already built into every major OS and this new tool isn't any easier to use than the existing ones:

Linux / UNIX:

  grep -n TODO *.js
Windows:

  find /n "TODO" *.js
This library is also doing regex matching, so that you don't have to worry about matches that exist outside of comments...

But like you said...you could always use fgrep (and most of us probably don't have "TODO" outside of comments anyways).

grep supports regular expressions as well (hence it's name: http://en.wikipedia.org/wiki/Grep). In fact GNU grep even has support for PCRE.

However it should be noted that regex isn't the right tool for parsing source code to begin with. There could be instances in that code where false positives are matched. And there are already known instances where positives are missed (namely "//" comments). So if you really want to exclude anything that isn't a comment then the only accurate way to do so would be full source code parsing. Failing that, it's better to match all instances since "TODO" generally isn't a string that occurs frequently outside of comments (unless it's a visual prompt to the user, eg

  alert('TODO: this feature hasn't been implemented yet')
but in those instances you'd want the source code captured as well).
Yep, truly avoiding false positives and capturing all todos you would need to really parse the source code (perhaps creating an AST or lexical parsing).

Even if you build that tool (which handles many languages) - it has an extra headache cost with the parsing time, which might be really slow for large projects.

I actually started Leasot with Javascript AST checking which never misses TODOS but is very hard to extend to other languages, as well as parsing speed was a magnitude slower.

Just out of interest, how often do you find yourself using "TODO" as a variable name or elsewhere in your code?

This isn't a dig, it's just something I've genuinely never stumbled across in 2 decades of programming so wondered if there's a culture out there I've missed.

Probably never. False positives is just a nice catch-phrase ;) But I don't presume to know other people's naming patterns
If you have TODO outside of comments which are false positives, you need to see those, and weed them out along with the true TODO items.

This TODO scanner project is completely, utterly, pointless.

There is no need for your second line. Your first line is a valid, relevant opinion. The second one has no use but to belittle the OP (or anyone else who has written a TODO scanner). You're being a dick, and missing the forest for the trees while you're at it. This project can be integrated into a larger gulp workflow, for example. It also has the ability to write the output to custom formats.
I see, I can build a cloud-powered dashboard to give my mobile workforce a semantically-enhanced visibility into the proliferation of TODO comments throughout the enterprise, integrated into the social networking applications they already use.

TODO comments could automatically be turned into action items, against the developer who introduced them (git blame!)

The software could be intelligent enough to uncover relationships among the TODO's, and schedule a meeting for specific subsets of TODO's, calling in all the relevant peo^H^H^H stake holders.

;) I'm in no way offended. I guess kazinator is spending his time much better than me. But anyway, if anyone is more comfortable using CLI (or any IDE that provides this) with grep/ack/ag/pt and that solves parsing todos for him, that's great. I would only use Leasot if you need easy integration with other tools (think JSONs/XML), want to trim down false positives (such as variable names, strings etc...).

There are probably other use cases, but overall, if cli regex works great for you, stick with it.

Regarding if Leasot is pointless or not, well everyone is entitled to their own opinion (kazinator). In the greater sense I guess the world needs more todo doers than todo parsers ;)

You probably want fgrep.
fgrep / grep -F only really comes into it's own if you have regular expression patterns that need to be matched as plain strings. "TODO" is purely alpha characters so regular grep is fine.
Did not know that. Thank you.
What did you think regular grep did, then?
This looks good. Very similar functionality is built into rails via `rake notes`