| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by clon 1673 days ago
	At what point should you give up writing a sed "program" and just write a few lines of Python to arrive at an answer? Seems he went way past that point.

3 comments

jolmg 1673 days ago

It was even easier to just count with my finger. :)

The point was a sed challenge. Challenges are fun sometimes.

It was also a good exercise for doing things with sed. There are times when sed is the most terse way to shape text to how you want it. It's also very portable.

link

azalemeth 1673 days ago

Is answering the challenge he answered with python actually easier? I know counting the weeks is easier, but is parsing text like a ninja far less verbose in posix-like tools? Learning awk made me much more productive, at the very least...

link

pcwalton 1673 days ago

Ruby and Perl have awk modes, activated with command-line switches, that are strictly more powerful than awk. I tried using awk recently for a text-processing task and it was a fun learning experience, but I ended up rewriting in Ruby. There's basically no advantage that awk has over Ruby or Perl, aside from portability to embedded systems, and plenty of disadvantages.

link

jolmg 1673 days ago

> Ruby and Perl have awk modes

Yes, but not Python. azalemeth was wondering if Python could be more terse than sed for this type of work. I think Python tends to be the most verbose among all languages discussed for text handling.

> There's basically no advantage that awk has over Ruby

There are a few. The main advantage of awk over ruby for me is that it's more terse for some simple jobs. That's because of features like

- how unused variables can be used as numbers, strings, or associative arrays without previous declaration. For example, in the common snippet `awk '!a[$0]++'`, `a` is referenced as an array without having to declare it as such first, and the returned value is used as a number (i.e. by `++`) without having to declare that as such.

- how number strings can be implicitly used as numbers

- how records are split by default in a very useful way, with each field made available in a very terse manner with the `$` unary operator.

- how plain regexes can be used as booleans that match over the line.

- how simply passing a boolean expression means to print records that match. It also helps that records can be multi-line or be defined by regexes.

- how ranges can be specified with a couple of boolean expressions

- how files and co-processes can be implicitly opened on first use

It also has advantage in that

- It launches much quicker in comparison to most languages. I recently did a small shell DSL using many piped calls to awk to write some extensions for i3, and performance was decent, surprisingly. jq calls were a far bigger bottleneck.

- It's available on even really old systems. It can be refreshing at times when your only other options are sh, C, or 4GL.

There are many cases where awk is more convenient than ruby.

link

nrdvana 1672 days ago

That pretty much describes perl, except for the co-processes. Probably in part due to Larry Wall initially setting out to make "a better awk".

link

helsinki 1673 days ago

cal -y | sed -E '1,2d;s/(.{20}) /\1\n/g;' | python3 -c “import sys, calendar;[print(ln.strip()) for ln in sys.stdin.readlines() if ln.strip() in calendar.month_name]”

easier to do:

python3 -c “import calendar;[print(m) for m in calendar.month_name]”

link

t0mas88 1673 days ago

I think you've only read the title? :-) The article is about a script that prints the full calendar format, with all the days/dates in between to be able to count some weeks. Not just a list of month names.

link

fragmede 1673 days ago

> Of course, I just looked at the calendar and counted

link