Hacker News new | ask | show | jobs
by akarambir 2793 days ago
It was sed substitute command on a ~800Mb file on Thinkpad T470 with SSD. It was taking around 40-50 sec for each substitution. Though as others have pointed, it may not be directly related to article in discussion.
2 comments

>It was taking around 40-50 sec for each substitution.

Substitution should not be really a relevant metric as it wouldn't influence the result much. Sed/Awk will still have to go through the whole file to find all occurrences they should substitute (and when they do find an occurrence, the substitution would take nanoseconds).

The size of the file is a better metric (e.g. how many seconds for that 800mb in total).

Also, whether you used regex in your awk/sed, and what kind. A badly written regex can slow down search very much.

Did you use any quadratic or worse regex algorithm? Such as having more than one .* in a single regex.

Did you set LANG=C before running sed, to bypass the UTF-8 logic?

Also, if you had a list of substitutions to perform, did you try writing them as a single sed script?