Hacker News new | ask | show | jobs
by 38leinad 5108 days ago
just use cygwin
1 comments

I feel like cygwin manages to combine the worst of UNIX and the worst of Windows. And the commands that use the cygwin runtime are noticeably slower than the native counterparts.

Also, AFAIK there is no 64bit version yet, which for me is a dealbreaker (I often work with files larger than 2GB).

Almost nothing, except in-memory processing, requires 64-bit for dealing with files over 2GB. Piping with utilities is the Unix way, and it works well with files of any size in Cygwin.

Cygwin commands that run slower than Windows counterparts are typically those that are syscall heavy, where those syscalls are significantly different on Windows and need lots of work for emulation. The biggie is fork(); it's better by far to write scripts etc. in such a way that they stream results rather than iterate and create new processes.

So, for example, rather than write a script that converts Unix paths to Windows paths with iterated calls to cygpath -w, instead pipe the paths to cygpath -w -f -. Rather than use pipe-to-sed (like "$(echo $foo | sed 's|bar|baz|')") for ad-hoc edits, try to use shell substitutions instead (like "${foo/bar/baz}").

Another thing that can be slower in Cygwin is find, when run over very large directory trees. I wrote a wrapper script (I call it rdir) that runs "cmd.exe /c dir /b" and massages the output into a Cygwin-style format. I also have the same script written in terms of find, so that my scripts that use it work on Windows, Solaris, Linux and OS X.

But I have to say, the biggest limiting factor in me solving ad-hoc problems is composing the tools available, rather than the actual runtime speed of the tools themselves. Having all the Unix tools available makes my life far easier in this respect. They could be even slower, and I wouldn't mind, because I would still be saving lots of time compared to what Windows provides; and my scripts usually also work on all my other systems running different OSes.

PowerShell doesn't even support simple fork-join like bash does trivially:

    for x in {1..10}; do (sleep $x; echo $x) & done; wait
I use this idiom a lot when dealing with lots of multi-gigabyte files. PowerShell is mostly useful to me when I need to access Windows-specific stuff that Cygwin doesn't do well, like WMI.
> Almost nothing, except in-memory processing, requires 64-bit for dealing with files over 2GB. Piping with utilities is the Unix way, and it works well with files of any size in Cygwin.

Even ls, or wc -c report bogus results with >2GB files. less does not work even if I want to look at just the first few hundreds of lines (and "head -n 1000 | less" is a horrible workaround).

> Cygwin commands that run slower than Windows counterparts are typically those that are syscall heavy

Most unix commands are syscall/filesystem/IO heavy, after all they are file utilities. What you say with find is exactly what I'm talking about. I find that the unix tools ported to Win32 and compiled with mingw are significantly faster.

Eh? What you state about ls, wc and less is directly contrary to my experience. I'm so astonished I created a 30GB test file and tested it:

     $ cmd /c dir k.txt
     Volume in drive C is CobraRoot
     Volume Serial Number is 02D8-502C
    
     Directory of C:\Users\barrkel\AppData\Local\Temp
    
    2012-07-01  15:24    31,292,160,000 k.txt
                   1 File(s) 31,292,160,000 bytes
                   0 Dir(s)  142,087,471,104 bytes free
    
    $ du -h k.txt
    30G     k.txt
    $ ls -l k.txt
    -rw-r--r--+ 1 barrkel None 31292160000 Jul  1 15:24 k.txt
    $ wc -c k.txt
    31292160000 k.txt
    $ time wc -l k.txt
    6400000000 k.txt
    
    real    1m5.651s
    user    0m46.207s
    sys     0m8.642s
66 seconds to read 30GB isn't too bad, that's over 400MB/sec. (It's an SSD.) When I said directory listings could be slow, I meant directory listings, not general I/O; simple read() and write() do not need translation (provided you aren't using Cygwin text-mode mount options, which are not recommended).

    $ less k.txt
this works just fine; when I do > to go to the end of the file, it goes there immediately, but stays busy calculating line numbers (it's scanning the whole file); if I cancel with Ctrl+C, it stops, just like it does on other Unix OSes.

PS: It's the mingw tools that don't work properly! I tried it a couple of times, but all the incompatibilities made me give up pretty quickly.

Thanks, that's very interesting, maybe I should give cygwin another try. Last time it was a couple of years ago and I had all the mentioned problems, then I decided to wait until a 64bit version before trying it again...
If I had to guess, I'd say somehow you ended up with text-mode mounts in your previous experience. The default, and recommended, is binary mode, but you're given a choice on install. It affects C programs that specify "t" to fopen() and friends, and causes Cygwin to convert line endings to and from DOS. But it's more trouble than it's worth.
I'm with you there -- I find Cygwin starts slow and unstable, and then seems to rot at an astonishing speed, to the point where it's usually unusable after I've had it installed for a couple of months.
Cygwin doesn't rot. It doesn't automatically update, and nor does it self-configure, so there's nothing to cause the rot. I've never had problems like you describe.

The biggest problem - and what I suspect is happening to you - is when you have third-party programs and utilities that interfere with Cygwin, most often by putting an older or newer version of cygwin1.dll on the $PATH (i.e. you may be using Cygwin as part of some other program and not be aware of it). Cygwin uses shared memory; multiple versions of cygwin1.dll disagree on the format of this shared memory, and things go pear-shaped pretty quickly from there.

Also some antivirus programs can trip up Cygwin; in its emulation, it sometimes has cause to open, close then open files in quick succession, but AV programs sometimes analyse files when they are opened by programs, and cause bogus timing-dependent sharing errors.

I love Cygwin, but I have experienced what I would describe as rot. (manifested via errors on fork()). I think that's caused by things changing around it, via windows dll updates, but I don't have a lot of insight into windows' dll handling and whatnot so I don't know for sure. Also pretty sure installing new cygwin packages can cause this. Usually, rebaseall fixes such problems.

In any event, I much prefer Cygwin to the alternatives.

"there's nothing to cause the rot"

The rot tends to set in as I install packages to Cygwin; the more I add, the slower and less stable Cygwin seems to become, to the point of taking tens of seconds to reach a prompt after opening. Make of that what you will.

"Also some antivirus programs can trip up Cygwin"

That could well be a contributing factor in my case.

Did you try disabling bash completion[1]?

[1] http://cfc.kizzx2.com/index.php/tag/cygwin-slow-performance-...

No, I didn't -- good spot, thank you.

I haven't bothered re-installing cygwin since the last nuke-and-pave, but I'll bear that in mind if I do and it's still relevant.

Windows lacks fork(), or at least it's not documented. To implement the Cygwin folks have to go through a lot of tricks to implement it. And one of them is rebasing all dlls to start at different addresses.

There is a tool to rebase everything, and it's usually started after install. It's possible that after you have recompiled your own apps/dlls they might need rebasing too (speaking as a cygwin user, not developer).

The only thing that I found annoying in cygwin, is that I can't use all commands from the cmd.exe, because they make sense only under cygwin. For example "gcc" is just a cygwin symbolic link to "gcc-something.exe", and when you "run it" it gives "Access denied". The workaround is to run like this: sh -c "gcc <args>" from cmd.exe

But it might solve very hard problems, for example - redis (from antirez's depot) compiled with cygwin worked for me, and although there is much better windows version by the MSOpenTech guys, it just shows that sometimes it might be the only reasonable way. The other app that comes in mind is rsync.

> Windows lacks fork(), or at least it's not documented.

The NT kernel supports forking, but the Win32 subsystem does not. Because Interix lives outside the Win32 subsystem, it can provide a proper fork() implementation, whereas Cygwin has to live with its somewhat brittle emulation.

Maybe I misunderstand this, but couldn't someone write a driver for Cygwin to have a better fork implementation?
As I understand it, the problem lies with making the different subsystems interact: Forking itself is reasonably easy using NtCreateProcess(), but the Win32 subsystem won't know how to deal with the forked process and stuff will break, including console output.

I don't see Microsoft adding forking support to the Win32 subsystem any time soon, so you'd end up rewriting Cygwin from scratch by reverse engineering Interix...

When I sometimes have to work in Windows (which really makes no sense at those companies since the products I build 99% of the time run on Linux); Cygwin makes Windows usable.