Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Shoot. I mixed up the module name. It's the sh module https://sh.readthedocs.io/en/latest/

    sort_env = os.environ.copy()
    sort_env["LANG"] = "C"
    sh.wc("-l", _in=sh.uniq("-c", _piped=True, _in=sh.sort(_env=sort_env, _piped=True, _in=sh("some-command", _piped=True, _err_to_out=True))))
It's not as natural in some ways for most people because you have to write it right to left instead of left to right as with the pipe syntax. If you split it over multiple lines it's better:

    some_command = sh("some-command", _piped=True, _err_to_out=True)
    sorted = sh.sort(_env=sort_env, _piped=True, _in=some_command)
    unique = sh.uniq("-c", _piped=True, _in=sorted)
    word_count = sh.wc("-l", _in=unique)
There's also all sorts of helpful stuff you can do like invoking a callback per line or chunk of output, with contexts for running a sequence of commands as sudo, etc etc.

And of course, you don't actually need to shell out to sort/uniq either:

   output_lines = sh("some-command", _err_to_out=True).splitlines()
   num_lines = len(list(set(output_lines)))
This is also cheaper because it avoids the sort which isn't strictly necessary for determining the number of unique lines (sorting is typically going to be an expensive way to do that for large files compared to a hash set because of the O(nlogn) string comparisons vs O(n) hashes).

It's really quite amazing and way less error prone too when maintaining anything more complicated. Of course, I've found it not as easy to develop muscle memory with it but that's my general experience with libraries.



Oh neat, I guess I missed the "_piped" arg when I looked. That does make it a lot better.

> And of course, you don't actually need to shell out to sort/uniq either:

Yeah, it's a contrived example. Imagine something more important happening there. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: