Plus One I Use Less Than People Say I Should
Python isn’t my main language as an embedded software developer, but I use it a lot to create tools and test harnesses for projects. Here are some of the language and library features I have found most useful over the years.
1: The “with” Statement
Context managers were one of the best inventions of the early years of Python. Suddenly it took very little care to manage resources and behave gracefully in error conditions. Yes, sure, they are syntactic sugar for try/finally blocks, but for once they are syntactic sugar that works for the programmer rather than just making things look pretty. I seldom write my own context managers — it’s rare that I use resources that there aren’t already managers for — but I use the built in ones all the time. File handling in particular is just easier to think about what all I have to do is:
with open(“myfile.dat”, “wb”) as outfile:
important_data.save(outfile)
I don’t have to explicitly close the file (so I can’t forget to do that), and if something goes wrong saving my data the file will still be properly closed, which can be a massive aid while debugging.
If you haven’t learned to love context managers, go and do so immediately.
2: Comprehensions
List, dictionary and set comprehensions (and their close cousin generator expressions) are a bit of a mixed blessing. On the plus side, they are compact ways to represent data structures that would otherwise need you to write a for loop to generate element by element. On the minus side, they are compact ways to represent data structures that might make more sense written out as a for loop.
Some people find expressions like “[2*n + 1 for n in some_list_of_ints]” hard to understand. I’ve never had a problem with that, having a background in degree-level maths, so I use this shorthand all the time. In fact my usual problem is remembering that these things are called “comprehensions”, because that term makes no sense to me. The term seems to have been coined in 1977, but it’s not one I met as a mathematician and I have to do a lot of torturing of the English language to relate building a list like that to comprehending it.
I don’t use comprehensions when the expressions start getting complicated. If the calculation is complex, or there would have to be a non-trivial filtering of the original data structure, then it is generally better to write it out in full rather than abbreviate with a comprehension. The full version will be more, er, comprehensible.
3: pathlib
Structured handling of file and directory names is a joy of the universe, and you will not persuade me otherwise. Python’s standard library module “pathlib” does exactly that, returning objects that you can manipulate in ways that make sense.
It isn’t possible to completely hide the operating system-specific nature of filenames, but pathlib gives it a pretty good go. The elements of a path are defined to cover both Windows and Unix variations, in such a way that the differences can be smoothed away when you don’t need to know them. Gotchas start springing up whenever symbolic links get involved, as they always do, but Path objects do have methods you can use to work around even those.
If you ever have to process file or directory names, even for as little as checking they exist, then pathlib is the way to go.
4: argparse
With the standard library module “argparse” there is really no excuse for not doing parameter parsing properly in your Python command line tools. I use it even for trivial cases because it’s easy to do and simple to extend, and one thing I’ve discovered over the years is that I always end up tweaking my tools.
It’s easy to be put off using the ArgumentParser class because it is very flexible, but I don’t find I need things like argument groups or exclusive options very often. Types, defaults and a few of the actions are all you need most of the time, and the benefit of getting clean data to your application code far outweighs the ten minutes you may need to refresh your mind about what is what.
The only real fly in the ointment is file handling. I find that I rarely use the FileType features that will automatically open files for you. There are two reasons for that. First it bypasses the file context manager I mentioned above, so I should really use a try/finally combination to ensure that the file is properly closed when I’ve finished with it. That’s a bit messy and unlovely. Secondly, I often find myself deriving my default output file name from the input file name, for example a log-cleaning tool might automatically turn the input filename “mylog.txt” into an output filename of “mylog_clean.txt” if I don’t feel like specifying something else. If I have to haul the filename out of the open file object ArgumentParser hands me, that’s another source of mess, not to mention I would have to repeat the special casing for stdin and stdout. It turns something that shortens my code into something that lengthens it, which is not exactly a win.
5: pdb
The Python Debugger is another one of those great things about the standard library. Debug environments for Python code are as varied as the IDEs and specialist editors that exist for the language, but there are circumstances where the irresistible force of your IDE comes up against the immovable object of a structure that just isn’t going to play ball with it. For example I use Qt for Python at home, and my usual debug environments fail the moment the Qt event loop takes over.
Not so pdb. I do have to manually insert “import pdb; pdb.set_trace()” at the points in the code I’m suspicious of, and then run my application from the command line (which I was doing anyway), but when my code hits the breakpoint it will stop and give me a debug prompt. It may look primitive compared with those flashy development apps, but it has the advantage of working.
+1: venv
Don’t get me wrong, virtual environments are not a bad thing. I can see the value in being able to pop up specific versions of Python packages for specific development purposes, though at work I am more likely to do that through docker containers. It’s just that on my home computer, which I am the only user of, I almost never need that. There are few good reasons not to install a package in the generic site packages when it will never clash with anything. At that point, having to activate a virtual environment is an unnecessary and unwelcome overhead.
Virtual environments are like docker containers in that they solve an irritating problem, but they do it in an irritating way. The only way I find to avoid irritation is to stop doing whatever it was that required me to use them. I have plenty of bits of code I have stopped playing around with at home because doing anything with them required me to find and activate a relevant virtual environment that I have long forgotten about. That’s not a good thing.
