Obfuscated Python code

Sounds like a contradiction, does it not? But if the “feature” I’m about to describe doesn’t qualify as obfuscated, then I don’t know what in Earth does.

It all started with me documenting a considerable chunk of Javasript (JS) code. Now as you might know, when a web page is rendered, if no pre-processing is done, JS code is sent to the client browser as is, and executed therein. And so, to avoid having the code documentation sent to the client, I started searching for a way of striping comments out of JS code, and eventually found jsstrip. It’s a very simple script, and two versions of it are provided: one in Perl, the other in Python.

I surveyed both, and when I was looking at the latter, the following snippet of code draw my attention:

while (i < slen):
# skip all “boring” characters. This is either
# reserved word (e.g. “for”, “else”, “if”) or a
# variable/object/method (e.g. “foo.color”)
j = i
while (j < slen and chars.find(s[j]) == -1):
j = j + 1
if i != j:
token = s[i:j]
i = j

Python is quircky about indentation, but apparently not as strict. (Oh, please don’t use the sample code, HTML trambles indentation…) From all I’d read about Python indentation rules, I thought that code would yield a downright error. But it does not! It took me a while to find out why: you see, all the indentation is done with 4 space blocks, except for the line with ‘j=i’, and the lines right below and above it (fourth, fifth and sixth lines in above snippet). Those lines is indented with a tab. And what’s more surprising, not only that won’t yield any errors when running the script (unless you pass the -tt handle; the -t only issues a warning), but has a curious (to say the least…) effect: the tab characters act like an extra indent block!

So the above code is actually equivalent to the referred lines with an extra “tab”. I think the reason for this to work is the tab is translated to 8 spaces, and that amounts for the extra indentation. (although in my test on the Python prompt, a tab get translated to 7 spaces…)

I’m not saying this ain’t useful sometimes (in this script’s case it saves having a big chuck of code with an extra indent…), but it really shunned the idea the idea that I had of Python as a clean language. Moreover, I only noticed the extra indent because when I ran the code under the pdb debugger (for which this is a great intro), the indentation used was “unrolled”, so to speak (it appeared as in the second code snippet).

Roughly translating a former Portuguese reporter, «And what about this one, hein»?


Os comentários estão fechados.