Thursday 22 September 2016

Magic behind python WAT moments

Standard

So once upon a time you find that thing in your programming language that behaves not really how you would expect it to.
This post will be about those moments when you are programming in python.

1. 'is' on small ints

This is one of the most popular one, let's say you write something along those lines (which you probably never should actually, I will explain why in a sec):

>>> a = 5

>>> b = 5

>>> a is b

True

Okay, that's seems reasonable, 5 is definitely a 5, let's try the next one.

>>> a = 300

>>> b = 300

>>> a is b

False

wat? This seems pretty inconsistent... But why?
For that we will have to look inside python implementation!
(This one is from python 3.5.2):

Objects/longobject.c

#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS 257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS 5
#endif

...

/* Small integers are preallocated in this array so that they
   can be shared.
   The integers that are preallocated are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];


Uh, now everything makes sense! Integers between -5 and 256 are kept in an array for quick access.
So with 'is' operator that checks if tested objects are the same one, it's true only for -5  <= x <= 256.
That's why I said you shouldn't use this code in the first place(I've seen it a few times), it's misuse of 'is' operator which in this case should be replaced with '=='.

2. type comparison

Let's play with REPL just for a bit longer.

>>> 5 < "a"
True

>>> True > "a"
False

>>> {} > 5
True  

At a glance it doesn't make much sense, but fast lookup into documentation tells us everything we need to know.
CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address.

Note: This is only true for Python 2.*, Python 3.* documentation states this:
The <, <=, > and >= operators will raise a TypeError exception when comparing a complex number with another built-in numeric type, when the objects are of different types that cannot be compared, or in other cases where there is no defined ordering.
Which you can confirm with a simple check.

3. default function value

More fun:

>>> def fun(bar=[]):
...   bar.append("yoyo")
...   return bar
... 
>>> fun()
['yoyo']
>>> fun()
['yoyo', 'yoyo']

That "feature" is the one that everyone should know. It's really surprising for newcomers, for explanation and workaround I advise to read this.

0 comments:

Post a Comment