The l (lowercase L) and the 1 (one) look really similar. Could that be the cause of some confusion? Of course, the function name helps, but most developers have learned not to trust function names to be an accurate description of what the function does, especially in tricky interview questions.
Still, I'd change this to something like:
def append_five(l=[]):
l.append(5)
return l
It tests the same thing (knowledge of how default parameters work), but without the confounding problem of similar-looking characters. Of course, syntax highlighting would help the applicant out.
All of that being said, I still don't doubt that many developers don't know what they should about default parameters.
I'm not a Python expert, but iirc from various blog posts the "l" variable does not get reset between function calls which will cause undesired behavior. So calling the function 3 times without argument would produce a list of size 1,2, and 3 with the third call rather than 3 lists of size 1. Can any Python guru's confirm?
The expression presented in the parameter list is only evaluated once, and that is when the method is defined. The confusion is that people assume the expression is evaluated every time the method is called.
I doubt that's the only reason. I fell for it myself at first without ever seeing a line of Ruby. It initially feels intuitive, and that's why I think most fall for it.
I think it's because the arguments are bound when the function is called. It's just natural that you'd expect the default values to also be bound at the same time.
> The default value gets created when the function is interpreted ("compiled").
No. The default value gets "created" (the expression is evaluated and stored) when the def statement is executed. Take the following example:
In [1]: def foo():
...: def append_five(l=[]):
...: l.append(5)
...: return l
...: return append_five
...:
In [2]: a = foo()
In [3]: b = foo()
In [4]: a()
Out[4]: [5]
In [5]: b()
Out[5]: [5]
In [6]: _4 is _5
Out[6]: False
We only wrote one function definition, but multiple lists are created. (They are created when the "def append_five" definition executes, during the execution of foo.)
I thought that was what he meant. Is there any sharp distinction between "interpreting" and "evaluating" in python that I am unaware of? I've always used the words more or less interchangeably. But now that I think about it that might be a little naive since I have no idea how it works under the hood
You could say that interpreting is first parsing and second executing/evaluating. The parser tokenizes and does a small amount of optimization such as ignoring unassigned values.
> The key is object mutability. A list type is mutable and a tuple type is immutable.
I don't think the question has much to do with mutability, it isn't surprising to me nor would I imagine most programmers that a list is mutable, that's very common.
The surprising part of this question is that the default value of 'l' continues to exist outside the lexical scope of the function, the expected behavior is that the value of 'l' is initialized at function call time and is garbage collected after each call. As it sits, using default values in python is sort of like defining a global that only has a named reference inside the function block, which is very strange.
It has something to do with mutability, because if an object is immutable, the behavior of Python matches what the naive developer expects. It's only mutable objects that break those expectations.
Don't even get into unexpected behavior in classes:
In [1]: class A(object):
...: l = []
...:
In [2]: a, b = A(), A()
In [3]: a.l.append("Something")
In [4]: a.l
Out[4]: ['Something']
In [5]: b.l
Out[5]: ['Something']
In [6]: class B(object0:
...:
KeyboardInterrupt
In [6]: class B(object):
...: l = None
...: def __init__(self):
...: self.l = []
...:
In [7]: c, d = B(), B()
In [8]: c.l.append("Something")
In [9]: c.l, d.l
Out[9]: (['Something'], [])
The other scoping issue in python that always struck me as strange is that loop variables aren't scoped to the loop, they continue to exist after the loop completes. I can see the logic for this feature even if I don't agree with it, but what I really don't get is that the loop variables are not defined if you iterate over something that is empty:
>>> for item in [1]:
... print item
1
>>> item
1
>>> for i in []:
... print i
>>> i
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined
I would expect i == None. That oddity makes it dangerous to use the feature unless you're really careful (e.g. using a for - else construct).
In [1]: for i in []:
...: pass
...: else:
...: print 'Else!'
...:
Else!
In [2]: for i in []:
...: break
...: else:
...: print 'Else!'
...:
Else!
In [3]: for i in range(2):
...: break
...: else:
...: print 'Else!'
...:
In [4]: for i in range(2):
...: pass
...: else:
...: print 'Else!'
...:
Else!
The syntax could be interpreted as:
if len(l) == 0:
print "Else!"
else:
for i in l:
pass
The "catch cases where a `break` is triggered" case isn't common enough for this syntax feature to be encountered very often, leading to confusion when people come across it (though at least it's not a bug where a common use-case has weird behavior to new-comers).
> but what I really don't get is that the loop variables are not defined if you iterate over something that is empty
if you conceptualize how a for-loop has to work as a while-loop using Python's iterator protocol (which is the only way the iterator protocol itself makes sense), it seems pretty intuitive.
If you have an empty loop, the first assignment doesn't complete (instead raising StopIteration in evaluating the right side, which raises the notional exception __NormalLoopExit, which invokes the else: clause, if any) so the variable never gets around to being created.
Most immutable objects don't have methods that would mutate the value, but fail because the object is immutable...
I guess the clarification to what I was saying is that, in the simple case (integers, strings, None) the objects are immutable. It's only getting into cases where the value of the object itself is mutable, that you run into issues. If all objects (or all objects 'allowed' as default values) were immutable, then this behavior would not trigger.
So saying that mutability has nothing to do with it isn't entirely true. It's the immutability of the types of values used in most simple cases that hides this issue from developers until they run into a more complex case.
I believe the former is more idiomatic, but I don't have a reference.
You want to explicitly check against `None` so that you're not overwriting all falsey values of `val` - even though you should generally try to enforce argument types, your second example would cause unexpected behavior in some cases, particularly those that have non-falsey 'default' assignments
I'm sorry, but I don't understand why you would think this as unexpected behaviour? For the class A, the list l is a class-level attribute, hence it can be referred via either a or b objects, but for class B, after initialisation, l is an object attribute, so it is different for both c and d.
It's not the concept that can be confusing, it's the syntax python chose.
In most of the languages I'm familiar with, there are very clear syntax differences when working with class attributes. For example, in many languages class attributes have to be accessed via the class name instead of from an instance of the class making it clear to the programmer they are working with a class attribute, e.g. MyClass.myClassVariable not myInstance.myClassVariable. Additionally, the way you define class attributes in python is the way you define instance attributes in many languages, which just adds to the confusion. e.g. in Java or C# you can define class variables directly in the class body, but an explicit 'static' keyword is needed, undecorated definitions are assumed to be instance variables.
Finally, I think the definition of class B above is a little more nuanced, class B has both a class attribute named l AND an instance attribute named l.
It's been a while since I've done major OOP coding in any language other than Python, so I'm a little rusty. The issues you raise are perfectly legitimate and would be understandably confusing to newcomers to the language. :)
I'm not a Python dev, but I've been meaning to learn for a while. So this is really interesting stuff. A few questions, if you don't mind.
I understand mutability and immutability in other languages (and I gave your link a quick read to make sure there weren't any weird Python-specific rules), so I understand how the list can change and still be the same object, but a tuple or string would not. But why does that mean that the default parameter object remains in existence throughout all calls, instead of being recreated each time it is called?
Is there a reason for this being the default behavior? It seems like the majority of the time you would want to use a default parameter, you'd want it to behave like your bug-free examples.
Think of the default parameter values as arguments to the initializer for the function object. If you passed a list into the constructor of a class, you wouldn't be surprised that if you modified the list outside the class that it would modify the same list inside the class.
While that explains how it works, I actually completely agree with you. This is surprising behavior and, in a language that prides itself on not being surprising, seems, well, surprising.
I have to wonder if performance isn't the big reason for it. If your default is [], it isn't a big deal to re-evaluate, but if your default is get_default_cities_from_slow_web_service(), having that re-evaluated on every function call would be catastrophic. Given the choice between two negatives, the choice they made is probably reasonable.
Before I ever ask this question (I do a lot of tech interviews sadly) I always ask the candidate about object mutability vs immutability. Almost everyone knows the textbook answer, and only a few know the actual implications of it. This tests which they know :)
Default kwargs of a function are defined at function definition. However, they are only in scope, for the scope of said function. It is a weird but important subtle difference.
When you use + on two lists, a new list is created, and elements from both are copied into the new one. Whereas the append operation modifies the list, and simply adds a value. Keep in mind that a python "list" is really like a C++ vector, so while sometimes append operation sometimes allocates a new array, and copies all the values, in general is O(1). The add operation is O(n).
And besides all that, there is nothing wrong with doing an append on one line, and returning the variable on the next. It's clear and readable.
Yes, that's correct. The default value is only interpreted once, when the `def` statement is called. After that point, it's completely mutable. You have to see Python functions as objects and default parameter values as object variables.
The problem is not the existence of the object variables, but that they are in such an unfortunate place. The rest of the parameter list is declaring fresh local variables. It's inconsistent that the left side of the equals sign is per-invocation, and the right side is per-def.
In [1]: a = []
In [2]: a.append(a)
In [3]: a
Out[3]: [[...]]
In [4]: a[0]
Out[4]: [[...]]
In [5]: a[0][0]
Out[5]: [[...]]
In [6]: a[0][0][0]
Out[6]: [[...]]
In [7]: a[0][0][0][0]
Out[7]: [[...]]
In [8]: a.append(a)
In [9]: a
Out[9]: [[...], [...]]
In [10]: a[0][1][0] is a
Out[10]: True
In [11]: id(a)
Out[11]: 4547140064
In [12]: id(a[0][1][0])
Out[12]: 4547140064
Still, I'd change this to something like:
It tests the same thing (knowledge of how default parameters work), but without the confounding problem of similar-looking characters. Of course, syntax highlighting would help the applicant out.All of that being said, I still don't doubt that many developers don't know what they should about default parameters.