Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The l (lowercase L) and the 1 (one) look really similar. Could that be the cause of some confusion? Of course, the function name helps, but most developers have learned not to trust function names to be an accurate description of what the function does, especially in tricky interview questions.

Still, I'd change this to something like:

    def append_five(l=[]):

        l.append(5)

        return l
It tests the same thing (knowledge of how default parameters work), but without the confounding problem of similar-looking characters. Of course, syntax highlighting would help the applicant out.

All of that being said, I still don't doubt that many developers don't know what they should about default parameters.



I'm not a Python expert, but iirc from various blog posts the "l" variable does not get reset between function calls which will cause undesired behavior. So calling the function 3 times without argument would produce a list of size 1,2, and 3 with the third call rather than 3 lists of size 1. Can any Python guru's confirm?


The expression presented in the parameter list is only evaluated once, and that is when the method is defined. The confusion is that people assume the expression is evaluated every time the method is called.


The confusion is that people assume the expression is evaluated every time the method is called.

Because that's how it works in a lot of other languages, such as Ruby and Javascript.


I doubt that's the only reason. I fell for it myself at first without ever seeing a line of Ruby. It initially feels intuitive, and that's why I think most fall for it.


I think it's because the arguments are bound when the function is called. It's just natural that you'd expect the default values to also be bound at the same time.


Yes that is correct. The default value gets created when the function is interpreted ("compiled").


> The default value gets created when the function is interpreted ("compiled").

No. The default value gets "created" (the expression is evaluated and stored) when the def statement is executed. Take the following example:

  In [1]: def foo():
     ...:     def append_five(l=[]):
     ...:         l.append(5)
     ...:         return l
     ...:     return append_five
     ...:

  In [2]: a = foo()

  In [3]: b = foo()

  In [4]: a()
  Out[4]: [5]

  In [5]: b()
  Out[5]: [5]

  In [6]: _4 is _5
  Out[6]: False
We only wrote one function definition, but multiple lists are created. (They are created when the "def append_five" definition executes, during the execution of foo.)


I thought that was what he meant. Is there any sharp distinction between "interpreting" and "evaluating" in python that I am unaware of? I've always used the words more or less interchangeably. But now that I think about it that might be a little naive since I have no idea how it works under the hood


You could say that interpreting is first parsing and second executing/evaluating. The parser tokenizes and does a small amount of optimization such as ignoring unassigned values.


The parent wrote "compiled", which is certainly more incorrect than either "interpreted" or "evaluated."


The key is object mutability. A list type is mutable and a tuple type is immutable.

If the candidate correctly deduces what will happen, I'll ask them to write a bug-free version, which looks like one of the below:

def append_one(var=None):

    var = var or []

    var.append(1)

    return var

def append_one(var=None):

    if var is None:

        var = []

    var.append(1)

    return var

Mutability is a very subtle but very important concept to understand in python. Everyone who uses python for non-trivial code should know it well: https://docs.python.org/2/reference/datamodel.html


> The key is object mutability. A list type is mutable and a tuple type is immutable.

I don't think the question has much to do with mutability, it isn't surprising to me nor would I imagine most programmers that a list is mutable, that's very common.

The surprising part of this question is that the default value of 'l' continues to exist outside the lexical scope of the function, the expected behavior is that the value of 'l' is initialized at function call time and is garbage collected after each call. As it sits, using default values in python is sort of like defining a global that only has a named reference inside the function block, which is very strange.


It has something to do with mutability, because if an object is immutable, the behavior of Python matches what the naive developer expects. It's only mutable objects that break those expectations.

Don't even get into unexpected behavior in classes:

    In [1]: class A(object):
       ...:     l = []
       ...:

    In [2]: a, b = A(), A()

    In [3]: a.l.append("Something")

    In [4]: a.l
    Out[4]: ['Something']

    In [5]: b.l
    Out[5]: ['Something']

    In [6]: class B(object0:
       ...:
    KeyboardInterrupt

    In [6]: class B(object):
       ...:     l = None
       ...:     def __init__(self):
       ...:         self.l = []
       ...:

    In [7]: c, d = B(), B()

    In [8]: c.l.append("Something")

    In [9]: c.l, d.l
    Out[9]: (['Something'], [])


The other scoping issue in python that always struck me as strange is that loop variables aren't scoped to the loop, they continue to exist after the loop completes. I can see the logic for this feature even if I don't agree with it, but what I really don't get is that the loop variables are not defined if you iterate over something that is empty:

   >>> for item in [1]:
   ...   print item
   1
   >>> item
   1

   >>> for i in []:
   ...   print i
   
   >>> i
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   NameError: name 'i' is not defined
I would expect i == None. That oddity makes it dangerous to use the feature unless you're really careful (e.g. using a for - else construct).


Then there's for-else:

    In [1]: for i in []:
       ...:     pass
       ...: else:
       ...:     print 'Else!'
       ...:
    Else!

    In [2]: for i in []:
       ...:     break
       ...: else:
       ...:    print 'Else!'
       ...:
    Else!

    In [3]: for i in range(2):
       ...:     break
       ...: else:
       ...:     print 'Else!'
       ...:

    In [4]: for i in range(2):
       ...:     pass
       ...: else:
       ...:     print 'Else!'
       ...:
    Else!
The syntax could be interpreted as:

  if len(l) == 0:
    print "Else!"
  else:
    for i in l:
      pass
The "catch cases where a `break` is triggered" case isn't common enough for this syntax feature to be encountered very often, leading to confusion when people come across it (though at least it's not a bug where a common use-case has weird behavior to new-comers).


> but what I really don't get is that the loop variables are not defined if you iterate over something that is empty

if you conceptualize how a for-loop has to work as a while-loop using Python's iterator protocol (which is the only way the iterator protocol itself makes sense), it seems pretty intuitive.

That is, this:

  for item in items:
      ...1
  else:
      ...2
becomes, approximately:

  try:
      while True:
          __hidden_iter = items.iter()
          try: 
              item = __hidden_iter.next()
          except StopIteration:
              raise __NormalLoopExit
          ...1
  except __NormalLoopExit:
      ...2
If you have an empty loop, the first assignment doesn't complete (instead raising StopIteration in evaluating the right side, which raises the notional exception __NormalLoopExit, which invokes the else: clause, if any) so the variable never gets around to being created.


> if an object is immutable, the behavior of Python matches what the naive developer expects

If the object was immutable then append wouldn't work. That's hardly matching expectations.


Most immutable objects don't have methods that would mutate the value, but fail because the object is immutable...

I guess the clarification to what I was saying is that, in the simple case (integers, strings, None) the objects are immutable. It's only getting into cases where the value of the object itself is mutable, that you run into issues. If all objects (or all objects 'allowed' as default values) were immutable, then this behavior would not trigger.

So saying that mutability has nothing to do with it isn't entirely true. It's the immutability of the types of values used in most simple cases that hides this issue from developers until they run into a more complex case.


Read my post that has the "correct answers" which show you how to do it. The key is setting the default to None and then doing something like:

if val is None: val = []

or the more idiomatic python way:

    val = val or []


I believe the former is more idiomatic, but I don't have a reference.

You want to explicitly check against `None` so that you're not overwriting all falsey values of `val` - even though you should generally try to enforce argument types, your second example would cause unexpected behavior in some cases, particularly those that have non-falsey 'default' assignments


I'm well aware of that way to do it, but it doesn't excuse a different way being unintuitive.


Why? You would just get a new list back.


I can see why you might think so, but remember the Zen of Python is to have only one obvious way to do something.

    >>> [1,2,3] + [4,5]
    [1, 2, 3, 4, 5]
Thus appending should do something different than addition.

    >>> x = [1,2,3]
    >>> x.append([4,5])
    >>> x
    [1, 2, 3, [4, 5]]


I'm sorry, but I don't understand why you would think this as unexpected behaviour? For the class A, the list l is a class-level attribute, hence it can be referred via either a or b objects, but for class B, after initialisation, l is an object attribute, so it is different for both c and d.


It's not the concept that can be confusing, it's the syntax python chose.

In most of the languages I'm familiar with, there are very clear syntax differences when working with class attributes. For example, in many languages class attributes have to be accessed via the class name instead of from an instance of the class making it clear to the programmer they are working with a class attribute, e.g. MyClass.myClassVariable not myInstance.myClassVariable. Additionally, the way you define class attributes in python is the way you define instance attributes in many languages, which just adds to the confusion. e.g. in Java or C# you can define class variables directly in the class body, but an explicit 'static' keyword is needed, undecorated definitions are assumed to be instance variables.

Finally, I think the definition of class B above is a little more nuanced, class B has both a class attribute named l AND an instance attribute named l.

B.l == None and B().l == []


Ah, gotcha!

It's been a while since I've done major OOP coding in any language other than Python, so I'm a little rusty. The issues you raise are perfectly legitimate and would be understandably confusing to newcomers to the language. :)


In Python everything is an object, including a function. The default value isn't a global, it belongs to the function.

I wonder if people who weren't exposed to languages which work differently ala C++ would be as surprised?


I'm not a Python dev, but I've been meaning to learn for a while. So this is really interesting stuff. A few questions, if you don't mind.

I understand mutability and immutability in other languages (and I gave your link a quick read to make sure there weren't any weird Python-specific rules), so I understand how the list can change and still be the same object, but a tuple or string would not. But why does that mean that the default parameter object remains in existence throughout all calls, instead of being recreated each time it is called?

Is there a reason for this being the default behavior? It seems like the majority of the time you would want to use a default parameter, you'd want it to behave like your bug-free examples.


Think of the default parameter values as arguments to the initializer for the function object. If you passed a list into the constructor of a class, you wouldn't be surprised that if you modified the list outside the class that it would modify the same list inside the class.

While that explains how it works, I actually completely agree with you. This is surprising behavior and, in a language that prides itself on not being surprising, seems, well, surprising.

I have to wonder if performance isn't the big reason for it. If your default is [], it isn't a big deal to re-evaluate, but if your default is get_default_cities_from_slow_web_service(), having that re-evaluated on every function call would be catastrophic. Given the choice between two negatives, the choice they made is probably reasonable.


You pretty much nailed it right there.

Before I ever ask this question (I do a lot of tech interviews sadly) I always ask the candidate about object mutability vs immutability. Almost everyone knows the textbook answer, and only a few know the actual implications of it. This tests which they know :)

Default kwargs of a function are defined at function definition. However, they are only in scope, for the scope of said function. It is a weird but important subtle difference.


I think only the second is truly bug-free. The first only does what the user expects if they pass in a non-empty list:

  my_list = []
  append_one(my_list)
  # my_list didn't get anything appended to it
This shows up another subtle trap related to the "truthiness" (or falsiness in this case) of things like the empty list.


Since append doesn't return a value, how about:

  def append_one(var=None):
      return (var or []) + [1]
Would this take longer and/or use more storage for long lists as vars?


When you use + on two lists, a new list is created, and elements from both are copied into the new one. Whereas the append operation modifies the list, and simply adds a value. Keep in mind that a python "list" is really like a C++ vector, so while sometimes append operation sometimes allocates a new array, and copies all the values, in general is O(1). The add operation is O(n).

And besides all that, there is nothing wrong with doing an append on one line, and returning the variable on the next. It's clear and readable.


I like this, very elegant actually.


I'd argue that the key difference from other languages is (re)assignment rather than (im)mutability.


Yes, that's correct. The default value is only interpreted once, when the `def` statement is called. After that point, it's completely mutable. You have to see Python functions as objects and default parameter values as object variables.


The problem is not the existence of the object variables, but that they are in such an unfortunate place. The rest of the parameter list is declaring fresh local variables. It's inconsistent that the left side of the equals sign is per-invocation, and the right side is per-def.


Not the op, but I'd accept the confusion response of: [[]] [[],[]] [[],[],[]]

because the behavior is the same, whether or not they misread an 'l' as a 1.


Python doesn't seem to agree with you :^)

  In [1]: a = []

  In [2]: a.append(a)

  In [3]: a
  Out[3]: [[...]]

  In [4]: a[0]
  Out[4]: [[...]]

  In [5]: a[0][0]
  Out[5]: [[...]]

  In [6]: a[0][0][0]
  Out[6]: [[...]]

  In [7]: a[0][0][0][0]
  Out[7]: [[...]]

  In [8]: a.append(a)

  In [9]: a
  Out[9]: [[...], [...]]

  In [10]: a[0][1][0] is a
  Out[10]: True

  In [11]: id(a)
  Out[11]: 4547140064

  In [12]: id(a[0][1][0])
  Out[12]: 4547140064


Yeah, the whole infinite loop thing. Wasn't fully thinking when I wrote my reply. Good catch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: