Mutability/immutability is a basic concept in Python (and others) programming. However, fragmented and inconsistent online resources on these topics have confused me, and others seem confused. Therefore, there is one more of my partial understanding here.

Many concise definitions of immutable data type are like this:

Immutability refers to the property of an object or data structure whose state cannot be altered after it is created

Instances of primitive built-in values, such as numbers, are immutable. The values themselves cannot change over the course of program execution.

An object with a fixed value. Immutable objects include numbers, strings and tuples. Such an object cannot be altered. A new object has to be created if a different value has to be stored. They play an important role in places where a constant hash value is needed, for example as a key in a dictionary.

What does "cannot be altered" really mean? The more I learned, the more complicated this subject seemed. Immutability often arises in various contexts.

I have been guessing what the underlying immutable things do incorrectly only after look at enough examples

Is immutable data types unique in Python? No

Is iterable also mutable? No, str is a counter-example

In Python, is the immutable passed by value and the mutable passed by ref? parameters pass the reference to object, and the symbol(variable) inside function is a new immutable

Is Python immutable a replacement of "const" variable/pointer in the static languages? No, immutable is intrinsic to the value (concrete pieces of data), Python symbols can be reassigned at any time

I have approached this topic playfully yet failed to understand the essence: immutability relates closely to pure functions and side effects. These are fundamental ideas in functional programming. Python incorporates various programming paradigms. I need to delve into pure functional programming to fully understand the purpose of all Pythonic tools. Mastering balanced coding in Python represents the key knowledge of Pythonic practices.

Disambiguity

Immutability is on the Value

Refer to Python's Mutable vs Immutable Types: What's the Difference?

  • Variables(label/symbol) point to the memory position where concrete objects live. Or, say, holding a reference to an object.
  • Objects are concrete pieces of information that live in specific memory positions.
    • Examples: numbers, strings, functions, classes, and modules, mostly everything
  • Three core characteristics of Objects:
    1. Value (the object, concrete pieces of data)
    2. Identity (memory address where the object lives in), one can use id() to check and relate to is
    3. Type (global, python Internal know to handle it)
  • Mutability or immutability is intrinsic to objects rather than to variables.

Object that allows you to change its values without changing its identity is a mutable object. The changes that you can perform on a mutable object’s value are known as mutations.

An object that doesn't allow changes in its value is an immutable object. Immutable types require explicit assignment to "modify" the variable value. Under the hood, a new copy is created.

Comparison

Starting with something familiar.

The discussion on basic data types concerning immutability is quite general. This concept appears in various programming languages, including Python and Java. Mostly, the primitive data types are immutable.

There is a list of what is mutable/immutable:

┌────────┬─────────┬─────────┬──────────┬───────────┬─────────┬─────────┬─────────┐
│Language│Data Type│prim/ref?│Immutable?│Comparable?│Sortable?│Hashable?│Iterable?│
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │int      │primitive│    ✓     │   value   │    ✓    │    ✓    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │double   │primitive│    ✓     │   value   │    ✓    │         │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │String   │reference│    ✓     │   value   │    ✓    │    ✓    │         │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │Class    │reference│    x     │ mem addr  │dflt. not│dflt. not│    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │int      │   NA    │    ✓     │   value   │    ✓    │    ✓    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │str      │   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │list     │   NA    │    x     │   value   │mem addr │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │tuple    │   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │set      │   NA    │    x     │   value   │mem addr │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │frozenset│   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │dict     │   NA    │    x     │   value   │    x    │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │class    │   NA    │    x     │ mem addr  │    x    │    x    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │bytes    │   NA    │    ✓     │   value   │    ✓    │         │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │bytearray│   NA    │    x     │ mem addr  │    x    │    x    │    ✓    │
└────────┴─────────┴─────────┴──────────┴───────────┴─────────┴─────────┴─────────┘

This table looks having something out of scope about immutability, but they are related.

  • Comparable: equality, is ==(by val) or is(by ref/addr) used
  • Sortable: has a defined comparison order
  • Hashable: comply with __hash__ which allow objects working hash(), related to id(), hash() allows the type to be used as a dictionary key and set. Check hashable in the glossary of Pydoc immutables are required hashable. The mutables are not hashable
  • Iterable: has interface __iter__

List vs. Tuple

Mutable list and Immutable tuple are good examples to explain the difference.

tuple is a linear, ordered collection like a list but as an immutable version of list. Both have different APIs. Likewise, set comes with frozenset. bytes comes with bytearray.

  • list has the mutating member functions like list.apppend(), list.pop(), but not the tuple
  • Though list and tuple are iterable, element of list can be reassigned to a new value, but not tuple. However, note that tuple can have a mutable element. This has no good use, I have not spotted a good use of example for this. But it breaks the hashability of the object.

Syntactical examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
alist = [1,2,3,'4']
atuple = ([1,2,3],[4,5,6])
astring = 'hello'

alist[1] = 22 # legal
astring[1] = '3'  # TypeError
atuple[1] = 11 # TypeError
atuple[1][0]=4 # legal

alist.append('5')
astring.upper() # 'HELLO' 
  • At line 8, a mutable tuple member is not recommended. So far, there is no good purpose yet, as this destroys the hashability of the tuple.
  • At line 10, note "append()" return None , alist has mutated as being [1,2,3,'4','5']
  • At line 11, upper() is non-mutating but returning a new instance. astring is still "hello", upper() is non-mutating, but returning a new instance

One obvious characteristic of mutable types is the support of individual element assignment or self-mutating member functions. On the contrary, immutable types do not have such API or operations.

Explicit is better than implicit

In Python, symbols pass the referenced object into a function.

"Explicit is better than implicit" is one of the Python Zen principles.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
li = [64, 93, 79, 59, 58, 51]
tp = tuple(li)

def reassign(li):
  li=li[1:4] # slicing return a copy immediately
  print(f'{li}')

def mutate(li):
  li[0]=99

reassign(li)
# print [93, 79, 59]
li # li is still [64, 93, 79, 59, 58, 51]
reassign(tp)
# print (93, 79, 59)
tp # tp is still (64, 93, 79, 59, 58, 51)

mutate(li)
li #  [99, 93, 79, 59, 58, 51]
# error if mutate(tp)

The "li" inside the function has the same name as the "li" outside the function. It is intended to confuse one if he is not clear-headed.

Python does not have a const/final equivalence to constrain a symbol (similar to a pointer). A symbol/label in Python can be reassigned freely but "explicitly" with the function return, unlike the passed by reference way. There is workaround, use the global, nonlocal to leak out, but the design implicitly tells this is not a natural way to code in Python.

Note that the symbol "li" is specific to the function scope and not the original "li." The referenced object of the original li cannot be changed inside the function. But this is not the point of immutability. The referenced object can be mutable, and the value can be changed inside the function.

To change the li outside the function, use return, a similar flow but with "return":

21
22
23
24
25
26
27
28
29
def explicit_reassign(li):
  li=li[1:4]
  return li

li = explicit_reassign(li)
li # li is now [93, 79, 59]
reassign(tp)
tp = explicit_reassign(tp)
tp # tp is now (93, 79, 59)

The design of the function and the use of the return statement are related to the concept of mutability and side effects in Python.

In Python, lists are mutable objects, which means that they can be modified in place. When one passes a list as an argument to a function, he is actually passing a reference to the list. Any mutation (through element-wise reassignment/mutating member functions) made to the list within the function will affect the original "li" referenced object outside the function.

However, the "li" variable inside the reassign() is reassigned to a new list that is a slice of the original list. This reassignment creates a new local variable li that points to the new list, but it doesn't change the original list outside the function. Therefore, the modification is limited to the function's scope.

This design pattern allows for more control over mutability and side effects. By explicitly returning the modified list, it becomes clearer that the function has a side effect of modifying the input list, and one can decide whether to capture the returned value or not.

Tuple is unsuitable for large sequences that mutate as needed (think about the string .vs StringBuilder case). Python Tuple is more likely used as large static data, like those with static and const keywords in other static languages. To mutate Python Tuple, copy the tuple as a mutable version List. Using the list to mutate is more efficient.

From Pydoc's "faq/design.txt":

Tuples can be thought of as being similar to Pascal records or C structs; they're small collections of related data which may be of different types and are operated on as a group. For example, a Cartesian coordinate is appropriately represented as a tuple of two or three numbers.

Lists, on the other hand, are more like arrays in other languages. They tend to hold various objects, all of which are of the same type and operated individually. CPython's lists are really variable-length arrays, not Lisp-style linked lists. The implementation uses a contiguous array of references to other objects and keeps a pointer to this array and the array's length in a list head structure.

This makes indexing a list "a[i]" an operation whose cost is independent of the list's size or the index's value.

When items are appended or inserted, the array of references is resized. Some cleverness is applied to improve the performance of appending items repeatedly; when the array must be grown, some extra space is allocated so the next few times don't require an actual resize.

The above-mentioned sounds like the context in implementing an Array stack and queue.

Mutable default argument

Using a mutable type as a default argument in a function is generally undesirable, particularly for those who come from other languages. It could be a pitfall.

def append_to(element, li=[]):
    # state of the default list is preserved,
    # every call without "li" will refer to the same instance of list
    li.append(element)
    return li

# instead, use this pattern
def append_to(element, li=None):
    if li is None: li = []
    li.append(element)
    return li

This is not a problem with the immutable.

Conclusion

Call it a day for "immutability". A continuation of this subject is part ii

My conclusion, Pydoc is a gem.

References

Your thoughts...