This is a continuation to What I didn't know about Immutability (I)

Previously, the terminology and the main differences between the immutable and the mutable were discussed.

In this part, the discussion goes on extended context related to immutability.

Comparison

Let's keep this table as a reference for ongoing discussion.

┌────────┬─────────┬─────────┬──────────┬───────────┬─────────┬─────────┬─────────┐
│Language│Data Type│prim/ref?│Immutable?│Comparable?│Sortable?│Hashable?│Iterable?│
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │int      │primitive│    ✓     │   value   │    ✓    │    ✓    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │double   │primitive│    ✓     │   value   │    ✓    │         │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │String   │reference│    ✓     │   value   │    ✓    │    ✓    │         │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Java    │Class    │reference│    x     │ mem addr  │dflt. not│dflt. not│    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │int      │   NA    │    ✓     │   value   │    ✓    │    ✓    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │str      │   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │list     │   NA    │    x     │   value   │mem addr │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │tuple    │   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │set      │   NA    │    x     │   value   │mem addr │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │frozenset│   NA    │    ✓     │   value   │    ✓    │    ✓    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │dict     │   NA    │    x     │   value   │    x    │    x    │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │class    │   NA    │    x     │ mem addr  │    x    │    x    │    x    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │bytes    │   NA    │    ✓     │   value   │    ✓    │         │    ✓    │
├────────┼─────────┼─────────┼──────────┼───────────┼─────────┼─────────┼─────────┤
│Python  │bytearray│   NA    │    x     │ mem addr  │    x    │    x    │    ✓    │
└────────┴─────────┴─────────┴──────────┴───────────┴─────────┴─────────┴─────────┘

This table looks having something out of scope about immutability, but they are related.

  • Comparable: equality, is ==(by val) or is(by ref/addr) used
  • Sortable: has a defined comparison order
  • Hashable: comply with __hash__ which allow objects working hash(), related to id(), hash() allow the type to be used as dictionary key and set. Check hashable in glossary of pydoc immutables are hashble, mutables are not hashable
  • Iterable: has interface __iter__

vs. Memory

Is it related to memory optimization, like the memory stack/heap or "static" keyword? I have no idea. On the web, people said, "Python is not C. It does not use the concepts of heap and stack."

The immutable and mutable types can have different memory behavior. The immutable data type is often more compact in memory and safer for programming applications. id() or is are good inspection tools for identity/memory addresses.

New instances of mutable are always to a new memory while assigning different symbols to a value of the immutable can sometimes reuse the memory. The examples are interned integers and strings. In default, only small integers and short strings are interned, or use sys.intern() to intern

equality vs. identity

== is for equality, is is for identity.

li = [1,2]
li2 = [1,2]
li == li2 #True, same value
id(li) == id(li2) #False, different memory address
li is li2 #False

char_a = 'a'
char_b = 'a'
id(a) == id(b) #True, same memory address
a is b #True

Note two equal objects are not necessarily identical objects. == is for value comparison, is is for memory address comparison. Using is is for bulky data is faster than ==.

interning

Interning is somehow making exception to what just said about immutable, a general idea for memory optimization.

If a small integer or short str is assigned again, memory possibly reuses the data by reference to the existing value.

In [1]: a=18374
In [2]: id(a)
Out[2]: 140653578873040
In [3]: a=18374
In [4]: id(a)
Out[4]: 140653578878064

# However

In [5]: a=1
In [6]: id(a)
Out[6]: 140653620953328
In [7]: a=1
In [8]: id(a)
Out[8]: 140653620953328

CPython pre-loads(catches) a global list of integers in the range[-5, 256], any time an integer is referenced in the range, python will use the cached version of that object.

Class attribute

Check if one thinks the following is reasonable:

class Thing:
  num = 3 # immutable
  li  = [32,33] # mutable

t = Thing() #t.num is 3 and t.li and Thing.li are [32,33]
id(t.num) == id(Thing.num) #True
t.num = 4 # reassignment of t.num symbol to a new immutable value
id(t.num) != id(Thing.num) #True
t.li.append(35) # Thing.li and t.li are [32,33,35], and same id

t2 = Thing() #t2,Thing,t1 share same instance of list, but different num.
t2.num # is 3

vs. Symbol tables

The keys used in Symbol tables are generally immutable. Assigning a key to a Python dictionary with a mutable mostly raises an error. Symbol tables can include structures such as Hashtables (similar to dictionaries in Python), N-ary Search Trees, and (Indexed) Priority Queues.

An object is *hashable* if it has a hash value which never changes during its lifetime (it needs a "__hash__()" method), and can be compared to other objects (it needs an "__eq__()" method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Most of Python's immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not; immutable containers (such as tuples and frozensets) are only hashable if their elements are hashable. Objects which are instances of user- defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their "id()".

Being immutable is necessary for being hashable, sortable, or both. id() of classes that return a number is hashable.

Symbol tables organize hashes or ordered data in structures. It becomes problematic if the "atomic" value changes unnoticed within the more complex data structure. This issue relates to the concepts of "safety" and "consistency."

Applications

Examples of using immutable data occur in different contexts

  • In Django, the Request and its data (QueryDict) is immutable.
  • For example, the Unicode code point is a universal standard, somewhere in the operating system, keep this large table using immutable data type and make it more compact.
  • Coordinate as key in the dictionary
  • use immutable string for key and configuration, it provides more safety.
  • In testing, test objects created on 'unittest.TestCase' in setUpClass() are potentially a source of test isolation leakage since tests could mutate them. Therefore those objects should be immutable or add their own rollback behavior.

Extras from LLM

Have I covered enough about immutability? I am not sure. The LLM answer complements more what I have missed:

  • Immutability is a core principle in functional programming (e.g., Haskell, Elixir). Functions avoid side effects by treating data as immutable, enhancing code reliability and testability.
    • (This is actually a very good hint to trace the origin of the term Immutability)
  • Immutable data can be safely shared across threads without locks or synchronization, preventing race conditions in parallel processing.

  • "Tuples are always faster": True for creation and memory, but not for all operations. Use the right tool for the job.

  • "Use tuples for speed": The performance gain is often trivial unless one works at scale (e.g., millions of elements).

Conclusion

Immutability includes key programming elements detailed in Pydoc. This serves as a vital resource for Python developers.

Atomicity

Atomicity isn't the 'A' in ACID related to databases. It's more about the fundamental data level within a programming language context.

Under the hood, coding manipulates bits and bytes, but not all bits. There are bits hidden or made atomic.

Consider "a string is an immutable sequence of characters, whereas a character or a number is an immutable sequence of bits." As mentioned above, the sequence of bits or Value is the atom's most basic level in the programming context.

The immutable sequence is either deleted or created altogether. The bits together have a valid meaning. It is atomic and untouchable in the context of programming. As for the mutable, like Array, there would be some cost in memory to put the value change in place.

Such atomic sequence of bits is optimized for creating and reading to some extent with more compact memory. Usually, such data is small, and the bits don't mutate during a lifetime.

Atom in physics is indivisible.

Immutable Value is like an atom.

The joke is that nothing seems indivisible in the real world. Neither atom nor immutable data type. Just immutable data is presented as indivisible in the context of programming. There seem to be tools in memory to do whatever one wants.

The business of immutability with underlying Python internal, memory, or electronic principles is a mystery to me. Perhaps I should have some prior knowledge, but I am not sure. It is hard, sometimes I don't know what I don't know. Elements like memory, the runtime virtual machine, or the compiler remain hidden from the programmer. I lack access to the deeper level, which isn't essential for standard Python programming tasks. After all, coders just need to know the abstract model of the programming language. Depth into immutability could be another domain of the subject.

Knowing what I don't know is better than not knowing what I don't know.

References

Your thoughts...