Sequence Hacking, Hashing, and Slicing
Notes based on Fluent Python, summarized and rephrased in my own words.[file:16]
Goal: a vector type that behaves like a sequence
This chapter builds a multidimensional Vector class that behaves like a rich sequence:
- It is iterable and printable in a friendly way.
- It supports
len(v)andv[i]and slicing. - It works with
bytes(),repr(), andformat(). - It can be hashed and used as a dict key or set member.
- It exposes convenient attribute-style access for the first few components (
x,y,z,t).[file:16]
The implementation shows how far you can go by following the sequence protocol and using duck typing instead of complex inheritance hierarchies.[file:16]
A basic multidimensional Vector
We start with a vector backed by an array('d') of floats:[file:16]
from array import array
import reprlib
import math
class Vector:
typecode = 'd'
def __init__(self, components):
# "protected" attribute holding the components in an array of doubles
self._components = array(self.typecode, components)
def __iter__(self):
return iter(self._components)
def __repr__(self):
components = reprlib.repr(self._components)
components = components[components.find('['):-1]
return f'Vector({components})'
def __str__(self):
return str(tuple(self))
def __bytes__(self):
return bytes([ord(self.typecode)]) + bytes(self._components)
def __eq__(self, other):
return tuple(self) == tuple(other)
def __abs__(self):
return math.sqrt(sum(x * x for x in self))
def __bool__(self):
return bool(abs(self))
@classmethod
def frombytes(cls, octets):
typecode = chr(octets[0])
memv = memoryview(octets[1:]).cast(typecode)
return cls(memv)Features so far:[file:16]
- The components are stored efficiently in a contiguous C array.
__iter__makes the vector iterable and allowstuple(v).__repr__usesreprlib.reprto abbreviate large vectors nicely (e.g.Vector([0.0, 1.0, 2.0, 3.0, 4.0, ...])).__bytes__andfrombytesprovide a compact binary representation and a way to reconstruct from it.__abs__implements the Euclidean norm;__bool__makes zero vectors falsy.[file:16]
Protocols and duck typing
In Python, a protocol is an informal interface: a set of methods and behaviors a type is expected to provide, but not enforced by inheritance or declarations.[file:16]
For sequences, the core protocol often boils down to:
__len__(self)__getitem__(self, index)
Any class that implements these with the right semantics behaves like a sequence, regardless of its base classes.[file:16]
Example: a simple French deck of cards implemented as a sequence:[file:16]
import collections
Card = collections.namedtuple('Card', ['rank', 'suit'])
class FrenchDeck:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spades diamonds clubs hearts'.split()
def __init__(self):
self._cards = [Card(rank, suit) for suit in self.suits
for rank in self.ranks]
def __len__(self):
return len(self._cards)
def __getitem__(self, position):
return self._cards[position]Because it follows the sequence protocol, FrenchDeck works with len, slicing, random.choice, iteration, etc., even though it does not inherit from any sequence base class.[file:16]
This is duck typing: if it behaves like a sequence, we treat it as one.
Making Vector a proper sequence (length and indexing)
To make Vector fully sequence-like, we add __len__ and __getitem__:[file:16]
class Vector:
# ... previous methods ...
def __len__(self):
return len(self._components)
def __getitem__(self, index):
return self._components[index]At this point, Vector supports:
len(v)to get the number of components.- Indexing like
v[0],v[-1]. - Slicing like
v[1:4]becausearraysupports slicing.[file:16]
How slicing works under the hood
When you use the subscript operator obj[...], Python calls obj.__getitem__(index).
The index parameter can be:
- An integer (for single-element access), or
- A
sliceobject, or - A tuple containing integers and/or slices if there are commas inside the brackets.[file:16]
Toy example:[file:16]
class MySeq:
def __getitem__(self, index):
return index
s = MySeq()
s[1] # 1 (int)
s[1:4] # slice(1, 4, None)
s[1:4:2] # slice(1, 4, 2)
s[1:4:2, 9] # (slice(1, 4, 2), 9)
s[1:4:2, 7:9] # (slice(1, 4, 2), slice(7, 9, None))The built-in slice type has attributes start, stop, and step, and a helpful method indices(length) that normalizes negative indexes and trims them to sequence bounds.[file:16]
slice(None, 10, 2).indices(5) # (0, 5, 2)
slice(-3, None, None).indices(5) # (2, 5, 1)This is the same logic that built-in sequences use to handle missing and negative indices.[file:16]
Handling slicing explicitly in __getitem__
To return a new Vector when we slice, we refine __getitem__:[file:16]
import numbers
def __len__(self):
return len(self._components)
def __getitem__(self, index):
cls = type(self)
if isinstance(index, slice):
return cls(self._components[index])
elif isinstance(index, numbers.Integral):
return self._components[index]
else:
msg = '{cls.__name__} indices must be integers'
raise TypeError(msg.format(cls=cls))Now:
- Slicing
v[1:4]returns a newVectorbuilt from the sliced array.[file:16] - Integer indices return single components.
- Any other index type raises a clear
TypeError.
Dynamic attribute access (x, y, z, t)
For convenience, we can allow attribute-style access to the first few components using x, y, z, and t:[file:16]
shortcut_names = 'xyzt'
def __getattr__(self, name):
cls = type(self)
if len(name) == 1:
pos = cls.shortcut_names.find(name)
if 0 <= pos < len(self._components):
return self._components[pos]
msg = '{.__name__!r} object has no attribute {!r}'
raise AttributeError(msg.format(cls, name))Notes:[file:16]
__getattr__is only called if normal attribute lookup fails.- If
nameis a single character found inshortcut_namesand within range, we return the corresponding component. - Otherwise we raise a standard
AttributeError.
To prevent users from setting these attributes or from inventing arbitrary one-letter attributes, we customize __setattr__:[file:16]
def __setattr__(self, name, value):
cls = type(self)
if len(name) == 1:
if name in cls.shortcut_names:
error = 'readonly attribute {attr_name!r}'
elif name.islower():
error = "can't set attributes 'a' to 'z' in {cls_name!r}"
else:
error = ''
if error:
msg = error.format(cls_name=cls.__name__, attr_name=name)
raise AttributeError(msg)
super().__setattr__(name, value)This enforces that:
v.x,v.y, etc. are read-only views of components.[file:16]- Other lowercase one-letter attribute names are not allowed, to avoid confusing, short, magical names.
Hashing and efficient equality
To make Vector hashable, we define __hash__ along with a consistent __eq__:[file:16]
import functools
import operator
def __eq__(self, other):
return (len(self) == len(other) and
all(a == b for a, b in zip(self, other)))
def __hash__(self):
hashes = (hash(x) for x in self._components)
return functools.reduce(operator.xor, hashes, 0)__eq__does a structural comparison: lengths must match, and all components must be equal pairwise.[file:16]__hash__combines the hashes of all components using bitwise XOR viafunctools.reduce.[file:16]
This ensures equal vectors have equal hashes and can be used in sets and as dict keys.
Formatting vectors, including hyperspherical coordinates
The full implementation goes further and supports rich formatting via __format__:[file:16]
import itertools
def angle(self, n):
r = math.sqrt(sum(x * x for x in self[n:]))
a = math.atan2(r, self[n-1])
if (n == len(self)-1) and (self[-1] < 0):
return math.pi * 2 - a
else:
return a
def angles(self):
return (self.angle(n) for n in range(1, len(self)))
def __format__(self, fmt_spec=''):
if fmt_spec.endswith('h'): # hyperspherical coordinates
fmt_spec = fmt_spec[:-1]
coords = itertools.chain([abs(self)], self.angles())
outer_fmt = '<{}>'
else: # default: Cartesian
coords = self
outer_fmt = '({})'
components = (format(c, fmt_spec) for c in coords)
return outer_fmt.format(', '.join(components))Behavior:[file:16]
- Without a special suffix, vectors are formatted as Cartesian coordinates:
'(x, y, z, ...)'. - If the format spec ends with
'h', they are shown in hyperspherical coordinates inside angle brackets:'<r, θ1, θ2, ...>'.[file:16] - Any remaining format spec (precision, notation) is applied to each component with the built-in
format.
Examples from the docstring:[file:16]
format(Vector([3, 4]))→'(3.0, 4.0)'.format(Vector([1, 1]), '0.5fh')→'<1.41421, 0.78540>'.
Final Vector: a full-featured sequence type
The final Vector implementation supports:[file:16]
- Construction from any iterable of numbers.
- Iteration, length, indexing, slicing.
- Representation with
repr,str,bytes, and reconstruction withfrombytes. - Structural equality and hashing.
- Dynamic attribute access for the first few coordinates.
- Flexible formatting, including Cartesian and hyperspherical forms.
All of this is achieved by following Python’s protocols (iteration, sequence, numeric, formatting) and leaning on duck typing, rather than relying heavily on inheritance.[file:16]