I've been writing Python long enough to remember when print still needed parentheses. And yet, every year, I stumble on a trick or pattern that makes me mutter, "Where the hell was this my whole career?" These aren't your run-of-the-mill "use list comprehensions" or "remember enumerate()" kind of tips. These are the buried treasures — the ones that save hours, make your code cleaner than a fresh pip install, and leave your colleagues wondering if you've been secretly talking to Guido himself.
1) Zero-copy IPC with multiprocessing.shared_memory + memoryview
Why it matters: sending megabyte-sized arrays between processes normally pickles and copies — expensive. Shared memory + memoryview moves bytes once, not N times.
# producer.py
from multiprocessing import shared_memory
data = b'A' * 10_000_000
shm = shared_memory.SharedMemory(create=True, size=len(data))
mv = memoryview(shm.buf)
mv[:len(data)] = data
print(shm.name) # pass this to the consumer
# consumer.py
from multiprocessing import shared_memory
name = input("shm name: ").strip()
shm = shared_memory.SharedMemory(name=name)
view = memoryview(shm.buf)
print(len(view)) # large buffer available without copying
shm.close(); shm.unlink()Pro tip: for numeric arrays use numpy.frombuffer(shm.buf, dtype=...) — zero-copy and fast. Pitfall: lifecycle management — unlink() when everybody's done.
2) Audit sensitive actions with sys.addaudithook
Why it matters: see what code did (imports, sockets, file opens) across a running process — without sprinkling logs everywhere.
import sys
def auditor(event, args):
if event.startswith("open"):
print("AUDIT:", event, args)
sys.addaudithook(auditor)
open("/tmp/x", "w").close() # auditor receives the eventUse-case: hardening, compliance, or quickly discovering where a library uses open() or subprocess. Note: hooks can affect performance; keep audits minimal and filter aggressively.
3) Import-time AST transforms for surgical instrumentation
Why it matters: change behavior of third-party modules on load — add invariants, auto-logging, or lightweight tracing — without editing the source.
Mini-example (toy): insert an entry log at top of every function in a module when importing.
import ast, importlib.abc, importlib.util
class InstrumentLoader(importlib.abc.Loader):
def __init__(self, path): self.path = path
def create_module(self, spec): return None
def exec_module(self, module):
src = open(self.path, 'r', encoding='utf-8').read()
tree = ast.parse(src)
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
node.body.insert(0, ast.parse(f"print('enter {node.name}')").body[0])
code = compile(tree, self.path, 'exec')
exec(code, module.__dict__)
# use importlib.util.spec_from_loader to load a module with this loaderPro tip: use ast.NodeTransformer and be extremely targeted — broad transforms are brittle. Great for temporary instrumentation or building a small APM.
4) Pickle protocol 5 + out-of-band buffers (avoid huge copies)
Why it matters: serializing very large binary payloads (images, model weights) normally copies into the pickle stream. Protocol 5 lets you move buffers out-of-band.
import pickle
big = b'x' * 5_000_000
buffers = []
data = pickle.dumps({'payload': memoryview(big)}, protocol=5, buffer_callback=buffers.append)
# send `data` (small) + `buffers` (OOB) separately; receiver reconstructs with pickle.loads(data, buffers=buffers)When to use: high-throughput RPC, multiprocessing, or network protocols where copy overhead kills latency. Caveat: both sides must support protocol 5 (Python 3.8+).
5) Propagate async context into threads with contextvars + copy_context()
Why it matters: threading.local() fails for asyncio. contextvars keeps request-local state across await — and copy_context() moves that same context into a thread worker.
import contextvars, asyncio
from concurrent.futures import ThreadPoolExecutor
request_id = contextvars.ContextVar("request_id")
def blocking_task():
# runs in thread but sees the context values we propagated
return request_id.get(None)
async def main():
request_id.set("req-42")
loop = asyncio.get_running_loop()
ctx = contextvars.copy_context()
with ThreadPoolExecutor() as pool:
fut = loop.run_in_executor(pool, ctx.run, blocking_task)
print(await fut) # prints "req-42"
asyncio.run(main())Essential for structured tracing across async + sync boundaries. Don't forget to propagate context explicitly into thread pools.
6) In-place binary edits with mmap + atomic commit (os.replace)
Why it matters: when you need to patch huge files (DBs, media) you don't want to read/write the whole thing. mmap edits bytes directly; os.replace gives you an atomic commit.
import mmap, struct, os, tempfile
def patch_file_atomic(path, offset, delta):
with open(path, 'r+b') as f:
mm = mmap.mmap(f.fileno(), 0)
val = struct.unpack_from('<I', mm, offset)[0]
struct.pack_into('<I', mm, offset, val + delta)
mm.flush(); mm.close()
# if you need atomicity across processes, write a copy then os.replace(tmp, path)
# safer pattern: write a small patched copy and os.replace for atomic swap
def atomic_update(path, updater):
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
with open(path,'rb') as src, open(tmp.name,'wb') as dst:
dst.write(src.read()) # only OK if file isn't huge, otherwise stream + patch
updater(dst)
os.replace(tmp.name, path)
finally:
if os.path.exists(tmp.name): os.unlink(tmp.name)Use mmap when edits are localized and latency matters. Use atomic swap when you must never leave a half-written file.
7) Automatic cleanup and ephemeral caches with weakref.finalize + WeakValueDictionary
Why it matters: build caches that clean themselves when memory is tight, and attach cleanup actions to objects without del hazards.
import weakref, gc
class Resource:
def __init__(self, name): self.name = name
def close(self): print("closed", self.name)
cache = weakref.WeakValueDictionary()
def make(name):
r = Resource(name)
weakref.finalize(r, r.close) # guaranteed to call close when r is gone
cache[name] = r
return r
r1 = make("A")
del r1
gc.collect() # finalize will call close once object truly collectedUse-case: caches of database connections, temporary native handles, or glue objects to third-party libs. weakref.finalize avoids __del__ pitfalls (ordering issues during interpreter shutdown).
Debug Smarter, Faster! 🐍 Grab your Python Debugging Guide — Click here to download!
If you enjoyed reading, be sure to give it 50 CLAPS! Follow and don't miss out on any of my future posts — subscribe to my profile for must-read blog updates!
Thanks for reading!