TIL: Running Multiple Functions in Parallel with Fastcore
Basic Usage: One Function, Many Items
The core pattern for parallel is applying one function across multiple items:
from fastcore.parallel import parallel
def my_func(x):
return x * 2
results = parallel(my_func, [1, 2, 3, 4, 5])
# Returns: [2, 4, 6, 8, 10]
Key insight: First argument is the function, second is the iterable of items to process.
Running Multiple Different Functions in Parallel
The Challenge
What if you have different functions you want to run simultaneously?
func_one()func_two()func_three()
The Solution: Functions as Data
Sticky Analogy: Think of your functions as items in a to-do list. Instead of processing data in parallel, you're processing tasks — and each task happens to be "call this function."
from fastcore.parallel import parallel
import time
def func_one():
time.sleep(1)
return "func_one done (1s)"
def func_two():
time.sleep(2)
return "func_two done (2s)"
def func_three():
time.sleep(3)
return "func_three done (3s)"
# Run different functions in parallel
results = parallel(lambda f: f(), [func_one, func_two, func_three], threadpool=True)
print(results)
How it works: We pass the functions themselves as the "items" list, and use a callable that invokes each function.
The Pickling Problem
Why Lambdas Fail with Multiprocessing
If you try this without threadpool=True:
results = parallel(lambda f: f(), [func_one, func_two, func_three])
# ❌ Error: Can't pickle <function <lambda>>
Sticky Analogy: Multiprocessing is like sending instructions to workers in different buildings. You need to write everything down (serialize/pickle) so they can understand it. A lambda is like saying "you know, that thing" — it can't be written down because it has no name!
Technical Explanation
- Multiprocessing creates separate Python processes
- Data must be serialized (pickled) to send between processes
- Lambdas are anonymous — Python can't look them up by name, so they can't be pickled
- Threads share memory — no serialization needed, so lambdas work fine
Two Solutions to the Pickling Problem
Solution 1: Use Threads (threadpool=True)
results = parallel(lambda f: f(), funcs, threadpool=True)
✅ Lambdas work
✅ No serialization overhead
⚠️ Subject to Python's GIL (Global Interpreter Lock)
👍 Best for: I/O-bound tasks (network, file operations)
Solution 2: Use a Named Function
def call_func(f):
return f()
results = parallel(call_func, funcs)
✅ Works with multiprocessing
✅ True parallelism (bypasses GIL)
⚠️ Serialization overhead
👍 Best for: CPU-bound tasks
Quick Reference
| Scenario | Approach | Code |
|---|---|---|
| Same function, many items | Basic parallel |
parallel(fn, items) |
| Different functions, I/O-bound | Threadpool + lambda | parallel(lambda f: f(), funcs, threadpool=True) |
| Different functions, CPU-bound | Named caller function | parallel(call_func, funcs) |
Threads vs Processes
| Feature | Threads (threadpool=True) |
Processes (default) |
|---|---|---|
| Memory | Shared | Separate |
| Pickling needed | No | Yes |
| GIL limitation | Yes | No |
| Best for | I/O-bound | CPU-bound |
TL;DR
parallel(fn, items)— runsfnon each item in parallel- Run different functions: pass functions as items, use
lambda f: f()to call them - Lambdas + multiprocessing don't mix — use
threadpool=Trueor a named function - Threads for I/O, Processes for CPU