Perusing glossy magazines,1 I was made aware of CVE-2024-2912 which describes how a POST request can lead to Remote Code Execution (RCE) in BentoML servers. A feature most users would rather live without. Bugs happen and I don’t want to criticise the developers unjustly, but knowing the root of the issue was Python’s pickle module, I can only wonder: How the fuck is this still happening?
pickle is insecure by design
import pickle
pickle.loads(b'cos\nsystem\n'
b'(S"echo evil"\ntR.')Example how ‘unpickling’ an insecure data leads to execution of a shell command.pickle serialisation uses a stack-based virtual machine with a ‘reduce’ operation which allows calling arbitrary Python functions (as shown in figure on the right). It’s no surprise it keeps popping up in security vulnerabilities. It’s been known for decades using picule invites trouble.2 The documentation highlights the dangers quite clearly, but that’s apparently not enough.
Call to action
I call upon you to stop this madness. There are easy steps you can take to make everyone safer:
- If you see a junior developer type
import pickle, mentor them and explain the module must never be used due to security holes. - If you see
+import pickle line during a code review, reject the patch. - If you write code yourself, use an alternative serialisation method, e.g. one listed below.
- And finally, if you’re Python project member, deprecate
pickle. Many features have been deprecated already, so backwards compatibility is not a valid excuse. C managed to get rid of gets, I believe it’s possible to heal Python as well.
Alternatives
There are myriad alternatives, so — aside from sheer developer convenience at the cost of user safety — there’s no reason to use pickle in new projects. Some options are:
| Format | Implementation | Notes |
|---|
| JSON | json built-in | Claims it’s ‘easy for humans to read and write,’ but in practice that’s not true. Does not naively support custom types. |
|---|
| TOML | tomllib built-in | Easy for humans to read and write. Does not naively support custom types. |
|---|
| YAML | StrictYAML | Easy for humans to read and write. StrictYAML supports a subset of YAML. See below for discussion of security implications. |
|---|
| Parquet | PyArrow | Columnar format naively supported by pandas and Polars. |
|---|
| Protocol Buffers | protobuf package | Fast, binary format. Requires schema definitions and a compilation step. |
|---|
| Borsh | borsh-construct package | Fast, binary format. Requires schema definitions. |
|---|
| Safetensors | safetensors package | Designed for ML models. |
|---|
What about PyYAML?
The most popular Python library implementing YAML is PyYAML.3 It is better than pickle in that it can be used safely, but for some inexplicable reason it has insecure defaults as seen in the following snippet:
>>> import yaml
>>> payload = '''!!python/object/apply:eval
... args: ["__import__('os').system('echo evil')"]'''
>>> yaml.load(payload, yaml.Loader)
evil
0To guarantee safety, yaml.SafeLoader has to be used. Why a safe loader isn’t the default is beyond my comprehension, but because that is the case, I don’t recommend using PyYAML. It’s too easy to do the wrong thing as attested by CVE-2026-24009 published just a fortnight ago.
Instead, StrictYAML is a safe alternative although it handles a subset of YAML features and may not always be a feasible option. If full YAML support is necessary, it is prudent to create wrapper functions which hide the choice of the loader and document explicitly security concerns.
Conclusion
Just say no to pickle. You should sooner do drugs than use pickle.1 Robbe Van Roey. 2025. Unveiling BentoML Pickle-Based Serialization. PageOut Issue 7. https://pagedout.institute/download/PagedOut_007.pdf ↩
2 The issue has been discussed in 2001 and the oldest vulnerability in the CVE catalogue is a 15-year old CVE-2011-2520. Overall, the database lists 51 entries which reference ‘pickle’ in context of ‘Python’. ↩
3 According to pypistats.org, PyYAML gets 673 M monthly downloads, ruamel 145 M, StrictYAML 18.4 M and poyo only 234 k. ↩