RAII: Tragedy in three acts
Posted by Michał ‘mina86’ Nazarewicz on 1st of April 2023
In a recent Computerphile video, Ian Knight talked about RAII idiom and it’s application in C++ and Rust. While the video described the general concepts, I felt different examples could be more clearly convey essence of the topic.
I’ve decided to give my own explanation to hopefully better illustrate what RAII is and how it relates to Rust’s ownership. Then again, for whatever reason I’ve decided to write it as a play with dialogue in faux Old English so it may well be even more confusing instead.
Cast of characters | |
(In the order of appearance) | |
Gregory | A software engineer and Putuel’s employee number #1 |
Sampson | A software engineer and a self-proclaimed 10× developer |
Paris | An apprentice returning to Putuel two summers in a row |
CTO | Puteal’s Chief Technical Officer spending most of his time in meetings |
Admin | Administrative assistant working in Puteal Corporation’s headquarters in Novear |
Act I | |
Scene I | Novear. A public place. |
Enter Sampson and Gregory, two senior engineers of the Puteal Corporation, carrying laptops and phones | |
Gregory | Pray tell, what doth the function’s purpose? |
Sampson | It doth readeth a number from a file. A task as trivial as can be and yet QA reports memory leak after my change. Hence, I come to thee for help. |
Both look at a laptop showing code Sampson has written [error handling omitted for brevity from all source code listings]: | |
double read_double(FILE *fd) { char *buffer = malloc(1024); /* allocate temporary buffer */ fgets(buffer, 1024, fd); /* read first line of the file */ return atof(buffer); /* parse and return the number */ } | |
Gregory | Thine mistake is apparent. Thou didst allocate memory but ne’er freed it. Verily, in C thou needs’t to explicitly free any memory thou dost allocate. Submit this fix and thy code shall surely pass. |
double read_double(FILE *fd) { char *buffer = malloc(1024); /* allocate temporary buffer */ fgets(buffer, 1024, fd); /* read first line of the file */ double result = atoi(buffer); /* parse the line */ free(buffer); /* free the temporary buffer */ return result; /* return parsed number */ } | |
Scene II | A hall. |
Enter Puteal CTO, an apprentice called Paris and an Admin | |
Paris | I’ve done as Sampson beseeched of me. I’ve taken the read_double function and changed it so that it doth taketh file path as an argument. He hath warned me about managing memory and so I’ve made sure all temporary buffers are freed. Nonetheless, tests fail. |
double read_double(const char *path) { FILE *fd = fopen(path, "r"); /* open file */ char *buffer = malloc(1024); fgets(buffer, 1024, fd); double result = atof(buffer); free(buffer); return result; } | |
CTO | Thou didst well managing memory, but memory isn’t the only resource that needs to be freed. Just like allocations, if thou dost open a file, thou must close it anon once thou art done with it. |
Exit CTO and Admin towards sounds of a starting meeting | |
Paris | Managing resources is no easy task but I think I’m starting to get the hang of it. |
double read_double(const char *path) { FILE *fd = fopen(path, "r"); /* open file */ char *buffer = malloc(1024); fgets(buffer, 1024, fd); fclose(fd); /* close the file */ double result = atof(buffer); free(buffer); return result; } | |
Scene III | Novear. A room in Puteal’s office. |
Enter Paris and Sampson they set them down on two low stools, and debug | |
Paris | The end of my apprenticeship is upon me and yet my code barely doth work. It canst update the sum once but as soon as I try doing it for the second time, nothing happens. |
double update_sum_from_file(mtx_t *lock, double *sum, const char *path) { double value = read_double(path); /* read next number from file */ mtx_lock(lock); /* reserve access to `sum` */ value += sum->value; /* calculate sum */ sum->value = value; /* update the sum */ return value; /* return new sum */ } | |
Sampson | Thou hast learned well that resources need to be acquired and released. But what thou art missing is that not only system memory or a file descriptor are resources. |
Paris | So just like memory needs to be freed, files need to be closed and locks needs to be unlocked! |
double update_sum_from_file(mtx_t *lock, double *sum, const char *path) { double value = read_double(path); /* read next number from file */ mtx_lock(lock); /* reserve access to `sum` */ value += *sum; /* calculate sum */ *sum = value; /* update the sum */ mtx_unlock(lock); /* release `sum` */ return value; /* return new sum */ } | |
Paris | I’m gladdened I partook the apprenticeship. Verily, I’ve learned that resources need to be freed once they art no longer used. But also that many things can be modelled like a resource. |
I don’t comprehend why it all needs to be done manually? | |
Exit Sampson while Paris monologues leaving him puzzled | |
Act II | |
Scene I | Court of Puteal headquarters. |
Enter Sampson and Paris bearing a laptop before him | |
Paris | Mine last year’s apprenticeship project looks naught like mine own handiwork. |
Sampson | Thou seest, in the year we migrated our code base to C++. |
Paris | Aye, I understandeth. But I spent so much time learning about managing resources and yet the new code doth not close its file. |
Enter Gregory and an Admin with a laptop. They all look at code on Paris’ computer: | |
double read_double(const char *path) { std::fstream file{path}; /* open file */ double result; /* declare variable to hold result */ file >> result; /* read the number */ return result; /* return the result */ } | |
Sampson | Oh, that’s RAII. Resource Acquisition Is Initialisation idiom. C++ usetht it commonly. |
Gregory | Resource is acquired when object is initialised and released when it’s destroyed. The compiler tracks lifetimes of local variables and thusly handles resources for us. |
By this method, all manner of resources can be managed. And forsooth, for more abstract concepts without a concrete object representing them, such as the concept of exclusive access to a variable, a guard class can be fashioned. Gaze upon this other function: | |
double update_sum_from_file(std::mutex &lock, double *sum, const char *path) { double value = read_double(path); /* read next number from file */ std::lock_guard<std::mutex> lock{mutex}; /* reserve access to `sum` */ value += *sum; /* calculate sum */ *sum = value; /* update the sum */ return value; /* return new sum */ } | |
Paris | I perceive it well. When the lock goes out of scope, the compiler shall run its destructor, which shall release the mutex. Such was my inquiry yesteryear. Thus, compilers can render managing resources more automatic. |
Scene II | Novear. Sampson’s office. |
Enter Gregory and Sampson | |
Sampson | Verily, this bug doth drive me mad! To make use of the RAII idiom, I’ve writ an nptr template to automatically manage memory. |
template<class T> struct nptr { nptr(T *ptr) : ptr(ptr) {} /* take ownership of the memory */ ~nptr() { delete ptr; } /* free memory when destructed */ T *operator->() { return ptr; } T &operator*() { return *ptr; } private: T *ptr; }; | |
Gregory | I perceive… And what of the code that bears the bug? |
Sampson | 'Tis naught but a simple code which calculates sum of numbers in a file: |
std::optional<double> try_read_double(nptr<std::istream> file) { double result; return *file >> result ? std::optional{result} : std::nullopt; } double sum_doubles(const char *path) { nptr<std::istream> file{new std::fstream{path}}; std::optional<double> number; double result = 0.0; while ((number = try_read_double(file))) { result += *number; } return result; } | |
Enter Paris with an inquiry for Sampson; seeing them talk he pauses and listens in | |
Gregory | The bug lies in improper ownership tracking. When ye call the try_read_double function, a copy of thy nptr is made pointing to the file stream. When that function doth finish, it frees that very stream, for it believes that it doth own it. Alas, then you try to use it again in next loop iteration. |
Why hast thou not made use of std::unique_ptr ? | |
Sampson | Ah! I prefer my own class, good sir. |
Gregory | Thine predicament would have been easier to discern if thou hadst used standard classes. In truth, if thou wert to switch to the usage of std::unique_ptr , the compiler would verily find the issue and correctly refuse to compile the code. |
std::optional<double> try_read_double(std::unique_ptr<std::istream> file) { double result; return *file >> result ? std::optional{result} : std::nullopt; } double sum_doubles(const char *path) { auto file = std::make_unique<std::fstream>(path); std::optional<double> number; double result = 0.0; while ((number = try_read_double(file))) { /* compile error */ result += *number; } return result; } | |
Exit Gregory, exit Paris moment later | |
Scene III | Before Sampson’s office. |
Enter Gregory and Paris, meeting | |
Paris | I’m yet again vexed. I had imagined that with RAII, the compiler would handle all resource management for us? |
Gregory | Verily, for RAII to function, each resource must be owned by a solitary object. If the ownership may be duplicated then problems shall arise. Ownership may only be moved. |
Paris | Couldn’t compiler enforce that just like it can automatically manage resources? |
Gregory | Mayhap the compiler can enforce it, but it’s not a trivial matter. Alas, if thou art willing to spend time to model ownership in a way that the compiler understands, it can prevent some of the issues. However, thou wilt still require an escape hatch, for in the general case, the compiler cannot prove the correctness of the code. |
Exit Gregory and Paris, still talking | |
Act III | |
Scene I | A field near Novear. |
Enter Gregory and Paris | |
Gregory | Greetings, good fellow! How hast thou been since thy apprenticeship? |
Paris | I’ve done as thou hast instructed and looked into Rust. It is as thou hast said. I’ve recreated Sampson’s code and the compiler wouldn’t let me run it: |
fn try_read_double(rd: Box<dyn std::io::Read>) -> Option<f64> { todo!() } fn sum_doubles(path: &std::path::Path) -> f64 { let file = std::fs::File::open(path).unwrap(); let file: Box<dyn std::io::Read> = Box::new(file); let mut result = 0.0; while let Some(number) = try_read_double(file) { result += number; } result } | |
Gregory | Verily, the compiler hath the vision to behold the migration of file’s ownership into the realm of try_read_double function during the first iteration and lo, it is not obtainable any longer by sum_doubles . |
error[E0382]: use of moved value: `file` let file: Box<dyn std::io::Read> = Box::new(file); ---- move occurs because `file` has type `Box<dyn std::io::Read>`, which does not implement the `Copy` trait let mut result = 0.0; while let Some(number) = try_read_double(file) { ^^^^ value moved here, in previous iteration of loop | |
Paris | Alas, I see not what thou hast forewarned me of. The syntax present doth not exceed that which wouldst be used had this been writ in C++: |
fn try_read_double(rd: &dyn std::io::Read) -> Option<f64> { todo!() } fn sum_doubles(path: &std::path::Path) -> f64 { let file = std::fs::File::open(path).unwrap(); let file: Box<dyn std::io::Read> = Box::new(file); let mut result = 0.0; while let Some(number) = try_read_double(&*file) { result += number; } result } | |
Gregory | Verily, the Rust compiler is of great wit and often elides lifetimes. Nonetheless, other cases may prove more intricate. |
struct Folder<T, F>(T, F); impl<T, F: for <'a, 'b> Fn(&'a mut T, &'b T)> Folder<T, F> { fn push(&mut self, element: &T) { (self.1)(&mut self.0, element) } } | |
Paris | Surely though, albeit this code is more wordy, it is advantageous if I cannot commit an error in ownership. |
Gregory | Verily, there be manifold factors in the selection of a programming tongue. And there may be aspects which may render other choices not imprudent. |
Aforeword
A thing to keep in mind is that the examples are somewhat contrived. For example, the buffer and file object present in read_double
function can easily live on the stack. A real-life code wouldn’t bother allocating them on the heap. Then again, I could see a beginner make a mistake of trying to bypass std::unique_ptr
not having a copy constructor by creating objects on heap and passing pointers around.
In the end, is this better explanation than the one in aforementioned Computerphile video? I’d argue code examples represent discussed concepts better though to be honest form of presentation hinders clarity of explanation. Yet, I had too much fun messing around with this post so here it is in this form.
Lastly, I don’t know Old English so the dialogue is probably incorrect. I’m happy to accept corrections but otherwise I don’t care that much. One shouldn’t take this post too seriously.