In the beginning, there was C. In C, there are three types of memory allocation: static, automatic, and dynamic. Static variables are the constants embedded in the source file, and as they have known sizes and never change, they’re not all that interesting. Automatic allocation can be thought of as stack allocation - space is allocated when a lexical block is entered, and freed when that block is exited. Its most important feature is directly related to that. Until C99, automatically allocated variables were required to have their sizes known at compile time. This means that any string, list, map, and any structure which derived from these had to live on the heap, in dynamic memory.

Eliminating the Garbage Collector: The RAII Way

Dynamic memory was explicitly allocated and freed by the programmer using four fundamental operations: malloc, realloc, calloc, and free. The first two of these perform no initialization whatsoever, memory may contain cruft. All of them except free can fail. In that case, they return a null pointer, the access of which is undefined behavior; in the best case, your program explodes. In the worst case, your program appears to work for a while, processing garbage data before exploding.

Doing things this way is kind of painful because you, the programmer, have sole charge of maintaining a bunch of invariants which cause your program to explode when violated. There must be a malloc call before the variable is ever accessed. You must check that malloc returned successfully before using your variable. There must exist exactly one free call per malloc in the execution path. If zero, memory leaks. If more than one, your program explodes. There may be no access attempts to the variable after it has been freed. Let’s see an example of what this actually looks like:

int main() {
   char *str = (char *) malloc(7); 
   strcpy(str, "toptal");
   printf("char array = \"%s\" @ %u\n", str, str);

   str = (char *) realloc(str, 11);
   strcat(str, ".com");
   printf("char array = \"%s\" @ %u\n", str, str);

   free(str);
   
   return(0);
}
$ make runc
gcc -o c c.c
./c
char * (null terminated): toptal @ 66576
char * (null terminated): toptal.com @ 66576

That code, simple as it is, already contains one antipattern and one questionable decision. In real life, you should never write out byte counts as literals, but instead use the sizeof function. Similarly, we allocate the char * array to exactly the size of the string we need twice (one more than the length of the string, to account for null-termination), which is a fairly expensive operation. A more sophisticated program might construct a larger string buffer, allowing the string size to grow.

The Invention of RAII: A New Hope

All that manual management was unpleasant, to say the least. In the mid-80s, Bjarne Stroustrup invented a new paradigm for his brand-new language, C++. He called it Resource Acquisition Is Initialization, and the fundamental insights were the following: objects can be specified to have constructors and destructors which are called automatically at appropriate times by the compiler, this provides a much more convenient way to manage the memory a given object requires, and the technique is also useful for resources which are not memory.

This means the above example, in C++, is much cleaner:

int main() {
   std::string str = std::string ("toptal");
   std::cout << "string object: " << str << " @ " << &str << "\n";
   
   str += ".com";
   std::cout << "string object: " << str << " @ " << &str << "\n";
   
   return(0);
}
$ g++ -o ex_1 ex_1.cpp && ./ex_1
string object: toptal @ 0x5fcaf0
string object: toptal.com @ 0x5fcaf0

No manual memory management in sight! The string object is constructed, has an overloaded method called, and is automatically destroyed when the function exits. Unfortunately, that same simplicity can lead to other complications. Let’s look at an example in some detail:

vector<string> read_lines_from_file(string &file_name) {
	vector<string> lines;
	string line;
	
	ifstream file_handle (file_name.c_str());
	while (file_handle.good() && !file_handle.eof()) {
		getline(file_handle, line);
		lines.push_back(line);
	}
	
	file_handle.close();
	
	return lines;
}

int main(int argc, char* argv[]) {
	// get file name from the first argument
	string file_name (argv[1]);
	int count = read_lines_from_file(file_name).size();
	cout << "File " << file_name << " contains " << count << " lines.";
	
	return 0;
}
$ make cpp && ./c++ makefile
g++ -o c++ c++.cpp
File makefile contains 38 lines.

That all seems fairly straightforward. The vector lines get filled up, returned, and called. However, being efficient programmers who care about performance, something about this bothers us: in the return statement, the vector is copied into a new vector due to the value semantics in play, shortly before its destruction.

This isn’t strictly true in modern C++ anymore. C++11 introduced the notion of move semantics, in which the origin is left in a valid (so that it can still be properly destroyed) but unspecified state. Return calls are a very easy case for the compiler to optimize to move semantics, as it knows the origin will be destroyed shortly before any further access. However, the purpose of the example is to demonstrate why people invented a whole bunch of garbage-collected languages in the late 80s and early 90s, and in those times C++ move semantics were not available.

For large data, this can get expensive. Let’s optimize this, and just return a pointer. There are a few syntax changes, but otherwise it’s the same code:

Actually, vector is a value handle: a relatively small structure containing pointers to items on the heap. Strictly speaking, it’s not a problem to simply return the vector. The example would work better if it were a large array being returned. As attempting to read a file into a pre-allocated array would be nonsensical, we use the vector instead. Just pretend it’s an impractically large data structure, please.

vector<string> * read_lines_from_file(string &file_name) {
	vector<string> * lines;
	string line;
	
	ifstream file_handle (file_name.c_str());
	while (file_handle.good() && !file_handle.eof()) {
		getline(file_handle, line);
		lines->push_back(line);
	}
	
	file_handle.close();
	
	return lines;
}
$ make cpp && ./c++ makefile
g++ -o c++ c++.cpp
Segmentation fault (core dumped)

Ouch! Now that lines is a pointer, we can see that automatic variables are working as advertised: the vector is destroyed as its scope is departed, leaving the pointer pointing to a forward location in the stack. A segmentation fault is simply attempted access of illegal memory, and so we really should have expected that. Still, we want to get the file’s lines back from our function somehow, and the natural thing is to simply move our variable out of the stack and into the heap. This is done with the new keyword. We can simply edit one line of our file, where we define lines:

vector<string> * lines = new vector<string>;
$ make cpp && ./c++ makefile
g++ -o c++ c++.cpp
File makefile contains 38 lines.

Unfortunately, though this appears to work perfectly, it still has a flaw: it leaks memory. In C++, pointers to the heap must be manually deleted after they are no longer needed; if not, that memory becomes unavailable once the last pointer falls out of scope, and isn’t recovered until the OS manages it when the process ends. Idiomatic modern C++ would use a unique_ptr here, which implements the desired behavior. It deletes the object pointed to when the pointer falls out of scope. However, that behavior wasn’t part of the language until C++11.

In this example, this can be easily fixed:

vector<string> * read_lines_from_file(string &file_name) {
	vector<string> * lines = new vector<string>;
	string line;
	
	ifstream file_handle (file_name.c_str());
	while (file_handle.good() && !file_handle.eof()) {
		getline(file_handle, line);
		lines->push_back(line);
	}
	
	file_handle.close();
	
	return lines;
}

int main(int argc, char* argv[]) {
	// get file name from the first argument
	string file_name (argv[1]);
	vector<string> * file_lines = read_lines_from_file(file_name);
	int count = file_lines->size();
	delete file_lines;
	cout << "File " << file_name << " contains " << count << " lines.";
	
	return 0;
}

Unfortunately, as programs expand beyond toy scale, it rapidly becomes more difficult to reason about where and when exactly a pointer should be deleted. When a function returns a pointer, do you own it now? Should you delete it yourself when you’re done with it, or does it belong to some data structure which will all be freed at once later on? Get it wrong in one way and memory leaks, get it wrong in the other and you’ve corrupted the data structure in question and likely others, as they attempt to dereference pointers which are now no longer valid.

“Into the Garbage Collector, flyboy!”

Garbage Collectors are not a new technology. They were invented in 1959 by John McCarthy for Lisp. With Smalltalk-80 in 1980, garbage collection began to come into the mainstream. However, the 1990s represented the true flowering of the technique: between 1990 and 2000, a large number of languages were released, all of which used garbage collection of one sort or another: Haskell, Python, Lua, Java, JavaScript, Ruby, OCaml, and C# are among the best-known.

What is garbage collection? In short, it’s a set of techniques used to automate away the manual memory management. It’s often available as a library for languages with manual memory management such as C and C++, but it’s much more commonly used in languages which require it. The great advantage is that the programmer simply doesn’t need to think about memory; it’s all abstracted away. For example, the Python equivalent to our file-reading code above is simply this:

def read_lines_from_file(file_name):
	lines = []
	with open(file_name) as fp: 
		for line in fp:
			lines.append(line)
	return lines
	
if __name__ == '__main__':
	import sys
	file_name = sys.argv[1]
	count = len(read_lines_from_file(file_name))
	print("File {} contains {} lines.".format(file_name, count))
$ python3 python3.py makefile
File makefile contains 38 lines.

The array of lines comes into being when first assigned to and is returned without copying to the calling scope. It gets cleaned up by the Garbage Collector sometime after it falls out of that scope, as the timing is indeterminate. An interesting note is that in Python, RAII for non-memory resources is not idiomatic. It’s allowed - we could have simply written fp = open(file_name) instead of using a with block, and let the GC clean up afterward. But the recommended pattern is to use a context manager when possible so that they can be released at deterministic times.

As nice as it is to abstract away memory management, there is a cost. In reference counting garbage collection, all variable assignment and scope exits gain a small cost to update the references. In mark-and-sweep systems, at unpredictable intervals all program execution is halted while the GC cleans up the memory. This is often called a stop-the-world event. Implementations like Python, which use both systems, suffer from both penalties. These issues reduce the suitability of garbage-collected languages for cases where performance is critical, or real-time applications are necessary. One can see the performance penalty in action even on these toy programs:

$ make cpp && time ./c++ makefile
g++ -o c++ c++.cpp
File makefile contains 38 lines.
real    0m0.016s
user    0m0.000s
sys     0m0.015s

$ time python3 python3.py makefile
File makefile contains 38 lines.

real    0m0.041s
user    0m0.015s
sys     0m0.015s

The Python version takes almost three times as much real time as the C++ version. While not all of that difference can be attributed to garbage collection, it’s still considerable.

Ownership: RAII Awakens

Is that the end, then? Must all programming languages choose between performance and ease of programming? No! Programming language research continues, and we are beginning to see the first implementations of the next generation of language paradigms. Of particular interest is the language called Rust, which promises Python-like ergonomics and C-like speed while making dangling pointers, null pointers and such impossible - they won’t compile. How can it make those claims?

The core technology which permits these impressive claims is called the borrow checker, a static checker which runs on compilation, rejecting code which could cause these issues. However, before going too deeply into the implications, we’ll need to talk about the prerequisites.

Ownership

Recall in our discussion of pointers in C++, we touched on the notion of ownership, which in rough terms means “who is responsible for deleting this variable.” Rust formalizes and strengthens this concept. Every variable binding has ownership of the resource it binds, and the borrow checker ensures there is exactly one binding which has overall ownership of the resource. That is, the following snippet from the Rust Book, won’t compile:

let v = vec![1, 2, 3];
let v2 = v;
println!("v[0] is: {}", v[0]);
error: use of moved value: `v`
println!("v[0] is: {}", v[0]);
                        ^

Assignments in Rust have move semantics by default - they transfer ownership. It’s possible to give copy semantics to a type, and this is already done for numeric primitives, but it’s unusual. Due to this, as of the third line of code, v2 owns the vector in question and it can no longer be accessed as v. Why is this useful? When every resource has exactly one owner, it also has a single moment at which it falls out of scope, which can be determined at compile-time. This means in turn that Rust can deliver on the promise of RAII, initializing and destroying resources deterministically based on their scope, without ever using a garbage collector or requiring the programmer to manually release anything.

Compare this to reference-counting garbage collection. In an RC implementation, all pointers have at least two pieces of information: the object pointed to, and the number of references to that object. The object is destroyed when that count reaches 0. This doubles the memory requirement of the pointer and adds a small cost to its use, as the count is automatically incremented, decremented, and checked. Rust’s ownership system offers the same guarantee, that objects get destroyed automatically when they run out of references, but it does so without any runtime cost. The ownership of each object is analyzed and the destruction calls inserted at compile time.

Borrowing

If move semantics were the only way to pass data, function return types would get very complicated, very fast. If you wanted to write a function which used two vectors to produce an integer, which didn’t destroy the vectors afterwards, you’d have to include them in the return value. While that’s technically possible, it’s terrible to use:

fn foo(v1: Vec<i32>, v2: Vec<i32>) -> (Vec<i32>, Vec<i32>, i32) {
    // do stuff with v1 and v2

    // hand back ownership, and the result of our function
    (v1, v2, 42)
}

let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];

let (v1, v2, answer) = foo(v1, v2);

Instead, Rust has the concept of borrowing. You can write the same function like so, and it will borrow the reference to the vectors, giving it back to the owner when the function ends:

fn foo(v1: &Vec<i32>, v2: &Vec<i32>) -> i32 {
    // do stuff
    42
}

let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];

let answer = foo(&v1, &v2);

v1 and v2 return their ownership to the original scope after fn foo returns, falling out of scope and being destroyed automatically when the containing scope exits.

It is worth mentioning here that there are restrictions on borrowing, enforced by the borrow checker at compile time, which the Rust Book puts very succinctly:

Any borrow must last for a scope no greater than that of the owner. Second, you may have one or the other of these two kinds of borrows, but not both at the same time:

  • one or more references (&T) to a resource

  • exactly one mutable reference (&mut T)

This is noteworthy because it forms a critical aspect of Rust’s protection against data races. By preventing multiple mutable accesses to a given resource at compile time, it guarantees that code cannot be written in which the outcome is indeterminate because it depends on which thread arrived at the resource first. This prevents issues such as iterator invalidation and use after free.

The Borrow Checker in Practical Terms

Now that we know about some of Rust’s features, let’s look at how we implement the same file line counter we’ve seen before:

fn read_lines_from_file(file_name: &str) -> io::Result<Vec<String>> {
	// variables in Rust are immutable by default. The mut keyword allows them to be mutated.
	let mut lines = Vec::new();
	let mut buffer = String::new();
	
	if let Ok(mut fp) = OpenOptions::new().read(true).open(file_name) {
		// We enter this block only if the file was successfully opened.
		// This is one way to unwrap the Result<T, E> type Rust uses instead of exceptions.
		
		// fp.read_to_string can return an Err. The try! macro passes such errors 
		// upwards through the call stack, or continues otherwise.
		try!(fp.read_to_string(&mut buffer));
		lines = buffer.split("\n").map(|s| s.to_string()).collect();
	}
	
	Ok(lines)
}

fn main() {
	// Get file name from the first argument.
	// Note that args().nth() produces an Option<T>. To get at the actual argument, we use
	// the .expect() function, which panics with the given message if nth() returned None, 
	// indicating that there weren't at least that many arguments. Contrast with C++, which
	// segfaults when there aren't enough arguments, or Python, which raises an IndexError.
	// In Rust, error cases *must* be accounted for.
	let file_name = env::args().nth(1).expect("This program requires at least one argument!");
	if let Ok(file_lines) = read_lines_from_file(&file_name) {
		println!("File {} contains {} lines.", file_name, file_lines.len());
	} else {
		// read_lines_from_file returned an error
		println!("Could not read file {}", file_name);
	}
}

Beyond the items already commented on in the source code, it’s worth going through and tracing the lifetimes of the various variables. file_name and file_lines last until the end of main(); their destructors are called at that time without extra cost, using the same mechanism as C++’s automatic variables. When calling read_lines_from_file, file_name is loaned immutably to that function for its duration. Within read_lines_from_file, buffer acts the same way, destroyed when it falls out of scope. lines, on the other hand, persists and is returned successfully to main. Why?

The first thing to note is that as Rust is an expression-based language, the return call may not look like one at first. If the last line of a function omits the trailing semicolon, that expression is the return value. The second thing is that return values get special handling. They are assumed to want to live at least as long as the function’s caller. The final note is that due to the move semantics involved, there’s no copy necessary to transmute Ok(lines) into Ok(file_lines), the compiler simply makes the variable point at the appropriate bit of memory.

“Only at the end do you realize the true power of RAII.”

Manual memory management is a nightmare that programmers have been inventing ways to avoid since the invention of the compiler. RAII was a promising pattern, but crippled in C++ because it simply didn’t work for heap allocated objects without some odd workarounds. Consequently, there was an explosion of garbage-collected languages in the 90s, designed to make life more pleasant for the programmer even at the expense of performance.

However, that’s not the last word in language design. By using new and strong notions of ownership and borrowing, Rust manages to merge the scope-basing of RAII patterns with the memory security of garbage-collection; all without ever requiring a garbage collector to stop the world, while making safety guarantees not seen in any other language. This is the future of systems programming. After all, “to err is human, but compilers never forget.

About the author

Peter Goodspeed-Niklaus, Germany
member since August 31, 2015
Peter received a B.S. in Computer Science, with distinction, from WPI in 2005. Immediately after, he took a decade to broaden himself professionally. He's spent that time as an English teacher in Japan and a Blackhawk pilot for the US Army, and has recently left the Army. He writes computer games in his free time. [click to continue...]
Hiring? Meet the Top 10 Freelance Software Developers for Hire in December 2016

Comments

Tarik Zakaria Benmerar
Big thanks. It's not only memory leaks issue. Every week, we hear about security flaws because these old languages that are still widely used today. Time to innovate, may I say. The coming decades will be for effective technologies. Easy, secure and performant
fredzvt
Excellent article! Thanks for sharing!
aybSerti
i found you do promotion for Rust language much more then describing how doing garbage collection without garbage collector. First, i though you ae going to get in deep details about how Rust compilator and runtime deals with garbage collector. anyways. i like the way you introduce Rust for person coming from C++ and JS world
optiklab
Brilliant article, thanks! You forgot only one new technique... :)
Milan Marković
When returning a vector of strings in your first example, another mechanism might come first. Before going for move semantics, the compiler will try to perform RVO (return value optimization). This is true for every function that has a single return statements (no complicated logic with multiple returns), and is there for most modern C++ compilers and offers better performance to move semantics. https://en.wikipedia.org/wiki/Return_value_optimization
Kumar Sambhav Manu
Only thing I hated in the article were cute but big cartoons, and also the diversion to python and then the abrupt comeback to Rust. Otherwise, good read.
David Nichols
There is a garbage collection approach that supports RAII with completely automatic memory management with an algorithm that handles cyclic directed graphs of objects in a multithreaded context: https://github.com/qorelanguage/qore/wiki/Prompt-Collection AFAIK this is the first time such an approach has been implemented (disclaimer: I am the author of the language and the prompt collection implementation)
comments powered by Disqus
Subscribe
The #1 Blog for Engineers
Get the latest content first.
No spam. Just great engineering and design posts.
The #1 Blog for Engineers
Get the latest content first.
Thank you for subscribing!
You can edit your subscription preferences here.
Trending articles
Relevant technologies
About the author
Peter Goodspeed-Niklaus
Python Developer
Peter received a B.S. in Computer Science, with distinction, from WPI in 2005. Immediately after, he took a decade to broaden himself professionally. He's spent that time as an English teacher in Japan and a Blackhawk pilot for the US Army, and has recently left the Army. He writes computer games in his free time.