Friday, 10 February 2012

RAII is not Garbage

"RAII is the greatest contribution C++ has made to software development."
- Russel Winder

Managed and non-managed programming languages have very different ways of approaching resource management. Ever since I created my first resource leak (Windows handles in an MFC application) I have been fascinated by resource and memory management and the ways of preventing leaks. In this article I am going to compare the ways a non-managed programming language like C++ manages resources compared to managed languages like Java and C#.


Resource Acquisition Is Initialisation

Resource Acquisition is Initialisation which is often abbreviated to RAII, although badly named is as Dr. Winder says the greatest contribution C++ has made to software development. Unlike garbage collected languages memory in C++ is not cleaned up automatically. If you or something you are using allocates memory on the free-store, you or that other something must delete it when it's finished with. In his article Garbage Collection and Object Lifetime, Ric Parkin discusses the different ways memory is handled in C++ and C# in a reasonable amount of detail so I wont go into it too deeply here.

So what is RAII? It is a mechanism that is available as a direct result of C++ having classes with constructors and automatic, and deterministic destructors. It is probably best demonstrated with a simple example. The following ScopedFile class allocates a C FILE pointer in its constructor and closes it in its destructor.
class ScopedCFile
{
private:
FILE* file;

public:
ScopedCFile(const std::string& filename)
{
file = fopen(filename.c_str(), "r");
// ..
}

~ScopedCFile()
{
// ..
close(file);
}
};

int main(int argc, char* argv[])
{
ScopedCFile file(argv[1]);
return 0;
}
If an instance of ScopedCFile is created on the stack its destructor is guaranteed to be called when it goes out of scope. This is automatic and deterministic destruction. The destructor cleans up by closing the file. As the client of the ScopedCFile instance you do not have to take any action to ensure clean-up. Of course if you create the instance on the free-store, you become responsible for deleting it. Deleting the instance causes the destructor to be called and ensures clean-up as before. The calling of the destructor is still deterministic, but it is not longer automatic.

Smart pointers such as std::unique_ptr can be used to manage free-store memory in C++. They are generally stack based objects that employ RAII to make free-store based object deletion automatic. They are not usually restricted to dealing just with memory and can also deal with resources.

Of course the C++ standard library has its own file handling classes that do all of the resource handling for you so you don't actually need to write a ScopedCFile class in most cases.


Garbage Collection and Destructors

It seems to me that most modern languages are garbage collected. In fact I would go so far as saying that most languages have some sort of automatic memory management and C++ is an example of one of the very few languages that don't (leaving managed C++ to one side). Therefore I am going to pick two languages that I am familiar with, Java and C#, to demonstrate resource management in a garbage collected language.

In garbage collected languages memory deletion is done for you automatically. Objects are generally created on the heap, although there are some exceptions. Every so often the garbage collector is invoked and determines which objects are no longer referenced. These objects are then marked for deletion and subsequently deleted. The upshot of this is that after you create an object you don't have to worry about deleting it again as the garbage collector will do it for you. Sounds wonderful, doesn't it? The truth is it's not bad, but it's not perfect either. You have little control over when the garbage collector is called. Even when it is invoked directly the best you can hope for is that the runtime agrees it's a good time to garbage collect. This means that there is no way of determining when or even if an object will ever be destroyed.

What about destructors? C# has destructors and Java has finalizers. Both are methods that are called just before an object is deleted. Therefore it cannot be determined when or even if (in the case of Java) they will be called. C# destructors and Java finalizers are automatic, but not deterministic. That's not much good for resource clean up. If for example you're opening a lot of files you may well run out of operating system file handles before the garbage collector runs to free them all up again, if it runs at all.

So how do you make sure resources are cleaned up as soon as they are no longer needed? Both Java and C# support try-catch-finally.
public static void main(String[] args) throws IOException 
{
InputStream inputStream = new FileInputStream(args[1]);
try
{
// ..
}
catch(Exception e)
{
// ..
}
finally
{
inputStream.close();
}
}
Clean-up code is placed in the finally block so that even in the presence of exceptions resources are released. C# also has the IDisposable interface which together with the using declarative provides a shorthand for try-catch-finally.
static void Main(string[] args)
{
using(var fileStream = new FileStream(args[1], FileMode.Open))
{
// ..
}
}
In the newly released Java 7 try has been overloaded to provide a similar short hand.
public static void main(String[] args) throws IOException 
{
try(InputStream inputStream = new FileInputStream(args[1]))
{
// ..
}
}
RAII vs try-catch-finally

Despite five wonderful years of C++, I am now a big fan of memory managed languages. That's not quite the same as being a fan of garbage collection. What I like is being able to allocate memory without having to worry about deallocating it. However the non-deterministic destruction behaviour of garbage collected languages pushes the responsibility for cleaning up resources from the encapsulating class to the client of the class.

As I have shown, when using RAII in C++, no clean-up is required by the client:
int main(int argc, char* argv[])
{
ScopedCFile file(argv[1]);
return 0;
}
but in a garbage collected language such as C# clean-up code must be written by the client or left for finalisation, which is never a good idea:
static void Main(string[] args)
{
using(var fileStream = new FileStream(args[1], FileMode.Open))
{
// ..
}
}
Therefore RAII is the clear winner in terms of resource management, as the client is not relied upon to notice that the class they are using requires clean-up. It also keeps the client code cleaner and more readable as its intention is not littered with code that serves no purpose other than to clean-up.


Execute Around Method

There are ways to move the resource handling from the client into an encapsulating class and Kevlin Henney discusses this in his article Another Tale of Two Patterns. The basic idea is that you have a class that manages the resource and passes it to another class, a delegate or something to use it. In the following C# example the execute method creates a FileStream object and passes it to a delegate. The implementation of the delegate does what needs to be done with the FileStream object and when it returns the execute method cleans up.
static class FileStreamTemplate
{
public static void Execute( string filename,
FileMode fileMode,
Action action)
{
using (var fileStream = new FileStream(filename, fileMode))
{
action(fileStream);
}
}
}

static void Main(string[] args)
{
FileStreamTemplate.Execute(args[1], FileMode.Open, delegate(FileStream fileStream)
{
// ..
});
}
For such a simple example as a FileStream the only advantage is making sure the client cannot forget to clean-up and a timely manner. With more complex resources such as the various classes that collaborate together to access a database the boilerplate becomes a far more useful encapsulation.

Although Execute Around Method in garbage collected languages solves the same problem as RAII In C++ it is still inferior due to the amount of boilerplate required and the complexity and verbosity of the client code.


Finally

It is clear that the encapsulated resource management provided by RAII in C++ is vastly superior to the client responsible approach of the try-catch-finally pattern in garbage collected languages like Java and C#. I would go so far as to say that the language designers of garbage collected languages have dropped the ball where resource management is concerned. In fact C# using was added as an after thought. Try-catch-finally is very much treating a symptom, rather than fixing the problem.

This is about the third version of this article. In the previous version I fell into the trap of trying to provide solution, when really that is an article all to itself. When I started this article I set out only to highlight that what many people see as good enough in garbage collected languages is actually nowhere near enough. Yet all is not lost. In a follow up article I'll describe some of the concerns in creating a solution to encapsulated resource management in garbage collected languages.


Acknowledgements

Thank you to Russel Winder for inspiration and Ric Parkin for review. Further thanks go to Phil Nash, Alan Stokes and Nick Butler.

No comments:

Post a Comment