Comparing I/O Speeds of C, C++, & Lisp

Gene Michael Stover

created Thursday, 30 October 2003
updated Sunday, 23 May 2004

Copyright © 2003, 2004 Gene Michael Stover. All rights reserved. Permission to copy, store, & view this document unmodified & in its entirety is granted.


Contents

1 Introduction

On a whim - I swear, just a whim, I wrote a few programs to compare the speeds of input & output in C, C++, & Lisp. You know what I learned? Lisp reads data faster than C++ does. Yeah, I just about fell out of my chair, too.

For the record, I know that any kind of statistics are just statistics - & damned lies. The test I ran was informal, & it considers only a couple of simple cases. Nevertheless, the performance difference is big; I think it's undeniable.

By the way, C++ writes data faster than Lisp does. C is significantly faster than C++ & Lisp for both input & output.

2 The Test Procedure

I ran two tests, one for input & one for output.

For output, there is one test program for each language. A test program reads an integer N from standard input. Then it writes N records to standard output. A record is an integer & a floating point number. The record is written as text. The two fields are separated by a single space, & the entire record is terminated with a newline. A test program tracks the amount of time required to write all the records, & then it prints the number of records, the amount of time, & the output rate in records per second.

The source files for the output tests are:

  1. http://cybertiggyr.com/gene/ios0/out-c.c for C,
  2. http://cybertiggyr.com/gene/ios0/out-cxx.cxx for C++
  3. http://cybertiggyr.com/gene/ios0/out-lisp for Lisp.

For input, I first generate a large file using a program called generator.c. It writes a lot of records to standard output. Each record contains an integer, a floating-point number, & a string of randomly selected characters. Each field is terminated by a newline. I save the random stuff to a file & then pipe it into the input test programs. There is one input test program for C, one for Lisp, & two for C++. One of the C++ input test programs uses std::cin to read string objects of type std::string. The other C++ input test program uses std::cin to read strings as character arrays (as C does).

The sources for the input test programs are:

  1. http://cybertiggyr.com/gene/ios0/cee.c for C,
  2. http://cybertiggyr.com/gene/ios0/creep.cxx for C++ using std::string objects,
  3. http://cybertiggyr.com/gene/ios0/strep.cxx for C++ using character arrays,
  4. http://cybertiggyr.com/gene/ios0/lee for Lisp.

If you want to reconstruct the programs as I created & ran them, you'll also want these files:

  1. http://cybertiggyr.com/gene/ios0/Makefile &
  2. http://cybertiggyr.com/gene/ios0/go.

For the record, I ran all the tests on Linux 2.2 on a Sony Vaio PCG-XG500 with a 750 MHz or 800 MHz Pentium something-or-other & 128 or 256 megabytes of RAM. The C programs were compiled with Gnu gcc 3.0.4. The C++ programs were compiled with Gnu g++ 3.0.4. The Lisp I used was Gnu Clisp 2.28.

3 Output Test Results

The results of the output performance test are in Figure 1.

Figure 1: The results of the output performance tests.
\begin{figure}\begin{tabular}{\vert r\vert r\vert r\vert r\vert l\vert} \hline
{...
...
250,003 & 74 & 3,378.419 & Lisp & out-lisp \\ \hline
\end{tabular}
\end{figure}

The first column, records, is the number of records the test program printed. The second column, seconds, is the number of seconds the test program required to print those records. The third column, rate, is the number of records the program was able to print each second, on the mean. The fourth column is the language. The fifth & final column is the name of the source file.

C was the fastest by far. Since it required only 1 second to write 250,003 records, & the limit of resolution of the clock is 1 second, it's possible C is faster than $250,003
\frac{record}{second}$.

C++ is more than three times faster than Lisp.

Here we have confirmation of the belief that Lisp's format function is notoriously slow.

4 Input Test Results

The results of the input performance test are in Figure 2.

Figure 2: The results of the input performance tests.
\begin{figure}\begin{tabular}{\vert r\vert r\vert r\vert r\vert l\vert} \hline
{...
... & 2,000.020 & C++ & creep.cxx, std::string \\ \hline
\end{tabular}
\end{figure}

The columns are the same as in Figure 1. Notice that there are two programs for C++. The creep.cxx program treats strings as objects using class std::string. The strep.cxx program treats strings as character arrays, as C does.

The C program uses scanf to read. Both C++ programs use std::cin to read. The Lisp program uses read to read. Notice that the Lisp program is doing full dynamic parsing on the input, whereas the other programs assume the types of the fields they are reading. The Lisp program also does more dynamic allocation because read probably creates fully dynamically allocated & dynamically typed objects from the input stream. So the Lisp program is doing more work (not that it does the Lisp program much good in this case).

As with the output test, C is many times faster than the other languages.

The Lisp program is more than eight times faster than the C++ programs. The C++ program which treats strings as character arrays is slightly faster than the C++ program which uses string objects.

5 Conclusion

Frankly, I'm surprised with the results of the input test. My guess is that the implementation of class std::cin in the C++ library that my compiler uses is inefficient; it's surely nothing to do with C++ the language. What cracks me up is how people insist that they need to use C++ for performance, but here it runs slower than a garbage-collected, dynamically typed, interpreted language that does full parsing on its input.

The results of the output tests were what I expected. In fact, they confirmed the belief among Lisp programmers that the format function is slow.

A. Optimizations

I've been asked what compiler optimizations I used, or why I didn't use any.

I intentionally ignored compiler optimizations for serveral reasons.

  1. Optimizations open a can of worms.
    1. Should I invoke every optimization that some compiler has? What if some of those optimizations are safe for the simple programs I have here but not for large applications? (This is an especially realistic question with Lisp compilers.) Are those optimizations invalid in a performance test because they couldn't be used in a large program?

    2. What if one compiler allows a bunch of optimizations that another compiler doesn't? Is it unfair to the latter compiler?

    3. If optimizations change the results of these tests, do they reflect the performance of large applications?

  2. If optimizations are so unconditionally good, why aren't they the default?

  3. When you rely heavily on optimizations in large programs, it's easy to work your way into hard-to-find bugs because an optimization that worked when the application was simpler doesn't work now. So I've developed the defensive habit of using defaults.

  4. What kind of performance improvement will optimizations create? Will they double the performance, so twenty seconds becomes ten? Increase it ten times, so twenty seconds become two? Big woop.

B. Other File Formats

This document is available online in several formats:

There are no plans to make it available in Pointless Document Format (PDF).

Gene Michael Stover 2008-04-20