Top compiler Questions

List of Tags

I have a Flash project, and it has many source files. I have a fairly heavily-used class, call it Jenine. I recently (and, perhaps, callously) relocated Jenine from one namespace to another. I thought we were ready - I thought it was time. The new Jenine was better in every way - she had lost some code bloat, she had decoupled herself from a few vestigial class relationships, and she had finally come home to the namespace that she had always secretly known in her heart was the one she truly belonged to. She was among her own kind.

Unfortunately, Flash would have none of that. Perhaps it had formed an attachment. Perhaps it didn't want Jenine to be decoupled. Either way, it clung to the old, perfect version of Jenine in its memory. It refused to move on. It ignored her (function) calls. It tried to forget her new, public interfaces. Instead, every instance of Jenine that it constructed was always a copy of the old version, down to its classpath:

var jenineInstance:Jenine = new Jenine();
trace( getQualifiedClassName(jenineInstance));
// Should print: com.newnamespace.subspace::Jenine
// Prints: com.oldnamespace.subspace::Jenine
// Ah, young love!

We fought. I'm not proud of some of the things I said or did. In the end, in a towering fit of rage, I deleted all references of Jenine completely. She was utterly, completely erased from the system. My cursor fell upon the "Empty Trash" menu option like the cold lid of a casket.

I don't think Flash ever recovered. To this day it still clings to the memory of Jenine. Her old, imperfect definitions still float through my project like abandoned ghosts. Whenever I force Flash to compile, it still lovingly inserts her into my movie, nestling her definition in amongst the other, living classes, like a small shrine. I wonder if they can see her.

Flash and I don't really talk anymore. I write my code, it compiles it. There's a new girl in town named Summer who looks almost identical to Jenine, as if someone had just copied her source-code wholesale into a new class, but Flash hasn't shown any interest. Most days it just mopes around and writes bad poetry in my comments when it thinks I'm not looking.

I hope no one else has had a similar experience, that this is just a singular, painful ripple in the horrifying dark lagoon that is the Flash code-base. If, by some fluke chance you have, or you have any idea how to erase whatever damn cache the compiler is using, please, please help.

Answered By: murzeb ( 448)

Flash still has the ASO file, which is the compiled byte code for your classes. On Windows, you can see the ASO files here:

C:\Documents and Settings\username\Local Settings\Application Data\Adobe\Flash CS4\en\Configuration\Classes\aso

On a Mac, the directory structure is similar in /Users/username/Library/Application Support/

You can remove those files by hand, or in Flash you can select Control->Delete ASO files to remove them.

Preferred Languages : C/C++, Java, and Ruby

I am looking for some helpful books/tutorials on how to write your own compiler simply for educational purposes. I am most familiar with C/C++, Java, and Ruby, so I prefer resources that involve one of those three, but any good resource is acceptable.

Answered By: Michael Stum ( 574)

Big List of Resources:

¶ Link to a PDF
$ Link to a printed book

Dan Goldstein

Compiling a C++ file takes a very long time when compared to C#, Java. It takes significantly longer to compile a C++ file than it would to run a normal size Python script. I'm current using VC++ but it's the same with any compiler. Why is this?

The two reasons I could think of were loading header files and running the preprocessor, but that doesn't seem like it should explain why it takes so long.

Answered By: jalf ( 364)

Several reasons:

  • Header files: Every single compilation unit requires hundreds or even thousands of headers to be 1: loaded, and 2: compiled. Every one of them typically has to be recompiled for every compilation unit, because the preprocessor ensure that the result of compiling a header might vary between every compilation unit. (A macro may be defined in one compilation unit which changes the content of the header).

    This is probably the main reason, as it requires huge amounts of code to be compiled for every compilation unit, and additionally, every header has to be compiled multiple times (once for every compilation unit that includes it)

  • Linking: Once compiled, all the object files have to be linked together. This is basically a monolithic process that can't very well be parallelized, and has to process your entire project.

  • Parsing: The syntax is extremely complicated to parse, depends heavily on context, and is very hard to disambiguate. This takes a lot of time

  • Templates: In C#, List<T> is the only type that is compiled, no matter how many instantiations of List you have in your program. In C++, vector<int> is a completely separate type from vector<float>, and each one will has to be compiled separately.

    Add to this that templates make up a full turing-complete "sub-language" that the compiler has to interpret, and this can become ridiculously complicated. Even relatively simple template metaprogramming code can define recursive templates that create dozens and dozens of template instantiations. Templates may also result in extremely complex types, with ridiculously long names, adding a lot of extra work to the linker. (It has to compare a lot of symbol names, and if these names can grow into many thousand characters, that can become fairly expensive).

    And of course, they exacerbate the problems with header files, because templates generally have to be defined in headers, which means far more code has to be parsed and compiled for every compilation unit. In plain C code, a header typically only contains forward declarations, but very little actual code. In C++, it is not uncommon for almost all the code to reside in header files.

  • Optimization: C++ allows some very dramatic optimizations. C# or Java don't allow classes to be completely eliminated (they have to be there for reflection purposes), but even a simple C++ template metaprogrogram can easily generate dozens or hundreds of classes, all of which are inlined and eliminated again in the optimization phase.

    Moreover, a C++ program must be fully optimized by the compiler. A C# program can rely on the JIT compiler to perform additional optimizations at load-time, C++ doesn't get any such "second chances". What the compiler generates is as optimized as it's going to get.

  • Machine code: C++ is compiled to machine code which may be somewhat more complicated than the bytecode Java or .NET use (especially in the case of x86).
    (This is mentioned out of completeness only because it was mentioned in comments and such. In practice, this step is unlikely to take more than a tiny fraction of the total compilation time.)

Most of these factors are shared by C code, which actually compiles fairly efficiently. The parsing step is a lot more complicated in C++, and can take up significantly more time, but the main offender is probably templates. They're useful, and make C++ a far more powerful language, but they also take their toll in terms of compilation speed.