GNU Compiler Collection (GCC): Overview and Tips on the GCC Compiler Toolchain

GCC Background

Original author

Richard M. Stallman[1] is the original author of GCC. Richard is also the founder of the GNU Project and the Free Software Foundation (FSF), and the father and current maintainer of the One True Emacs.

Big Picture

GCC is a critical piece of several free operating systems. GNU Hurd[2] and GNU / Linux are two of these free operating systems.

Part of GNU Project (GNU’s Not UNIX)

Managed by Free Software Foundation (FSF)

Licensed under GNU General Public License (GNU GPL)

Initially named "GNU C Compiler"

Development History

GCC Development Dates

1984 - Started GCC
1987 - First release (1.0)
1992 - Release 2.0 included C++ support
1997 - Started experimental branch (EGCS)
2001 - Release 3.0 incorporated EGCS
2007 - Current release is 4.1.1 (released May 2006)

GNU Philosophy

Free software

Not necessarily free of cost

But liberty to use, modify, and redistribute with limited restrictions designed to perpetuate freedom (no proprietary modifications)

Distributors can sell installations, and support, but must provide the source code and not restrict further copying and redistribution

Free Software Foundation operates on sales of books and donations of money, time, and equipment.

Provide a free operating system

To support a proprietary-software free environment to work and communicate

Supported Platforms

Supports worlds widest selection of processors and platforms

Supports most platforms available today

Processor & system manufactures port GCC to their hardware first

32-bit CPU

Sun Sparc
Motorola M68k
DEC Alpha
HP HPPA 7100 / 8000

64-bit CPU

IA-64 / IPF
Sun UltraSparc
DEC Alpha
IBM Cell


ADI Blackfin
TI TMS320C3x / TMS320C4x


Motorola 68HC11 / 68HC12
Renesas H8/300

Other Embedded Processor

Argonaut ARC
Axis Communications CRIS
Vitesse IQ2000
Renesas M32C / M32R
Zilog Z80

OS / System

GNU Hurd
MS-Windows / MSDOS
FreeBSD / NetBSD
VAX Ultrix
Mac OS X
Cray Unicos/Mk

Supported Languages

List of languages supported by GCC

Built-in Languages

The following languages have built-in interfaces, and require no additional installation steps after GCC is downloaded and installed.

C (gcc command)
C++ (g++ command)
Objective-C / Objective-C++ (gcc command)
Fortran (g77 command)
Java (gcj command)
Ada (gnat command)
Assembly language of every supported processor (as command)

Third-Party Supported Languages

The following languages are supported with "front-end" interfaces that are not built-in to GCC. User must obtain these front-ends from other sources.



Compiler Toolchain

ar – Code archiver

objdump – Object dump; disassembly

nm – Name; parses object file for symbols

strip – Removes the symbol table from an object file

gdb – GNU debugger

make – Project building; automates creation of executable and other object files

gprof – GNU profiling; collects runtime function timing

gcov – GNU coverage; collects line by line execution counts

Compiler Toolchain


gcc – C compiler front-end

g++ – C++ compiler front-end

cpp – Preprocessing (1st step)

gcc / g++ – Compilation (2nd step)

as – Assembly (3rd step)

ld – Linking (4th step)

Front-end command is typically used to perform all compile steps, thereby hiding the individual steps.

Major Features

Large number of supported platforms

Optimizing compiler

Substantial optional warning messages

Native compiler

Cross-compiler (target platform different than the one running GCC)

Written in C with modular design

Allows new languages to be used on any supported processor

Allows new processors to use any supported language

Compiles itself

Several currently supported languages

Language libraries increase developer productivity.

Well documented

Command line interface

Allows for Integrated Development Environments (IDE) like Eclipse

Easy to port

Supports dynamic libraries for faster execution and smaller executables

Free software (most important)

Allows anyone to add a language, a processor, a feature, or fix a bug

Allows enhancements to be shared

Large development community increases number of features and provides support

C / C++ File Extensions

Compile Sequence

Error / Warning Messages


#if statement syntax

Comment syntax (nested C-style comments)

Macro syntax

Missing include files

Recursion for include files is too deep

Compilation (not exhaustive list)

Statement syntax

Un-initialized or undeclared variable (undeclared function)

Un-used variable / parameter

Type mismatch or conversion problem

Missing prototype

Standards compliance

Unknown #pragma type


None (if the assembly file was created by GCC verses by hand or another tool)


Missing object file or static library file

Missing "extern" symbol

Wrong or corrupted file format of an input file

main() function not defined


Dynamic library not found or could not be loaded

Memory access privilege violation

Arithmetic exception (divide by zero)

Illegal instruction

Compiler Front-end (gcc / g++ commands)

Wrapper that coordinates compile sequence

Provides single (language dependent) command to interface to all the compile steps

Reduces complexity to one command

Chooses correct compile steps based on input file extension

Checks file extensions for non-default output filenames

Layered user interface approach allows simple usage as well as more powerful complex usage

Example below compiles source file into an executable and warns of any major issues.

gcc -Wall hello.c -o hello.exe

Command Line Options

Recommended options for warning:

-ansi -pedantic -Wall -W -Wconversion -Wshadow -Wcast-qual -Wwrite-strings -O2

The -ansi and -pedantic options are painful to use since they limit many common coding practices that most modern compilers properly handle anyway. See the -std option for some alternatives to selecting a specific language standard.

Note that using the -ansi option with C source code will cause an odd warning message when the code contains C++ style comments (i.e. //).

The -O2 option is recommended to enable some optimizing compiler algorithms that contain additional warning printouts.

For the most common error warnings that typically are bugs, the -Wall and -W options both should be used.

Keep all temporary files (-save-temps)

Show subcommand calls and other verbose messages (-v)

Add symbol table to object / executable files for debugging (-g[x])

Optimize the output to the assembly file (-O[x] capital letter o)

Specify path to includes (-I capital letter i)

Specify path to static library archive files (-L)

Specify a specific static library to use when linking (-l small letter L)

Preprocessing (cpp command)

Equivalent commands:

gcc -E hello.c -o hello.i
cpp hello.c > hello.i


Removes macro definitions and expands (#define)

Handles compiler conditional directives (#if / #elif / #else / #endif)

Inserts header files recursively (#include)

Removes comments (/* */, // )

Adds line number and filename indications (#)

Does not remove pragma directives (#pragma)

Line number and filename indications allow for accurate localization of warning and error messages identified during compilation step.



gcc -S hello.i -o hello.s


Parses preprocessed file

Processes statements

Optimizes (optional)

Converts statements into assembly

Outputs assembly file

Prints warning and error messages

Current GCC versions combine Preprocessing and Compilation steps into one integrated step.

Assembly (as command)

Equivalent commands:

gcc -c hello.s -o hello.o
as hello.s -o hello.o


Parses assembly file

Converts assembly to machine code

Outputs object file

External function calls and data are left with undefined references.

Linking (ld command)

Equivalent commands:

gcc hello.o -o hello.exe
ld -Bdynamic --dll-search-prefix=cyg -o hello.exe /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../crt0.o -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -L/usr/lib/gcc/i686-pc-cygwin/3.4.4/../../.. hello.o -lgcc -lcygwin -luser32 -lkernel32 -ladvapi32 -lshell32 –lgcc


Reads input object files and static libraries

Resolves undefined references

Sets up any dynamic library references

Links all the object files in order, as found on the command line

Outputs the executable file

Prints an error message if references can not be resolved

Dynamic library references will be resolved at execution time.

Extra C Run Time (CRT) object files are included to perform additional house keeping required by the system. These are system dependent (processor (hardware) and OS (kernel & utility) specific).

If the native compiler (or cross compiler) was built correctly, there is no need to understand or remember the “ld" form of the command – just use the "gcc" form with all the project specific object files included on the command line.

Archive (ar command)

Bundles multiple object files into one static library file

Example – create archive then use it:

ar cr libsockio.a send.o receive.o
gcc main.c libsockio.a -o talk.exe

Example – list archive’s table of contents:

ar t libsockio.a

Standard format for static library filename:

lib<user_defined>.a where "<user_defined>" is specified by the developer

Any archive headers included by external source or other header files, need to be distributed with the static library file.

Disassembly (objdump command)

Disassembles an object or executable file back into an assembly file with optional interspersed source code


gcc -g hello.c -o hello.exe
objdump -C -D -S -l hello.exe > hello.disassembly

Object or executable file must have been assembled (and linked) with debugging turned on (-g option) to include the interspersed source code.

Helpful for understanding what exactly the compiler is doing

Inserted (interspersed) lines of source code before blocks of assembly op-codes show what assembly was generated for specific C / C++ statements

The best results of interspersed source code occur when optimization is turned off (-O0 option (capital letter o followed by zero)).

Symbol Table Parsing (nm command)

Lists the symbols in the object file’s or executable file’s symbol table


nm hello.exe

File must have been compiled (and linked) with debugging turned on (-g option)

"nm" stands for name

Useful for troubleshooting linking errors where a symbol is:

defined in multiple object or library files

referenced but not defined in any object or library file

Symbol Table Removal (strip command)

Removes the symbol table from an object or executable file


strip hello.exe

Significantly shrinks the file size

Useful after debugging is complete and the file is ready to be distributed

Helps to limit reverse engineering of proprietary software

Release version of a file can easily be stripped from the validated debug version simplifying an audit step, otherwise more effort would be needed to show that a recompiled release version (without a symbol table) exactly matched the debug version.

Debugging (gdb command)

Supports the debugging of running code as well as diagnosis of core dump files

Example to start and debug an executable:

gdb -se=hello.exe

Example to debug a core file:

gdb -s hello.exe -c core

GUI interfaces, such as DDD, offer a simplified user interface to gdb.

Project Building (make command)

Compiles the source code of a project and builds the executable(s)



Builds only the files that are out of date

Highly configurable

Text based Makefile is used to specify build steps.

Uses generic build rules, source specific steps, or a mix

Profiling (gprof command)

Analyzes the processing time for each function in an executable

Three step technique:

Compiler adds additional code to "instrument" the executable

Collect the timing data by running the executable

Use gprof tool to report the function timing


gcc -pg hello.c -o hello.exe
gprof hello.exe

Both compiler and linker must receive the profiler command line option (-pg)

Instrumented executable generates a gmon.out file containing data related to the function timing.

Useful when looking to improve the performance of an executable

The best effort to optimize code should concentrate on the functions that use the largest percentage of processor time.

Also gives information on the function call sequence

Coverage Testing (gcov command)

Analyzes which source lines of code have been executed or "covered"

Three step technique:

Compiler adds additional code to "instrument" the executable

Collect the coverage data by running the executable

Use gcov tool to report the number of times each line was executed


gcc -fprofile-arcs -ftest-coverage hello.c -o hello.exe
gcov hello.c

gcov creates a file (hello.c.gcov) containing an annotated version of the source code. The first column contains a count of how many times each line was executed.

In *.gcov files, the lines beginning with "#####" were not executed.

The gcov feature helps:

Find some bugs in branching code (i.e. "if" statements)

Assist testing of the code

Optimize code based on the likelihood that a branch is taken

Reference Books

An Introduction to GCC

By Brian Gough and Richard M. Stallman

Copyright 2005 Network Theory Limited

Published by Network Theory Limited

Overview and tips related to GCC

Well organized, well written, with many tips and tricks in 137 pages

A must have if you use GCC and are not an expert on the tool sells it new for $14.16.

Using GCC

By Richard M. Stallman, GCC Developer Community

Copyright 2003 Free Software Foundation, Inc.

Published by GNU Press

Official reference manual of GCC

The above hardcopy book is for GCC Version 3.3.1. sells it new for $45.00.

Online access (current documentation of current software)

Debugging with GDB

By Richard M. Stallman, Roland H. Pesch, and Stan Shebs

Copyright 2003 Free Software Foundation, Inc.

Published by GNU Press

The GNU source-level debugger sells it new for $30.00.

GNU Make

By Richard M. Stallman, Roland McGrath, Paul D. Smith

Copyright 2006 Free Software Foundation, Inc.

Published by Free Software Foundation, Inc.

A program for directing recompilation (building targets) sells it new for $16.50.

Online access (current)

Reference URL’s

GCC, the GNU Compiler Collection (main web site)

Many documents of different subcategories of GCC and different versions of GCC

GNU’s Not Unix! (GNU)

Free Software Foundation (FSF)

Cygwin – Full blown UNIX under Windows

MinGW – Minimalist GNU for Windows (easy to distribute executable files)

DDD – Data Display Debugger (GUI interface for GDB)

Eclipse – Integrated Development Environment (IDE)

Comparable to Microsoft Visual Studio


GNU GCC offers many features and is the closest development system to being universal that exists (allows more familiarity over time, thus improving productivity).

Strong current support translates to high probability of GCC being supported in the distant future.

GUI’s and IDE’s like DDD and Eclipse offer more intuitive interfaces.


The price is right.

If you don’t like it, you can fix it and give it to everyone else.

-- 02:38, 25 March 2007 (UTC)

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.