| This chapter describes the target-independent part of the compiler. It |
| documents the options and extensions which are not specific to a certain |
| target. Be sure to also read the chapter on the backend you are using. It will |
| likely contain important additional information like data-representation |
| or additional options. |
| |
| @node General Compiler Options |
| @section General Compiler Options |
| |
| Usually @command{vbcc} will be called by @command{vc}. However, if called |
| directly it expects the following syntax: |
| |
| @example |
| @command{vbcc<target> [options] file} |
| @end example |
| |
| The following options are supported by the machine independent part |
| of @command{vbcc} (and will be passed through by @command{vc}): |
| |
| @table @option |
| |
| @item -quiet |
| Do not print the copyright notice. |
| |
| @item -ic1 |
| Write the intermediate code before optimizing to file.ic1. |
| |
| @item -ic2 |
| Write the intermediate code after optimizing to file.ic2. |
| |
| @item -debug=n |
| Set the debug level to n. |
| |
| @item -o=ofile |
| Write the generated assembler output to <ofile> rather than |
| the default file. |
| |
| @item -noasm |
| Do not generate assembler output (only for testing). |
| |
| @item -O=n |
| Turns optimizing options on/off; every bit set in n turns |
| on an option. Usually the predefined optimization options |
| by the compiler driver should be used. |
| @xref{Optimizations}. |
| |
| @item -speed |
| Turns on optimizations which improve speed even if they |
| increase code-size quite a bit. |
| |
| @item -size |
| Turns on optimizations which improve code-size even if |
| they have negative effect on execution-times. |
| |
| @item -final |
| This flag is useful only with higher optimization levels. |
| It tells the compiler that all relevant files have been |
| provided to the compiler (i.e. it is the link-stage). |
| The compiler will try to eliminate all functions and |
| variables which are not referenced. |
| |
| @xref{Unused Object Elimination}. |
| |
| @item -wpo |
| Create a high-level pseudo object for cross-module |
| optimization (@pxref{Cross-Module Optimizations}). |
| |
| |
| @item -g |
| Create debug output. Whether this is supported as well as the |
| format of the debug information depends on the backend. |
| Some backends may offer additional options to control the |
| generation of debug output. |
| |
| Usually DWARF2-output will be generated by default, if |
| possible. |
| |
| Also, options regarding optimization and |
| code-generation may affect the debug output |
| (@pxref{Debugging Optimized Code}). |
| |
| |
| @item -cmd=<file> |
| A file containing additional command line options can |
| be specified using this command. This may be useful for |
| very long command lines. |
| |
| @item -c89 |
| @item -c99 |
| Set the C standard to be used. |
| The default is the 1999 ISO C standard (ISO/IEC9899:1999). |
| Currently the following changes of C99 are handled: |
| @itemize @minus |
| @item long long int (not supported by all backends) |
| @item flexible array members as last element of a struct |
| @item mixed statements and declarations |
| @item declarations within for-loops |
| @item @code{inline} function-specifier |
| @item @code{restrict}-qualifier |
| @item new reserved keywords |
| @item @code{//}-comments |
| @item vararg-macros |
| @item @code{_Pragma} |
| @item implicit int deprecated |
| @item implicit function-declarations deprecated |
| @item increased translation-limits |
| @item designated initializers |
| @item non-constant initializers for automatic aggregates |
| @item compound literals |
| @item variable-length arrays (incomplete) |
| @end itemize |
| |
| @item -unsigned-char |
| Make the unqualified type of @code{char} unsigned. |
| |
| @item -maxoptpasses=n |
| Set maximum number of optimizer passes to n. |
| @xref{Optimizations}. |
| |
| @item -inline-size=n |
| Set the maximum 'size' of functions to be inlined. |
| @xref{Function Inlining}. |
| |
| |
| @item -inline-depth=n |
| Inline functions up to n nesting-levels (including recursive |
| calls). The default value is 1. Be careful with values greater |
| than 2. |
| @xref{Function Inlining}. |
| |
| @item -unroll-size=n |
| Set the maximum 'size' of unrolled loops. |
| @xref{Loop Unrolling}. |
| |
| @item -unroll-all |
| Unroll loops with a non-constant number of iterations if |
| the number can be calculated at runtime before entering |
| the loop. @xref{Loop Unrolling}. |
| |
| @item -no-inline-peephole |
| Some backends provide peephole-optimizers which perform |
| simple optimizations on the assembly code output by @command{vbcc}. |
| By default, these optimizations will also be performed |
| on inline-assembly code of the application. This switch |
| turns off this behaviour. @xref{Inline-Assembly Functions}. |
| |
| @item -fp-associative |
| Floating point operations do not obey the law of |
| associativity, e.g. @code{(a+b)+c==a+(b+c)} is not true for all |
| floating point numbers @code{a},@code{b},@code{c}. Therefore |
| certain optimizations |
| depending on this property cannot be performed on floating |
| point numbers. |
| |
| This option tells @command{vbcc} to treat floating point |
| operations as associative and perform those optimizations |
| even if that may change the results in some cases (not |
| ISO conforming). |
| |
| @item -no-alias-opt |
| Do not perform type-based alias analysis. |
| @xref{Alias Analysis}. |
| |
| @item -no-multiple-ccs |
| If the backend supports multiple condition code |
| registers, @command{vbcc} will try to use them when optimizing. |
| This flag prevents @command{vbcc} from using them. |
| |
| @item -double-push |
| On targets where function-arguments are passed in registers |
| but also stack-slots are left empty for such arguments, |
| pass those arguments both in registers and on the stack. |
| |
| This generates less efficient code but some broken code |
| (e.g. code which calls varargs functions without correct |
| prototypes in scope) may work. |
| |
| @item -short-push |
| In the presence of a prototype, no promotion will be done |
| on function arguments. For example, <char> will be passed |
| as <char> rather than <int> and <float> will not be |
| promoted to <double>. This may be more efficient on small |
| targets. |
| |
| However, please note that this feature may not be |
| supported by all backends and that using this option |
| breaks ANSI/ISO conformance. For example, a function |
| with a <char> parameter must never be called without a |
| prototype in scope. |
| |
| @item -soft-float |
| On targets supporting this flag, software floating point |
| emulation will be used rather than a hardware FPU. Please |
| consult the corresponding backend documentation when |
| using this flag. |
| |
| @item -stack-check |
| Insert code for dynamic stack checking/extending if the |
| backend and the environment support this feature. |
| |
| @item -ansi |
| @itemx -iso |
| Switch to ANSI/ISO mode. |
| |
| |
| @itemize @minus |
| @item In ISO mode warning 209 will be printed by default. |
| @item Inline-assembly functions are not recognized. |
| @item Assignments between pointers to <type> and pointers |
| to unsigned <type> will cause warnings. |
| @end itemize |
| |
| @item -maxerrors=n |
| Abort the compilation after n errors; do not stop if n==0. |
| |
| @item -dontwarn=n[,n...] |
| Suppress warning number n; suppress all warnings if n<0. |
| Multiple warnings may be separated by commas. |
| @xref{Errors and Warnings} |
| |
| @item -warn=n |
| Turn on warning number n; turn on all warnings if n<0. |
| @xref{Errors and Warnings} |
| |
| @item -no-cpp-warn |
| Turn off all preprocessor warnings. |
| |
| @item -warnings-as-errors |
| Treat all enabled warnings as errors. |
| |
| @item -strip-path |
| Strip the path of filenames from error messages. |
| Error messages may look more convenient that |
| way, but message browsers or |
| similar programs might need full paths. |
| |
| @item -no-include-stack |
| Do not display the include stack in error messages. |
| |
| |
| @item -+ |
| @itemx -cpp-comments |
| Allow C++ style comments (not ISO89 conforming). |
| |
| @item -no-trigraphs |
| Do not recognize trigraphs (not ISO conforming). |
| |
| @item -E |
| Write the preprocessor output to <file>.i. |
| |
| @item -deps |
| Write a make-style dependency-line to <file>.dep. |
| |
| @item -deps-for-libs |
| By default, @code{-deps} will not include files that are |
| included using the syntax @code{#include <...>}. Specify this option |
| to add those files as well. |
| |
| @item -depobj=<file> |
| Use the specified filename as target in the generated dependency |
| file instead of basing it on the input file name. |
| |
| @item -reserve-reg=<register> |
| Reserve that register not to be used by the backend. |
| This option is dangerous and must only be used for registers |
| otherwise available for the register allocator. If it used |
| for special registers or registers used internally by the |
| backend, it may be ignored, lead to corrupt code or even |
| cause internal errors from the compiler. |
| |
| Only use if you know what you are doing! |
| |
| @item -dontkeep-initialized-data |
| By default @command{vbcc} keeps all data of initializations in memory |
| during the whole compilation (it can sometimes make use |
| of this when optimizing). This can take some amount of |
| memory, though. This options tells @command{vbcc} to |
| keep as little of this data in memory as possible. |
| This has not yet been tested very well. |
| |
| @item -prefer-statics |
| Assign auto variables to static memory rather than the stack if it |
| can be deduced that the function is not called recursively, i.e. the |
| behaviour is still C compliant. This may be more efficient on targets |
| that can access static data faster than stack. While stack-usage is |
| reduced, total memory consumption is usually increased. |
| |
| Functions will not be re-entrant any more. |
| |
| This option only has effect on higher optimization levels (@code{-O3}). |
| |
| @item -force-statics |
| Like @code{-prefer-statics}, but assume all functions as non-recursive. |
| This will break C compliance. |
| |
| This option only has effect on higher optimization levels (@code{-O}). |
| |
| @item -range-opt |
| Perform additional optimizations based on value range analysis. This |
| option is under development and considered experimental. The following |
| optimizations are currently implemented: |
| |
| @itemize @minus |
| @item Induction variables of some loops are transformed to smaller |
| types if it can be determined that they will only get assigned |
| values that fit into a smaller type. |
| @end itemize |
| |
| @item -merge-strings |
| Overlay identical string-constants to save memory. Currently |
| only strings identical to string-constants on top-level are |
| recognized. |
| |
| @item -sec-per-obj |
| Tells the backend to put every function/object into its own |
| separate section. This allows more fine-grained elimination |
| of unused functions/objects by the linker. On the other hand, |
| it may prevent some optimizations by the assembler. |
| |
| This option only has effect if it is supported by the backend. |
| |
| @item -mask-opt |
| Perform mask optimizations on suitable library function. This |
| will create calls to optimized versions of e.g. the printf/scanf |
| family of functions. |
| |
| @end table |
| |
| |
| The assembler output will be saved to @file{file.asm} |
| (if @file{file} already contained |
| a suffix, this will first be removed; same applies to .ic1/.ic2) |
| |
| |
| @node Errors and Warnings |
| @section Errors and Warnings |
| |
| @command{vbcc} knows the following kinds of messages: |
| |
| @table @asis |
| |
| @item Fatal Errors |
| Something is badly wrong and further compilation is |
| impossible or pointless. @command{vbcc} will abort. |
| E.g. no source file or really corrupt source. |
| |
| @item Errors |
| There was an error and @command{vbcc} cannot generate useful |
| code. Compilation continues, but no code will be |
| generated. |
| E.g. unknown identifiers. |
| |
| @item Warnings (1) |
| Warnings with ISO-violations. The program is not |
| ISO-conforming, but @command{vbcc} will generate code that |
| could be what you want (or not). |
| E.g. missing semicolon. |
| |
| @item Warnings (2) |
| The code has no ISO-violations, but contains some |
| strange things you should perhaps look at. |
| E.g. unused variables. |
| @end table |
| |
| Errors or the first kind of warnings are always displayed and cannot |
| be suppressed. |
| |
| Only some warnings of the second kind are turned on by default. |
| Many of them are very useful for some but annoying to others, and |
| their usability may depend on programming style. |
| Everybody is recommended to find their own preferences. |
| |
| A good way to do this is starting with all warnings turned on by |
| @option{-warn=-1}. Now all possible warnings will be issued. Everytime |
| a warning that is not considered useful appears, turn that one off with |
| @option{-dontwarn=n}. |
| |
| See @ref{List of Errors} for a list of all diagnostic messages available. |
| |
| See @ref{The Frontend} to find out how to configure @command{vc} to your |
| preferences. |
| |
| |
| @section Data Types |
| |
| @command{vbcc} can handle the following atomic data types: |
| |
| @table @code |
| @item signed char |
| @item unsigned char |
| @item signed short |
| @item unsigned short |
| @item signed int |
| @item unsigned int |
| @item signed long int |
| @item unsigned long int |
| @item signed long long int |
| (with @option{-c99}) |
| @item unsigned long long int |
| (with @option{-c99}) |
| @item float |
| @item double |
| @item long double |
| @end table |
| |
| The default signedness for integer types is @code{signed}. |
| |
| Depending on the backend, some of these types can have identical |
| representation. The representation (size, alignment etc.) of these types |
| usually varies between different backends. @command{vbcc} is able to support |
| arbitrary implementations. |
| |
| Backends may be restricted and omit some types (e.g. floating point on small |
| embedded architectures) or offer additional types. E.g. some backends |
| may provide special bit types or different pointer types. |
| |
| |
| @node Optimizations |
| @section Optimizations |
| |
| @command{vbcc} offers different levels of optimization, ranging from fast |
| compilation with straight-forward code suitable for easy debugging |
| to highly aggressive cross-module optimizations delivering very |
| fast and/or tight code. |
| |
| This section describes the general phases of compilation and gives |
| a short overview on the available optimizations. |
| |
| In the first compilation phase every function is parsed into a tree |
| structure one expression after the other. Type-checking and some |
| minor optimizations like constant-folding or some algebraic |
| simplifications are done on the trees. |
| This phase of the translation is identical in optimizing and |
| non-optimizing compilation. |
| |
| Then intermediate code is generated from the trees. In non-optimizing |
| compilation temporaries needed to evaluate the expression are |
| immediately assigned to registers while in optimizing |
| compilation, a new variable is generated for each temporary. |
| Slightly different intermediate code |
| is produced in optimizing compilation. |
| Some minor optimizations are performed while generating the intermediate |
| code (simple elimination of unreachable code, some optimizations on |
| branches etc.). |
| |
| After intermediate code for the whole function has been generated, |
| simple register allocation may be done in non-optimizing compilation |
| if bit 1 has been set in the @option{-O} option. |
| Afterwards, the intermediate code is passed to the code generator and |
| then all memory for the function, its variables etc. is freed. |
| |
| In optimizing compilation flowgraphs are constructed, data flow analysis |
| is performed and many passes are made over the function's intermediate |
| code. Code may be moved around, new variables may be added, other |
| variables removed etc. etc. (for more detailed information on the |
| optimizations look at the description for the |
| @option{-O} option below). |
| |
| Many of the optimization routines depend on each other. If one |
| routine finds an optimization, this often enables other routines to |
| find further ones. Also, some routines only do a first step and let |
| other routines 'clean up' afterwards. Therefore @command{vbcc} usually |
| makes many passes until no further optimizations are found. |
| To avoid possible extremely long optimization times, the number of |
| those passes can be limited with @option{-maxoptpasses} (the |
| default is max. 10 passes). |
| @command{vbcc} will display a warning if more passes might be useful. |
| |
| Depending on the optimization level, a whole translation-unit or |
| even several translation-units will be read at once. Also, the |
| intermediate code for all functions may be kept in memory during |
| the entire compilation. Be aware that higher optimization levels |
| can take much more time and memory to complete. |
| |
| The following table lists the optimizations which are activated by |
| bits in the argument of the @option{-O} option. Note that not all |
| combinations are valid. It is heavily recommended not to fiddle with |
| this option but just use one of the settings provided by @command{vc} |
| (e.g. @option{-O0} - @option{-O4}). These options also automatically |
| handle actions like invoking the scheduler or cross-module optimizer. |
| |
| @table @asis |
| |
| @item Bit 0 (1) |
| Perform Register allocation. @xref{Register Allocation}. |
| |
| @item Bit 1 (2) |
| This flag turns on the optimizer. If it is set to zero, no global |
| optimizations will be performed, no matter what the other flags are set |
| to. |
| Slightly different intermediate code will be generated |
| by the first translation phases and a flowgraph will be constructed. |
| @xref{Flow Optimizations}. |
| |
| @item Bit 2 (4) |
| Perform common subexpression elimination |
| (@pxref{Common Subexpression Elimination}) and copy propagation |
| (@pxref{Copy Propagation}). |
| This can be done globally or only within basic blocks |
| depending on bit 5. |
| |
| @item Bit 3 (8) |
| Perform constant propagation (@pxref{Constant Propagation}). |
| This can be done globally or only within basic blocks |
| depending on bit 5. |
| |
| @item Bit 4 (16) |
| Perform dead code elimination (@pxref{Dead Code Elimination}). |
| |
| |
| @item Bit 5 (32) |
| Some optimizations are available in local and global versions. This |
| flag turns on the global versions. Several major optimizations |
| will not be performed and only one optimization pass is done unless |
| this flag is set. |
| |
| @item Bit 6 (64) |
| Reserved. |
| |
| @item Bit 7 (128) |
| @command{vbcc} will try to identify loops and perform some loop optimizations. |
| See @ref{Strength Reduction} and @ref{Loop-Invariant Code Motion}. |
| These only work if bit 5 (32) is set. |
| |
| |
| @item Bit 8 (256) |
| @command{vbcc} tries to place variables at the same memory addresses if possible |
| (see @ref{Unused Object Elimination}). |
| |
| |
| @item Bit 9 (512) |
| Reserved. |
| |
| @item Bit 10 (1024) |
| Pointers are analyzed and more precise alias-information is generated |
| (@pxref{Alias Analysis}). |
| Using this information, better data-flow analysis is possible. |
| |
| Also, @command{vbcc} tries to place global/static variables and variables which |
| have their address taken in registers, if possible |
| (@pxref{Register Allocation}). |
| |
| |
| @item Bit 11 (2048) |
| More aggressive loop optimizations are performed (see |
| @ref{Loop Unrolling} and @ref{Induction Variable Elimination}). |
| Only works if bit 5 (32) and bit 7 (128) are set. |
| |
| @item Bit 12 (4096) |
| Perform function inlining (@pxref{Function Inlining}). |
| |
| @item Bit 13 (8192) |
| Reserved. |
| |
| @item Bit 14 (16384) |
| Perform inter-procedural analysis (@pxref{Inter-Procedural Analysis}) |
| and cross-module optimizations (@pxref{Cross-Module Optimizations}). |
| |
| @end table |
| |
| Also look at the documentation for the target-dependent part of @command{vbcc}. |
| There may be additional machine specific optimization options. |
| |
| |
| @node Register Allocation |
| @subsection Register Allocation |
| |
| This optimization tries to assign variables or temporaries into machine |
| registers to save time and space. The scope and details of this optimization |
| vary on the optimization level. |
| |
| With @option{-O0} only temporaries during expression-evaluation are put |
| into registers. This may be useful for debugging. |
| |
| At the default level (without the optimizer), additionally local variables |
| whose address has not been taken may be put into registers for a whole |
| function. The decision which variables to assign to registers is based |
| on very simple heuristics. |
| |
| In optimizing compilation a different algorithm will be used which uses |
| hierarchical live-range-splitting. This means that variables may be assigned |
| to different registers at different time. This typically allows to put the |
| most used variables into registers in all inner loops. Note that this |
| means that a variable can be located in different registers at different |
| locations. Most debuggers can not handle this. |
| |
| Also, the use of |
| registers can be guided by information provided by the backend, if |
| available. For architectures which are not very orthogonal this allows |
| to choose registers which are better suited to certain operations. |
| Constants can also be assigned to registers, if this is beneficial for |
| the architecture. |
| |
| The options @option{-speed} and @option{-size} change the behaviour of the |
| register-allocator to optimize for speed or size of the generated code. |
| |
| On low optimization levels, only local variables whose address has not |
| been taken will be assigned to registers. On higher optimization levels, |
| @command{vbcc} will also try to assign global/static variables and variables which |
| had their address taken, to registers. Typically, this occurs during |
| loops. The variables will be loaded into a register before entering a loop |
| and stored back after the loop. However, this can only be done if @command{vbcc} |
| can detect that the variable is not modified in unpredictable ways. |
| Therefore, alias-analysis is crucial for this optimization. |
| |
| During register-allocation @command{vbcc} will use information on |
| register usage of functions to minimize loading/saving of registers between |
| function-calls. Therefore, other optimizations will affect register |
| allocation. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Flow Optimizations |
| @subsection Flow Optimizations |
| |
| When optimizing @command{vbcc} will construct a flowgraph for every function and |
| perform optimizations based on control-flow. For example, code which is |
| unreachable will be removed and branches to other branches or branches |
| around branches will be simplified. |
| |
| Also, unused labels will be removed and basic blocks united to allow further |
| optimizations. |
| |
| For example, the following code |
| |
| @example |
| void f(int x, int y) |
| @{ |
| if(x > y) |
| goto label1; |
| q(); |
| label1: |
| goto label2; |
| r(); |
| label2: |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int x, int y) |
| @{ |
| if(x <= y) |
| q(); |
| @} |
| @end example |
| |
| Identical code at the beginning or end of basic blocks will be moved to |
| the successors/predecessors under certain conditions. |
| |
| |
| @node Common Subexpression Elimination |
| @subsection Common Subexpression Elimination |
| |
| If an expression has been computed on all paths leading to a second |
| evaluation and @command{vbcc} knows that the operands have not been changed, |
| then the result of the original evaluation will be reused instead of |
| recomputing it. Also, memory operands will be loaded into registers and |
| reused instead of being reloaded, if possible. |
| |
| For example, the following code |
| |
| @example |
| void f(int x, int y) |
| @{ |
| q(x * y, x * y); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int x, int y) |
| @{ |
| int tmp; |
| |
| tmp = x * y; |
| q(tmp, tmp); |
| @} |
| @end example |
| |
| Depending on the optimization level, @command{vbcc} will perform this optimization |
| only locally within basic blocks or globally across an entire function. |
| |
| As this optimization requires detecting whether operand of an expression |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Copy Propagation |
| @subsection Copy Propagation |
| |
| If a variable is assigned to another one, the original variable will be |
| used as long as it is not modified. This is especially useful in |
| conjunction with other optimizations, e.g. common subexpression elimination. |
| |
| For example, the following code |
| |
| @example |
| int y; |
| |
| int f() |
| @{ |
| int x; |
| x = y; |
| return x; |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int y; |
| |
| int f() |
| @{ |
| return y; |
| @} |
| @end example |
| |
| Depending on the optimization level, @command{vbcc} will perform this optimization |
| only locally within basic blocks or globally across an entire function. |
| |
| As this optimization requires detecting whether a variable |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Constant Propagation |
| @subsection Constant Propagation |
| |
| If a variable is known to have a constant value (this includes addresses |
| of objects) at some use, it will be replaced by the constant. |
| |
| For example, the following code |
| |
| @example |
| int f() |
| @{ |
| int x; |
| x = 1; |
| return x; |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int f() |
| @{ |
| return 1; |
| @} |
| @end example |
| |
| Depending on the optimization level, @command{vbcc} will perform this optimization |
| only locally within basic blocks or globally across an entire function. |
| |
| As this optimization requires detecting whether a variable |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Dead Code Elimination |
| @subsection Dead Code Elimination |
| |
| If a variable is assigned a value which is never used (either because it |
| is overwritten or its lifetime ends), the assignment will be removed. |
| This optimization is crucial to remove code which has become dead due |
| to other optimizations. |
| |
| For example, the following code |
| |
| @example |
| int x; |
| |
| void f() |
| @{ |
| int y; |
| x = 1; |
| y = 2; |
| x = 3; |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int x; |
| |
| void f() |
| @{ |
| x = 3; |
| @} |
| @end example |
| |
| As this optimization requires detecting whether a variable |
| may be read, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| @node Loop-Invariant Code Motion |
| @subsection Loop-Invariant Code Motion |
| |
| If the operands of a computation within a loop will not change |
| during iterations, the computation will be moved outside of the |
| loop. |
| |
| For example, the following code |
| |
| @example |
| void f(int x, int y) |
| @{ |
| int i; |
| |
| for (i = 0; i < 100; i++) |
| q(x * y); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int x, int y) |
| @{ |
| int i, tmp = x * y; |
| |
| for (i = 0; i < 100; i++) |
| q(tmp); |
| @} |
| @end example |
| |
| As this optimization requires detecting whether operands of an expression |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Strength Reduction |
| @subsection Strength Reduction |
| |
| This is an optimization applied to loops in order to replace more costly |
| operations (usually multiplications) by cheaper ones (typically additions). |
| Linear functions of an induction variable (a variable which is changed by |
| a loop-invariant value in every iteration) will be replaced by new |
| induction variables. If possible, the original induction variable will be |
| eliminated. |
| |
| As array accesses are actually composed of multiplications and additions, |
| they often benefit significantly by this optimization. |
| |
| For example, the following code |
| |
| @example |
| void f(int *p) |
| @{ |
| int i; |
| |
| for (i = 0; i < 100; i++) |
| p[i] = i; |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int *p) |
| @{ |
| int i; |
| |
| for (i = 0; i < 100; i++) |
| *p++ = i; |
| @} |
| @end example |
| |
| As this optimization requires detecting whether operands of an expression |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Induction Variable Elimination |
| @subsection Induction Variable Elimination |
| |
| If an induction variable is only used to determine the number of |
| iterations through the loop, it will be removed. Instead, a new variable |
| will be created which counts down to zero. This is generally faster |
| and often enables special decrement-and-branch or decrement-and-compare |
| instructions. |
| |
| For example, the following code |
| |
| @example |
| void f(int n) |
| @{ |
| int i; |
| |
| for (i = 0; i < n; i++) |
| puts("hello"); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int n) |
| @{ |
| int tmp; |
| |
| for(tmp = n; tmp > 0; tmp--) |
| puts("hello"); |
| |
| @} |
| @end example |
| |
| As this optimization requires detecting whether operands of an expression |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Loop Unrolling |
| @subsection Loop Unrolling |
| |
| @command{vbcc} reduces the loop overhead by replicating the loop body |
| and reducing the number of iterations. Also, additional optimizations between |
| different iterations of the loop will often be enabled by creating larger |
| basic blocks. However, code-size as well as compilation-times can |
| increase significantly. |
| |
| This optimization can be controlled by @option{-unroll-size} and |
| @option{-unroll-all}. @option{-unroll-size} specifies the maximum number |
| of intermediate instructions for the unrolled loop body. @command{vbcc} will try |
| to unroll the loop as many times to suit this value. |
| |
| If the number of iterations is constant and the size of the loop body |
| multiplied by this number is less or equal to the value specified by |
| @option{-unroll-size}, the loop will be unrolled completely. If |
| the loop is known to be executed exactly once, it will always be unrolled |
| completely. |
| |
| For example, the following code |
| |
| @example |
| void f() |
| @{ |
| int i; |
| |
| for (i = 0; i < 4; i++) |
| q(i); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f() |
| @{ |
| q(0); |
| q(1); |
| q(2); |
| q(3); |
| @} |
| @end example |
| |
| If the number of iteration is constant the loop will be unrolled as many |
| times as permitted by the size of the loop and @option{-unroll-size}. If the |
| number of iterations is not a multiple of the number of replications, the |
| remaining iterations will be unrolled separately. |
| |
| For example, the following code |
| |
| @example |
| void f() |
| @{ |
| int i; |
| |
| for (i = 0; i < 102; i++) |
| q(i); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f() |
| @{ |
| int i; |
| q(0); |
| q(1); |
| for(i = 2; i < 102;)@{ |
| q(i++); |
| q(i++); |
| q(i++); |
| q(i++); |
| @} |
| @} |
| @end example |
| |
| By default, only loops with a constant number of iterations will be |
| unrolled. However, if @option{-unroll-all} is specified, @command{vbcc} will also |
| unroll loops if the number of iterations can be calculated at entry to |
| the loop. |
| |
| For example, the following code |
| |
| @example |
| void f(int n) |
| @{ |
| int i; |
| |
| for (i = 0; i < n; i++) |
| q(i); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| void f(int n) |
| @{ |
| int i, tmp; |
| |
| i = 0; |
| tmp = n & 3; |
| switch(tmp)@{ |
| case 3: |
| q(i++); |
| case 2: |
| q(i++); |
| case 1: |
| q(i++); |
| @} |
| while(i < n)@{ |
| q(i++); |
| q(i++); |
| q(i++); |
| q(i++); |
| @} |
| @} |
| @end example |
| |
| As this optimization requires detecting whether operands of an expression |
| may have changed, it will be affected by other optimizations. |
| See @ref{Alias Analysis}, @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| |
| @node Function Inlining |
| @subsection Function Inlining |
| |
| To reduce the overhead, a function call can be expanded inline. Passing |
| parameters can be optimized as the arguments can be directly accessed |
| by the inlined function. Also, further optimizations are enabled, e.g. |
| constant arguments can be evaluated or common subexpressions between |
| the caller and the callee can be eliminated. An inlined function call is |
| as fast as a macro. However (just as with using large macros), code size and |
| compilation time can increase significantly. |
| |
| Therefore, this optimization can be controlled with @option{-inline-size} and |
| @option{-inline-depth}. @command{vbcc} will only inline functions which contain |
| less intermediate instructions than specified with this option. |
| |
| For example, the following code |
| |
| @example |
| int f(int n) |
| @{ |
| return q(&n,1); |
| @} |
| |
| void q(int *x, int y) |
| @{ |
| if(y > 0) |
| *x = *x + y; |
| else |
| abort(); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int f(int n) |
| @{ |
| return n + 1; |
| @} |
| |
| void q(int *x, int y) |
| @{ |
| if(y > 0) |
| *x = *x + y; |
| else |
| abort(); |
| @} |
| @end example |
| |
| If a function to be inlined calls another function, that function can also be |
| inlined. This also includes a recursive call of the function. |
| |
| For example, the following code |
| |
| @example |
| int f(int n) |
| @{ |
| if(n < 2) |
| return 1; |
| else |
| return f(n - 1) + f(n - 2); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int f(int n) |
| @{ |
| if(n < 2) |
| return 1; |
| else@{ |
| int tmp1 = n - 1, tmp2, tmp3 = n - 2, tmp4; |
| if(tmp1 < 2) |
| tmp2 = 1; |
| else |
| tmp2 = f(tmp1 - 1) + f(tmp2 - 2); |
| if(tmp3 < 2) |
| tmp4 = 1; |
| else |
| tmp4 = f(tmp3 - 1) + f(tmp3 - 2); |
| return tmp2 + tmp4; |
| @} |
| @} |
| @end example |
| |
| By default, only one level of inlining is done. The maximum nesting of |
| inlining can be set with @option{-inline-depth}. However, this option |
| should be used with care. The code-size can increase very fast and in |
| many cases the code will be slower. Only use it for fine-tuning after |
| measuring if it is really beneficial. |
| |
| |
| At lower optimization levels a function must be defined in the same |
| translation-unit as the caller to be inlined. With cross-module |
| optimizations, @command{vbcc} will also inline functions which are defined in |
| other files. @xref{Cross-Module Optimizations}. |
| |
| |
| See also @ref{Inline-Assembly Functions}. |
| |
| |
| |
| @node Intrinsic Functions |
| @subsection Intrinsic Functions |
| |
| This optimization will replace calls to some known functions |
| (usually library functions) with calls to different functions or |
| special inline-code. This optimization usually depends on the |
| arguments to a function. Typical candidates are the @code{printf} |
| family of functions and string-functions applied to string-literals. |
| |
| For example, the following code |
| |
| @example |
| int f() |
| @{ |
| return strlen("vbcc"); |
| @} |
| @end example |
| |
| will be optimized like: |
| |
| @example |
| int f() |
| @{ |
| return 4; |
| @} |
| @end example |
| |
| Note that there are also other possibilities of providing specially |
| optimized library functions. See @ref{Inline-Assembly Functions} and |
| @ref{Function Inlining}. |
| |
| |
| @node Unused Object Elimination |
| @subsection Unused Object Elimination |
| |
| Depending on the optimization level, @command{vbcc} will try to eliminate different |
| objects and reduce the size needed for objects. |
| |
| Generally, @command{vbcc} will try to use common storage for local non-static variables |
| with non-overlapping live-ranges . |
| |
| At some optimization levels and with @option{-size} specified, @command{vbcc} will try to |
| order the placement of variables with static storage-duration to minimize |
| padding needed due to different alignment requirements. This optimization |
| generally benefits from an increased scope of optimization. |
| @xref{Cross-Module Optimizations}. |
| |
| At higher optimization levels objects and functions which are not |
| referenced are eliminated. This includes functions which have always |
| been inlined or variables which have always been replaced by constants. |
| |
| When using separate compilation, objects and functions with external |
| linkage usually cannot be eliminated, because they might be referenced |
| from other translation-units. This precludes also elimination of anything |
| referenced by such an object or function. |
| |
| However, unused objects and functions with |
| external linkage can be eliminated if @option{-final} is specified. |
| In this case @command{vbcc} will assume that basically the entire program is presented |
| and eliminate everything which is not referenced directly or indirectly |
| from main(). If some objects are not referenced but must not be |
| eliminated, they have to be declared with the @code{__entry} attribute. |
| Typical examples are callback functions which are called from a library |
| function or from anywhere outside the program, interrupt-handlers or |
| other data which should be preserved. |
| @xref{Cross-Module Optimizations}. |
| |
| |
| |
| @node Alias Analysis |
| @subsection Alias Analysis |
| |
| Many optimizations can only be done if it is known that two expressions |
| are not aliased, i.e. they do not refer to the same object. |
| If such information is not available, worst-case assumptions have to be |
| made in order to create correct code. In the C |
| language aliasing can occur by use of pointers. As pointers are generally |
| a very frequently used feature of C and also array accesses are just |
| disguised pointer arithmetic, alias analysis is very important. |
| |
| @command{vbcc} uses the following methods to obtain aliasing information: |
| |
| @itemize @minus |
| @item The C language does not allow accessing an object using an lvalue |
| of a different type. Exceptions are accessing an object using a |
| qualified version of the same type and accessing an object using |
| a character type. In the following example @code{p1} and @code{p2} |
| must not point to the same object: |
| |
| @example |
| f(int *p1, long *p2) |
| @{ |
| ... |
| @} |
| @end example |
| |
| @command{vbcc} will assume that the source is correct and does not break |
| this requirement of the C language. If a program does break this |
| requirement and cannot be fixed, then @code{-no-alias-opt} must be |
| specified and some performance will be lost. |
| |
| @item At higher optimization levels, @command{vbcc} will try to keep track of all |
| objects a pointer can point to. In the following example, @command{vbcc} |
| will see that @code{p1} can only point to @code{x} or @code{y} |
| whereas @code{p2} can only point to @code{z}. Therefore it |
| knows that @code{p1} and @code{p2} are not aliased. |
| |
| @example |
| int x[10], y[10], z[10]; |
| |
| int f(int a, int b, int c) |
| @{ |
| int *p1, *p2; |
| |
| if(a < b) |
| p1 = &x[a]; |
| else |
| p1 = &y[b]; |
| |
| p2 = &z[c]; |
| |
| ... |
| @} |
| @end example |
| |
| As pointers itself may be aliased and function calls might |
| modify pointers, this analysis sometimes benefits from a larger |
| scope of optimization. |
| See @ref{Inter-Procedural Analysis} and |
| @ref{Cross-Module Optimizations}. |
| |
| This optimization will alter the behaviour of broken code which uses |
| pointer arithmetic to step from one object into another. |
| |
| @item The 1999 C standard provides the @code{restrict}-qualifier to help |
| alias analysis. If a pointer is declared with this qualifier, the |
| compiler may assume that the object pointed to by this pointer is |
| only aliased by pointers which are derived from this pointer. |
| For a formal definition of the rules for @code{restrict} please |
| consult ISO/IEC9899:1999. |
| |
| @command{vbcc} will make use of this information at higher optimization |
| levels (@option{-c99} must be used to use this new keyword). |
| |
| A very useful application for @code{restrict} are function |
| parameters. Consider the following example: |
| |
| @example |
| void cross_prod(float *restrict res, |
| float *restrict x, |
| float *restrict y) |
| @{ |
| res[0] = x[1] * y[2] - x[2] * y[1]; |
| res[1] = x[2] * y[0] - x[0] * y[2]; |
| res[2] = x[0] * y[1] - x[1] * y[0]; |
| @} |
| @end example |
| |
| Without @code{restrict}, a compiler has to assume that writing the |
| results through @code{res} can modify the object pointed to by |
| @code{x} and @code{y}. Therefore, the compiler has to reload all |
| the values on the right side twice. With @code{restrict} @command{vbcc} |
| will optimize this code like: |
| |
| @example |
| void cross_prod(float *restrict res, |
| float *restrict x, |
| float *restrict y) |
| @{ |
| float x0 = x[0], x1 = x[1], x2 = x[2]; |
| float y0 = y[0], y1 = x[1], y2 = y[2]; |
| |
| res[0] = x1 * y2 - x2 * y1; |
| res[1] = x2 * y0 - x0 * y2; |
| res[2] = x0 * y1 - x1 * y0; |
| @} |
| @end example |
| |
| |
| @end itemize |
| |
| |
| @node Inter-Procedural Analysis |
| @subsection Inter-Procedural Analysis |
| |
| Apart from the number of different optimizations a compiler offers, another |
| important point is the scope of the underlying analysis. If a compiler only |
| looks at small parts of code when deciding whether to do an optimization, |
| it often cannot prove that a transformation does not change the |
| behaviour and therefore has to reject it. |
| |
| Simple compilers only look at single expressions, simple optimizing |
| compilers often restrict their analysis to basic blocks or extended basic |
| blocks. Analyzing a whole function is common in today's optimizing compilers. |
| |
| This already allows many optimizations but often worst-case assumptions |
| have to be made when a function is called. |
| To avoid this, @command{vbcc} will not restrict its analysis to single functions |
| at higher optimization levels. Inter-procedural data-flow analysis often |
| allows for example to eliminate more common subexpressions or dead code. |
| Register allocation and many other optimizations also sometimes benefit |
| from inter-procedural analysis. |
| |
| Further extension of the scope of optimizations is possible by activating |
| cross-module optimizations. @xref{Cross-Module Optimizations}. |
| |
| |
| @node Cross-Module Optimizations |
| @subsection Cross-Module Optimizations |
| |
| Separate compilation has always been an important feature of the C language. |
| Splitting up an application into several modules does not only reduce |
| turn-around times and resource-requirements for compilation, but it also |
| helps writing reusable well-structured code. |
| |
| However, an optimizer has much more possibilities when it has access to the |
| entire source code. In order to provide maximum possible optimizations |
| without sacrificing structure and modularity of code, @command{vbcc} can do |
| optimizations across different translation-units. Another benefit is |
| that cross-module analysis also will detect objects which are declared |
| inconsistently in different translation-units. |
| |
| Unfortunately common object-code does not contain enough information to |
| perform aggressive optimization, To overcome this problem, @command{vbcc} offers |
| two solutions: |
| |
| @itemize @minus |
| @item If cross-module optimizations are enabled and several files are |
| passed to @command{vbcc}, it will read in all files at once, perform |
| optimizations across these files and generate a single object |
| file as output. This file is similar to what would have been |
| obtained by separately compiling the files and linking the |
| resulting objects together. |
| |
| @item The method described above often requires changes in makefiles and |
| somewhat different handling. Therefore @command{vbcc} also provides means |
| to generate some kind of special pseudo object files which pretain |
| enough high-level information to perform aggressive optimizations |
| at link time. |
| |
| If @option{-wpo} is specified (which will automatically be done by |
| @command{vc} at higher optimization levels) @command{vbcc} will generate such files |
| rather than normal assembly or object files. These files can |
| not be handled by normal linkers. However, @command{vc} will detect these |
| files and before linking it will pass all such files to @command{vbcc} again. |
| @command{vbcc} will optimize the entire code and generate real code which |
| is then passed to the linker. |
| |
| It is possible to pass @command{vc} a mixture of real and pseudo object |
| files. @command{vc} will detect the pseudo objects, compile them and link |
| them together with the real objects. Obviously, @command{vc} has to be used |
| for linking. Directly calling the linker with pseudo objects will |
| not work. |
| |
| Please note that optimization and code generation is deferred to |
| link-time. Therefore, all compiler options related to optimization and |
| code generation have to be specified at the linker command as well. |
| Otherwise they would be ignored. Other options (e.g. setting paths |
| or defining macros) have to be specified when compiling. |
| |
| Also, turn-around times will obviously increase as usually everything |
| will be rebuild even if makefiles are used. While only the |
| corresponding pseudo object may be rebuilt if one file is changed, |
| all the real work will be done at the linking stage. |
| |
| @end itemize |
| |
| @node Instruction Scheduling |
| @subsection Instruction Scheduling |
| |
| Some backends provide an instruction scheduler which is automatically run |
| by @command{vc} at higher optimization levels. The purpose is to reorder |
| instructions to make better use of the different pipelines a CPU may |
| offer. |
| |
| The exact details depend heavily on the backend, but in general the |
| scheduler will try to place instructions which can be executed in parallel |
| (e.g. on super-scalar architectures) close to each other. Also, |
| instructions which depend on the result of another instruction will be |
| moved further apart to avoid pipeline-stalls. |
| |
| Please note that it may be crucial to specify the correct derivate of a |
| CPU family in order to get best results from the sceduler. Different |
| variants of an architecture may have a different number and behaviour of |
| pipelines requiring different scheduling decisions. |
| |
| Consult the backend documentation for details. |
| |
| |
| @node Target-Specific Optimizations |
| @subsection Target-Specific Optimizations |
| |
| In addition to those optimzations which are available for all targets, |
| every backend will provide a series of additional optimizations. These |
| vary between the different backends, but optimizations frequently done by |
| backends are: |
| |
| @itemize @minus |
| @item use of complex or auto-increment addressing-modes |
| @item implicit setting of condition-codes |
| @item instruction-combining |
| @item delayed popping of stack-slots |
| @item optimized function entry- and exit-code |
| @item elimination of a frame pointer |
| @item optimized multiplication/division by constants |
| @item inline code for block-copying |
| @end itemize |
| |
| |
| |
| @node Debugging Optimized Code |
| @subsection Debugging Optimized Code |
| |
| Debugging of optimized code is usually not possible without problems. |
| Many compilers turn off almost all optimizations when debugging. |
| @command{vbcc} allows debugging output together with optimizations and |
| tries to still do all optimizations (some restrictions have to be made |
| regarding instruction-scheduling). |
| |
| However, depending on the debugger and debugging-format used, the |
| information displayed in the debugger may differ from the real |
| situation. Typical problems are: |
| |
| @itemize @minus |
| |
| @item Incorrectly displayed values of variables. |
| |
| When optimizing vbcc will often remove certain variables or |
| eliminate code which sets them. Sometimes it is possible, to |
| tell the debugger that a variable has been optimized away, but |
| most of the time the debugger does not allow this and you |
| will just get bogus values when trying to inspect a variable. |
| |
| Also, variables whose locations differs at various locations |
| of the program (e.g. a variable is in a register at one place |
| and in memory at another) can only be correctly displayed, if |
| the debugger supports this. |
| |
| Sometimes, this can even occur in non-optimized code (e.g. |
| with register-parameters or a changing stack-pointer). |
| |
| @item Strange program flow. |
| |
| When stepping through a program, you may see lines of code |
| be executed out-of-order or parts of the code skipped. This |
| often occurs due to code being moved around or eliminated/combined. |
| |
| @item Missed break-points. |
| |
| Setting break-points (especially on source-lines) needs some |
| care when optimized code is debugged. E.g. code may have been |
| moved or even replicated at different parts. A break-point |
| set in a debugger will usually only be set on one instance of |
| the code. Therefore, a different instance of the code may have been |
| executed although the break-point was not hit. |
| |
| @end itemize |
| |
| @node Extensions |
| @section Extensions |
| |
| This section lists and describes all extensions to the C language provided |
| by @command{vbcc}. Most of them are implemented in a way which does not break |
| correct C code and still allows all diagnostics required by the C standard |
| by using reserved identifiers. |
| |
| The only exception (@pxref{Inline-Assembly Functions}) can be turned off |
| using @option{-iso} or @option{-ansi}. |
| |
| @subsection Pragmas |
| |
| @command{vbcc} accepts the following @code{#pragma}-directives: |
| |
| @table @code |
| |
| @item #pragma printflike <function> |
| @itemx #pragma scanflike <function> |
| @command{vbcc} will handle @code{<function>} specially. |
| @code{<function>} has to be an already declared |
| function, with external linkage, that |
| takes a variable number of arguments |
| and a @code{const char *} as the last fixed |
| parameter. |
| |
| If such a function is called with a |
| string-constant as format-string, @command{vbcc} |
| will check if the arguments seem to |
| match the format-specifiers in the |
| format-string, according to the rules |
| of printf or scanf. |
| Also, @command{vbcc} will replace the call by a |
| call to a simplified version according |
| to the following rules, if such a |
| function has been declared with external |
| linkage: |
| |
| @itemize @minus |
| @item If no format-specifiers are used at all, |
| @code{__v0<function>} will be called. |
| |
| @item If no qualifiers are used and only |
| @code{d,i,x,X,o,s,c} are used, @code{__v1<function>} |
| will be called. |
| |
| @item If no floating-point arguments are used, |
| @code{__v2<function>} will be called. |
| @end itemize |
| |
| @item #pragma dontwarn <n> |
| Disables warning number n. Must be followed by @code{#pragma popwarn}. |
| |
| @item #pragma warn <n> |
| Enables warning number n. Must be followed by @code{#pragma popwarn}. |
| |
| @item #pragma popwarn |
| Undoes the last modification done by @code{#pragma warn} or |
| @code{#pragma dontwarn}. |
| |
| @item #pragma only-inline on |
| The following functions will be parsed and are available for |
| inlining (@pxref{Function Inlining}), but no out-of-line code |
| will be generated, even if some calls could not be inlined. |
| |
| Do not use this with functions that have |
| local static variables! |
| |
| @item #pragma only-inline off |
| The following functions are translated |
| as usual again. |
| |
| @item #pragma opt <n> |
| Sets the optimization options to <n> |
| (similar to -O=<n>) for the following |
| functions. |
| This is only used for debugging purposes. Do not use! |
| |
| @item #pragma begin_header |
| Used to mark the beginning of a system-header. Must be followed |
| by @code{#pragma end_header}. Not for use in applications! |
| |
| @item #pragma end_header |
| The counterpart to @code{#pragma begin_header}. Marks the end |
| of a system-header. Not for use in applications! |
| |
| @item #pragma pack(n) |
| Set alignment of structure members to a multiple of @code{n} bytes. |
| |
| @item #pragma pack() |
| Restores structure alignment to the target's default alignment, |
| which was in effect when the compilation started. |
| |
| @item #pragma pack(push[,n]) |
| Pushes the current structure alignment onto an internal stack and |
| optionally sets a new alignment to a multiple of @code{n} bytes. |
| |
| @item #pragma pack(pop) |
| Restores the topmost structure alignment, saved by @code{pack(push)}, |
| from an internal stack. Restores the default alignment, when the |
| stack is empty. |
| |
| @end table |
| |
| |
| @subsection Register Parameters |
| |
| If the parameters for certain functions should be passed in certain |
| registers, it is possible to specify the registers using |
| @code{__reg("<reg>")} in the |
| prototype, e.g. |
| |
| @example |
| void f(__reg("d0") int x, __reg("a0") char *y) @{ ... @} |
| @end example |
| |
| The names of the available registers depend on the backend and will |
| be listed in the corresponding part of the documentation. |
| Note that a matching prototype must be in scope when calling such |
| a function - otherwise wrong code will be generated. |
| Therefore it is not useful to use register parameters in an old-style |
| function-definition. |
| |
| If the backend cannot handle the specified register for a |
| certain type, this will cause an error. Note that this may happen |
| although the register could store that type, if the backend |
| does not provide the necessary support. |
| |
| Also note that this may force @command{vbcc} to create worse code. |
| |
| |
| @node Inline-Assembly Functions |
| @subsection Inline-Assembly Functions |
| |
| Only use them if you know what you are doing! |
| |
| A function-declaration may be followed by '=' and a string-constant. |
| If a function is called with such a declaration in scope, no |
| function-call will be generated but the string-constant will be |
| inserted in the assembly-output. |
| Otherwise the compiler and optimizer will treat this like a |
| function-call, i.e. the inline-assembly must not modify any callee-save |
| registers without restoring them. However, it is also possible to |
| specify the side-effects of inline-assembly functions like |
| registers used or variables used and modified |
| (@pxref{Specifying side-effects}). |
| |
| Example: |
| |
| @example |
| double sin(__reg("fp0") double) = "\tfsin.x\tfp0\n"; |
| @end example |
| |
| There are several issues to take care of when writing inline-assembly. |
| |
| @itemize @minus |
| @item As inline-assembly is subject to loop unrolling or function inlining |
| it may be replicated at different locations. Unless it is absolutely |
| known that this will not happen, the code should not define any |
| labels (e.g. for branches). Use offsets instead. |
| |
| @item If a backend provides an instruction scheduler, inline-assembly code |
| will also be scheduled. Some schedulers make assumptions about |
| their input (usually compiler-generated code) to improve the |
| code. Have a look at the backend documentation to see if there |
| are any issues to consider. |
| |
| @item If a backend provides a peephole optimizer which optimizes the |
| assembly output, inline-assembly code will also be optimized |
| unless @option{-no-inline-peephole} is specified. |
| Have a look at the backend documentation to see if there are any |
| issues to consider. |
| |
| @item @command{vbcc} assumes that inline-assembly does not introduce any new |
| control-flow edges. I.e. control will only enter inline-assembly |
| if the function call is reached and if control leaves |
| inline-assembly it will continue after the call. |
| |
| @end itemize |
| |
| Inline-assembly-functions are not recognized when ANSI/ISO mode is |
| turned on. |
| |
| |
| @node Variable Attributes |
| @subsection Variable Attributes |
| |
| @command{vbcc} offers attributes to variables or functions. These attributes |
| can be specified at the declaration of a variable or function and |
| are syntactically similar to storage-class-specifiers |
| (e.g. @code{static}). |
| |
| Often, these attributes are specific to one backend and will be |
| documented in the backend-documentation (typical attributes would |
| e.g. be @code{__interrupt} or @code{__section}). Attributes may |
| also have parameters. A generally available |
| attribute s @code{__entry} which is used to preserve unreferenced |
| objects and functions (@pxref{Unused Object Elimination}): |
| |
| @example |
| __entry __interrupt __section("vectab") void my_handler() |
| @end example |
| |
| Additional non-target-specific attributes are available to |
| specify side-effects of functions (@pxref{Specifying side-effects}). |
| |
| |
| Please note that some common extensions like @code{__far} are |
| variable attributes on some architectures, but actually type |
| attributes (@pxref{Type Attributes}) on others. This is due to |
| significantly different meanings on different architectures. |
| |
| |
| @node Type Attributes |
| @subsection Type Attributes |
| |
| Types may be qualified by additional attributes, e.g. @code{__far}, |
| on some backends. Regarding the availability of type attributes |
| please consult the backend documentation. |
| |
| Syntactically type attributes have to be placed like a type-qualifier |
| (e.g. @code{const}). |
| As example, some backends know the attribute @code{__far}. |
| |
| Declaration of a pointer to a far-qualified character would be |
| |
| @example |
| __far char *p; |
| @end example |
| |
| whereas |
| |
| @example |
| char * __far p; |
| @end example |
| |
| is a far-qualified pointer to an unqualified char. |
| |
| Please note that some common extensions like @code{__far} are |
| type attributes on some architectures, but actually variable |
| attributes (@pxref{Variable Attributes}) on others. This is due to |
| significantly different meanings on different architectures. |
| |
| @subsection @code{__typeof} |
| |
| @code{__typeof} is syntactically equivalent to sizeof, but its result is of |
| type int and is a number representing the type of its argument. |
| This may be necessary for implementing @file{stdarg.h}. |
| |
| |
| @subsection @code{__alignof} |
| |
| @code{__alignof} is syntactically equivalent to sizeof, but its result is of |
| type int and is the alignment in bytes of the type of the argument. |
| This may be necessary for implementing @file{stdarg.h}. |
| |
| |
| @subsection @code{__offsetof} |
| |
| @code{__offsetof} is a builtin version of the @code{offsetof}-macro |
| as defined in the C language. The first argument is a structure |
| type and the second a member of the structure type. The result |
| will be a constant expression representing the offset of the |
| specified member in the structure. |
| |
| @node Specifying side-effects |
| @subsection Specifying side-effects |
| |
| Only use if you know what you are doing! |
| |
| When optimizing and generating code, @command{vbcc} often has to take |
| into account side-effects of function-calls, e.g. which registers might |
| be modified by this function and what variables are read or modified. |
| |
| A rather imprecise way to make assumptions on side-effects is given by |
| the ABI of a certain system (that defines which registers have to be |
| preserved by functions) or rules derived from the language (e.g. local |
| variables whose address has not been taken cannot be accessed by another |
| function). |
| |
| On higher optimization levels (@pxref{Inter-Procedural Analysis} and |
| @pxref{Cross-Module Optimizations})) @command{vbcc} will try to analyse |
| functions and often gets much more precise informations regarding |
| side-effects. |
| |
| However, if the source code of functions is not visible to @command{vbcc}, |
| e.g. because the functions are from libraries or they are |
| written in assembly (@pxref{Inline-Assembly Functions}), it is obviously |
| not possible to analyze the code. In this case, it is possible to specify |
| these side-effects using the following special variable-attributes |
| (@pxref{Variable Attributes}). |
| |
| The @code{__regsused(<register-list>)} attribute specifies the volatile |
| registers used or modified by a function. The register list is a list of |
| register names (as defined in the backend-documentation) separated by |
| slashes and enclosed in double-quotes, e.g. |
| |
| @code{ __regsused("d0/d1") int abs();} |
| |
| declares a function @code{abs} which only uses registers @code{d0} and |
| @code{d1}. |
| |
| @code{__varsmodified(<variable-list>)} specifies a list of variables with |
| external linkage |
| which are modified by the function. @code{__varsused} is similar, but |
| specifies the external variables read by the function. If a variable is |
| read and written, both attributes have to be specified. The variable-list |
| is a list of identifiers, separated by slashes and enclosed in double |
| quotes. |
| |
| The attribute @code{__writesmem(<type>)} is used to specify that the |
| function accesses memory using a certain type. This is necessary if the |
| function modifies memory accessible to the calling function which cannot |
| be specified using @code{__varsmodified} (e.g. because it is accessed via |
| pointers). @code{__readsmem} is similar, but specifies memory which is |
| read. |
| |
| If one of @code{__varsused}, @code{varsmodified}, @code{__readsmem} and |
| @code{__writesmem} is specified, all relevant side-effects must be |
| specified. If, for example, only @code{__varsused("my_global")} |
| is specified, this implies that the function only reads @code{my_global} |
| and does not modify any variable accessible to the caller. |
| |
| All of these attributes may be specified multiple times. |
| |
| @subsection Automatic constructor/destructor functions |
| |
| The linker @command{vlink} provides a feature to collect pointers to |
| all functions starting with the names @code{_INIT} or @code{_EXIT} in |
| a prioritized array, labeled by @code{__CTOR_LIST__} and |
| @code{__DTOR_LIST__}. The C-library (vclib) calls the constructor functions |
| before entering @code{main()} and the destructor functions on program |
| exit. |
| |
| The format of these special function names is: |
| @example |
| void _INIT[_<pri>][_<name>](void) |
| void _EXIT[_<pri>][_<name>](void) |
| @end example |
| The optional priority @code{<pri>} may be a digit between 1 and 9, where |
| a constructor with a priority of 1 is executed first while a destructor |
| with a priority of 1 is executed last. @code{<name>} is an optional name, |
| used to differentiate functions of the same level. |
| |
| @subsection @code{__noinline} |
| |
| @code{__noinline} will prevent inlining of a given function. The heuristic |
| used for deciding whether a function should be inlined generally makes a good |
| trade-off between code size and performance, but sometimes it can be useful to override |
| this behaviour. Use-cases include keeping "cold" functions out-of line to reduce code |
| size, or to allow safe use of inline assembly with labels. |
| |
| @subsection Predefined macros |
| The following macros are defined by the compiler. |
| @example |
| #define __VBCC__ |
| #define __entry __vattr("entry") |
| #define __str(x) #x |
| #define __asm(x) do{static void inline_assembly()=x;inline_assembly();}while(0) |
| #define __regsused(x) __vattr("regused("x")") |
| #define __varsused(x) __vattr("varused("x")") |
| #define __varsmodified(x) __vattr("varchanged("x")") |
| #define __noreturn __vattr("noreturn()") |
| #define __alwaysreturn __vattr("alwaysreturn()") |
| #define __nosidefx __vattr("nosidefx()") |
| #define __stack(x) __vattr(__str(stack1(x))) |
| #define __stack2(x) __vattr(__str(stack2(x))) |
| #define __noinline __vattr("noinline()") |
| #define __STDC_VERSION__ 199901L |
| @end example |
| @code{__STDC_VERSION__} is defined in C99-mode only. |
| |
| @subsection Masked symbols |
| Together with vlink, this feature allows to provide a family of specially tailored |
| functions in a library. |
| |
| A symbol can be attached a mask using the @code{__mask} attribute, e.g. |
| |
| @example |
| __mask(15) void myfunc(void); |
| @end example |
| |
| The symbol of this function will get the suffix @code{.15} attached. |
| |
| A reference to a masked symbol can either be created by vbcc itself when using the |
| option @code{-mask-opt}, or manually by referencing a symbol prefixed with |
| @code{__maskm_<mask>}. |
| |
| @example |
| extern __mask(15) void myfunc(void); |
| ... |
| __maskm_15_myfunc(); |
| @end example |
| |
| If there are several masked references to a symbol, the linker will pull the first |
| symbol containing a mask that has at least all bits set that are set in any masked |
| reference. If no masked symbols matching that requirement are available, the symbol |
| without a mask is used. Also, a non-masked reference will only pull a non-masked |
| symbol from a library. |
| |
| Masked objects may only be defined in libraries and zero masks are not allowed. |
| |
| |
| @section Known Problems |
| |
| Some known target-independent problems of @command{vbcc} at the moment: |
| |
| @itemize @minus |
| |
| @item Some exotic scope-rules are not handled correctly. |
| |
| @item Debugging-infos may have problems on higher optimization-levels. |
| |
| @item String-constants are not merged (partially done). |
| |
| @end itemize |
| |
| @section Credits |
| |
| All those who wrote parts of the @command{vbcc} distribution, made suggestions, |
| answered my questions, tested @command{vbcc}, reported errors or were otherwise |
| involved in the development of @command{vbcc} (in descending alphabetical order, |
| under work, not complete): |
| |
| @itemize |
| @item Frank Wille |
| @item Gary Watson |
| @item Andrea Vallinotto |
| @item Johnny Tevessen |
| @item Eero Tamminen |
| @item Gabriele Svelto |
| @item Dirk Stoecker |
| @item Ralph Schmidt |
| @item Markus Schmidinger |
| @item Thorsten Schaaps |
| @item Anton Rolls |
| @item Michaela Pruess |
| @item Thomas Pornin |
| @item Joerg Plate |
| @item Gilles Pirio |
| @item Bartlomiej Pater |
| @item Elena Novaretti |
| @item Gunther Nikl |
| @item Constantinos Nicolakakis |
| @item Timm S. Mueller |
| @item Robert Claus Mueller |
| @item Joern Maass |
| @item Aki M Laukkanen |
| @item Kai Kohlmorgen |
| @item Uwe Klinger |
| @item Andreas Kleinert |
| @item Julian Kinraid |
| @item Acereda Macia Jorge |
| @item Dirk Holtwick |
| @item Matthew Hey |
| @item Tim Hanson |
| @item Kasper Graversen |
| @item Jens Granseuer |
| @item Volker Graf |
| @item Marcus Geelnard |
| @item Franta Fulin |
| @item Matthias Fleischer |
| @item Alexander Fichtner |
| @item Olivier Fabre |
| @item Robert Ennals |
| @item Thomas Dorn |
| @item Walter Doerwald |
| @item Aaron Digulla |
| @item Lars Dannenberg |
| @item Sam Crow |
| @item Michael Bode |
| @item Michael Bauer |
| @item Juergen Barthelmann |
| @item Thomas Arnhold |
| @item Alkinoos Alexandros Argiropoulos |
| @item Thomas Aglassinger |
| @end itemize |