PulkoMandy | 17fc759 | 2022-07-28 18:27:54 +0200 | [diff] [blame] | 1 | This chapter documents the backend for the M68k and Coldfire |
| 2 | processor families. |
| 3 | |
| 4 | @section Additional options |
| 5 | |
| 6 | This backend provides the following additional options: |
| 7 | |
| 8 | @table @option |
| 9 | |
| 10 | @item -a2scratch |
| 11 | Allow using @code{A2} as scratch register. |
| 12 | |
| 13 | @item -amiga-softfloat |
| 14 | Call AmigaOS MathIEEE library functions via direct inline |
| 15 | code, instead of callling stub routines from @file{mieee.lib}. |
| 16 | It still requires that you either link with @file{mieee.lib} or |
| 17 | define @code{MathIeeeSingBasBase}, @code{MathIeeeDoubBasBase} |
| 18 | and @code{MathIeeeDoubTransBase} yourself. |
| 19 | |
| 20 | @item -conservative-sr |
| 21 | Restrict strength-reduction. Experimental. |
| 22 | |
| 23 | @item -const-in-data |
| 24 | By default constant data will be placed in the code |
| 25 | section (and therefore is accessible with faster pc-relative |
| 26 | addressing modes). Using this option it will be placed in the |
| 27 | data section. |
| 28 | |
| 29 | This could e.g. be useful if you want to use small data and |
| 30 | small code, but your code gets too big with all the constant |
| 31 | data. |
| 32 | |
| 33 | Note that on operating systems with memory protection this |
| 34 | option will disable write-protection of constant data. |
| 35 | |
| 36 | @item -cpu=n |
| 37 | Generate code for cpu n (e.g. @option{-cpu=68020}), |
| 38 | defaults to 68000. |
| 39 | |
| 40 | @item -d2scratch |
| 41 | Allow using @code{D2} as scratch register. |
| 42 | |
| 43 | @item -fastcall |
| 44 | Pass function arguments in volatile registers, when possible. |
| 45 | |
| 46 | @item -fp2scratch |
| 47 | Allow using @code{FP2} as scratch register. |
| 48 | |
| 49 | @item -fpu=n |
| 50 | Generate code for fpu n (e.g. @option{-fpu=68881}), default: 0. |
| 51 | |
| 52 | @item -gas |
| 53 | Create output suitable for the GNU assembler. |
| 54 | |
| 55 | @item -hunkdebug |
| 56 | When creating debug-output (@option{-g} option) create |
| 57 | Amiga debug hunks rather than DWARF2. |
| 58 | Does not work with @option{-gas}. |
| 59 | |
| 60 | @item -no-delayed-popping |
| 61 | By default arguments of function calls are not always popped |
| 62 | from the stack immediately after the call, so that the |
| 63 | arguments of several calls may be popped at once. |
| 64 | With this option @command{vbcc} can be forced to pop them after every |
| 65 | function call. |
| 66 | This may simplify debugging and reduce the |
| 67 | stack size needed by the compiled program. |
| 68 | |
| 69 | @item -no-fp-return |
| 70 | |
| 71 | Do not return floats and doubles in floating-point registers |
| 72 | even if code for an fpu is generated. |
| 73 | |
| 74 | @item -no-intz |
| 75 | When generating code for FPU do quick&dirty conversions |
| 76 | from floating-point to integer. The code may be somewhat |
| 77 | faster but will not correctly round to zero. |
| 78 | Only use it if you know what you are doing. |
| 79 | |
| 80 | @item -no-mreg-return |
| 81 | Do not use multiple registers to return types that do not |
| 82 | fit into a single register. This is mainly for backwards |
| 83 | compatibility with certain libraries. |
| 84 | |
| 85 | @item -no-peephole |
| 86 | Do not perform peephole-optimizations. |
| 87 | |
| 88 | @item -no-reserve-regs |
| 89 | Do not reserve temporary registers for the backend. Can lead to |
| 90 | worse code generation. |
| 91 | |
| 92 | @item -old-softfloat |
| 93 | Use old libcall mechanism for software floating point. |
| 94 | Should not be used, will usually generate worse code. |
| 95 | |
| 96 | @item -old-libcalls |
| 97 | Use old libcall mechanism for (some) integer support routines. |
| 98 | Should not be used, will usually generate worse code. |
| 99 | |
| 100 | @item -phxass |
| 101 | Generate assembly output for the PhxAss assembler. |
| 102 | |
| 103 | @item -prof |
| 104 | Insert code for profiling. |
| 105 | |
| 106 | @item -sc |
| 107 | Use small code model (see below). |
| 108 | |
| 109 | @item -sd |
| 110 | Use small data model (see below). |
| 111 | |
| 112 | @item -use-commons |
| 113 | Use real common symbols instead of bss symbols for |
| 114 | non-initialized external variables. |
| 115 | |
| 116 | @item -use-framepointer |
| 117 | By default automatic variables are addressed through a7 |
| 118 | instead of a5. This generates slightly better code, because |
| 119 | the function entry and exit overhead is reduced and a5 can be |
| 120 | used as register variable etc. |
| 121 | |
| 122 | However this may be a bit confusing when debugging and you |
| 123 | can force @command{vbcc} to use a5 as a fixed framepointer. |
| 124 | |
| 125 | |
| 126 | |
| 127 | |
| 128 | @end table |
| 129 | |
| 130 | @section ABI |
| 131 | |
| 132 | The current version generates assembler output for use with the |
| 133 | @command{vasmm68k_mot}. Most peephole optimizations are done by the |
| 134 | assembler so @command{vbcc} only does some that the assembler cannot make. |
| 135 | The generated executables will probably only work with OS2.0 or higher. |
| 136 | |
| 137 | With @option{-gas} assembler output suitable for the GNU assembler is generated |
| 138 | (the version must understand the Motorola syntax - some old ones do not). |
| 139 | The output is only slightly modified from the @command{vasm}-output and will |
| 140 | therefore result in worse code on @command{gas}. |
| 141 | |
| 142 | The register names provided by this backend are: |
| 143 | |
| 144 | @example |
| 145 | a0, a1, a2, a3, a4, a5, a6, a7 |
| 146 | d0, d1, d2, d3, d4, d5, d6, d7 |
| 147 | fp0, fp1, fp2, fp3, fp4, fp5, fp6, fp7 |
| 148 | @end example |
| 149 | |
| 150 | The registers @code{a0} - @code{a7} are supported to hold pointer |
| 151 | types. @code{d0} - @code{d7} can be used for integers types |
| 152 | excluding @code{long long}, pointers and @code{float} if no |
| 153 | FPU code is generated. @code{fp0} - @code{fp7} can be used for |
| 154 | all floating point types if FPU code is generated. |
| 155 | |
| 156 | Additionally the following register pairs can be used for |
| 157 | @code{long long}: |
| 158 | |
| 159 | @example |
| 160 | d0/d1, d2/d3, d4/d5, d6/d7 |
| 161 | @end example |
| 162 | |
| 163 | The registers @code{d0, d1, a0, a1, fp0} and @code{fp1} are used as scratch registers |
| 164 | (i.e. they can be destroyed in function calls), all other registers are |
| 165 | preserved. |
| 166 | |
| 167 | By default, all function arguments are passed on the stack. |
| 168 | |
| 169 | All scalar types up to 4 bytes are returned in register @code{d0}, |
| 170 | @code{long long} is returned in @code{d0/d1}. |
| 171 | If compiled for FPU, floating point values are returned in |
| 172 | @code{fp0} unless @option{-no-fpreturn} is specified. |
| 173 | Types which are 8, 12 or 16 bytes large will be returned in several |
| 174 | registers (@code{d0/d1/a0/a1}) unless @option{-no-mreg-return} is specified. |
| 175 | All other types are returned by passing the function the address |
| 176 | of the result as a hidden argument - such a function must not be called |
| 177 | without a proper declaration in scope. |
| 178 | |
| 179 | Objects which have been compiled with different settings must not be |
| 180 | linked together. |
| 181 | |
| 182 | @code{a7} is used as stack pointer. If @option{-sd} is used, |
| 183 | @code{a4} will be used as small data pointer. If |
| 184 | @option{-use-framepointer} is used, @code{a5} will be used as |
| 185 | frame pointer. All other registers will be used by the |
| 186 | register allocator and can be used for register parameters. |
| 187 | |
| 188 | The size of the stack frame is limited to 32KB for early members |
| 189 | of the 68000 family prior to 68020. |
| 190 | |
| 191 | |
| 192 | The basic data types are represented like: |
| 193 | |
| 194 | @example |
| 195 | type size in bits alignment in bytes |
| 196 | |
| 197 | char 8 1 |
| 198 | short 16 2 |
| 199 | int 32 2 |
| 200 | long 32 2 |
| 201 | long long 64 2 |
| 202 | all pointers 32 2 |
| 203 | float(fpu) 32 2 see below |
| 204 | double(fpu) 64 2 see below |
| 205 | long double(fpu) 64 2 see below |
| 206 | @end example |
| 207 | |
| 208 | |
| 209 | |
| 210 | @section Small data |
| 211 | |
| 212 | @command{vbcc} can access static data in two ways. By default all such data will |
| 213 | be accessed with full 32bit addresses (large data model). |
| 214 | However there is a second way. You can set up an address register |
| 215 | (@code{a4}) |
| 216 | to point into the data segment and then address data with a 16bit |
| 217 | offset through this register. |
| 218 | |
| 219 | The advantages of the small data model are that the program will |
| 220 | usually be smaller (because the 16bit offsets use less space and no |
| 221 | relocation information is needed) and faster. |
| 222 | |
| 223 | The disadvantages are that one address register cannot be used by the |
| 224 | compiler and that it can only be used if all static data occupies |
| 225 | less than 64kb. Also object modules and libraries that |
| 226 | have been compiled with different data models must not be mixed |
| 227 | (it is possible to call functions |
| 228 | compiled with large data model from object files compiled with small |
| 229 | data model, but not vice versa and only functions can be called that |
| 230 | way - other data cannot be accessed). |
| 231 | |
| 232 | If small data is used with functions which are called from |
| 233 | functions which have not been compiled with @command{vbcc} or without the small data |
| 234 | model then those functions must be declared with the @code{__saveds} attribute |
| 235 | or call @code{geta4()} as the first statement (do not use |
| 236 | automatic initializations prior to the call to @code{geta4}). |
| 237 | Note that @code{geta4()} must not be called through a function pointer! |
| 238 | |
| 239 | |
| 240 | @section Small code |
| 241 | |
| 242 | In the small code model calls to external functions (i.e. from |
| 243 | libraries or other object files) are done with 16bit offsets through |
| 244 | the program counter rather than with absolute 32bit addresses. |
| 245 | |
| 246 | The advantage is slightly smaller and faster code. |
| 247 | The disadvantages are that all the code (including library functions) |
| 248 | must be small enough. Objects/libraries can be linked together if they |
| 249 | have been compiled with different code models. |
| 250 | |
| 251 | |
| 252 | @section CPUs |
| 253 | |
| 254 | The values of @option{-cpu=n} have those effects: |
| 255 | |
| 256 | @table @option |
| 257 | |
| 258 | @item n<68000 |
| 259 | Code for the Coldfire family is generated. |
| 260 | |
| 261 | @item n>=68000 |
| 262 | Code for the 68k family is generated. |
| 263 | |
| 264 | @item n>=68020 |
| 265 | @itemize @minus |
| 266 | @item 32bit multiplication/division/modulo is done with the |
| 267 | mul?.l, div?.l and div?l.l instructions. |
| 268 | @item tst.l ax is used. |
| 269 | @item extb.l dx is used. |
| 270 | @item 16/32bit offsets are used in certain addressing modes. |
| 271 | @item link.l is used. |
| 272 | @item Addressing modes with scaling are used. |
| 273 | @end itemize |
| 274 | |
| 275 | @item n==68040 |
| 276 | @itemize @minus |
| 277 | @item 8bit constants are not copied in data registers. |
| 278 | @item Static memory is not subject to common subexpression elimination. |
| 279 | @end itemize |
| 280 | |
| 281 | @end table |
| 282 | |
| 283 | |
| 284 | @section FPUs |
| 285 | |
| 286 | At the moment the values of -fpu=n have those effects: |
| 287 | |
| 288 | @table @option |
| 289 | @item n>68000 |
| 290 | Floating point calculations are done using the FPU. |
| 291 | @item n=68040 |
| 292 | @itemx n=68060 |
| 293 | Instructions that have to be emulated on these FPUs |
| 294 | will not be used; at the moment this only includes |
| 295 | the @code{fintrz} instruction in case of the 040. |
| 296 | @end table |
| 297 | |
| 298 | @section Math |
| 299 | |
| 300 | Long multiply |
| 301 | on CPUs <68020 uses inline routines. This may increase code size a bit, |
| 302 | but it should be significantly faster, because function call overhead |
| 303 | is not necessary. |
| 304 | Long division and modulo is handled by calls to library |
| 305 | functions. |
| 306 | (Some operations involving constants (e.g. powers of two) are always |
| 307 | implemented by more efficient inline code.) |
| 308 | |
| 309 | If no FPU is specified floating point math is done using |
| 310 | math libraries. 32bit IEEE format is used for float and 64bit IEEE |
| 311 | for double and long double. |
| 312 | |
| 313 | If floating point math is done with the FPU |
| 314 | floating point values are kept in registers and therefore may |
| 315 | have extended precision sometimes. This is not ANSI compliant but |
| 316 | will usually cause no harm. When floating point values are stored in |
| 317 | memory they use the same IEEE formats as without FPU. |
| 318 | Return values are passed in @code{fp0}. |
| 319 | |
| 320 | Note that you must not link object files together if they were not |
| 321 | compiled with the same @code{-fpu} settings and that |
| 322 | a proper math library must be linked. |
| 323 | |
| 324 | |
| 325 | @section Target-Specific Variable Attributes |
| 326 | |
| 327 | This backend offers the following variable attributes: |
| 328 | |
| 329 | @table @code |
| 330 | @item __amigainterrupt |
| 331 | Used to write interrupt-handlers for AmigaOS. Stack-checking |
| 332 | for a function with this attribute will be disabled and if a value |
| 333 | is returned in d0, the |
| 334 | condition codes will be set accordingly. |
| 335 | |
| 336 | @item __chip |
| 337 | Place variable in chip-memory. Only applicable on |
| 338 | AmigaOS to variables with static storage-duration. |
| 339 | |
| 340 | @item __far |
| 341 | Do not place this variable in the small-data segment |
| 342 | in small data mode. No effect in large data mode. |
| 343 | Only applicable to variables with static |
| 344 | storage-duration. |
| 345 | |
| 346 | @item __interrupt |
| 347 | This is used to declare interrupt-handlers. The |
| 348 | function using this attribute will save all registers |
| 349 | it destroys (including scratch-registers) and return |
| 350 | with @code{rte} rather than @code{rts}. |
| 351 | |
| 352 | @item __near |
| 353 | Currently ignored. |
| 354 | |
| 355 | @item __regargs |
| 356 | Declare function to use the @option{-fastcall} ABI. The |
| 357 | first arguments are passed in volatile registers. |
| 358 | |
| 359 | @item __saveds |
| 360 | Load the pointer to the small data segment at |
| 361 | function-entry. Applicable only to functions. |
| 362 | |
| 363 | @item __section(<string-literal>) |
| 364 | Places the variable/function in a section named |
| 365 | according to the argument. |
| 366 | |
| 367 | @item __stdargs |
| 368 | Declare function to use the standard ABI (default), |
| 369 | which passes all arguments on the stack. |
| 370 | @end table |
| 371 | |
| 372 | @section Target-specific pragmas |
| 373 | |
| 374 | This backend offers the following #pragmas: |
| 375 | |
| 376 | @table @code |
| 377 | |
| 378 | @item #pragma stdargs-on |
| 379 | Automatically declare the following functions with the |
| 380 | @code{__stdargs} attribute. |
| 381 | |
| 382 | @item #pragma stdargs-off |
| 383 | Stop automatically declaring the following functions with the |
| 384 | @code{__stdargs} attribute. |
| 385 | |
| 386 | @end table |
| 387 | |
| 388 | @section Predefined Macros |
| 389 | |
| 390 | This backend defines the following macros: |
| 391 | |
| 392 | @table @code |
| 393 | @item __AMIGADATE__ |
| 394 | This is set to current date as @code{"(DD.MM.YYYY)"}, |
| 395 | useful with version strings. |
| 396 | |
| 397 | @item __COLDFIRE |
| 398 | (If a Coldfire CPU is selected.) |
| 399 | |
| 400 | @item __INTSIZE |
| 401 | Is set to the size of the @code{int} type. |
| 402 | Either 16 (vbccm68ks) or 32 (vbccm68k). |
| 403 | |
| 404 | @item __M680x0 |
| 405 | (Depending on the settings of @option{-cpu}, e.g. |
| 406 | @code{__M68020}.) |
| 407 | |
| 408 | @item __M68881 |
| 409 | (If @option{-fpu=68881} is selected.) |
| 410 | |
| 411 | @item __M68882 |
| 412 | (If code for another FPU is selected; |
| 413 | @option{-fpu=68040} or @option{-fpu=68060} will |
| 414 | set @code{__M68882}.) |
| 415 | |
| 416 | @item __M68K__ |
| 417 | |
| 418 | @item __SMALL_DATA__ |
| 419 | (If @option{-sd} is selected to enable small data.) |
| 420 | @end table |
| 421 | |
| 422 | |
| 423 | @section Stack |
| 424 | |
| 425 | If the @option{-stack-check} option is used, every function-prologue will |
| 426 | call the function @code{__stack_check} with the stacksize needed by the |
| 427 | current function on the stack. This function has to consider its own |
| 428 | stacksize and must restore all registers. |
| 429 | |
| 430 | If the compiler is able to calculate the maximum stack-size of a |
| 431 | function including all callees, it will add a comment in the |
| 432 | generated assembly-output (subject to change to labels). |
| 433 | |
| 434 | |
| 435 | @section Stdarg |
| 436 | |
| 437 | A possible @file{<stdarg.h>} could look like this: |
| 438 | |
| 439 | @example |
| 440 | |
| 441 | typedef unsigned char *va_list; |
| 442 | |
| 443 | #define __va_align(type) (__alignof(type)>=4?__alignof(type):4) |
| 444 | |
| 445 | #define __va_do_align(vl,type) ((vl)=(char *)((((unsigned int)(vl))+__va_align(type)-1)/__va_align(type)*__va_align(type))) |
| 446 | |
| 447 | #define __va_mem(vl,type) (__va_do_align((vl),type),(vl)+=sizeof(type),((type*)(vl))[-1]) |
| 448 | |
| 449 | #define va_start(ap, lastarg) ((ap)=(va_list)(&lastarg+1)) |
| 450 | |
| 451 | #define va_arg(vl,type) __va_mem(vl,type) |
| 452 | |
| 453 | #define va_end(vl) ((vl)=0) |
| 454 | |
| 455 | #define va_copy(new,old) ((new)=(old)) |
| 456 | |
| 457 | #endif |
| 458 | |
| 459 | |
| 460 | @end example |
| 461 | |
| 462 | @section Known problems |
| 463 | |
| 464 | @itemize @minus |
| 465 | @item The extended precision of the FPU registers can cause problems if |
| 466 | a program depends on the exact precision. Most programs will not |
| 467 | have trouble with that, but programs which do exact comparisons |
| 468 | with floating point types (e.g. to try to calculate the number |
| 469 | of significant bits) may not work as expected (especially if the |
| 470 | optimizer was turned on). |
| 471 | @end itemize |
| 472 | |
| 473 | |
| 474 | |