[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 Intel x86

ld can create DLLs that operate with various runtimes available on a common x86 operating system. These runtimes include native (using the mingw "platform"), cygwin, and pw.
auto-import from DLLs
  1. With this feature on, DLL clients can import variables from DLL without any concern from their side (for example, without any source code modifications). Auto-import can be enabled using the --enable-auto-import flag, or disabled via the --disable-auto-import flag. Auto-import is disabled by default.

  2. This is done completely in bounds of the PE specification (to be fair, there's a minor violation of the spec at one point, but in practice auto-import works on all known variants of that common x86 operating system) So, the resulting DLL can be used with any other PE compiler/linker.

  3. Auto-import is fully compatible with standard import method, in which variables are decorated using attribute modifiers. Libraries of either type may be mixed together.

  4. Overhead (space): 8 bytes per imported symbol, plus 20 for each reference to it; Overhead (load time): negligible; Overhead (virtual/physical memory): should be less than effect of DLL relocation.

Motivation

The obvious and only way to get rid of dllimport insanity is to make client access variable directly in the DLL, bypassing the extra dereference imposed by ordinary DLL runtime linking. I.e., whenever client contains someting like

mov dll_var,%eax,

address of dll_var in the command should be relocated to point into loaded DLL. The aim is to make OS loader do so, and than make ld help with that. Import section of PE made following way: there's a vector of structures each describing imports from particular DLL. Each such structure points to two other parellel vectors: one holding imported names, and one which will hold address of corresponding imported name. So, the solution is de-vectorize these structures, making import locations be sparse and pointing directly into code.

Implementation

For each reference of data symbol to be imported from DLL (to set of which belong symbols with name <sym>, if __imp_<sym> is found in implib), the import fixup entry is generated. That entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 subsection. Each fixup entry contains pointer to symbol's address within .text section (marked with __fuN_<sym> symbol, where N is integer), pointer to DLL name (so, DLL name is referenced by multiple entries), and pointer to symbol name thunk. Symbol name thunk is singleton vector (__nm_th_<symbol>) pointing to IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing imported name. Here comes that "om the edge" problem mentioned above: PE specification rambles that name vector (OriginalFirstThunk) should run in parallel with addresses vector (FirstThunk), i.e. that they should have same number of elements and terminated with zero. We violate this, since FirstThunk points directly into machine code. But in practice, OS loader implemented the sane way: it goes thru OriginalFirstThunk and puts addresses to FirstThunk, not something else. It once again should be noted that dll and symbol name structures are reused across fixup entries and should be there anyway to support standard import stuff, so sustained overhead is 20 bytes per reference. Other question is whether having several IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, it is done even by native compiler/linker (libth32's functions are in fact resident in windows9x kernel32.dll, so if you use it, you have two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is whether referencing the same PE structures several times is valid. The answer is why not, prohibiting that (detecting violation) would require more work on behalf of loader than not doing it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated by system on December, 2 2004 using texi2html