Tuesday, November 5, 2024

Using Win32 Calling Conventions

When writing code for the Win32 platform, most developers don’t pay attention to selecting a “calling convention”, and in most cases it doesn’t really matter much. But as systems get larger and split into more modules (especially when third-party modules are to be included), this becomes something that cannot really be ignored.

In this Tech Tip, we discuss what the MSVC calling conventions are, why to choose one over the other, and “big system” considerations that may not be obvious.

Calling Conventions

Traditionally, C function calls are made with the caller pushing some parameters onto the stack, calling the function, and then popping the stack to clean up those pushed arguments.

/* example of __cdecl */
push arg1
push arg2
push arg3
call function
add sp,12 // effectively "pop; pop; pop"

It turns out that Microsoft compilers on Windows (and probably most others) support not just this convention, but two others as well. The technical details are found at Microsoft’s MSDN site, but we’ll touch on them here as well.

The default convention – shown above – is known as __cdecl, and the other most popular one is __stdcall. In this convention, the parameters are pushed by the caller, but the stack is cleaned up by the callee.

This is the standard convention for Win32 API functions (as defined by the WINAPI macro in <windows.h>), and it’s also sometimes called the “Pascal” calling convention.

/* example of __stdcall */
push arg1
push arg2
push arg3
call function
// no stack cleanup - callee does this

This looks like a minor technical detail, but if there is a disagreement on how the stack is managed between the caller and the callee, the stack will be destroyed in a way that is unlikely to be recovered.

A mismatch in calling convention is catastrophic for a running program.

At first this seems like a not-that-interesting distinction (to many it is in fact not-that-interesting), but there are several implications that arise when considering one or the other.

We’ll note that there is also a __fastcall convention that uses registers, but we don’t believe it’s really that useful in the general case – the save and restore of the registers often removes any speed benefit of using the register for arg passing. We’ll only touch on it in passing.

  • Since __stdcall does stack cleanup, the (very tiny) code to perform this task is found in only one place, rather than being duplicated in every caller as it is in __cdecl. This makes the code very slightly smaller, though the size impact is only visible in large programs.
  • Variadic functions like printf() are almost impossible to get right with __stdcall, because only the caller really knows how many arguments were passed in order to clean them up. The callee can make some good guesses (say, by looking at a format string), but the stack cleanup would have to be determined by the actual logic of the function, not the calling-convention mechanism itself. Hence only __cdecl supports variadic functions so that the caller can do the cleanup.
  • There isn’t really a “right or wrong” with respect to which one is best, but it is positively fatal to “mix and match”. The general principle is “the stack-cleanup must match the arg-pushing”, and this only happens when caller and callee know what the other is doing. Calling a function with the “wrong” convention will destroy the stack.
  • Linker symbol name decorations

    As mentioned in a bullet point above, calling a function with the “wrong” convention can be disastrous, so Microsoft has a mechanism to avoid this from happening. It works well, though it can be maddening if one does not know what the reasons are.

    They have chosen to resolve this by encoding the calling convention into the low-level function names with extra characters (which are often called “decorations”), and these are treated as unrelated names by the linker. The default calling convention is __cdecl, but each one can be requested explicitly with the /G? parameter to the compiler.

    __cdecl (cl /Gd …)

    All function names of this type are prefixed with an underscore, and the number of parameters does not really matter because the caller is responsible for stack setup and stack cleanup. It is possible for a caller and callee to be confused over the number of parameters actually passed, but at least the stack discipline is maintained properly.

    __stdcall (cl /Gz …)

    These function names are prefixed with an underscore and appended with the number of bytes of parameters passed. By this mechanism, it’s not possible to call a function with the “wrong” type, or even with the wrong number of parameters.

    __fastcall (cl /Gr …)

    These function names start with an @ sign and are suffixed with the @parameter count, much like __stdcall.

    Examples:

    We’ll note that the decorated names are never visible to a C program: they are strictly a linker facility, and the linker will never resolve one kind of reference with the “wrong” one.

    We can see this in action with a simple program that declares – but does not define – several functions that are not found by the linker.

    Note that since a double variable takes eight bytes (not four like an int), the three-parameter func2() is …@16 instead of …@12. But both of the __cdecl functions are undecorated in this manner.

    But doesn’t the compiler catch this?

    Yes, but the calling-convention decorations are solving a somewhat narrower problem than function prototypes do.

    C++ has always supported, and ANSI C introduced, “function prototypes”, which allow one to describe the parameters of a function in a declaration (previously, only the return type was part of a declaration). When a function is actually called, it’s compared with the declaration, and a warning issued:

    /* somefile.c */

    extern int foo(int a); // prototype
    ...
    n = foo(1, 2, 3); // mismatch! bad parameter count!

    Here, the compiler expects the foo() function to take just one parameter, and when it see a few more (or with the wrong type), it objects. The Microsoft calling-conventions would add nothing to this.
    But when the linker enters the picture, it’s possible to see cases where this will arise. Consider two files, one that uses a function and the other that defines it:

    /* in file1.c */ /* in file2.c */

    extern int __cdecl foo(int); int __stdcall foo(int a)
    {
    ... ....
    n = foo(1); }

    Since the compiler never looks at the two source files together, it could never detect that there has been a mismatch in the calling conventions used. The resulting code, if linked, would destroy the stack.
    Before one lambasts the programmer for making such a foolish mistake, consider that the default calling convention is usually __cdecl, so even if the file1.c example omitted the declaration for foo(), it would still default upon first use to __cdecl. This is a different (but very common) oversight.

    In a small project, the example shown is highly contrived, but as systems get larger, this situation arises more often. It’s common to use a third-party library (which exports many functions), and one cannot always tell which compiler flags were used by the library builder.

    It’s at this point where we get to the real reason for the calling-convention decorations. It’s not to keep a programmer from calling a function with the wrong number (or type) of arguments:

    Calling-convention symbol decorations exist only to maintain stack discipline

    When does it matter?

    In most cases, it makes no difference either which calling convention is used by default throughout the program, or what the convention is on any particular function, but there are a few exceptions of note when using other than __cdecl for the default.

  • The function main() (and the wide version wmain()) must always be __cdecl.
  • The function WinMain() — the starting point for GUI programs — is always __stdcall, though this is pre-declared by the <windows.h> include file to make this more or less automatic.
  • Variadic (“printf-like”) functions are __cdecl even if declared otherwise (e.g., the calling convention keyword is ignored). We are surprised that the compiler does not issue a warning against this misuse:
    int __stdcall myprintf(const char *fmt, ...); // it's really __cdecl
  • Some library functions take addresses of other functions as parameters, and these must be matched properly. A common example is qsort, which takes a “compare” function as the last parameter, and this function must be __cdecl.
  • extern void __cdecl qsort( void *base, size_t num, size_t width,
    int (__cdecl *compare )(const void *, const void *) );
    ....
    int __stdcall mycompare(const void *p1, const void *p2)
    {
    // compare here
    }
    ....
    qsort(base, n, width, mycompare); // ERROR - mismatch

    We’ll note here that the calling convention of the qsort function itself doesn’t enter into this – it’s the convention of the parameter to qsort that does.

    This function-address-as-parameter issue also comes up with signal handlers.

    Building bigger systems

    As mentioned, smaller programs really just don’t care much about this, but when systems get larger, or when third-party libraries enter the picture, it becomes necessary to be aware of calling-convention issues (particularly on an inter-module basis). This is further complicated if the software in question must be ported to non-Windows platforms that have no notion of calling conventions.

    Even when one has the source to a third-party package (say, the excellent NET-SNMP library), one may still not be too excited about diving into the build system. Though UNIX build systems are almost always created based on “Makefiles”, Windows builds often use “project files” that are somewhat less transparent and more ad hoc.

    Our preference is to use __stdcall when possible, but it’s less important “which convention to use” than it is “all conventions match”. We also don’t like to insist that all parts be built the same way, so a library could be built mainly with one while the application another.

    We’ll start with the library, and with the first guidline:

    Rule #1:

    Library headers should explicitly name a calling convention everywhere – Do not rely on the default.

    When a library header includes the calling convention on every function, the default value won’t ever be considered, so the client application can use whatever it likes for its convention. The
    For a Win32-only library, it’s straightforward enough to simply note the calling convention on every function:

    Typedefs carry the calling convention (implicit or explicit) along with the type information for function pointers, and in our simple library, we’ve done this with the FAILHANDLER typedef.
    In practice it’s not strictly necessary to mark the function definitions with the calling-convention keywords, because if the definition is seen in the presense of the keyword-endowed declaration from the header file, it overrides the default.

    Most libraries also have internal functions that are not exported or visible to the users, and these need not have the calling conventions noted. The idea is that since the library is built as a whole in one big step, all the parts will share the same default convention even if it’s different from how other unrelated modules are built. As long as these functions are not visible to the outside, their conventions are a private matter.

    Where this gets trickier is when the library will be used by non-Win32 platforms, almost all of which treat __stdcall and related keywords as syntax errors.

    Rule #2:

    Use the C preprocessor and a portability-related header file to make this work seamlesly on non-MSVC platforms.

    We generally create a “portable.h” header file that contains this (and many other) definitions related to portability. Since the calling conventions are meaningless on non-Windows compilers, all that’s required to support them is to make them go away.

    #ifndef _WIN32
    # define __cdecl /* nothing */
    # define __stdcall /* nothing */
    # define __fastcall /* nothing */
    #endif /* _WIN32 */

    This elides these keywords anytime they are seen on a non-Windows platform, though developers using compilers other than Microsoft Visual C may need to tune the definitions a bit.

    We’ve found it most helpful to define our libraries such that they all have a “portable.h” header file that all others can include to help iron out some of these compiler and platform differences. “Portability” extends to far beyond just the calling conventions, though this is a habit that only “experience” can inform.

    As an aside, we’ll encourage those taking this approach to add support for the GNU C compiler __attribute__ facility, which can be used to provide high-level information tagging that the compiler uses to perform better error checking. Though the details of __attribute__ itself are not pertinant to our discussion, the “making it work for a non-GNU compiler” fits squarely in our “portable.h” scheme:

  • Unixwiz.net Tech Tip – Using GNU C __attribute__
  • An important goal is that the calling convention applied to the functions at library compile time must be the same that the client application sees when the header files is used. A contorted “portable.h” file that allows for (say) __stdcall to be defined sometimes and not other times on the same platform is asking for trouble.

    Steve Friedl is a software and network security consultant in Southern California. He runs Unixwiz.net which features a collection of tools, tech tips, and other information in the scope of his consulting practice.

    Related Articles

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here

    Latest Articles