Virtual method table
In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding).
Whenever a class defines a virtual function (or method), most compilers add a hidden member variable to the class that points to an array of pointers to (virtual) functions called the virtual method table. These pointers are used at runtime to invoke the appropriate function implementations, because at compile time it may not yet be known if the base function is to be called or a derived one implemented by a class that inherits from the base class.
There are many different ways to implement such dynamic dispatch, but use of virtual method tables is especially common among C++ and related languages (such as D and C#). Languages that separate the programmatic interface of objects from the implementation, like Visual Basic and Delphi, also tend to use this approach, because it allows objects to use a different implementation simply by using a different set of method pointers. The method allows creation of external libraries, where other techniques perhaps may not.[1]
Suppose a program contains three classes in an inheritance hierarchy: a superclass, Cat
, and two subclasses, HouseCat
and Lion
. Class Cat defines a virtual function named speak()
, so its subclasses may provide an appropriate implementation (e.g. either meow()
or roar()
). When the program calls the speak function on a Cat reference (which can refer to an instance of Cat
, or an instance of HouseCat
or Lion
), the code must be able to determine which implementation of the function the call should be dispatched to. This depends on the actual class of the object, not the class of the reference to it (Cat
). The class cannot generally be determined statically (that is, at compile time), so neither can the compiler decide which function to call at that time. The call must be dispatched to the right function dynamically (that is, at run time) instead.
Implementation
[edit]An object's virtual method table will contain the addresses of the object's dynamically bound methods. Method calls are performed by fetching the method's address from the object's virtual method table. The virtual method table is the same for all objects belonging to the same class, and is therefore typically shared between them. Objects belonging to type-compatible classes (for example siblings in an inheritance hierarchy) will have virtual method tables with the same layout: the address of a given method will appear at the same offset for all type-compatible classes. Thus, fetching the method's address from a given offset into a virtual method table will get the method corresponding to the object's actual class.[2]
The C++ standards do not mandate exactly how dynamic dispatch must be implemented, but compilers generally use minor variations on the same basic model.
Typically, the compiler creates a separate virtual method table for each class. When an object is created, a pointer to this table, called the virtual table pointer, vpointer or VPTR, is added as a hidden member of this object. As such, the compiler must also generate "hidden" code in the constructors of each class to initialize a new object's virtual table pointer to the address of its class's virtual method table.
Many compilers place the virtual table pointer as the last member of the object; other compilers place it as the first; portable source code works either way.[3] For example, g++ previously placed the pointer at the end of the object.[4]
Example
[edit]Consider the following class declarations in C++:
import std;
class Base1 {
private:
int b1 = 0;
public:
explicit Base1(int b1):
b1{b1} {}
virtual ~Base1() = default;
void nonVirtual() {
std::println("Base1::nonVirtual() called!");
}
virtual void fn1() {
std::println("Base1::fn1() called!");
}
};
class Base2 {
private:
int b2 = 0;
public:
explicit Base2(int b2):
b2{b2} {}
virtual ~Base2() = default;
virtual void fn2() {
std::println("Base2::fn2() called!");
}
};
class Derived : public Base1, public Base2 {
private:
int d = 0;
public:
explicit Base1(int b1, int b2, int d):
Base1(b1), Base2(b2), d{d} {}
~Derived() = default;
void fn3() {
std::println("Derived::fn3() called!");
}
void fn2() override {
std::println("Derived::fn2() called!");
}
};
int main() {
Base2* base2 = new Base2();
Derived* derived = new Derived();
// ...
delete base2;
delete derived;
}
g++ 3.4.6 from GCC produces the following 32-bit memory layout for the object base2
:[nb 1]
b2: +0: pointer to virtual method table of Base2 +4: value of b2 virtual method table of Base2: +0: Base2::fn2()
and the following memory layout for the object d
:
d: +0: pointer to virtual method table of Derived (for Base1) +4: value of b1 +8: pointer to virtual method table of Derived (for Base2) +12: value of b2 +16: value of d Total size: 20 Bytes. virtual method table of Derived (for Base1): +0: Base1::fn1() // Base1::fn1() is not overridden virtual method table of D (for Base2): +0: Derived::fn2() // Base2::fn2() is overridden by Derived::fn2() // The location of Base2::fn2 is not in the virtual method table for Derived
Note that those functions not carrying the keyword virtual
in their declaration (such as nonVirtual()
and d()
) do not generally appear in the virtual method table. There are exceptions for special cases as posed by the default constructor.
Also note the virtual destructors in the base classes, Base1
and Base2
. They are necessary to ensure delete derived;
can free up memory not just for Derived
, but also for Base1
and Base2
, if derived
is a pointer or reference to the types Base1
or B2
. They were excluded from the memory layouts to keep the example simple. [nb 2]
Overriding of the method fn2()
in class Derived
is implemented by duplicating the virtual method table of Base2
and replacing the pointer to Base2::fn2()
with a pointer to Derived::fn2()
.
Multiple inheritance and thunks
[edit]The g++ compiler implements the multiple inheritance of the classes Base1
and Base2
in class Derived
using two virtual method tables, one for each base class. (There are other ways to implement multiple inheritance, but this is the most common.) This leads to the necessity for "pointer fixups", also called thunks, when casting.
Consider the following C++ code:
Derived* derived = new Derived();
Base1* base1 = derived;
Base2* base2 = derived;
While derived
and base1
will point to the same memory location after execution of this code, base2
will point to the location derived + 8
(eight bytes beyond the memory location of derived
). Thus, base2
points to the region within derived
that "looks like" an instance of Base2
, i.e., has the same memory layout as an instance of Base2
.[clarification needed]
Invocation
[edit]A call to derived->fn1()
is handled by dereferencing derived
's Derived::Base1
vpointer, looking up the fn1
entry in the virtual method table, and then dereferencing that pointer to call the code.
Single inheritance
[edit]In the case of single inheritance (or in a language with only single inheritance), if the vpointer is always the first element in derived
(as it is with many compilers), this reduces to the following pseudo-C++:
(*((*derived)[0]))(derived)
Where *derived
refers to the virtual method table of Derived
and [0]
refers to the first method in the virtual method table. The parameter derived
becomes the "this
" pointer to the object.
Multiple inheritance
[edit]In the more general case, calling Base1::fn1()
or Derived::fn2()
is more complicated:
// Call derived->fn1()
(*(*(derived[0]/*pointer to virtual method table of Derived (for Base1)*/)[0]))(derived)
// Call derived->fn2()
(*(*(derived[8]/*pointer to virtual method table of Derived (for Base2)*/)[0]))(derived + 8)
The call to derived->fn1()
passes a Base1
pointer as a parameter. The call to derived->fn2()
passes a Base2
pointer as a parameter. This second call requires a fixup to produce the correct pointer. The location of Base2::fn2
is not in the virtual method table for Derived
.
By comparison, a call to derived->fnonvirtual()
is much simpler:
(*Base1::fnonvirtual)(derived)
Efficiency
[edit]A virtual call requires at least an extra indexed dereference and sometimes a "fixup" addition, compared to a non-virtual call, which is simply a jump to a compiled-in pointer. Therefore, calling virtual functions is inherently slower than calling non-virtual functions. An experiment done in 1996 indicates that approximately 6–13% of execution time is spent simply dispatching to the correct function, though the overhead can be as high as 50%.[5] The cost of virtual functions may not be so high on modern CPU architectures due to much larger caches and better branch prediction.
Furthermore, in environments where JIT compilation is not in use, virtual function calls usually cannot be inlined. In certain cases it may be possible for the compiler to perform a process known as devirtualization in which, for instance, the lookup and indirect call are replaced with a conditional execution of each inlined body, but such optimizations are not common.
To avoid this overhead, compilers usually avoid using virtual method tables whenever the call can be resolved at compile time.
Thus, the call to fn1
above may not require a table lookup because the compiler may be able to tell that derived
can only hold a Derived
at this point, and Derived
does not override fn1
. Or the compiler (or optimizer) may be able to detect that there are no subclasses of Base1
anywhere in the program that override fn1
. The call to Base1::fn1
or Base2::fn2
will probably not require a table lookup because the implementation is specified explicitly (although it does still require the this
-pointer fixup).
Comparison with alternatives
[edit]The virtual method table is generally a good performance trade-off to achieve dynamic dispatch, but there are alternatives, such as binary tree dispatch, with higher performance in some typical cases, but different trade-offs.[1][6]
However, virtual method tables only allow for single dispatch on the special "this" parameter, in contrast to multiple dispatch (as in CLOS, Dylan, or Julia), where the types of all parameters can be taken into account in dispatching.
Virtual method tables also only work if dispatching is constrained to a known set of methods, so they can be placed in a simple array built at compile time, in contrast to duck typing languages (such as Smalltalk, Python or JavaScript).
Languages that provide either or both of these features often dispatch by looking up a string in a hash table, or some other equivalent method. There are a variety of techniques to make this faster (e.g., interning/tokenizing method names, caching lookups, just-in-time compilation).
See also
[edit]Notes
[edit]- ^ G++'s
-fdump-class-hierarchy
(starting with version 8:-fdump-lang-class
) argument can be used to dump virtual method tables for manual inspection. For AIX VisualAge XlC compiler, use-qdump_class_hierarchy
to dump class hierarchy and virtual function table layout. - ^ "C++ - why there are two virtual destructor in the virtual table and where is address of the non-virtual function (gcc4.6.3)".
References
[edit]- Margaret A. Ellis and Bjarne Stroustrup (1990) The Annotated C++ Reference Manual. Reading, MA: Addison-Wesley. (ISBN 0-201-51459-1)
- ^ a b Zendra, Olivier; Colnet, Dominique; Collin, Suzanne (1997). Efficient Dynamic Dispatch without Virtual Function Tables: The SmallEiffel Compiler -- 12th Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA'97), ACM SIGPLAN, Oct 1997, Atlanta, United States. pp.125-141. inria-00565627. Centre de Recherche en Informatique de Nancy Campus Scientifique, Bâtiment LORIA. p. 16.
- ^ Ellis & Stroustrup 1990, pp. 227–232
- ^ Danny Kalev. "C++ Reference Guide: The Object Model II". 2003. Heading "Inheritance and Polymorphism" and "Multiple Inheritance".
- ^ "C++ ABI Closed Issues". Archived from the original on 25 July 2011. Retrieved 17 June 2011.
{{cite web}}
: CS1 maint: bot: original URL status unknown (link) - ^ Driesen, Karel; Hölzle, Urs (1996). "The Direct Cost of Virtual Function Calls in C++" (PDF). OOPSLA.
- ^ Zendra, Olivier and Driesen, Karel, "Stress-testing Control Structures for Dynamic Dispatch in Java", pp. 105–118, Proceedings of the USENIX 2nd Java Virtual Machine Research and Technology Symposium, 2002 (JVM '02)