Operator Overloading - Théo's Fieldnotes

## Type Operations Imagine you create a class called `Complex` so that it can hold complex numbers. ```cpp class Complex { public: Complex(double real, double imag) : real_(real), imag_(imag) {} double real() const { return real_; } double imag() const { return imag_; } private: double real_; double imag_; }; ``` How would you add up two instances of the class `Complex` together? A first approach could be to implement an `add` static member method. ```cpp int main() { Complex z1(34,43); Complex z2(12,7); Complex z3 = Complex::add(z1, z2); std::cout << z3.real() << "," << z3.imag() << "i" << std::endl; return 0; } ``` You could implement a `static` method in the class, but wouldn't it be more convenient to use the `+` operator, so that we can write code like below? ```cpp int main() { Complex z1(34,43); Complex z2(12,7); Complex z3 = z1 + z2; std::cout << z3 << std::endl; return 0; } ``` We can achieve that by overloading the print and addition operators. This means we're going to provide our own implementation for those. We can do the same for other operators, of course, whether they're arithmetic (`+`, `-`, `*`...), logic (`&&`, `||`, `!`...), comparison (`==`, `<=`,...), increment (`-`,`++`, `+=`...) or assignment (`=`) operators. Indeed, whenever we use an operator, a previously defined function or member method is called. They act as a shorthand syntax to make code easier to read and write. Consider the following two lines. ```cpp std::cout << "Hello World"; std::cout.operator<<("Hello World"); ``` They are exactly the same. `operator<<` is an overloaded member function of `std::iostream`, whose signature is `std::ostream::operator<<(const char*)`, and `std::cout` is an instance of `std::ostream`. `<<` is usually related to bit shifting, but here they are overloaded to mean something different in this context. Overloading happens when two functions have the exact same name but have different signatures. The two `max` functions below are said to be overloaded. ```cpp int max(int, int); double max(double, double); ``` They accept different parameters, so the compiler knows exactly which function to match with a specific call. Since operators are merely syntactic sugar for function calls, and since we can overload functions in C++, we can, in turn, implement operators for our own classes and structs by overloading them. ## External Overload Going back to our `Complex` class example, we can overload the operator `+` using a function, creating an external overload. External overloads are typically used for *symmetric* operators, such as `+`, `-`, `*`, `/`, `==`, `!=`, `<<`. ```cpp Complex operator+(Complex lhs, const Complex& rhs) { // Here we return an anonymous instance of the Complex class return Complex(lhs.real() + rhs.real(), lhs.imag() + rhs.imag()) } ``` Because of how C++ 11 internals work, it's best to pass the first argument by value; a *move* operation will be performed instead of a copy. In some situations, we *have* to use an external overload. This is for instance the case when we want to overload operators that implement operations on heterogenous types. Let's say we want to make possible the multiplication of a `doule` with a `Complex` number. ```cpp const Complex operator*(double x, const Complex& z) { return Complex(x * z.real(), z.imag()); } ``` This is also the case when we want to define overloads on types we did not define. `cout` is an instance of `iostream`. We shouldn't try to modify the `iostream` class, and we use an external overload. ```cpp std::ostream& operator<<(std::ostream& os, const Complex& z) { os << '(' << z.real() << ", " << z.imag() << 'i' << ')'; return os; } ``` Here, `cout` is passed by reference. This is the case because we need to modify `cout` since we're writing to it. We can now write `Complex` objects to the console. ```cpp int main() { Complex z(1.0, 2.0); std::cout << z << std::endl; return 0; } ``` However, it would have been best to define a method to convert a `Complex` into a string, so that this logic stays encapsulated within its class, instead of relying on the operator overload to implement this logic. ```cpp class Complex { public: std::string to_string() const { std::stringstream ss; ss << '(' << real_ << ", " << imag_ << 'i' << ')'; return ss.str(); } // ... }; ``` Now we can rewrite the overload so that we can output complex numbers to the console. ```cpp std::ostream& operator<<(std::ostream& os, const Complex& z) { os << z.to_string(); return os; } ``` Much cleaner ! >👉 The `operator<<` overload *needs* to return `std::ostream&`, so that we can chain calls to `std::cout`, like that: `std::cout << z << std::endl;`. This will be evaluated as `operator<<(operator<<(std::cout, z),std::endl);`. ## Making Friends In some cases, you might need to allow functions that overload operators externally to access private elements from a class, where accessors are not enough. To do so, we use the keyword `friend` before the member method definition inside the class. Once the overload is marked as `friend`, no need to use accessors, we can access the instance's `private` members directly. ```cpp class Complex { public: friend std::ostream& operator<<(ostream& os, const Complex& z); // ... } std::ostream& operator<<(std::ostream& os, const Complex& z) { return os << '(' << z.real_ << ", " << z.imag_ << 'i' << ')'; } ``` ## Internal Overload So, we can create internal overloads using member methods, too. They are typically used for: - Operators that mutates the state of the left-hand operand like `+=`, `/=`, `*=` - Special operators that must be members, like `=`, `[]`, `()`, `->` The operator `+=` mutates the left-hand object `*this`, so we'll use an internal overload. ```cpp class Complex { public: Complex& operator+=(const Complex& z); // ... }; Complex& Complex::operator+=(const Complex& z) { real_ += z.real(); imag_ += z.imag(); return *this; } ``` Now that we've defined the `+=` operator overload, we can leverage it to simplify the `+` operator overload. This is a C++ idiom to reduce code duplication and ensure consistency. We'll implement `operator+` as an external overload because it is symmetric. ```cpp class Complex { public: Complex& operator+=(const Complex& z); // ... }; Complex& Complex::operator+=(const Complex& z) { /* ... */ } // External overload inline Complex operator+(Complex lhs ,const Complex& rhs) { lhs += rhs; return lhs; } ``` Passing the left-hand operand by copy leverages the compiler's `copy / move` semantics. >👉 Historically, the `inline` keyword was used as hint to the compiler, suggesting it inlined the function, replacing the function call with the actual body of the function to avoid the overhead of a function call (setting up the stack, jumping to a new memory location and returning). Compiler have become extremely good at optimization and will often ignore this hint, making decisions based on their rules about which functions to inline for the best performance. > >The reason we still use it today is C++ as a **one definition rule** (ODR), which states a non-inline function may be present at most only once in an entire program. Defining a function in a header file that you include in multiple places makes the linker see multiple definitions of the same function, yielding an error. `inline` tells the linker it's okay if there are multiple definitions of this function because they are all the same, so it can just pick one. This makes it possible to define small functions, like the `operator+` overload directly in the header file for convenience. When writing `z3 = z1 + z2`, `z1` is copied into the `lhs` parameter of `operator+`, `lhs` is then modified by `+=` before being returned. ## Type Conversions Using external overloads is what makes mixed operations possible, like the case of the multiplication between a `double` and a `Complex` number. ### Symmetric Operator, Asymmetric Operation Let's review what happens when we rely on internal overload member methods for symmetric operators. When we write `z1 * z2`,`z1`'s `operator*` overload gets called. Assuming `Complex` has a constructor `Complex(double real)` that can convert a `double` to a `Complex`, let's consider two situations. If we write `z1 * 2.0`, it's the left-hand term's operator overload that is called. ```cpp z1.operator*(Complex(2.0)); // ✅ Works ! ``` If we write `2.0 * z1`, we get a compile-time error, because `double` doesn't have a member function with the signature `Complex double::operator*(const Complex& z);`. ```cpp (2.0).operator*(z1); // 🚫 Compiler not happy ``` What's more, it doesn't have a constructor to create a `double` from a `Complex`. More importantly still, `double` is a primitive type and has no member functions. The compiler cannot perform a conversion on the left-hand side, because it's not a parameter, it's the object the method is being called on. This makes this operation asymmetric: It works only if `Complex` is the left-hand side of the expression. ### Restoring Symmetry This is obviously not idea. To avoid such situation, we use external operator overloads, making both sides of the operation parameters of the function. ```cpp Complex operator*(const Complex& lhs, const Complex& rhs); ``` Now the compiler can perform its standard type conversion process on either argument to find a matching function. Writing `z * 2.0` translates to `operator*(z, Complex(2.0))`. Writing `2.0 * z` translates to `operator*(Complex(2.0), z)`. Again, because there is a `Complex` constructor that accepts a `double` as a parameter, the compiler is successfully able to cast `2.0`. One thing you might ask is whether that means the compiler generates functions our program needs for all possible permutations of operator overloads and class constructors. At least I did. Turns out that this is not the case. The compiler follows a three-step process to determine the best course of action: 1. Function signature name lookup 2. Viability check on both sides to try to match operands to the function's signature 3. Best match selection The compiler translates `z + 2.0` into machine code just as if you had written `z + Complex(2.0)` because it found a `Complex` constructor for `double`. This means that this doesn't make your executable heavier, and it doesn't create overhead at runtime. ## Example Implementations ### Comparison Operators The idiomatic C++ way is to implement `operator==` and then define `operator !=` in terms of `==`. ```cpp class Complex { public: friend bool operator==(const Complex& lhs, const Complex& rhs); // ... } inline bool operator==(const Complex& lhs, const Complex& rhs) { return lhs.real_ == rhs.real_ && lhs.imag_ == rhs.imag_; } inline bool operator!=(const Complex& lhs, const Complex& rhs) { return !(lhs==rhs); } ``` >👉 You don't use the keyword `const` for non-member functions marked as `friend` inside a class. The trailing `const` keyword can only be used on member functions because it specifies that the `this` pointer is a pointer-to-const. Since non-member functions (including friends) do not have a `this` pointer, the trailing `const` qualifier is not applicable to them. C++ 20 simplifies this even further with the spaceship operator `<=>` that automatically generates all comparison operators. ### Increments & Decrements Operators Because they mutate the instance's state, we implement it as an internal overload. A pre-increment mutates the object in place and returns a reference to itself (increment then yield), while a post-increment mutates the original object but makes and return a copy of the state before modification (yield then increment). Decrements follow the exact same pattern. ```cpp class Complex { public: Complex& operator++(); Complex operator++(int); Complex& operator--(); Complex operator--(int); // ... } // pre-increment Complex& Complex::operator++() { real_ += 1.0; // Let's increment the real part, for example return *this; // Return a reference to the mutated instance } // post-increment Complex Complex::operator++(int) { Complex temp = *this; ++(*this); // using the pre-increment we defined above return temp; // return a copy of the original object } // pre-decrement Complex& Complex::operator--() { real_ -= 1.0; return *this; } // post-decrement Complex Complex::operator--(int) { Complex temp = *this; --(*this); return temp; } ``` >👉 For the compiler to distinguish between pre-increment and post-increment, notice with use a dummy `int` parameter. >⚠️ You have to be careful when using post-increments, because the expression they're wrapped in will be evaluated with the state of the object before it was mutated. ### Copy Assignment It's often the case that the default copy assignment operator overload provided by the compiler is enough. Nonetheless, let's see how we might implement it ourselves. ```cpp class Complex { public: Complex& operator=(const Complex& rhs); // ... } Complex& Complex::operator=(const Complex& rhs) { if (this == &rhs) { return *this; } real_ = rhs.real_; imag_ = rhs.imag_; return *this; } ``` It's critical that we check we're not assigning an object to itself. In simple cases such as ours, this is just inefficient. If a class manages a resource, however, such as memory with a raw pointer, omitting this checks creates major bugs that can lead to crashes. You should always perform that check when implementing the copy assignment. This does not mean you need to overload `operator==` first. Here, we are comparing memory addresses, not the object themselves. #### Copy-and-Swap Idiom Since C++ 11, we also have the option of implementing the `operator=` overload using the copy-and-swap idiom, using `std::swap`. This results in simpler and safer implementation. The idea is to rely on the copy constructor to copy before swapping the internal state of two instances. ```cpp #include <utility> // For std::swap class Complex { public: Complex(double real, double imag): real_(real), imag_(imag) {} Complex(const Complex& other): Complex(other.real_, other.imag_) {} void swap(Complex& rhs) noexcept { std::swap(real_, rhs.real_); std::swap(imag_, rhs.imag_); } Complex& operator=(Complex rhs); // ... } Complex& Complex::operator=(Complex rhs) { this->swap(rhs); // Same as this.swap(rhs) return *this; } int main() { Complex z1(1.0, 2.0); Complex z2(2.0, 1.0); z2 = z1; return 0; } ``` When we write `z2 = z1`: 1. `operator=` is called on `z2`. A temporary object, the `rhs` parameter, is created as a copy of the argument `z1` by invoking the copy constructor `Complex(const Complex& other)` 2. Inside of `operator=`'s body, the `swap` member function is called on the left-hand side: `lhs.swap(rhs)` 3. At this point, `z2` now holds `z1`'s state, while `rhs` still holds on to `z1`'s state 4. `operator=` returns a reference to the mutated left-hand side `this`. 5. The temporary copy of `rhs` goes out of scope, its destructor is called. Lastly, it's idiomatic to provide a non-member version to call the member method. ```cpp void swap(Complex& lhs, Complex& rhs) noexcept { lhs.swap(rhs); } ``` This will allow generic code, like standard library algorithms to swap instances of this class efficiently, making our class a good citizen of the C++ ecosystem. This allows `std::swap` to avoid making wasteful copies of potentially heavy objects when our implementation only swaps interval pointers and values, making it much faster. Now, when we write something like `std::swap(z1, z2)`, the compiler will not only look at the current scope but also at the namespace of the function's arguments. This is **Argument-Dependent Lookup**. Since we provided a non-member `swap(Complex&, Complex&)` in the same namespace as the `Complex` class, ADL allows the compiler to find it automatically when two `Complex` objects are swapped. Neat ! #### Removing the Copy Assignment Additionally, it might be safer in some cases, like when an instance is an heavy object that we want only one instance of, to `delete` `operator=` so that an instance cannot be copied. ```cpp class Complex { public: Complex& operator=(const Complex& rhs) = delete; // ... } ``` ### Compound Assignment As stated earlier, it's idiomatic to overload `operator+` in terms of `operator+=` in C++. ```cpp class Complex { public: Complex& operator+=(const Complex& rhs); // ... }; // Internal overload Complex& Complex::operator+=(const Complex& rhs) { real_ += rhs.real_; imag_ += rhs.imag_; return *this; } // External overload inline Complex operator+(Complex lhs, const Complex& rhs) { lhs += rhs; return lhs; } ``` ### Negation Operator It's worth noting the `operator-` overload for the subtraction operation should not be mistaken for that of the unary minus (the function that reverses the sign of the object). They have two distinct signatures, the former taking an argument, the object being subtracted, and the latter taking no argument, since the function is applied to the object itself. Additionally, just like with the `operator+` overload, subtraction will be defined as an external overload, while the sign function will be defined as a member method, since it's only related to the object it is applied to. Completing the example pertaining to the compound assignment, ```cpp class Complex { public: Complex& operator+=(const Complex& rhs); Complex& operator-=(const Complex& rhs); Complex& operator-() const; // takes no arguments // ... }; // Internal overloads Complex& Complex::operator+=(const Complex& rhs) { real_ += rhs.real_; imag_ += rhs.imag_; return *this; } Complex& Complex::operator-=(const Complex& rhs) { real_ -= rhs.real_; imag_ -= rhs.imag_; return *this; } // Unary operators do not mutate the instance in place // We return a new object by value Complex Complex::operator-() const { return Complex(-real_, -imag_); } // External overloads inline Complex operator+(Complex lhs, const Complex& rhs) { lhs += rhs; return lhs; } inline Complex operator-(Complex lhs, const Complex& rhs) { lhs -= rhs; return lhs; } ``` ## List of Overloadable Operators ### Arithmetic Operators - `+` (Unary Plus): Indicates a positive value. - `-` (Unary Minus): Negates a value. - `+` (Binary Plus): Performs addition. - `-` (Binary Minus): Performs subtraction. - `*` (Multiplication): Performs multiplication. - `/` (Division): Performs division. - `%` (Modulo): Computes the remainder of a division. - `++` (Increment): Increases a value by one (pre- and post-increment). - `--` (Decrement): Decreases a value by one (pre- and post-decrement). ### Bitwise Operators - `&` (Bitwise AND): Performs a bit-by-bit AND operation. - `|` (Bitwise OR): Performs a bit-by-bit OR operation. - `^` (Bitwise XOR): Performs a bit-by-bit exclusive OR operation. - `~` (Bitwise NOT / Complement): Flips all bits of a value. - `<<` (Left Shift): Shifts bits to the left. - `>>` (Right Shift): Shifts bits to the right. ### Assignment Operators - `=` (Assignment): Assigns a value. - `+=` (Addition Assignment): Adds and assigns (`a = a + b`). - `-=` (Subtraction Assignment): Subtracts and assigns (`a = a - b`). - `*=` (Multiplication Assignment): Multiplies and assigns (`a = a * b`). - `/=` (Division Assignment): Divides and assigns (`a = a / b`). - `%=` (Modulo Assignment): Computes remainder and assigns (`a = a % b`). - `&=` (Bitwise AND Assignment): Performs bitwise AND and assigns (`a = a & b`). - `|=` (Bitwise OR Assignment): Performs bitwise OR and assigns (`a = a | b`). - `^=` (Bitwise XOR Assignment): Performs bitwise XOR and assigns (`a = a ^ b`). - `<<=` (Left Shift Assignment): Performs left shift and assigns (`a = a << b`). - `>>=` (Right Shift Assignment): Performs right shift and assigns (`a = a >> b`). ### Relational and Comparison Operators - `==` (Equal to): Tests for equality. - `!=` (Not equal to): Tests for inequality. - `<` (Less than): Tests if a value is less than another. - `>` (Greater than): Tests if a value is greater than another. - `<=` (Less than or equal to): Tests if a value is less than or equal to another. - `>=` (Greater than or equal to): Tests if a value is greater than or equal to another. ### Logical Operators - `!` (Logical NOT): Inverts a boolean value. - `&&` (Logical AND): Performs a logical AND operation (cannot be overloaded directly but its behavior can be customized by overloading `operator bool` and other operators). - `||` (Logical OR): Performs a logical OR operation (cannot be overloaded directly). ### Member and Pointer Operators - `[]` (Subscript): Provides array-like access using an index. - `()` (Function Call): Makes an object callable like a function. - `*` (Dereference): Accesses the value a pointer points to. - `&` (Address-of): Gets the memory address of an object (rarely overloaded). - `->` (Member Access by Pointer): Accesses a member of an object via a pointer. - `->*` (Pointer-to-Member Access): Accesses a member pointed to by a pointer-to-member. ### Memory Management Operators - `new` (New): Allocates memory for an object or array. - `new[]` (New array): Allocates memory for an array. - `delete` (Delete): Deallocates memory for an object. - `delete[]` (Delete array): Deallocates memory for an array. ### Other Operators - `,` (Comma): Evaluates the left operand, discards the result, then evaluates the right operand. ### Alternative Operator Keywords It's also possible to use alternative spellings for a few common operators. It can make sense for keyboards that do not have standard symbols, but it's also easier to read, especially when it comes to overloads. When you write many overloads back to back, it can be easier to read `operator not_eq` than `operator!=`. But it's a matter of personal preference. - `and` (`&&`) - `or` (`||`) - `xor` (`^`) - `not` (`!`) - `and_eq` (`&=`) - `or_eq` (`|=`) - `xor_eq` (`^=`) - `not_eq` (`!=`) - `bitand` (`&`) - `bitor` (`|`) - `compl` (`~`) To overload an operator using its alternative spelling, you write: ```cpp bool operator not_eq(const MyClass& rhs, const MyClass lhs) const; ```