Skip to content
Jim T edited this page Mar 10, 2019 · 1 revision

The main thing that C++ adds to C is the entire new concept of objects. Objects aren't just a new feature that's available to the language, it's the foundational structure in the language, and will be the primary tool used when creating any project.

This change requires something of shift in headspace, it's one of those things that if you don't get it immediately will become a massive "oh! now I get it!" moment later.

In essence, C++ encourages you to define your own types and write the code within those types to make them behave however you need to. This means you can write code that is entirely self-contained without polluting the rest of the application space with unexpected clutter.

The pattern is that you write a class and put your code & any variables you need into it. Then the outside program makes instances of your classes, called objects. Each object gets its own copy of all the variables it holds. When you're writing the code to implement an object, you only naturally get access to the variables you declare in your class.

As an example:

class Counter {
private:
  int count = 0;

public:
  void increment() {
    if (count == INT_MAX) {
      count = 0;
    } else {
      count += 1;
    }
  }

  int value() {
    return count;
  }
};

void main() {
  Counter a;
  Counter b;

  std::cout << "Initial values - a: " << a.value() << ", b: " << b.value() << std::endl;

  // Increment a twice
  a.increment();
  a.increment();

  // Increment b once
  b.increment();

  std::cout << "Final values - a: " << a.value() << ", b: " << b.value() << std::endl;
}

When run, this should output:

  Initial values - a: 0, b: 0
  Final values - a: 2, b: 1

This small snippet demonstrates quite a bit. First of all, notice that the main function declares 2 instances of the Counter type. They are completely independent, just as if you'd declared 2 integer variables instead. You can have as many instances of your class in your code as memory allows.

Also notice how the code inside the Counter class accesses only the member variables on its class. For example, inside the increment method, we only have access to the member variables of our class (in this case, count), any local variables we declare in the method, and the global variables.

One thing that's not obvious in the above example is that there is no way for the code in the main function to tamper with the counters. If you tried to write a.count = 4; in the main function, it wouldn't compile. This is because the count member of the Counter class is private. private means that a member can only be used by the class that defines it. This can be used to prevent misuse of objects by outside code that might not have the full picture of what needs to be done. The only thing that can be done with our counters is you can increment them by 1 and you can get their value. You can't reset them, you can't randomise their values. How their state is changed is defined solely by the methods on the class.

If you wanted to give the main() function access to the count member, change private: to public: in the above snippet. Then main can set the counter however it likes. Also notice that we now have no guarantees about what the counter value may be. The count value might be negative, something that couldn't happen before.

There is another way of declaring a class, using the struct keyword. structs only difference over classes is that everything is public by default, where in classes everything is private until you say otherwise. The following definitions are exactly the same:

class Customer {
public:
  std::string name;
  std::string address;
};
struct Customer {
  std::string name;
  std::string address;
};

There is literally no difference between these declarations.

The definitions above don't actually create anything. Declaring a class does not allocate any memory nor does it run any code. It is only defining a template that can be used to create an object later.

Classes and struct definitions can be nested. For example:

struct Customer {

  class Counter {
    int count = 0;
    
    public:
      void increment() {
        count = (count + 1) % INT_MAX;
      }

      int value() {
        return value;
      }
  };

  std::string name;
  std::string address;
  Counter orders;
  Counter complaints;

  void addOrder();
};

As an exercise, lets implement the addOrder method and see what's available to us:

void Customer::addOrder() {
  orders.count = 5; // This will fail, we don't have access to private members even if they're declared inside nested classes. How could we get access if we needed it and it made sense?
  complaints.increment(); // Yes, we have access to the increment method of our members.

  count = 4; // This will fail. We don't have our own count member variable. We can't access the member variables of other structs directly, we need to say which instance we're accessing - orders or complaints. Otherwise there's no way for the compiler to know.
}

Here, our Customer has 2 counters, the number of orders and the number of complaints. It makes sense that these are separate things that would be tracked independently. The code inside addOrder is not intended to make sense as an implementation, it's more to show what we have access to and what we don't.

Notice how this method is implemented outside of the class definition. This is the common way to implement methods. Generally the class definition is kept as short as possible to act as a quick reference to what's available in the class. The implementation, which can often be longer, gets broken out. Usually we put class definitions into a .hpp file, and the implementation code goes into a .cpp file. The implementation is still limited to accessing only the members on the current instance of the class.

Circular dependencies

Every time a class references another class, that's a dependency. It is entirely possible to get into a situation where two classes each reference each other. This is a problem, because the C++ compiler will only read your code from top to bottom once. As it goes, it allocates tables of data sizes and offsets into structures. If at any point it can't work out how large something is or where a method is, it will fail.

Lets consider a situation where a widget has a button, and when the button is activated, it sets a variable in the widget. How might we try to model this?

struct Widget {
   bool flashing = false;
   Button startFlashing;
};

struct Button {
   Widget parent;
   void onAction() {
     parent.flashing = true;
   }
};

This approach just won't work. When the compiler comes across struct Widget it tries to calculate how many bytes it needs to allocate when someone creates this widget. The widget contains two things, first a bool, that's fine, a single byte can be used to allocate that. But now it comes to Button, but it hasn't seen a Button yet, remember it only reads from top to bottom, and only once. So it can't work out how big the Widget should be and fails.

What if we reversed the definitions?

struct Button {
   Widget parent;
   void onAction() {
     parent.flashing = true;
   }
};

struct Widget {
   bool flashing = false;
   Button startFlashing;
};

Well, then we'd get the same situation, the Button needs to contain a Widget, but we don't know how big a widget is because we haven't seen it. So we don't know how many bytes to allocate for our Button objects, so we fail.

There's a deeper and more insidious problem here too. The button actually contains a new instance of a widget. Imagine if we could compile this. When we create our top-level widget, we create a button, because a widget contains a button. Then the button has its own complete Widget inside it, so we create one of those - but a widget has a button inside ... and so on. We don't want our button to have its own widget, we want our button to know about its parent widget.

We can do this with pointers.

struct Button {
   Widget *parent;
   void onAction() {
     parent->flashing = true;
   }
};

struct Widget {
   bool flashing = false;
   Button startFlashing;
};

Notice our button now contains a pointer to a Widget, not an entire widget instance. The compiler will still fail, because as it reads down, it doesn't know what a Widget is. But we're closer. One useful trick C++ gives us is a "forward declaration". The only job of this statement is to tell the compiler "don't worry, it's coming". A forward declaration looks like this:

struct Widget; - notice there's no clue as to what's in the Widget, what its baseclass is, or anything. It just says that there's something called a Widget and we'll find it later.

Now we have:

struct Widget;

struct Button {
   Widget *parent;
   void onAction() {
     parent->flashing = true;
   }
};

struct Widget {
   bool flashing = false;
   Button startFlashing;
};

The compiler is now actually very happy about the pointer to the parent widget. The compiler knows how many bytes to allocate for a pointer - it's always the same, 64bits on a current modern computer. The code will now fail to compile the onAction method. This is because as we read from top to bottom, the compiler knows Widget exists, knows how to create a pointer to it, but now we've got to access something inside the object we're pointing to. The compiler doesn't know where the flashing variable is inside the Widget structure because all it's seen is the forward declaration. So it can't output the code needed to say, for example: "go 4 bytes after the address to parent and store the byte value 1 there". Basically, it hasn't seen the "flashing" variable defined, so can't use it. We could try adding flashing to the forward declaration, but now we get back to the situation where Widget was defined first, which doesn't work.

What we need to do is break out the implementation of onAction. We can do it like this:

struct Widget;

struct Button {
   Widget *parent;
   void onAction(); // name the method, give its return type and list all the parameters, but no implementation
};

struct Widget {
   bool flashing = false;
   Button startFlashing;
};

void Button::onAction() { // Actually implement onAction
  parent->flashing = true;
}

Now as we read from top to bottom, we see: Widget exists and will be defined. Button exists, has a pointer to a Widget and a method called onAction which will be defined later. Then we see our Widget and that it has a variable called flashing and Button variable called startFlashing. Then we see a function, which is for the Button class, that's fine, we've seen that. Called onAction, which we've already seen, so we can tie those up. It accesses the parent variable, which is declared on the Button class, so that's good. And we're accessing flashing, and we've seen the flashing variable and where it is inside the Widget, so that's all good.

The only problem now is that we've never set the parent pointer. In this case if we ran the code, the pointer would have some random value in it from previous memory and we'd try to overwrite something that wasn't actually a Widget. Probably causing a crash and almost certainly corrupting memory.

We need to set the pointer on the Button, since we need to introduce some logic, we need constructors. A Constructor is a method that gets run when an object is first created.

struct Widget;

struct Button {
   Widget *parent;
   void onAction(); // name the method, give its return type and list all the parameters, but no implementation
};

struct Widget {
   bool flashing = false;
   Button startFlashing;

   Widget() {
     startFlashing->parent = this;
   }
};

void Button::onAction() { // Actually implement onAction
  parent->flashing = true;
}

This will now work fine. We need to do some work whenever we create a Widget object, so we add a constructor. The constructor sets the parent value of the Button to itself, so now the Button can call back into us.

More safety

Above, was the cheap, nasty, minimal, but effective way to break this kind of circular dependency. There are ways to make this much better. For example, it's very easy to forget to set the parent of the Button. We can have the compiler give us an error if we forget to do this. We do this by creating a constructor on the Button that needs a Widget pointer passed into it. If we have this, we can't create a Button without passing in a Widget.

struct Widget;

struct Button {
   Widget *parent;

   Button(Widget* p) {
     parent = p;
   }

   void onAction(); // name the method, give its return type and list all the parameters, but no implementation
};

struct Widget {
   bool flashing = false;
   Button startFlashing;

   Widget() : startFlashing(this) {
   }
};

void Button::onAction() { // Actually implement onAction
  parent->flashing = true;
}

This is a lot safer as now we can't forget to set the parent widget. Notice that the form of the constructor for Widget has changed. This is the syntax for passing values into the constructor for our member variables.

Another problem is that Button knows about Widget explicitly. This limits our button to only interacting with things that are derived from Widget. This might be fine, but if it's not, we'll need another more flexible approach. This would be to pull out the things that are common between Button and Widget into a separate struct and use that. This would solve all our circular dependency issues in one go.

struct FlashingThing {
  bool flashing = false;
};

struct Button {
   FlashingThing *parent;

   Button(FlashingThing* p) {
     parent = p;
   }

   void onAction() { // Actually implement inline, we know everything we need to about flashing things
     parent->flashing = true;
   }
};

struct Widget {
   FlashingThing isFlashing;
   Button startFlashing;

   Widget() : startFlashing(&isFlashing) {
   }
};

This is one way to break the cycle. The Widget has a flashing thing, which it shows to the Button and the Button can use it with impunity. Anything can contain a flashing thing, so we're not limited to dealing with Widgets. Notice there's also another option:

struct FlashingThing {
  bool flashing = false;
};

struct Button {
   FlashingThing *parent;

   Button(FlashingThing* p) {
     parent = p;
   }

   void onAction() { // Actually implement inline, we know everything we need to about flashing things
     parent->flashing = true;
   }
};

struct Widget : FlashingThing {
   Button startFlashing;

   Widget() : startFlashing(this) {
   }
};

Notice in this case, Widget derives from FlashingThing. We're saying Widget is a FlashingThing. And we tell our button about us with the this pointer in the constructor.

Clone this wiki locally