Tech TP: September 2008

When does a cast change the address

Typically, when you cast an address to a type, you are simply changing how that address is being interpreted - that does not change the address. But in the case of C++, casting can actually change the address.

Say you have this hierarchy:

class C : public A, B
{
};

Then if you do this (not necessarily the recommended way, static_cast is better because it makes your intentions clearer and it is a bit safer):

C c;
A *a = (A *)&c;

Then, if the members of c started at 0xa34c0000 (on a 32-bit machine say), the physical contents of a will be the same i.e. (void *)a == (void *)&c.

But if you do this:

C c;
B *b = (B *)&c;

Then, if c started at 0xa34c0000, b will start at 1 beyond 0xa34c0000 + size of class A members.

That means, that say if A were a pure virtual class with only a vptr as a data member, B would start at 0xa34c0004. Thus the above cast would change the memory address in the result, and rightly so.

Probably something that is trivial for the experienced C++ programmer, but does trip you up when you are accustomed to casts not changing addresses on you - in C land, for example.

*PS: In the virtual base class example above a reinterpret cast would actually put 0xa34c0000 in b as well:
B *b = reinterpret_cast<B*>&c;
But that is disastrous because class B does not start at that address. In general, reinterpret_cast means you are telling the compiler that you know more about the underlying type then the compiler does. If you use a direct C-style cast, C++ first tries to do a static_cast, which is why it ends up doing the right thing - incrementing the address by 4.

Getting from a structure member pointer to the structure pointer

This is typically something that is required in system software - where you have a pointer to a member of a structure, and you want to get to the base address of the structure itself. Generally speaking this should be easy enough - since you know the offset of the member, you simply subtract the sizes of every member that comes before the member whose pointer we have, and we should have a pointer to the base.

Generally speaking, there are more elegant ways of doing this. One way I have seen is this:

Consider a structure like this:

struct mystruct {
member *a;
another_member b;
something *c;
...
};

So, if you have a pointer to c, to get a pointer to mystruct, you could do this:

mystruct *s = (mystruct *)(( (byte *)c - (byte *)((mystruct *)0)->c)

What this is doing is casting the address 0 to the type mystruct. Then, if you are pointing to the "c" member, that gives you the actual relative offset of "c" in the structure. All that remains is to subtract this offset from "c" itself, in order to get a pointer to the containing structure.

Linux kernel does something similar - the macro container_of provides this functionality:

#define offsetof(TYPE, MEMBER)
  ((size_t) &((TYPE *)0)->MEMBER)

/**
* container_of - cast a member of a structure out to the containing structure
* @ptr:        the pointer to the member.
* @type:       the type of the container struct this is embedded in.
* @member:     the name of the member within the struct.
*
*/
#define container_of(ptr, type, member) ({ \
        const typeof( ((type *)0)->member ) *__mptr = (ptr); \
        (type *)( (char *)__mptr - offsetof(type,member) );})

It does the same thing if you ignore all the casting. It first gets a pointer to the member of a 0-based structure, and then subtracts the offset of the member from this address to get the base.

ARM RISC philosophy

It is interesting how ARM does not take the RISC philosophy too far. Yes, it does believe in single simple instructions that execute in one CPU cycle. But it does allow you to have complex instructions. For example, instructions that allow you to operate on multiple memory locations - possibly contiguous. The instruction may be a simple STR and does fit into 32 bits with the op-code + operand, but it may take longer than one CPU cycle to execute. This is ok - because ARM does not take RISC too far.

Also, the load-store architecture is cool. This is to guarantee that processing instructions always operate on registers - never directly on memory. Memory access is slow, it is better to have instructions that access memory work separately. This also allows some pipelining - because it may be possible that you issue a LDR on one instruction, while you are trying to execute a non-data dependent instruction at the same time.

Another point is code density - if ARM took the RISC architecture too seriously, then the code density would take a hit, because each instruction requires 32 bits (at least in the ARM instruction set, ignoring the 16-bit Thumb set), so if you need a lot more instructions to say something, your code becomes that much bloated.

Tech TP

Tuesday, September 23, 2008

When does a cast change the address

Getting from a structure member pointer to the structure pointer

ARM RISC philosophy

About Me

Blog Archive