A few days ago, I wanted two different versions a resizing array in C. One for int
, one for
char
. Of course, you can't do this is C. C doesn't have templates, doesn't have generics, doesn't
have anything like that. The only thing C does have is macros. So, I set to work.
First, let's write a simple version of a resizing array for a single type, T
(I haven't bothered
making this implementation super efficient, since this is just for illustrative purposes). The following goes in
vect.c
:
#define VECT_INIT_CAP 8
#define T int
typedef struct Vect {
size_t count, cap;
T *items;
} Vect;
Vect v_init(void) {
return {
0, VECT_INIT_CAP, malloc(VECT_INIT_CAP * sizeof(T))
};
};
void v_cleanup(Vect *v) {
free(v->items);
}
void v_push(Vect *v, T t) {
if (v->count == v->cap) {
v->cap *= 2;
v->items = realloc(v->items, v->cap * sizeof(T));
}
v->items[v->count++] = t;
}
T v_pop(Vect *v) {
return v->items[--v->count];
}
T v_get(Vect *v, size_t idx) {
return v->items[idx];
}
Looks good so far. This will work for a single type T
(in this case a typedef
of
int
).
There's a few problems we need to solve in order to get this working for whatever types we need. First, we need
a unique name for function, since C doesn't allow overloading. (So, we can't have both
v_push(vect(char), char);
and v_push(vect(int), int);
, since both have the same
name (v_push
), and C only allows one function with each name.
So, the first step to making this properly generic is to generate a unique name for the struct
and
functions, based on T
.
We can do this with a little bit of preprocessor magic:
#define Vect(T) v_##T
#define CAT(A,B) A##B
#define CAT_EXPAND(A, B) CAT(A,B)
#define VECT_FUN_NAME(T, F) CAT_EXPAND(Vect(T), F)
#define v_init(T) VECT_FUN_NAME(T, _init)
// ... and so on for the rest of the functions.
We also replace Vect
and the names of the functions with Vect(T)
and the appropriate
macros for each function.
So the signature for our push function is now void v_push(T)(Vect(T) *, T);
, for example.
Let's break down what's happening with v_push(T)
, since it involves a lot of
indirection with macros.
The preprocessor will do the following replacements (supposing we substitute have T
defined as int
, as above):
v_push(T)
-> VECT_FUN_NAME(int, _push)
VECT_FUN_NAME(int, _push)
-> CAT_EXPAND(Vect(int), _push)
CAT_EXPAND(Vect(int), _push)
-> CAT(v_int, _push)
CAT(v_int, _push)
-> v_int_push
The reason we need so many levels of indirection here is because of how the preprocessor expands things out.
##
will paste the macro arguments literally, so without the extra indirection,
CAT
would try to paste Vect(int)
and _push
literally, without expanding
them, which is an error.
Long story short, we've not got unique names for our vectors and related functions, based on T
.
(There are some obvious limitations: T
has to be a single identifier for this to work. We can get
around this by typedef
ing more complex types, if we need T
to be anything more
complicated.)
The last step to this is generating a unique version of our vector for each type we want.
Fortunately, this is pretty easy. We wrap our implementation in #define VECT_INSTANTIATE(T)
,
rather than having T
explicitly defined. Then, in order to get vectors of type int
,
we can just add VECT_INSTANTIATE(int)
to the bottom of vect.c
.
The last piece is the header. We add a similar macro to VECT_INSTANTIATE
(which I'll call
VECT_DECL
), which just declares the methods we want.
A full implementation can be found here, on my GitHub.
Since this directly generates a specific version of our vector for each type we need, there's 0 runtime overhead to doing this. There's also no code-bloat, since we get exactly one definition of each version of the vector for each type we need. (This is possibly better than C++, which could instantiate a template multiple times for the same type, but in practice I haven't found this to be a problem with C++.)
In fact, the only "overhead" we have compared with C++ is having to write VECT_DECL(T)
and
VECT_INSTANTIATE(T)
for each type we need, and having to write the type of the functions along with
each invocation. (This second one could be avoided if you want to add a few extra macros, but then it becomes
a bit more of a pain.)
In its current form, probably not. It's mostly a POC to show what kind of stuff is possible in C using the preprocessor.
Yup, I'm sure you can. And now you've got an extra dependency for your project that you have to deal with, while I've used the built-in code-generator that every C implementation comes with. I'm happy for you though.