Páginas

segunda-feira, 22 de janeiro de 2018

Strong Typed Identifier

Recently, I’ve got into one of these hard-to-track bugs that after a while in debug, you just “ah-hah!” and facepalm at the source reason. In my situation, it was because of two different classes using the same type for identifier types. For example, let’s assume two classes like this:

using CheeseId = unsigned int;
class Cheese
{
public:
    /* public interface */
    
    CheeseId GetId() const;
    
private:
    CheeseId m_CheeseId;
};


using SwordId = unsigned int;
class Sword
{
public:
    /* public interface */
    
    SwordId GetId();
    
private:
    SwordId m_SwordId;
};

The CheeseId and SwordId classes each have their own identifier type set up by the using directive. Since identifiers only make sense if there are different instances over time (as in collections), they could be used, or even stored, by the instances owner. For example, let’s say this manager class:

/* Assume this class is responsible for cheese memory */
class CheeseCollectionManager
{
public:
    /* public interface */
    Cheese* InstantiateCheese();
    Cheese* FindCheese(CheeseId id) const;
    void EatCheese(CheeseId id) const;
    
    /* more stuff */
};

This is a simple example of a manager class that holds unique ownership over instances, and allows some basic operations if the identifiers are known. A Cheese can be created into the world by using InstantiateCheese, and will be destroyed when it is eaten with EatCheese. Now, let’s suppose we have a CheeseEatingGladiator executing his day off actions:

class CheeseEatingGladiator
{
public:
    /* Some Other Code */
    
    void DayOffLunchTime()
    {
        m_CheeseCollectionManager.EatCheese(SwordId); // Oops
        CheeseId = Cheese::NoCheeseid;
    }

private:
    CheeseCollectionManager* m_CheeseCollectionManager;
    SwordId m_EquippedSword;
    CheeseId m_CheeseForLunch;
};

Oops!? When coding, I’ve accidentally passed the SwordId to the EatCheese method. This compiled because both are of unsigned int type, resulting in random behavior. If both identifiers are the same, the code will work correctly – uncareful automatic tests might trip on this. If they are different, some random Cheese will suddenly disappear, and there’ll be a Cheese which no one might ever be able to eat, as the gladiator forgot the identifier right after eating. Worse yet, the SwordId might request for a non-existent Cheese, and the whole world will stop existing because the gladiator has segfault.

The best way to avoid an error in a system, even if caused by its programmers, is to make the error impossible to happen. If the compiler could detect we mixed up the CheeseId with SwordId and show a compile-time error, the error would be immediately detected before even trying to run the code. In other words, the problem is how to make a strong typed identifier. The solution is very simple:

struct CheeseId
{
    unsigned int value;
};

struct SwordId
{
    unsigned int value;
};

Now CheeseId and SwordId can’t be mixed up! It is possible to make the code even safer by making it harder for the programmer to manipulate the value inside the structs. For example:

struct CheeseId
{
    explicit CheeseId(unsigned int value) : m_Value(v) {}
    CheeseId(const CheeseId& other) : m_Value(other.m_value) {}
    
    const CheeseId& operator=(const CheeseId& other)
    {
        m_Value = other.Value;
        return *this;
    }
    
    unsigned int Value() const { return m_value }
    
private:
    unsigned int m_Value;
};

With this, the problem of mixing different types of identifiers won’t happen again, and the risk of using an incorrect unsigned int as the identifier is mostly mitigated.

Problem solved!!!

But not.

Now there’s a new problem: the amount of boilerplate code needed for each type of identifier. If we have on the dozens, the above code would repeat on dozens. But then, when you start needing other functionalities such as hash codes or converting to strings, DRY (Don’t Repeat Yourself) will prove why it is such a good principle. Of course, C++ has a simple way to deal with this… templates! But whenever templates comes up as an alternative, together comes a whole set of possibilities with pros and cons. I chose something like traits. Tag types or other techniques are also possible.

template <typename IdTraits>
struct IdType
{
    using underlying_type = typename IdTraits::underlying_type;

    IdType() {};
    IdType(const IdType& other);
    explicit IdType(underlying_type id);
    
    underlying_type Value() const;
    std::string ToString() const;

    bool operator==(const IdType& right) const;
    bool operator!=(const IdType& right) const;
    
private:
    underlying_type m_Value;
};

I’ve omitted the functions definitions. Creating a new identifier type now is simple:

struct CheeseIdTraits
{
    using underlying_type = unsigned int;
    
    // Called from IdType::ToString
    std::string ToString(underlying_type id) const
    {
        // Logic here
    }
};

using CheeseId = IdType<CheeseId>;

Of course, there are some improvements that could be made, specially with the help of metaprogramming, but as it is, the IdType using type traits seems good enough for my needs. I took this chance to learn a bit more about how to use templates, and the final solution was achieved with the help of amazing people from gamedev forums (on this topic ).

Changing my code to use strong typed identifiers revealed a couple more mistakes. These were harmless, but it is good to solve these before they take their part into a big, hard to track and hard to reproduce bug.

Syntax coloring on this page was made on hilite.me.

Nenhum comentário:

Postar um comentário