Magic Numbers

Magic Numbers

Magic numbers are the literal numeric values used in a program. These magic numbers appear usually in expressions, in array sizes, as constants. Some examples of magic numbers are

    char buf[1024]; /* 1K buffer size */
    int retransmissionTimer = 100; /* Re-transmit after 100 ms */

Here the literals 1024 and 100 are the magic numbers. The literals 0 and 1 are usually barred from being regarded as magic numbers.

So what's wrong with Magic Numbers ?

Magic numbers should not be used simply because of the fact that they are confusing. A number smacked right in the middle of an expression gives a little indication of their meaning or purpose.

    void moveLeft()
    {
        if (curCol == 79){
            curcol = 2;
            if (curRow == 24){
                curRow = 3;
                shiftScreenUp(1);
            } else {
                curRow ++;
            }
        } else {
            curCol ++;
        }
        displayCursor();
    } 

The above code-snippet is for a character mode text-editor. Since the text-editor operates in character mode so its designed fro 25 rows and 80 columns. However first row is used for top menu, second and last rows are used for border. First and last column are also reserved for border therefore user can only work between 3rd to 24th row and 2nd to 79th column. All of this information is not conveyed by using the literals in the code. It might seem to be a bug in the code if a person sees the code for a first time. However there are ways in which this code can be improved.

Avoid Magic numbers

If possible avoid the use of magic numbers by replacing them with other functionality provided in the language e.g. the following statement checks whether the a character is an alphabet or not

    if ((ch >='A' && ch <='Z')
      ||(ch >='a' && ch <='z'))

This functionality can be replaced by the use of function "isaplha" provided in the C language. Several other languages also provide a similar mechanism so that the programmer does not have to code this functionality.

Some other places where a language can help to replace magic numbers is using NULL instead of literal '0', using sizeof instead of using length of datatypes in C(which is highly recommended).

However, not in every case the literals can be avoided in such manner. In these cases it is better to give a name to these magic numbers.

Give name to Magic Numbers

Instead of using literals in a program, it is better to give names to these literals which convey the significance of literal, more clearly. These names can be in form of macros, enumerations, variables or constants. Consider the example of picking checking the expiry of a timer

    if (timer.curMSecs > (timer.info.StartMSecs + 100))
       deleteTimer(timer);

The number 100 is however confusing in this current context. If this magic number is replaced by a name such as

    #define maxRetransmitTime 100
    ...
    if (timer.curMSecs > (timer.info.StartMSecs + maxRetransmitTime))
       deleteTimer(timer);

This makes it clearer that the literal 100 signifies the maximum allowed time for retransmission. Apart from this one can look at the code now and tell that the code here is specific for retransmissions.

In general providing a name to a literal has the following benefits:

    1. It brings clarity to the code. Replacing 100 with maxRetransmitTime has not only made the code more understandable but also saved the programmer from some effort of putting some comments.
    2. Code becomes more maintainable. If in future the maximum re-transmission time has to be increased to 200, it has to be modified only in single place rather in hundreds of places where the literal 100 would have been used.
    3. It reduces chances of errors. If the re-transmission time needs to be changed then this has to be done only in one place. If literals would have been used in the code then there would be a chance that programmer forgets changing the literal in some place leading to a problem in the code.

Use Macros

Using macros is the most easiest form of replacing the literals. The above example of text-editor can be re-written using macros as

    #define MAXROW 24 /* Last row for border */
    #define MINROW 3  /* First row for Menu and second row for border */
    #define MAXCOL 79 /* Last  Column for border */
    #define MINCOL 2  /* First Column for border */
    void moveLeft()
    {
        if (curCol == MAXCOL){
            curcol = MINCOL;
            if (curRow == MAXROW){
                curRow = MINROW;
                shiftScreenUp(1);
            } else {
                curRow ++;
            }
        } else {
            curCol ++;
        }
       
        displayCursor();
    } 

Use Enumeration

Provided a huge application, the number of such macros would be huge. In order to ligically group the literals, enumerations can be used.

    typedef enum screenSize{
        MAXROW=24 /* Last row for border */
        MINROW=3  /* First row for Menu and second row for border */
        MAXCOL=79 /* Last  Column for border */
        MINCOL=2  /* First Column for border */
    }

However, there are some disadvantages for using enumerations. First is the lack of type-safety. Also in C only integer values can be represented in enumerations. And not every language provides the functionality of enumerations. But what every language does provide is the concept of variables which can be used in this case.

Use Variables or Constants

Almost every modern high level programming language provides the concept of variables and also of constants. Constants are type-safe and can contain every data-type value defined by the language.

    const int MAXROW=24 /* Last row for border */
    const int MINROW=3  /* First row for Menu and second row for border */
    const int MAXCOL=79 /* Last  Column for border */
    const int MINCOL=2  /* First Column for border */

Variables should be used only when there is a strong reason to do so, such as for achieving generality in the code. The above example can be re-written as

    void moveLeft(int MAXROW, int MAXCOL, int MINROW, int MINCOL)
    {
        if (curCol == MAXCOL){
            curcol = MINCOL;
            if (curRow == MAXROW){
                curRow = MINROW;
                shiftScreenUp(1);
            } else {
                curRow ++;
            }
        } else {
            curCol ++;
        }
       
        displayCursor();
    } 

By using this approach not only have we removed the magic numbers from the original code but also generalised the function for a text-editor of varying size. This can be useful in scenario when multiple instances of text-editors are running each having different screen definitions. The different instances only need to maintain their screen limits while the code can be used and shared among the different instances from a library.

However, not every functionality requires generic implementation and in such cases constants should be used.

Philosophy of Magic Numbers

Best practices say that no magic number should be used inside code except 0 and 1. These two numbers are expectable as they can be considered as Boolean values. Any other magic number is not expectable. The rational behind replace magic numbers with enum/const/macro, other then ease in modifying values, is we are associating a sort of metadata with number (data about data). This data is in turn of const/enum/macro name and inline comments you give while declare the magic number holder. Giving a good name to data holders is also considered as a good programming skill. The inline comments and name of placeholder gives a good idea about the constant number. Someone may argue that giving comment where magic numbers are used is equally good idea. This approach can become a nightmare when you replace magic number value but not corresponding comments. At some places in code you may forget to replace comments. This out of sync magic number and comments can make introduce more bugs at later stages for example for maintenance engineers. There are also a few cons of using magic number substitutes, such as maintenance engineers have to jump to placeholder declaration for looking real value of the placeholder (most of the debuggers don't show actual value of macros, the possibility to see its value is to jump to its declaration). This may increase development/maintenance time but advantages are comparatively prominent.

Correct Location of placeholder

Even if you agreed for not using magic number, the substitute placeholder declaration position is important. Often the correct place for this substitute placeholder is determined by its desired scope. If the magic number was intended to be used in a local scope then the substitue placeholder can be declared as a private member in a class or static in C language. For magic numbers with global scope, the substitute placeholder should be visible across all the files in project. This can be achieved by placing this enum/const/macro

In a resource file (in Visual Studio) or Pre-Compiled headers (PCH), or

Can be exported from a package in Java, or

In header file which can then be included by every other file using that magic number.

The Magic of some numbers just won't let go

It should be kept in mind that we are only removing the literals from the code, but they are still present in the code in the form of initialisation values to macros, variables or constants. But the main intent of naming the literals is to make the code and programmer depenedent on the name rather than the value. But these literals are still lurking inside the executable code which cannot be avoided. In fact some-times these literals in the executable can be used for our benefit. Some of the files start with a unique literal value such as:

Windows Executable file start with literal 0x4D5A the string equivalent of MZ (based on the name Mark Zbikowski, the person who designed Windows executable file format)

JPEG files start with 0x4A464946 the ASCII equivalent of string JFIF.

PNG files start with 0x89504e470d0a1a0a ( for \211 P N G \r \n \032 \n).

Java byte code starts with 0xCAFEBABE.

The Unix/Linux program uses these magic numbers to identify the type of file.

Apart from this the programming community has been known the fill the invalid memory area with well-known magic numbers or return some well known magic numbers in case of memory faults such as:

Burroughs B6700 initialised its memory of 48 words to 0xBADBADBADBAD.

Microsoft Windows LocalAlloc(LMEM_FIXED) API initialised its uninitialised allocated heap memory to 0xBAADF00D.

Some of the IBM's systems used 0xDEADBEEF to uninitialized memory leading to term "Dead Beef Error" (meaning using uninitialised memory).

In the end, the usage of names for replacing literals is similar to using DATA statement as described in the FORTRAN manual for Xerox Computers.

The primary purpose of the DATA statement is to give names to constants;

instead of referring to pi as 3.141592653589793 at every appearance, the

variable PI can be given that value with a DATA statement and used instead

of the longer form of the constant. This also simplifies modifying the

program, should the value of pi change.