Enum

binaryEnum is an abbreviation for enumerated.  The overall concept is processing information where each value is one of a very limited set of values.  This is a look from the basics to the more advanced in how a dynamic language like python treats enum, through the advanced treatment available in kotlin.

TLDR; Just read the headings until you find sections of interest.

Ascii characters are one of 127 possible characters (although now we use larger character sets), and the original ‘integer’ values were 0,1,2, …. 65,535,  but what about when you have your own finite set of possibilities, like expressing location as ‘at home’, ‘at work’ or simply ‘elsewhere’? Enter, the Enum.

Background

With Digital Data Everything is a number.

In digital computers, every value becomes a number.  Some values, for example a speed, or a length measurement, just are numbers already so to store the value as a number is obvious.  In fact at the simplest level, we have to types of data: integers and floats which are numbers.

Strings: All data can be a string.

But what about all the things that are not normally numbers? Since we use a language based on characters, any value we like can reduced to a string of characters, and as there are established ways of transforming each character into a number for storage in a computer, we could simply store each item as a string of characters.  We call this the string type.  We can store anything, regardless of whether the characters happen to be number or not, so we could even store our numbers as strings.   The string is effectively the common representation we can use for all types of data.  In a sense, all other types are subsets of the possible strings. Integers are subsets where only numeric characters are allowed.  Simplistically, the syntax of float like an integer with the addition of a single “.” character. Both integer and float are just subsets of the valid strings.  In fact you could design a language where string is the only type.

Why not keep everything as a string?

There are advantages not simply treating all data as a string. An integer or float is not just a string restricted to numeric characters, there are also special operations that can only be done with numbers, and special relationships between numbers.  These special subsets has their own rules which only apply to these subsets of possible strings.  For any given length of number, there also less possible alternatives than there are with strings. In the UTF-8 character set,  there are 1,112,064 possible values for each character.  In an integer, there are 10 possibilities for each character.  This allows for far more efficient storage of as an entry within the possible integers, than keeping them as string format which means keeping them as one of the far greater number of possible strings.  The less possible values, the more efficient the storage.  Actually keeping numbers in such a format also simply works with the fact that computers are by nature digital, so they are most efficient with numbers.

So now we have integer, float and string, and again, this would be sufficient, but there are advantages to even more types.

Boolean: The simplest ‘Enum’.

If you are reading this, it can be assumed you understand Boolean, but reviewing the basics is still worthwhile.  Boolean is a type where the values are not characters and not numbers, and therefore fits the classic definition of an Enum type: data with a limited number of possible values that we can ‘enumerate’. For Boolean, there are just two possible values: false and true.

Anything with a fixed number of choices fits the pattern on an enum.  Think of any data you fill out on a form where the answer is one of a fixed set of choices, such as gender: male/female/not-saying, or Title: Mr/Mrs/Dr/Prof/Miss/Ms  or even agreement: strongly-agree/agree/unsure/disagree/strongly-disagree.

Each of the possible choices could be represented as a number, for example:

TITLE_MR = 1
TITLE_MRS = 2
TITLE_DR = 3
#  etc.......

The value for title could be kept in an integer.  In fact a language could do the same with Boolean, and simply have false=0 and true=1 would cover most requirements.  So technically, no Enum values are needed at all.  Just as we could not even bother with the constants and just remember that title of Mrs is number 2.  But having enums which specifically declare what the possible values are makes the code intent far clearer, and the code more easily scanner by a language or a reader for errors.

History

Enums have been added to Python 3.4 as described in PEP 435. It has also been backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4 on pypi.

Prior to the introduction of Enum in the language officially, code could use constants for the alternate values as described in background, or made use of a class implement enums.   In fact libraries for Enum were sufficiently widely in use that these were the basis for Enum as added to the python language.

Enum in Python

There is a very simple format for declaring an enum:


>>> from enum import Enum
>>>
>>> Title = Enum("Title", "Mr Mrs Dr") # create type for title
>>> from enum import Enum # do import
>>> title= Title(2) #instance using value
>>> title
<Title.Mrs: 2>
>>> title2=Title["Dr"] # instance using name
>>> title2
<Title.Dr: 3>
>>> title.name # the attributes of 'title'
'Mrs'
>>> title.value
2
>>>

This example shows creating and using an Enum by specifying the names each alternative within the Enum.  In python, each alternative also has a value, and with this format, values for each alternative are assigned automatically, in a sequence starting with 1, so this case uses the values 1, 2 and 3.

Why start at one and not zero? Because in python the number zero can also be considered false, and the design is for every value by default to be true.

An alternative form allows choosing values as well as names.

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>
>>> title2 = Title["Dr"]
>>> title2.name
'Dr'
>>> title2.value
5
>>>

Note: this does allow starting at zero if you wish, and there are problems as a result
This alternative form can also work with automatic values:


>> from enum import Enum, auto
>>> class Title(Enum):
Mr = auto()
Mrs = auto() # using all automatic values
Dr = auto()
#
>> title2 = Title.Dr # same value as Title["Dr"]
>>> title2
<Title.Dr: 3>
>>> class Title(Enum):
Mr = auto()
Mrs = 5 # choose value # can mix auto() with other values
Dr = auto() # value will follow from previous chosen
#
>>> title2 = Title.Dr
>>> title2
<Title.Dr: 6>

In all these example, the value has been integer, but value can be anything: a float, a string, None, True or False and even a mutable list or a user created class. Auto() still works mixed with any other values, but ignores any non-numeric so

Mr = auto()
Mrs = None # Could be True or even [1,2,3]
Dr = auto()
#  will have values 1, None, 2

This has provided a simple review of enums in python, and illustrated the general concept. See the official python documentation for more details. Note that some cases provided by python are oriented to supporting implementations made before Enum was added to the language.

Enum in Kotlin

enum class Title{
    Mr(),
    Mrs(),
    Dr()
}
val title = Title.Mrs
println("$title  ${title.ordinal} ${title.name}")

This is almost an exact match for the python ‘auto()’ allocation of values. The differences are about ordinal vs value. Firstly, the name is different (ordinal vs value), and ordinal and starts from the more usual zero, not one, kotlin does not need to avoid the clash with python ‘Boolean safety’.

But the biggest difference is that ordinal is fixed. Like the indexes of an array, it must be the sequence of integers from zero … eg 0,1,2 …. It cannot be of another type, nor any other integers.

However, with kotlin, unlike the value in python, ordinal is not the only attribute beyond the name: you can add whatever extra fields you wish by adding them to the enum class.

enum class Title(val value:Int){
    Mr(2),
    Mrs(3),
    Dr(5)
}
println(Title.Mrs.value)
 

This does not provide an exact equivalent for value in python as there is no inbuilt method to find the Enum from the value as here:

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>

This does not allow ‘value’ to vary in type between class members either, but that could be done in this manner (although it is difficult to see the use):

enum class Title{
    Mr() {val value = 2},
    Mrs() { val value = "abc" }
}

Although this can be done, creating value this way results in reflection being needed to access value, so this is not practical. To have different properties, object should have different classes, so enter sealed classes.

Generally, the Kotlin Enum is an effective replacement for the python Enum. Where it gets interesting, is that in kotlin it does not stop there.

Beyond Enum: Sealed Classes

So far our example has been ‘Title‘, and just keeping it breif dictates that the examples have not even attempted to cover all possibilities. While any reasonable list would of course include Ms and Miss, what about Dame, Madam, Duchess, Duke or Sir?  In Italy Engineer is actually a title so we need to add Eng.  Could we ever be sure to cover every title a person might consider as their appropriate title?  One solution is to simply have people tick “other”.  A problem is that simply recording “other” is almost the same as having no data.  While adding a “please specify” may feel better, without actually recording what is specified, this is almost misleading.

Neither a python Enum,  nor a kotlin Enum, can handle the “other” case.  Every value in an enum is effectively mapped to a number.  Why not add a “specify” property?  Because it does not achieve the goal.  If we have four possible values “Mr/Mrs/Dr/Other” these could be mapped to 1,2,3,4 for storage.  Each choice is just one value.  Every time “Other” is recorded it is exacly the same, including any properties of other, so there is only one value “specify”.  To have additional values for specify we need a new entries in our list in the Enum defintition.  Something like “Mr/Mrs/Dr/Other1(“Miss”)/Other2(“Ms”) ….. in other words we still need to state every possible value of all properties at the time the Enum is declared.

Now consider, having a custom type (and a type is really a class) for each possible value.  So we have a Mr class and a Mrs class, and a Dr class and a Other class.   Then we can have the possible values for our enum as the classes.  Our Enum is no longer constrained (in this example) to four(4) specific values for Mr/Mrs/Dr/Other, but instead to 4 specific classes, and each class can have any number of different objects of that class.  While in a standard Enum type, if there are four possible values then variables of that type could be stored as a number between 1 and 4 (or 0 to 3), with a sealed class with four possible types, the reference to object matching one of the four classes is required, not just a number indicating which class is matched.   Despite this significant difference in how things work internally, the code can look very similar. This sample is attempting to look as similar as
possible to the enum example, but needs adjustment before being useful.

sealed class Title {  // no class parameter required
    object mr   // each entry can be an object definition or class definition
    object mrs  // note this sample shows entries with no base class ...
    object dr   // ... which valid syntax but not very useful
    class Other(val text:String)  // introduction of class member of 'enum'
}

The above example is difficult to use as the alternative values have no common
class so a different variable type is required for each value.

The example below fixes is not just valid syntax, but provides the recommended structure.

sealed class Title(val value:Int) {  // value to match previous enum examples
    object mr: Title(0)   // each object or class based on sealed class
    object mrs: Title(1)  // which gives a common base class
    class Dr: Title(2)   // entries matching the enum patter can be class or object
    // but entries making use of the power of sealed classes must be a class
    class Other(val text:String): Title(5)
}

In kotlin, object declares a singleton class where there is only ever a single instance of the object.  This means there will only ever be one instance of mr which is referenced everywhere, which is similar to the way Enum is implemented.
Both mr and mrs could also be implemented as classes, but since there is no data at all in objects of these classes, every instance of Dr (or a Mr or Mrs class) would be identical and therefore could be references to the same instance.   But the goal of using sealed classes is to provide for classes which do have data so every instance has the potential to be different.

Another example of sealed classes can be found here, with the code here.  Plus the official kotlin information could also be useful.

If there are questions, add a comment and I will attempt to address them.

 

Advertisements

1 thought on “Enum”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s