Enum

binaryEnum is an abbreviation for enumerated.  The overall concept is processing information where each value is one of a very limited set of values.  This is a look from the basics to the more advanced in how a dynamic language like python treats enum, through the advanced treatment available in kotlin.

TLDR; Just read the headings until you find sections of interest.

Ascii characters are one of 127 possible characters (although now we use larger character sets), and the original ‘integer’ values were 0,1,2, …. 65,535,  but what about when you have your own finite set of possibilities, like expressing location as ‘at home’, ‘at work’ or simply ‘elsewhere’? Enter, the Enum.

Background

With Digital Data Everything is a number.

In digital computers, every value becomes a number.  Some values, for example a speed, or a length measurement, just are numbers already so to store the value as a number is obvious.  In fact at the simplest level, we have to types of data: integers and floats which are numbers.

Strings: All data can be a string.

But what about all the things that are not normally numbers? Since we use a language based on characters, any value we like can reduced to a string of characters, and as there are established ways of transforming each character into a number for storage in a computer, we could simply store each item as a string of characters.  We call this the string type.  We can store anything, regardless of whether the characters happen to be number or not, so we could even store our numbers as strings.   The string is effectively the common representation we can use for all types of data.  In a sense, all other types are subsets of the possible strings. Integers are subsets where only numeric characters are allowed.  Simplistically, the syntax of float like an integer with the addition of a single “.” character. Both integer and float are just subsets of the valid strings.  In fact you could design a language where string is the only type.

Why not keep everything as a string?

There are advantages not simply treating all data as a string. An integer or float is not just a string restricted to numeric characters, there are also special operations that can only be done with numbers, and special relationships between numbers.  These special subsets has their own rules which only apply to these subsets of possible strings.  For any given length of number, there also less possible alternatives than there are with strings. In the UTF-8 character set,  there are 1,112,064 possible values for each character.  In an integer, there are 10 possibilities for each character.  This allows for far more efficient storage of as an entry within the possible integers, than keeping them as string format which means keeping them as one of the far greater number of possible strings.  The less possible values, the more efficient the storage.  Actually keeping numbers in such a format also simply works with the fact that computers are by nature digital, so they are most efficient with numbers.

So now we have integer, float and string, and again, this would be sufficient, but there are advantages to even more types.

Boolean: The simplest ‘Enum’.

If you are reading this, it can be assumed you understand Boolean, but reviewing the basics is still worthwhile.  Boolean is a type where the values are not characters and not numbers, and therefore fits the classic definition of an Enum type: data with a limited number of possible values that we can ‘enumerate’. For Boolean, there are just two possible values: false and true.

Anything with a fixed number of choices fits the pattern on an enum.  Think of any data you fill out on a form where the answer is one of a fixed set of choices, such as gender: male/female/not-saying, or Title: Mr/Mrs/Dr/Prof/Miss/Ms  or even agreement: strongly-agree/agree/unsure/disagree/strongly-disagree.

Each of the possible choices could be represented as a number, for example:

TITLE_MR = 1
TITLE_MRS = 2
TITLE_DR = 3
#  etc.......

The value for title could be kept in an integer.  In fact a language could do the same with Boolean, and simply have false=0 and true=1 would cover most requirements.  So technically, no Enum values are needed at all.  Just as we could not even bother with the constants and just remember that title of Mrs is number 2.  But having enums which specifically declare what the possible values are makes the code intent far clearer, and the code more easily scanner by a language or a reader for errors.

History

Enums have been added to Python 3.4 as described in PEP 435. It has also been backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4 on pypi.

Prior to the introduction of Enum in the language officially, code could use constants for the alternate values as described in background, or made use of a class implement enums.   In fact libraries for Enum were sufficiently widely in use that these were the basis for Enum as added to the python language.

Enum in Python

There is a very simple format for declaring an enum:


>>> from enum import Enum
>>>
>>> Title = Enum("Title", "Mr Mrs Dr") # create type for title
>>> from enum import Enum # do import
>>> title= Title(2) #instance using value
>>> title
<Title.Mrs: 2>
>>> title2=Title["Dr"] # instance using name
>>> title2
<Title.Dr: 3>
>>> title.name # the attributes of 'title'
'Mrs'
>>> title.value
2
>>>

This example shows creating and using an Enum by specifying the names each alternative within the Enum.  In python, each alternative also has a value, and with this format, values for each alternative are assigned automatically, in a sequence starting with 1, so this case uses the values 1, 2 and 3.

Why start at one and not zero? Because in python the number zero can also be considered false, and the design is for every value by default to be true.

An alternative form allows choosing values as well as names.

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>
>>> title2 = Title["Dr"]
>>> title2.name
'Dr'
>>> title2.value
5
>>>

Note: this does allow starting at zero if you wish, and there are problems as a result
This alternative form can also work with automatic values:


>> from enum import Enum, auto
>>> class Title(Enum):
Mr = auto()
Mrs = auto() # using all automatic values
Dr = auto()
#
>> title2 = Title.Dr # same value as Title["Dr"]
>>> title2
<Title.Dr: 3>
>>> class Title(Enum):
Mr = auto()
Mrs = 5 # choose value # can mix auto() with other values
Dr = auto() # value will follow from previous chosen
#
>>> title2 = Title.Dr
>>> title2
<Title.Dr: 6>

In all these example, the value has been integer, but value can be anything: a float, a string, None, True or False and even a mutable list or a user created class. Auto() still works mixed with any other values, but ignores any non-numeric so

Mr = auto()
Mrs = None # Could be True or even [1,2,3]
Dr = auto()
#  will have values 1, None, 2

This has provided a simple review of enums in python, and illustrated the general concept. See the official python documentation for more details. Note that some cases provided by python are oriented to supporting implementations made before Enum was added to the language.

Enum in Kotlin

enum class Title{
    Mr(),
    Mrs(),
    Dr()
}
val title = Title.Mrs
println("$title  ${title.ordinal} ${title.name}")

This is almost an exact match for the python ‘auto()’ allocation of values. The differences are about ordinal vs value. Firstly, the name is different (ordinal vs value), and ordinal and starts from the more usual zero, not one, kotlin does not need to avoid the clash with python ‘Boolean safety’.

But the biggest difference is that ordinal is fixed. Like the indexes of an array, it must be the sequence of integers from zero … eg 0,1,2 …. It cannot be of another type, nor any other integers.

However, with kotlin, unlike the value in python, ordinal is not the only attribute beyond the name: you can add whatever extra fields you wish by adding them to the enum class.

enum class Title(val value:Int){
    Mr(2),
    Mrs(3),
    Dr(5)
}
println(Title.Mrs.value)
 

This does not provide an exact equivalent for value in python as there is no inbuilt method to find the Enum from the value as here:

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>

This does not allow ‘value’ to vary in type between class members either, but that could be done in this manner (although it is difficult to see the use):

enum class Title{
    Mr() {val value = 2},
    Mrs() { val value = "abc" }
}

Although this can be done, creating value this way results in reflection being needed to access value, so this is not practical. To have different properties, object should have different classes, so enter sealed classes.

Generally, the Kotlin Enum is an effective replacement for the python Enum. Where it gets interesting, is that in kotlin it does not stop there.

Beyond Enum: Sealed Classes

So far our example has been ‘Title‘, and just keeping it breif dictates that the examples have not even attempted to cover all possibilities. While any reasonable list would of course include Ms and Miss, what about Dame, Madam, Duchess, Duke or Sir?  In Italy Engineer is actually a title so we need to add Eng.  Could we ever be sure to cover every title a person might consider as their appropriate title?  One solution is to simply have people tick “other”.  A problem is that simply recording “other” is almost the same as having no data.  While adding a “please specify” may feel better, without actually recording what is specified, this is almost misleading.

Neither a python Enum,  nor a kotlin Enum, can handle the “other” case.  Every value in an enum is effectively mapped to a number.  Why not add a “specify” property?  Because it does not achieve the goal.  If we have four possible values “Mr/Mrs/Dr/Other” these could be mapped to 1,2,3,4 for storage.  Each choice is just one value.  Every time “Other” is recorded it is exacly the same, including any properties of other, so there is only one value “specify”.  To have additional values for specify we need a new entries in our list in the Enum defintition.  Something like “Mr/Mrs/Dr/Other1(“Miss”)/Other2(“Ms”) ….. in other words we still need to state every possible value of all properties at the time the Enum is declared.

Now consider, having a custom type (and a type is really a class) for each possible value.  So we have a Mr class and a Mrs class, and a Dr class and a Other class.   Then we can have the possible values for our enum as the classes.  Our Enum is no longer constrained (in this example) to four(4) specific values for Mr/Mrs/Dr/Other, but instead to 4 specific classes, and each class can have any number of different objects of that class.  While in a standard Enum type, if there are four possible values then variables of that type could be stored as a number between 1 and 4 (or 0 to 3), with a sealed class with four possible types, the reference to object matching one of the four classes is required, not just a number indicating which class is matched.   Despite this significant difference in how things work internally, the code can look very similar. This sample is attempting to look as similar as
possible to the enum example, but needs adjustment before being useful.

sealed class Title {  // no class parameter required
    object mr   // each entry can be an object definition or class definition
    object mrs  // note this sample shows entries with no base class ...
    object dr   // ... which valid syntax but not very useful
    class Other(val text:String)  // introduction of class member of 'enum'
}

The above example is difficult to use as the alternative values have no common
class so a different variable type is required for each value.

The example below fixes is not just valid syntax, but provides the recommended structure.

sealed class Title(val value:Int) {  // value to match previous enum examples
    object mr: Title(0)   // each object or class based on sealed class
    object mrs: Title(1)  // which gives a common base class
    class Dr: Title(2)   // entries matching the enum patter can be class or object
    // but entries making use of the power of sealed classes must be a class
    class Other(val text:String): Title(5)
}

In kotlin, object declares a singleton class where there is only ever a single instance of the object.  This means there will only ever be one instance of mr which is referenced everywhere, which is similar to the way Enum is implemented.
Both mr and mrs could also be implemented as classes, but since there is no data at all in objects of these classes, every instance of Dr (or a Mr or Mrs class) would be identical and therefore could be references to the same instance.   But the goal of using sealed classes is to provide for classes which do have data so every instance has the potential to be different.

Another example of sealed classes can be found here, with the code here.  Plus the official kotlin information could also be useful.

If there are questions, add a comment and I will attempt to address them.

 

Advertisements

Implementation: what is a practical approach?

Any software team who is considering moving to kotlin, must by definition, be currently using at least one alternative language.  To change languages, and ecosystems, is a big step.  One of the key features of kotlin is how easily and seamlessly a project can migrate from java.  Currently, that same ease of migration is far less real from outside the java ecosystem.

Cold Turkey? Or step by step?

On rare occasions, there may be the opportunity to commence a complete new project and build each component with no basis on any legacy system.  If starting an entirely new project but not already experienced in kotlin, it will still require a huge leap of faith to start an entire development in kotlin.

More often, and in the project we are currently working with, the realistic path is to choose system components that can move to kotlin.

The candidates:

Individual pages discuss these sections, but the spoiler alert is that mobile/android development may not be the logical first choice it would on the surface seem.

Kotlin DSL templates with Python (or any other) Server

The Concept: A replacement template engine written in Kotlin DSL

Kotlin can replace the template system for a python server, or ruby server, or any other server, with no change to the server other than the template system, regardless of what language is used for the server itself.

This allows replacing mako, or jinga2,  Django templates, with kotlin DSL templates.  No kotlin or jvm installation is needed on the server, as the kotlin dsl templates can run as javascript in the browser.

Why? : A more dynamic and concise solution

Template engines are generally considered a method of producing ‘dynamic’ web pages.  While a ‘static’ web page always displays the exact same html, templates produce html which reflects the data presented to the template.  For example, a ‘member’ page will have the information for the member currently logged in,  while a static home page will display the same information to everyone.

However pages generated by templates are not necessarily ‘dynamic’ in a web2.0 manner.  The page is ‘generated on the fly’ by the template engine, but does not necessarily run dynamically in the client browser.  To the client browser, the page appear static.

These kotlin DSL templates are a complete rethink of how templates work, and inherently produce code that is also dynamic on the browser and the entire system makes adding true dynamic content far simpler.

Further, the description of page content becomes more concise than with conventional templates.

More Languages? Or Less?

It could be seen that adding kotlin to do templates means adding yet another language to the toolkit in use with a project, however this is only proposed as substitute for mako, jinga2 or some other template language, so there is also one language no longer required.

Kotlin is far more complex than any template language, but using the kotlin DSL as described here does not require learning the full kotlin language.   The further benefit is that learning a template language just for templates has no other uses, while it is very likely there are other possible uses of kotlin (e.g. Android?)  within a project.  If kotlin has an additional use within the project, then using kotlin for templates could mean one less language overall.

How do these Kotlin DSL templates work?

A conventional template is like a more powerful version of the python format function.

Consider:


"person: {name} age: {age}".format(name="fred",age=35)

This is equivalent to the string being a template that is supplied data in the form of the ‘name’ and ‘age’.  The string is modified by the data, to produce a final string.  Templates just allow more power with modifying the string.

The kotlin DSL templates I am discussing, actually run in the browser, not on the server.  The template can look similar to html, but it is code, not a string.  Using simple python format statement for a simple example of templating, the template might be:

<html>
    <body>
person: {name} age: {age}
</body>
</html>

and by processing the template with our data, the final page is prepared on the server and sent to browser.

However with the kotlin templates discussed, our layout inside the ‘html’ tag, is not present inside html sent to the browser, and in place of the content will normally be an emply placeholder ‘div’ tag.  Javascript code to instance the ‘body’ and ‘p’ tags inside the placeholder ‘div’ is sent as the template, and this code will read the jsondata (where present) and provide the correct content inside the div tag.

Different perspective?  HTML vs DOM

One perspective is to think of a web page and the html as basically synonymous.  Another perspective is to think of the web page is the DOM, and html is just data describing that DOM.  With this second paradigm, we could consider: “what if the DOM itself is described by javascript, not by the html code?”

This makes our web page:

<html>
  <body>

<!-- first a json script to hold any json data  -->
<script id="data" type="application/json">{jsondata}</script>

<!-- now here is the div where our main content will be added -->
<div id="page"></div>
<!-- now the script for the page content >
  <script src="{templatename}"></script>

  </body>
</html>

As you can see, the html to describe page content just an empty ‘div’ in the page.  So the DOM must get the page content part of  DOM from the kotlin DSL.  In fact this same html above, is now used for every page on the web site. The only part that changes is the {jsondata}, and  changing the {templatename} to the actual values for these to be used.  The sample above is perfect for using with a python format to substitute actual names,  but in testing if just sending the html file, then just set these to the values for testing.

What does the Kotlin DSL look like?

Of course the page above does nothing without the javascript,  because the javascript it generating the tags for the main part of the web page.   The only HTML is outside skeleton, and the main content of the page is described in kotlin DSL instead of in HTML.  Here is a very simple ‘main part of the page’ with just a

text

with a heading of ‘heading’ for content.

val div = document.create.div {
h1 { +"heading" }
    p { +"text" }
}

The document.create.div is needed for the very outer layer, and all the html tags inside become very clean and simple.  Using data within the page, or having tags produced in a loop, is all automatic by just using more of the kotlin language.

See the link: kotlin javascript tutorial for more on the DSL for html and how to configure Intellij and install the jar file for kotlinx.html

For a working example of the template scheme described on this page, see the ‘hypothetical programmable calculator’ as described in the page Machine code and Global Memory.  The code for the calculator with code for a kotlin DSL template example can be found in the repository.

Machine Code, Global memory, the Stack and the Heap

Today we computers that have multiple ‘cores’ (in fact complete processors) on a single chip. But one way to understand the fundamentals of computers is to analyse far simpler computers from the past.  This section will review those very fundamental concepts in term of a hypothetical programmable calculator, one of the very first ever popular ‘mini-computers’, the PDP-8 from 1965, and the Intel 8080 cpu from 1974 that still influences the instructions of intel Xenon and ‘Core’ processors in 2017 .

The JavaScript (with Kotlin source code) version of the ‘hypothetical programmable calculator’ is available here.  I will also add links to emulators for the PDP 8 and 8080.

Sections:

Machine Code and Global Memory

sdl453269107_1375355122_image1-1310fTo understand how memory works, it can be useful to consider how the CPU itself works, and what follows from that is how memory works.  Languages build on the underlying principles and automate using the different types of memory.

This page uses a ‘hypothetical programmable calculator’ (Magic Calculator) to illustrate the principles of how instructions are executed and how global memory works with the instructions described below.

 

Hypothetical Programmable Calculator

How does a CPU work?

The example of the ‘hypothetical programming calculator is used for ‘how does a  ‘Central Processing Unit’ (CPU) work.  The working memory inside a CPU is the CPU ‘registers’, and the value displayed on a calculator screen is very analogous to simple computer with a main register.  For this exercise, consider this displayed value as the ‘a’ register (Many original computer did have an ‘a’ register, or ‘accumulator’ register, and some even provided a continuous display of the value of this register).  To add two numbers on a calculator, we enter the first number to our display or ‘a’ register, then activate ‘+’, which has to store the operation as plus, and save the first number into a second register which we can call ‘b’. Then we enter the second number into the ‘a’ register (or display), and with the ‘=’ we do the stored operation with add ‘a’ and ‘b’ and leaves the result in a.

So we have some program steps, but how do we ‘run’ these steps? Well first we need some memory.

Program Memory and program counter

Many calculators have one or more ‘memories’.  Our programmable calculator is going to have 100 memories!  The simplest calculators have one memory, and you can save from the ‘a’ register to the memory, or load from the memory to the ‘a’ register.  On some calculators you can even add the ‘a’ register into the memory, but I digress. The big thing with our programmable calculator, is that values in memories represent instructions.  Number ‘1’ when used as instruction could be our ‘+’, number ‘2’ our ‘=’ and number ‘3’ could mean ‘set a,n’  to set a from value in the memory following the instruction.   To make this work, we need a new register, a ‘Program Counter’ register for pointing to instructions.  Every time we load an instruction, or load information with the ‘Program Counter’, the program counter increases by 1.

So our program to add 7 and 8 (in memory locations 0, 1, 2, 3, 4, 5, 6 )now looks like:

  • 3  7  1  3  8  2  0  (enter this string into the emulator ‘code’ field)

The steps are:

  1. The “program counter (PC) starts at zero so the instruction at zero, (3- load a) is run, and this instruction loads the next value from the memory location specified PC register (and again adds one to the register), so the result is the ‘7’ from location ‘1’ is loaded into ‘a’ and 7 is displayed.
  2. The PC register is now 2 (increased to 1 after loading the ‘3’ – load instruction, and again increased to 2 as the load instruction loaded the ‘7’ from location 1.  The plus instruction sets operation register to ‘add’ and copies the ‘7’ from the ‘a’ register to the ‘b’ register.
  3. The ‘load’ instruction (3) from location ‘3’ is loaded from the program counter and this instruction then loads the ‘8’ from memory location 4 into ‘a’ register
  4. the ‘=’ instructions (2) from memory location ‘5’ is loaded and this causes the ‘7’ from ‘b’ to be added to ‘a’ so the calculator then display our answer: ’15’
  5. the ‘stop’ instruction (0) from memory location 6 causes our program to stop.

This simple example illustrates how a program actually runs in a computer. The main memory can have both data and instructions.

Adding global variables: the instructions.

Currently the binary program for the ‘programmable calculator’  just does the equivalent of  ‘7 + 8’ in python.

This is only useful because we can see the ‘a’ register on the calculator display.  The equivalent of ‘7+8’ being useful in ‘idle’, because idle prints the answer. Now consider the program ‘answer = 7 + 8’.  This program stores the answer in a variable.  The previous program is stored in  7 memory locations, so there is lots of free memory locations for variables.   If we plan to use half of the memories for code, and half for variables, then all memories below 50 would hold code and numbers used inside code, and memories 50 and above would be for variables.

None of the current instructions use variables, so consider  two new instructions, load a,(n) and  save a,(n) to load ‘a’ register from the memory location we want, or save the ‘a’ register. The ‘load’ (instruction code 4)  and ‘save’ (instruction code 5) will both use the memory following the instruction to specify which memory is to be loaded or saved.

Currently the ‘Magic Calculator’ does not support these last two instructions(load and save), but if desired for experimentation, this could be added.