Classes, Objects and Instances

Topics for this page:

Background

Python, Kotlin or Java, which best supports Object Oriented Programming (OOP)?

History.

Simula 67 (in 1967) was probably the first Object Oriented Language.  It was a great language, but wow was it slow!  And slow even on the massive mainframe computers it lived on.

The origin of C++ was to add the power of Object Oriented Programming to the very efficient language C, while still delivering real performance. In first reference book on C++ was released in 1985 (tired of waiting for everyone to have access to the internet?) at a time when micro-computers were popular, but far slower than even old mainframe computers.  The result was a language with great control over how objects were created, but in reality necessarily severely compromised by the features to deliver performance.

Java began life as Oak in 1991, but was only used by the original team until 1995 and then had a very short gestation to reaching 1.0 in 1996.  Java delivered greater ease of Object Oriented Programming than C++, and with good performance even with the added layer of a virtual machine.  While the language is hardly truly Object Oriented throughout,  it struck winning formulae with a balance of Object Oriented and performance.

Python had origins also in 1991, but did not really get public exposure until python 2.0 in the year 2000, and did not get full object orientation until ‘new style classes’ were introduced at the end of 2001. Computing power had moved a long way between 1996 and 2001 (consider the 150Mhz 1995 Pentium pro vs the 1.3Ghz 2000 Pentium 4),  and python was targeting easy of programming over ultimate performance anyway.  As you can guess as a result python was able to be Object Oriented throughout instead of just having an outer veneer of being Object Oriented.  But yes, still slower than Java.

Of course for Kotlin, designed in 2011 and having version 1.0 released in 2016, having Object Oriented structure throughout and adding features for functional programming as well was not even really a challenge. But slower than Java? Sometimes, but a 2016 technology compiler that ran run in a 2016 computer with gigabytes of memory available (windows 3.1, the common version at the time of Java release, required 1MB of Ram!) can optimise and produce almost the same code.  Oh yes, the compiler is slower than a Java compiler can be, but is it a problem?

OOP Myths.

But Java has created some OOP myths with almost a generation of programmers learning OOP on Java.

“This program cannot be True OOP, where are the getters and setters!”

That encapsulation requires getters and setters is actually true.  That you have to write your own getters and setters just to access and store data is not true.  Both python and kotlin automatically provide getters and setters without the programmer noticing unless the program requires non-standard behaviour from the getters and setters. Java requires the program writes their own getters and setters in order to allow breaking the rules of OO for performance reasons.

True OOP requires and functions and data to be inside a class so every thing is an object!

True OOP does have everything as an object.  Functions, all data types, everything.  For performance reasons Java broke this rule.  Understandable, but breaking the rules of OOP does not make the language more OO.  Functions should be ‘first class functions’ and in java they were not, for performance reasons.  Putting the non-OOP functions inside an object at least gave them an Object Wrapper, and the same applies for Java primitive data types.  Modern OO languages do not need to wrap functions or data in a class because functions and data are always already objects.

defining classes: constructors and init

python vs kotlin Syntax

# syntax is:
class <Name>(<base classes>):
    def __init__(self, <parameter list>):
        # put init code here
#now example 1
class Fred:
    def __init__(self, var1, var2):
       self.var1 = var1
       self.var2 = var3
       self.container = []
#and example 2
class Fred(BaseClass):
    def __init__(self, var1, var2):
       super().__init__(var1)
       self.var1 = var1
       self.var2 = var3
       self.container = []
           # rest of init here

Hopefully the above python code is self explanatory. Example 2 adds a base class.  I do not deal with multiple inheritance at this time, and will devote a specific page at some future time as for python it gets complex, and for kotlin it requires additional concepts.  Now here is the kotlin ‘imitate python’ equivalent to example 1.

//syntax is:
//
<optional modifier> class <name>(<default constructor params>): <base>
//
// example 1a:bad version- to be python like - not best kotlin
class Fred{
    val var1: Int
    val var2: String
    val container: MutableList<Int>

    constructor(var1: Int, var2:String){
        this.var1 = var1
        this.var2 = var2
        this.container = mutableListOf<Int>()
    }
}
// example 1b: still bad version- to be python like again- not best kotlin
//  move 'constructor' to class definition, now contructor body is 'init'
class Fred constructor(var1: Int, var2:String){
    val var1: Int
    val var2: String
    val container: MutableList<Int>

    init{
        this.var1 = var1
        this.var2 = var2
        this.container = mutableListOf<Int>()
    }
}

// example 1c:better - 'constructor' keyword omitted and defines variables
// in the primary constructor
class Fred(val var1: Int, val var2: String){
    val container: MutableList<Int>

    init{
        container = mutableListOf<Int>()
    }
}

First the ‘not the best kotlin way’ examples. Is constructor or init the best equivalent python __init__? The first example keeps the class definition more similar to python, and uses a constructor to perform the role of the python __init__. In kotlin, a class can have multiple different constructors to enable constructing an object from different types. So there could be another constructor accepting String in place of Int for the parameters.

But where the python code simply assigns to self.var1 without first declaring var1, in kotlin all variables must be declared, so example aside from the declarations of the three instance variables (var1, var2 and container) and constructor in place of __init__, 1a above looks almost directly like the python version. However, in this form, there is more code than the python version.

Version 1b above moves the constructor(var1: Int, var2:String to the class declaration. Doing this makes this the default constructor for the class, but the body of the constructor method cannot be on this line declaring the class so the body of the constructor is now called init, and the class declaration reads: class Fred constructor(var1: Int, var2:String).
init is the special reserved word to identify the block of code which is the body of the default constructor.

So example 1b is very similar to example 1a, but introduces the concept of a default constructor and the init block.

Example 1c introduces some improvement. A common pattern is that values to the constructor (__init__) as saved as instance variables. Simply adding var or val in the constructor means the parameter is the declaration, plus this result in the constructor automatically saving the values passed in. So we lose 4 lines of code as unnecessary (two declarations, plus 2 assignments). We can also omit the word ‘constructor’ for the primary constructor except for some rare special cases. So now the code is almost as brief as the python code. But there are still optimisations to come.

// example 1d:best
class Fred(val var1: Int, val var2: String){
    val container = mutableListOf<Int>()
}
// example 2
class Fred(val var1: Int, val var2: String):BaseClass(Var1){
    val container = mutableListOf<Int>()
}

So for example 1d, the code is now more concise than python, despite the declaration of variables. Yes, container does now look like a python class variable, but this is how instance variables are in kotlin. So the code is brief with types, and perhaps more so than the code without types. This is because the remaining code in the kotlin constructor prior to this step, initialisation of container, can happen at the declaration of container. So no Normally, all code needed in a constructor is setting initial values, so normally no init block is needed as initial values at the definition, either automatically in the case of default constructor parameters, or at the definition in the main block of other instance variables.

Example 2 covers a class with a base class, just for completeness, to have the syntax covered.

instance vs class variables

Consider the following python code:

class Person:
    age = 21

    def __init__(self, name):
       self.name = name
       self.otherName = ""
       self.fullName = name

name, otherName and fullname are instances variable or properties, which means for each Person there is a new copy of each variable. Without a person object, there is no name, otherName or fullName. But age is a class variable, so it exists exactly once, even if there are no Person instances, and regardless of how many person instances.

>>> p1= Person("Fred")
>>> p2= Person("Tom")
>>> p1.age  # access class variable just like instance variable
21
>> p2.age  # same value both times
21
>>> Person.age = 22 # change value in class
>>> p1.age # and p1.age automatically has the new value
22
>>> p2.age # and so does p2.age
22
>>> p2.age = 19 # set p2.age creates an instance variable
>>> p1.age # p1 is still using the class so unchanged
22
>>> p2.age # but p2.age shows the instance, which hides the class variable
19
>>> del p2.age // now delete the instance variable
>>> p2.age  // but the class is still there
22
>>> 

The above code plays with how instance variables in python can have the same name as class variables, but hide the class variable.

Now to kotlin. As we saw in class definition, instance variables in kotlin are defined in the way closer to how class variables are defined in python.

Code which appears in the class definition and defines and sets the value of a variable, is actually run every time an object of the class is instanced. This is very useful, because it makes the most common case the one that is simpler, while with python the simpler code is the class variable, and they are less common.

With kotlin there is no automatic class object at run time, with information on class methods and properties held internal to the compiler. To have an object at run time to hold information for the class information for runtime including class variables, kotlin classes have a ‘companion object’. So the reverse of python, kotlin instance variables are declaring at the class level, and class variables are declared inside a container, the companion object container.

The kotlin companion object is the parallel of the class object in python. Any methods or variables in the companion object will be class based and exist once per class, regardless of whether there are zero or more instances of that class.

Most access to class based data will happen from inside the class, but if you do wish to access class based data from outside the class, you do need a ‘getter’ and/or a setter, which are not normally needed in kotlin, but this is an unusual case, and for completeness it is covered here.

class Person(val name:String) {  // instance variable declared in constructor

    var otherName = "" // an instance variable not from a parameter
    var fullName = name //instance variable manually set from parameter

    companion object {
       var age = 21  // class variable age
    }

    var staticAge get()= age   // instance property as getter and
        set(value){age = value}   // setter for age - see properties
}

The staticAge is needed for the example, but most often access to get or set a class variable like age will happen from within the class, so no staticAge would be needed.

p1 = Person("Fred")
p2 = Person("Tom")
p1.staticAge  // access class variable like instance only within class
21
p2.staticAge  // use instance with getter from outside class
21
Person.age = 22 // change value in class
p1.staticAge // and p1.age automatically has the new value
22
p2.staticAge // and so does p2.age
22
// cannot create instance value at runtime
// no workable equivalent to python class and instance with same name

The main point is that while defining a variable in the class scope in python creates a class variable, in kotlin creating at this scope creates normal instance variables (or properties).  Class variables (also know as static class variables) in kotlin are created within the companion object for the class, which is a single object as a container for the class, rather than each instance of the class.

self vs this

Access to variables of an object from outside the code of class definition (as in the previous example), is the same for python and kotlin. The code p1.name will access the name variable from the object p1.   Code inside the class must work without any actual object name, so another naming system is needed.  The naming for python is self to indicate the current object, and for kotlin this to indicate the current object.  But the python self is needed far more often than the kotlin this, so in python self.name for the object variable or property, and this.name in kotlin, but in kotlin the this. is only needed when there is a parameter or local with the same name, and normally the this. can be omitted.   So a lot less this in kotlin than self in python.

Again, in python the first parameter to each method in a class should be self, this is not included in the parameter list in kotlin.  Again, less this than self.

# consider in method defintion
    def setNames(self, name): # self as first parameter
       self.name = name
       self.otherName = ""
       self.fullName = name # use name as fullname
//method definition in kotlin
   fun setNames(name): // no 'this' in parameter list
       this.name = name // 'this.name' is property, 'name' is parameter
       otherName = "" // only one 'otherName', so do not need 'this.'
       fullName = "" // also only one 'fullName'

properties: getters and setters

Traditionally, java programmers have been taught that encapsulation (a key part of OO) requires building a class so that how things work can be changed without affecting code using the class. To do this ‘getters’ and ‘setters’ are required, to provide for changes to how data inside the class is used. Instead of allowing a variable to be accessed or set from outside the class, a getter method is created to get the value, and a setter method to set the value. The idea is functions already there in place ready for a possible time when getting or setting is to be become more complex.
Modern languages have identified problems with this approach:
almost all getters and setters just get or set the value and do nothing else so they just bloat the program
it is much clearer for the calling code to get the value of a variable or have an assignment statement to set the value – even when what is happening inside the class is more complex

The solution is:
require code only for the complex cases
ensure setting and getting from outside the class looks the same for simple and complex and is most readable.

Consider this python class:

class Person:
    def __init__(self, name):
       self.name = name
       self.otherName = ""
       self.fullName = name

>>> tom = Person("Tom")  #instance object
>>> tom.fullName = "Tom Jones" # set property using object
>>> tom.fullName  # get property
'Tom Jones'

getting and setting is as simple as possible when using the class, but what if we do wish to ‘complicate’ the fullName property changing the value from being simply its own data, to being the result of name together with otherNames?
Consider:

class Person:

    def __init__(self, name):
       self.name = name
       self.otherName = ""

    @property
    def fullName(self):
	    return " ".join([self.name,self.otherName])
    @fullName.setter
    def fullName(self,value):
	    if " " in value:
	        self.name,self.otherName = value.split(" ",1)
	    else:
	        self.name = value
	        self.otherName = ""
>>> bob = Person("Bob")
>>> bob.otherName = "Jones"
>>> bob.fullName
'Bob Jones'
>>> bob.fullName = "Bobby Smith"
>>> bob.name
'Bobby'
>>> bob.fullName
'Bobby Smith'
>>> bob.otherName
'Smith'

Now we have the new implementation, and all code written before the change will still work.

class Person(var name:String) {  // instance variable declared in constructor

    var otherName = "" // an instance variable not from a parameter
    var fullName
        get()= listOf(name,otherName).joinToString(" ")
        set(value) {
            val (first,second) =
                    if(' ' in value) value.split(" " ,limit=2)
                    else listOf(value,"")
            name = first
            otherName = second
        }
}

The kotlin code for having getters and setters is less changed by adding getters and setters. Simply follow the variable (or value) property declaration with the get and/or set methods.

More?

What is not covered?
Super, which I feel needs no explanation, and
Delegated properties and more complex cases with does need more. I will add a separate page on these but for now see this page, and delegated properties are described here.

Extension functions will also be covered separately.

Advertisements

Enum

binaryEnum is an abbreviation for enumerated.  The overall concept is processing information where each value is one of a very limited set of values.  This is a look from the basics to the more advanced in how a dynamic language like python treats enum, through the advanced treatment available in kotlin.

TLDR; Just read the headings until you find sections of interest.

Ascii characters are one of 127 possible characters (although now we use larger character sets), and the original ‘integer’ values were 0,1,2, …. 65,535,  but what about when you have your own finite set of possibilities, like expressing location as ‘at home’, ‘at work’ or simply ‘elsewhere’? Enter, the Enum.

Background

With Digital Data Everything is a number.

In digital computers, every value becomes a number.  Some values, for example a speed, or a length measurement, just are numbers already so to store the value as a number is obvious.  In fact at the simplest level, we have to types of data: integers and floats which are numbers.

Strings: All data can be a string.

But what about all the things that are not normally numbers? Since we use a language based on characters, any value we like can reduced to a string of characters, and as there are established ways of transforming each character into a number for storage in a computer, we could simply store each item as a string of characters.  We call this the string type.  We can store anything, regardless of whether the characters happen to be number or not, so we could even store our numbers as strings.   The string is effectively the common representation we can use for all types of data.  In a sense, all other types are subsets of the possible strings. Integers are subsets where only numeric characters are allowed.  Simplistically, the syntax of float like an integer with the addition of a single “.” character. Both integer and float are just subsets of the valid strings.  In fact you could design a language where string is the only type.

Why not keep everything as a string?

There are advantages not simply treating all data as a string. An integer or float is not just a string restricted to numeric characters, there are also special operations that can only be done with numbers, and special relationships between numbers.  These special subsets has their own rules which only apply to these subsets of possible strings.  For any given length of number, there also less possible alternatives than there are with strings. In the UTF-8 character set,  there are 1,112,064 possible values for each character.  In an integer, there are 10 possibilities for each character.  This allows for far more efficient storage of as an entry within the possible integers, than keeping them as string format which means keeping them as one of the far greater number of possible strings.  The less possible values, the more efficient the storage.  Actually keeping numbers in such a format also simply works with the fact that computers are by nature digital, so they are most efficient with numbers.

So now we have integer, float and string, and again, this would be sufficient, but there are advantages to even more types.

Boolean: The simplest ‘Enum’.

If you are reading this, it can be assumed you understand Boolean, but reviewing the basics is still worthwhile.  Boolean is a type where the values are not characters and not numbers, and therefore fits the classic definition of an Enum type: data with a limited number of possible values that we can ‘enumerate’. For Boolean, there are just two possible values: false and true.

Anything with a fixed number of choices fits the pattern on an enum.  Think of any data you fill out on a form where the answer is one of a fixed set of choices, such as gender: male/female/not-saying, or Title: Mr/Mrs/Dr/Prof/Miss/Ms  or even agreement: strongly-agree/agree/unsure/disagree/strongly-disagree.

Each of the possible choices could be represented as a number, for example:

TITLE_MR = 1
TITLE_MRS = 2
TITLE_DR = 3
#  etc.......

The value for title could be kept in an integer.  In fact a language could do the same with Boolean, and simply have false=0 and true=1 would cover most requirements.  So technically, no Enum values are needed at all.  Just as we could not even bother with the constants and just remember that title of Mrs is number 2.  But having enums which specifically declare what the possible values are makes the code intent far clearer, and the code more easily scanner by a language or a reader for errors.

History

Enums have been added to Python 3.4 as described in PEP 435. It has also been backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4 on pypi.

Prior to the introduction of Enum in the language officially, code could use constants for the alternate values as described in background, or made use of a class implement enums.   In fact libraries for Enum were sufficiently widely in use that these were the basis for Enum as added to the python language.

Enum in Python

There is a very simple format for declaring an enum:


>>> from enum import Enum
>>>
>>> Title = Enum("Title", "Mr Mrs Dr") # create type for title
>>> from enum import Enum # do import
>>> title= Title(2) #instance using value
>>> title
<Title.Mrs: 2>
>>> title2=Title["Dr"] # instance using name
>>> title2
<Title.Dr: 3>
>>> title.name # the attributes of 'title'
'Mrs'
>>> title.value
2
>>>

This example shows creating and using an Enum by specifying the names each alternative within the Enum.  In python, each alternative also has a value, and with this format, values for each alternative are assigned automatically, in a sequence starting with 1, so this case uses the values 1, 2 and 3.

Why start at one and not zero? Because in python the number zero can also be considered false, and the design is for every value by default to be true.

An alternative form allows choosing values as well as names.

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>
>>> title2 = Title["Dr"]
>>> title2.name
'Dr'
>>> title2.value
5
>>>

Note: this does allow starting at zero if you wish, and there are problems as a result
This alternative form can also work with automatic values:


>> from enum import Enum, auto
>>> class Title(Enum):
Mr = auto()
Mrs = auto() # using all automatic values
Dr = auto()
#
>> title2 = Title.Dr # same value as Title["Dr"]
>>> title2
<Title.Dr: 3>
>>> class Title(Enum):
Mr = auto()
Mrs = 5 # choose value # can mix auto() with other values
Dr = auto() # value will follow from previous chosen
#
>>> title2 = Title.Dr
>>> title2
<Title.Dr: 6>

In all these example, the value has been integer, but value can be anything: a float, a string, None, True or False and even a mutable list or a user created class. Auto() still works mixed with any other values, but ignores any non-numeric so

Mr = auto()
Mrs = None # Could be True or even [1,2,3]
Dr = auto()
#  will have values 1, None, 2

This has provided a simple review of enums in python, and illustrated the general concept. See the official python documentation for more details. Note that some cases provided by python are oriented to supporting implementations made before Enum was added to the language.

Enum in Kotlin

enum class Title{
    Mr(),
    Mrs(),
    Dr()
}
val title = Title.Mrs
println("$title  ${title.ordinal} ${title.name}")

This is almost an exact match for the python ‘auto()’ allocation of values. The differences are about ordinal vs value. Firstly, the name is different (ordinal vs value), and ordinal and starts from the more usual zero, not one, kotlin does not need to avoid the clash with python ‘Boolean safety’.

But the biggest difference is that ordinal is fixed. Like the indexes of an array, it must be the sequence of integers from zero … eg 0,1,2 …. It cannot be of another type, nor any other integers.

However, with kotlin, unlike the value in python, ordinal is not the only attribute beyond the name: you can add whatever extra fields you wish by adding them to the enum class.

enum class Title(val value:Int){
    Mr(2),
    Mrs(3),
    Dr(5)
}
println(Title.Mrs.value)
 

This does not provide an exact equivalent for value in python as there is no inbuilt method to find the Enum from the value as here:

>>> class Title(Enum):
Mr=0
Mrs = 2
Dr = 5
>>> title = Title(0)
>>> title
<Title.Mr: 0>

This does not allow ‘value’ to vary in type between class members either, but that could be done in this manner (although it is difficult to see the use):

enum class Title{
    Mr() {val value = 2},
    Mrs() { val value = "abc" }
}

Although this can be done, creating value this way results in reflection being needed to access value, so this is not practical. To have different properties, object should have different classes, so enter sealed classes.

Generally, the Kotlin Enum is an effective replacement for the python Enum. Where it gets interesting, is that in kotlin it does not stop there.

Beyond Enum: Sealed Classes

So far our example has been ‘Title‘, and just keeping it breif dictates that the examples have not even attempted to cover all possibilities. While any reasonable list would of course include Ms and Miss, what about Dame, Madam, Duchess, Duke or Sir?  In Italy Engineer is actually a title so we need to add Eng.  Could we ever be sure to cover every title a person might consider as their appropriate title?  One solution is to simply have people tick “other”.  A problem is that simply recording “other” is almost the same as having no data.  While adding a “please specify” may feel better, without actually recording what is specified, this is almost misleading.

Neither a python Enum,  nor a kotlin Enum, can handle the “other” case.  Every value in an enum is effectively mapped to a number.  Why not add a “specify” property?  Because it does not achieve the goal.  If we have four possible values “Mr/Mrs/Dr/Other” these could be mapped to 1,2,3,4 for storage.  Each choice is just one value.  Every time “Other” is recorded it is exacly the same, including any properties of other, so there is only one value “specify”.  To have additional values for specify we need a new entries in our list in the Enum defintition.  Something like “Mr/Mrs/Dr/Other1(“Miss”)/Other2(“Ms”) ….. in other words we still need to state every possible value of all properties at the time the Enum is declared.

Now consider, having a custom type (and a type is really a class) for each possible value.  So we have a Mr class and a Mrs class, and a Dr class and a Other class.   Then we can have the possible values for our enum as the classes.  Our Enum is no longer constrained (in this example) to four(4) specific values for Mr/Mrs/Dr/Other, but instead to 4 specific classes, and each class can have any number of different objects of that class.  While in a standard Enum type, if there are four possible values then variables of that type could be stored as a number between 1 and 4 (or 0 to 3), with a sealed class with four possible types, the reference to object matching one of the four classes is required, not just a number indicating which class is matched.   Despite this significant difference in how things work internally, the code can look very similar. This sample is attempting to look as similar as
possible to the enum example, but needs adjustment before being useful.

sealed class Title {  // no class parameter required
    object mr   // each entry can be an object definition or class definition
    object mrs  // note this sample shows entries with no base class ...
    object dr   // ... which valid syntax but not very useful
    class Other(val text:String)  // introduction of class member of 'enum'
}

The above example is difficult to use as the alternative values have no common
class so a different variable type is required for each value.

The example below fixes is not just valid syntax, but provides the recommended structure.

sealed class Title(val value:Int) {  // value to match previous enum examples
    object mr: Title(0)   // each object or class based on sealed class
    object mrs: Title(1)  // which gives a common base class
    class Dr: Title(2)   // entries matching the enum patter can be class or object
    // but entries making use of the power of sealed classes must be a class
    class Other(val text:String): Title(5)
}

In kotlin, object declares a singleton class where there is only ever a single instance of the object.  This means there will only ever be one instance of mr which is referenced everywhere, which is similar to the way Enum is implemented.
Both mr and mrs could also be implemented as classes, but since there is no data at all in objects of these classes, every instance of Dr (or a Mr or Mrs class) would be identical and therefore could be references to the same instance.   But the goal of using sealed classes is to provide for classes which do have data so every instance has the potential to be different.

Another example of sealed classes can be found here, with the code here.  Plus the official kotlin information could also be useful.

If there are questions, add a comment and I will attempt to address them.

 

Lists, Dictionaries Iterations & More

Python uses dictionary, list and tuple to hold collections information, with set also available but not quite as common.

Kotlin tends provides List and Maps and their mutable forms, MutableList and MutableMap as the main solutions for collections.  Again, Set and MutableSet are available matching the python ‘set’.  ‘List’ is equivalent to python ‘tuple’, MutableList to python ‘list’ and MutableMap to python ‘dictionary’.  In kotlin, there are many other options, some inherited from java, but they all have a logical role:

tuple -> List, Pair, data class

The python tuple is an immutable list, and the simple ability to use tuples  has a special role in the language with any expression with a comma creating a tuple.

Background: The good and bad of the ubiquetous python tuple

The good and bad of tuples can be seen in this :

>>> a, b = 1, 2
>>> a, b = b, a
>>> a, b
(2, 1)

The good is that tuples have almost zero syntactic overhead, all you need is a ‘,’ (comma). The bad is that it can be confusing when there is tuple, as opposed to another use of a ‘,’ in python.  Note that on the left of the assignment, it looks like a tuple, but the left of an assignment is special syntax and not a tuple.  While a tuple can hold values, it cannot hold names.  So construct like the one below

>>> a, b = 1, 2
>>> my_tuple = a, b
>>> *(my_tuple), = 6, 7
>>> a, b  # see, a and b are not changed
(1, 2)

Does not assign to a name that does not appear on the left of the assignment. In this case the ‘,’  is not indicating a tuple, but indicating ‘destructured assignment’.  Not every ‘,’  in python indicates a tuple even though a comma alone may indicate a tuple.  The comma syntax being brief is very useful most of the time, but can lead to some quirks as to whether a ‘,’ means a tuple or has another use.

>>> a = 3
>>> type(a)
<class 'int'>
>>> a = 3,
>>> type(a) #just one comma and a is tuple
<class 'tuple'>
>>> def test(a):
       print(type(a)
>>> test(1,) # does the comma mean a tuple or not?
<class 'int'>
>>> test((1)) # add brackets to make a tuple?
<class 'int'>
>>> test((1,),) # finally, now we have a single parameter that is a tuple
<class 'tuple'>

coding python tuples in kotlin

Kotlin List is close to a direct replacement for tuple, but unlike tuples and lists in python, there is no special syntax with using special brackets. “:List” for the type, “listOf()” to instance the list.

var (a1,b1) = listOf(1,2) // destructured assign a1 = 1, b1 = 2
var (a2,b2) = Pair(1,2)  // alternative destructured assign
data class XY(val x:Int, val y:Int)
var (a3,b3) = XY(1,2)  // destructured assign using data class

Note the ‘listOf() syntax to create a literal, but type is ‘List’ in a declaration.

Even in a declaration, a list can be used as a ‘drop in replacement’ for a tuple.  The syntax of declaring from a list is not as brief as python, and is not really within the kotlin ‘idiom’ to use List for a destrutured declaration.

In reality, a destructured declaration is a clear indicator that each element of the data is distinct in nature, and not really a collection.  The most common alternative to a data class in kotlin for destructured declarations are the ‘Pair’ or ‘Triple’, which are actually data classes, but without usage specific property names.

So while List is a direct equivalent to tuple, consider Pair or even a data class as the best substitute depending on the usage.

list -> MutableList

The python list can also be used in situations that are not really usage as a collection, but this tends to occur less than with lists. If there are the indexes to the list are literals, or the items in the list are not all of the same type, then consider if the list should be replaced by something other than a MutableList.  A true collection will be indexed mostly within loops. Use ‘mutableListOf() to instance a MutableList type.

common list operations:

val myList = mutableListOf(1,2) // myList cannot be reassigned, but list is mutable
val added = myList.add(33)  // add the value 3 to the (end of the) list
//  not 'add' return true if successful
myList.add(1,55) // insert into list at index 1 (2nd location)
//  list is now 1,55,2,33
remove(55)  // first the first entry matching 55 and remove (return true if successful)
val popped = removeAt(2) // remove the value found at index 2 and return that value

dictionary -> Map or MutableMap (or data class)

The direct replacement for dictionary is the MutableMap, but with no python equivalent to the Map, dictionaries are often used as maps.  If the dictionary is declared with literal values in place, then a Map (declared as mapOf( key to value) ) will be the replacement.  If the dictionary is declared empty, then a MutableMap ( declared with mutableMap() )  is likely to be the substitute.   Take care that dictionaries with literals strings for keys a probably really object substitutes and best replaced by a data class.

Common map operations.

val myMap = mutableMapOf<String,Int>()
//declared an empty map, type cannot be inferred without data
myMap.keys()
myMap.values()
myMap.toList()  // equivalent to python .items()

‘*’ and **

In python, any iterable can be used to provide parameters to a function.  Any dictionary can be used to provide keyword arguments.  kotlin has the concept of ‘varag’ which can have values provided by the ‘*’ prefix to an iterable, just as with python.  However, there is currently no equivalent to the ‘**”.  (more notes to be added on the * to be added)

list comprehension and other iterations vs map, filter, reduce

Python added map, filter and reduce together with lambda around 1994 , and then list comprehensions around 2000 with python 2.0. In that link Guido (creator of python)  presents strong arguments for list comprehensions, but notes that some people have suggested limitations to python lambda syntax is part of why in python comprehensions are favoured.  An argument is also presented on performance advantages, which in reality applies to python, but not to languages such as kotlin where compilation can support inline functions and other optimizations. It such languages, efficiencies depend on the compiler, not the technique itself.

Iterations vs map filter reduce, in both performance and appeal, is an argument that comes down to implementation details and personal taste.  Python has much stronger implementation of iterations in terms of both performance and appeal.  Kotlin has a map implementation with more performance than python list comprehensions,  but appeal is a personal choice.  Clearly, kotlin has better lambda than python, and that gives better map and filter, but comparing kotlin map and filter to the preferred python technique of iterations, cannot objectively produce a winner.   Just take the move from iterations in python to map and filter in kotlin with an open mind.

# first python
new_list = [map_func(it) for it in old_list]
//now kotlin
val newList = oldList.map{ mapFunc(it) }
#python for squaring list
new_list = [it * it for it in old_list]
//and kotlin
val new_list = oldList.map{ it*it }

# now version filtering out odd numbers
new_list = [it * it for it in old_list if it % 2 == 0]
//kotlin
newList = oldList.filter{ it % 2 == 0 }.map{ it*it }

With kotlin, a more complex, multi statement expression is possible with resorting to a ‘mapFunc’  (an external function to calculate the new value), but the two systems are similar.

Python added tuple and dictionary comprehension in python 2.7 in around 2012. Tuple comprehensions are basically identical to list comprehensions but with round brackets in place of square brackets.  In kotlin, map syntax is unchanged between immutableList (python list equivalent) and List (python tuple equivalent).

When the code is clearly cleaner for python is for a dictionary comprehension.  Starting with a map to produce a new map is case I have yet to find clean kotlin code for:

#python
new_dict = {k + '2': v * 2 for k, v in old_dict.items()}
// kotlin
val newMap = oldMap.toList().associateBy({it.first+"2"},{it.second*2})

There is more optimised code for mapping the values of a map, or mapping the keys, but mapping both at once, as you can see from above, is a little clumsy.  There may be a cleaner solution, but if so, I have not yet found it.

Sets

Sets are largely the same in both languages, with kotlin once again adding an immutable variant. Sets are not used as substitutes for other things, and the uses of a set are generally the same in both languages.  Common set operations:

val mySet = mutableSetOf<Int>()
mySet.add(3) // add value 3 to set

Arrays.

Lists, Maps and Sets, as well as data class objects, are all object stored as described in variables and objects page as having a reference stored in static (or stack) memory to an object in dynamic memory.  The ultimate in flexibility, but not the ultimate in performance.

Arrays are effectively an alternative to a List.  Fixed in size, and constructed in place. This fixed sized list, has storage of data directly in static memory, at the expense of flexibility.  For Arrays of basic types, think of allocating a block of memory to hold a fixed number items of the basic type.  If there are 10 four byte integers, then 40 bytes is required, and each integer can be addressed directly with no lookup required.  For other types, the static memory will be a block of reference to each object in dynamic memory.

In addition to efficiency, Arrays also allow interoperability with Java programs which make use of these data structures.  I will return to this section, but for now, these are features which have no direct equivalence in python.

Data Classes: Alternative to ‘Faux Collections’?

In coding solutions to problems, the choice of how to store the data can be between objects and lists and dictionaries.   Kotlin data classes can change which is the best choice.  This page examines just how tuples, dictionaries and lists can be used as for a ‘faux class’, and when to drop the ‘faux class’

Page contents(TL;DR – the kotlin solution):

The ‘struct’ problem: precursor to class?

The ‘c’ language has the concept ‘struct’, which is container for related, but not homogenous data.  Consider the following information about a person:

  • first name
  • last name
  • age
  • city

As long as the age is key in string form, ‘c’ could keep this information as an ‘array’ of 4 strings, referring to last name as ‘person[1]’ is far from ideal and there is that problem of needing to keep age as a string.   The struct provides an improved solution with descriptive names for the elements and types for each individudal field within the struct.  In c structs can be passed by value (which means copied) as will as by reference, comparison, ‘toString’ or other functions all have to be built separately.   The real lesson here is that every possible data type has a set of required methods.  In essence:  all data is an object.

Java: forced class hypocrisy

Java is a strange mixture.  The language designed at a time that Object Oriented programming was seen as the ‘magic bullet’ to end all problems in programming. C++ provided objects bolted on to the language ‘C’,  but java sought to have ‘pure’ object oriented programs,  but got the message wrong and decide ‘pure object oriented’ meant all code must be in classes, and missed that ‘all data is an object’.  The result is a language that is not really object oriented, but forces all code to be part of an object, even thought implementation of java does not even follow this edict itself.

Background: “one obvious solution” as a barrier object oriented programing in python.

Python itself started out with an underlying structure very object oriented, but allowing a procedural style for code written in python.   Python appears to follow the plan that beginner programmers can embrace a procedural style and allows for code to be procedural, often hiding object oriented underpinnings using procedural ‘syntactic sugar.

Programmers can learn python with no concept of OOP, then later learn OOP as they advance.  The language concentrates on ‘one obvious way to code’, requires that things done in a procedural method for learners,  should still appear procedural at all times. If you want one obvious way to solve a problem and the language allows a solution without OOP, then at least conceptually, an object oriented solution is not that one way.

In python, allowing beginners to code solutions without using objects, usually means allowing solutions substituting list, tuples, named tuples and dictionaries for data which might ideally be represented as ‘struct’ or objects.

In contrast, there has been no real work in the language to make it attractive to solve simple data requirements using classes.  Would this provide more than one logical way to solve a problem?  So data as an object remains still hard work. To start a useful object,  an ‘__init__’, method, a ‘__str__’ method and an __repr__ method are all required just for basic functionality. Contrast this with named tuples, where all is done automatically!

The result is a language that allows those who have not learnt object oriented concepts to progress as far as possible without ever declaring a class. Learning classes can wait, and all although code is built using a language with great object oriented foundations,  ‘faux objects’ built around collections (list, typles, dictionaries) are prevalent in python code.

‘faux collections’: python objects that appear as collections to the programmer.

But list, tuple, namedtuple and dictionary all can be used to describe data which is not really a collection. Used to pretend that objects which are not collections are collections. The danger to programming is to forget that these ‘faux collections’ are not really collections.  The ‘named tuple’, where each item in the ‘collection’ has its own name, is inherently designed for use purely as a ‘faux collection’.

List, tuple, named tuple and dictionary types are all described as collections. The concept of a collection is that all members of the collection are the same in nature.  But it is possible to use these types very effectively to describe things which are not really collections at all.  Consider some data read from a file to describe some people.  Each line of the file has ‘first name’, ‘last name’, age, and city.

So two lines of the file might be:

  • bill, smith, 23, new york
  • tom jones, 21, san Francisco

This file represents a true collection of ‘people’ because each line holds data which is the same in nature.  The first person or the ‘nth’ person are all people.  Every element in the collection has in common that it is a person.  But what do ‘first name’ and ‘age’ have in common?  The ‘collection’ of ‘first name’, ‘last name’, ‘age’ and ‘city’ can be held in a collection, but this is a ‘faux collection’.

In python:

people = []
while open("names") as lines:
for line in lines:
people.append(line.split(',"))

Would generate a list of people, but each person would be a list, where person[0] is the first name, person[1] is the last name etc.   So each line is using a collection for what really would be better as an object.  We could have a dictionary for each person so that person[‘first_name’] == ‘bill’ for our first person, and this may be more self documenting than person[0].

Python even gives named tuples, and each ‘person’ could a named tuple.

>>> from collections import namedtuple
>>> Person=namedtuple("Person", "first_name last_name age city")
>>> person=Person("bill","smith",19,"new york")
>>> person
Person(first_name='bill', last_name='smith', age=19, city='new york')
>>> person.age
19

The named tuple works exactly like a class, with the limitation all values are immutable. Like the ‘c’ struct, again the elements have a name, but there are more methods like ‘toString()’ already available.

A frequent request with python is for a ‘named list’, mirroring ‘named tuple’ to work just like a regular class.  But why not just make a class?  The reason is that a class definition requires a lot more code, with an __init__ and an __str__ and a __repr__ increasing the one or two lines required to declare our named tuple into around around 11 lines of code!

The kotlin solution: data classes

Consider this alternative to representing the ‘person’ from the previous section as a list, dictionary, tuple or named tuple.

data class Person(var first_name:String, var last_name:String, var age:Int, var city:String)

In one line we can define a class with a ‘constructor’ (equivalent to python __init__) a toString(), and even an equals comparator and a ‘toHash’. Using ‘val’ in place of var reproduces the ‘namedtuple’, but as used above it delivers on the request for a ‘namedlist’.  The python ‘namedtuple’ is really an class definition substitute, but in kotlin we can have an actual class just as easily.  This ease of use of a class makes many of the uses of dictionaries, lists and tuples in python redundant, and keeps the use of the kotlin equivalents to actually being used specifically for collections, and not the ‘object substitutes’ that usage that often occurs in python.

Lambdas: what do you mean I will use them?

Python has lambdas, but with python, the lambda is a very reduced lambda that Guido even hesitated about calling lambda.  I would suggest that with python, almost everything that can be done using lambdas, will be easier to understand if rewritten without the lambda.

Main Points of the page:

Background

Moving from python to kotlin in regards to lambdas, is not about the simple change in syntax. A major consequence of the syntax change is that lambdas play a greatly expanded role in programs written in the kotlin idiom. As the full power of kotlin lambdas becomes clear, a whole new tool in coding is revealed.

This page reviews lambdas, explains why they limited and usually undesirable in python but not limited and highly desirable in kotlin. TL:DR: read the headings and cherry pick reading which sections to read.

Ok, what is a lambda again?

Basically, lambdas are a ‘literal code’.  Imagine if you could not use literals in function calls or expressions, and instead literals have to be defined in a special ‘literal’ statement. The ‘hello world’ program would become:

lit greeting: "Hello World"
print(greeting)

simple code would get longer:

#this code would no longer be legal
people = 3
ears = people * 2

#instead we would have this
lit people: 3
lit doubler: 2
ears = people * doubler

Now it could be argued that by forcing our literals to have names, the code becomes more self documented. But it could be that the role of  “hello world” is no clearer by having it named ‘greeting’ and it could even be that naming ‘2’ doubler actual is confusing rather than helpful.  Note that while the declaration of ‘people’ can no longer use a general expression, this change has little real impact.

While with strings and integers forcing a special declaration and not allowing literals just seems silly, without lambdas, code literals are always moved to a special declaration. Here is an example in python declaring code with the special ‘def’ syntax, and also in the same way any other variable is declared by using a lambda.

def last_name(full_name_string):
     return full_name_string.split()[-1]
last_name_2 = last_name
last_lambda = lambda x: x.split()[-1]

>>> last_name("bill smith")
'smith'
>>> last_name2("fred bloggs")
'bloggs'
>>> last_lambda("tom jones")
'jones'

This is not an example of the real use of lambdas, but to illustrate how a ‘def’ is just giving a name to a block of code.  All three variables reference functions that work exactly the same.  But to define ‘last_name’, the ‘def’ syntax was used.  It may not even seem obvious that the ‘def’ was simply defining a variable ‘last_name’ and giving it the value of a block of code. Once defined, ‘last_name’ works like any other variable, only the definition requires a special syntax.  In contrast ‘last_lambda’ is defined like any other variable, direct from the code.  The lambda code works just like a literal for code, allowing use of code anywhere.

The above example may help with understanding, but it not the case where ‘code literals’ are needed.   In the example with numeric literals, forcing a declaration to have a special syntax did not really increase the code needed  ( syntax of ‘lit people: 3’  vs ‘people = 3’).  It was not being able to use a literal mid statement that resulted in more code.

ears = people * 2
# with no literals in an expression became
lit doubler: 2
ears = people * doubler

This is what happens with code without lambdas.  A forced declaration (using ‘def’) in place simply inserting the code where it is needed. It is where ‘code literals’ can be used mid expression that lambdas are really useful.

Simple example Use of Python Lambdas

def last_name(full_name_string):
  return full_name_string.split()[-1]
>>> my_list = ["bill smith", "fred blogs", "tom jones"]
>>> sorted(my_list, key=last_name)
['fred blogs', 'tom jones', 'bill smith']
>>> sorted(mylist, key=lambda full_name_string: full_name_string.split()[-1])
['fred blogs', 'tom jones', 'bill smith']

This is an example of the real use of lambdas.  This code takes a list of strings that are initially sorted alphabetically using the overall string, which results in sorting by first name, and re-sorts the list and returns the same list sorted by last name.

The code contrasts using a ‘def’ to define the code block, with of course gives the block of code a name and assigns it to a variable, with the lambda form where the code is a ‘literal’ within the call to sorted. Certainly the lambda saves declaring the function and saves two lines of code. The lambda also keeps the code definition where it is used, but not everyone will find things that readable.

Limitations of Python Lambdas

Kotlin Lambdas can inherit from a target environment, python Lambdas cannot.

Consider the use of kotlin in dsls like kotlinx.html.  The DSL (domain specific language) is built using lambdas.  The lambdas of the DSL can have their own language, because they have access to an enclosing environment.  In addition to the variables and methods of the enclosing scope of the lambda, there can be an entire additional vocabulary from a target environment.  There is no parallel to this in python, and the power comes from the combination of lambdas and extension functions. Consider this code:

class TestEnv(){
    var envVar = 3
    fun testInEnv(func: TestEnv.()->Unit){
        func()
    }
    fun envDoubler(value: Int) = value * 2
}
fun main(args: Array<String>){
    val test = TestEnv()
    val newNum = 4
    test.testInEnv ({ envVar = envDoubler(newNum) })
}

The lambda code can set ‘envVar’ in the target environment to a new value, and use the ‘envDoubler’ method for the calculation. This is simplistic example to keep the sample code concise, but imagine a class with an embedded file or socket connection and the power of allowing lambda code to access that file or socket. Providing access to an additional environment revolutionises the power of lambdas in many way other than enabling DSLs.

Significant whitespace is great, but it killed the lambda.

Lambdas in python are just clumsy.  I love the significant whitespace of python. It ensures easy to read code, and solves the ‘balancing braces’ challenge of other languages.  But it is hard to escape the fact a good lambda syntax and significant whitespace are difficult to combine.

Python blocks Rule! (But don’t do lambdas)

The rules for a block in python is that a block can follow any python line that ends with a ‘:’.   A block is indented from the line above, and ends when the line following the block is no longer indented.

To write a block in a ‘braces language’ like kotlin, most blocks will need a dedicated line following the block to just the closing brace.  Without this extra ‘}’ line the code is less readable.  In python a cleanly written minimal ‘block’ requires no additional lines, only one line for line of code in the block.   The code is always readable without decisions by the programmer to make the code readable, and the code simple requires one less line.

These extra lines can add up and make code longer, and it is easy to get the indenting wrong making the code hard to read. Winner: python.

Python syntax: python blocks kill the lambda

With lambda, you need to place a block in the middle of an expression, and you just cannot do that with a conventional python block.  So python has a custom syntax, where the lambda code is limited to just one expression, and unlike every other ‘:’ in python, a block cannot follow the ‘:’ in a lambda.  The syntax is

lambda :

Reasons to avoid lambdas in python:

    • A def is often more readable
    • Multi statement blocks are not even possible using lambdas
    • List iterations (and dict and tuple iterations) replace some of the most common lambda usecases

No ‘like lambda to the…’ jokes here! The result is that python lambdas are crippled.

Introducing Kotlin Lambdas.

The approach in Kotlin is to make lambdas, or “code literals” as easy as using String literals or Int literals. In fact even the “hello world” can have a lambda added.

fun main(args: Array<String>){
   println({"hello world"}())
}

The block ‘{ “hello world” }’ acts as a simple lambda function that returns the string hello world.  If you try this without the () brackets, it will print “() -> kotlin.String” indicating a function that returns a string.

So creating a lambda is as simple as creating a code block.

However, most useful lambda functions will need at least one argument.  Below are two examples of the previous ‘last name’ sort.

val lst = listOf("bill smith", "fred blogs", "tom jones")

// first ....no 'shortcuts' used, the long way
lst.sortedBy({fullName -> fullName.split(" ").last()})
[ fred blogs, tom jones, bill smith ]

// now...the preferred way
lst.sortedBy{it.split(" ").last()}
[ fred blogs, tom jones, bill smith ]

Note that at the start of the block we can give names to the parameters ” fullName ->” is declaring the parameter as “fullName”.  The block then proceeds as any other block.  There is no limit to how many statements in the block, nor on having constructs such as “if” or “when” or loops within the block.  The final statement executed in the block declares the return value.

The ‘preferred way’ example, has two changes from the first ‘no shortcuts’ solution. Consider these two sentences in English, “I am looking for the correct key, is the correct key long?  And is the correct key also grey?”   Now consider “I am looking for the correct key, is it long? And is it also grey?”   Repeating the name when what we are talking about is not necessary and the name can be replaced by “it”.  Kotlin takes the same approach. Instead of the “lastName ->” to give a name to what we are talking about, the parameter can just be called “it” during our block.  So the “lastName ->” at the beginning of the block can be omitted and when in place of ‘lastName.split(” “)’ the code just has ‘it.split(” “)’. Like in English, ‘it’ should only be clear when it is obvious what we mean by “it”, but using “it” is then often preferable to the longer way of saying what we mean.

So, firstly, omitting the parameter declaration gives the parameter a name of ‘it’ by default, reducing the clutter for simple lambdas such as this.

Next the brackets, ( ), around the parameter list to ‘sortedBy’ have been omitted. In place of ‘({it.split(” “)}), the code has simply ‘{it.split(” “)}.

This is actually the result of two separate rules which are designed to make lambdas more readable.  Firstly, the lambdas can follow the parameter list, to avoid needing to have brackets around the code.  This become most important when function calls are nested.  Secondly, if a function with lamda values as parameters would otherwise have an empty paramter list, then the brackets may be ommitted. See the following:

// first ....no 'shortcuts' used, the long way
lst.sortedBy({lastName -> lastName.split(" ").last()})

// next step use 'it' to save an actual name for parameter
lst.sortedBy({ it.split(" ").last() })

// and now we can have our lambda move after the parameter list for 'sortedBy'
lst.sortedBy(){ it.split(" ").last()}

// finally, sortedBy now has an empty parameter list () ..so the () can be omitted
lst.sortedBy{ it.split(" ").last() }

.

Conclusion.

Kotlin allows for extremely powerful lambdas, with no limitation to a single expression, as well as very concise and highly readable lambda code.  This changes how problems are solved by making lambdas a far more important part of solutions in the idiom of the language.